BERT fine-tuned for Named Entity Recognition (CoNLL-2003)

A fine-tuned version of bert-base-cased for Named Entity Recognition (NER), trained on the CoNLL-2003 English dataset as part of working through the Hugging Face LLM Course, Chapter 7. It achieves the following results on the evaluation set:

  • Loss: 0.0599
  • Precision: 0.9319
  • Recall: 0.9507
  • F1: 0.9412
  • Accuracy: 0.9867

Model details

Attribute Value
Base model bert-base-cased
Architecture Transformer Encoder (BERT)
Task Token Classification (NER)
Training dataset CoNLL-2003 (English)
Training epochs 3
Learning rate 2e-5
Weight decay 0.01
Hardware Google Colab (T4 GPU)

Entity types

The model recognises four entity types in IOB2 format:

Label Description
PER Person
ORG Organisation
LOC Location
MISC Miscellaneous

Usage

from transformers import pipeline

ner = pipeline(
    "token-classification",
    model="AlexStamp/bert-finetuned-ner",
    aggregation_strategy="simple"
)

ner("Alexis works at CERN in Switzerland.")

Training procedure

Fine-tuning was performed using the Hugging Face Trainer API with DataCollatorForTokenClassification and evaluated using the seqeval library, which computes entity-level F1 — stricter than token-level accuracy since the entire entity span must be correctly identified.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.0759 1.0 1756 0.0651 0.8905 0.9310 0.9103 0.9812
0.0355 2.0 3512 0.0681 0.9321 0.9473 0.9397 0.9853
0.0224 3.0 5268 0.0599 0.9319 0.9507 0.9412 0.9867

Framework versions

  • Transformers 5.12.0
  • Pytorch 2.11.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2

Limitations

  • Trained on English news wire text (Reuters corpus); may generalise poorly to other domains or languages
  • bert-base-cased is case-sensitive by design, which is appropriate for NER but means casing errors in input text can degrade performance

Notes

This model was trained as a portfolio exercise. The base model choice (bert-base-cased over bert-base-uncased) is deliberate — NER is case-sensitive since capitalisation is a strong signal for entity detection.

Downloads last month
77
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AlexStamp/bert-finetuned-ner

Finetuned
(2920)
this model

Dataset used to train AlexStamp/bert-finetuned-ner