huseyincenik/conll_ner_with_bert

This model is a fine-tuned version of bert-base-uncased on the CoNLL-2003 dataset for Named Entity Recognition (NER).

Model description

This model has been trained to perform Named Entity Recognition (NER) and is based on the BERT architecture. It was fine-tuned on the CoNLL-2003 dataset, a standard dataset for NER tasks.

Intended uses & limitations

Intended Uses

  • Named Entity Recognition: This model is designed to identify and classify named entities in text into categories such as location (LOC), organization (ORG), person (PER), and miscellaneous (MISC).

Limitations

  • Domain Specificity: The model was fine-tuned on the CoNLL-2003 dataset, which consists of news articles. It may not generalize well to other domains or types of text not represented in the training data.
  • Subword Tokens: The model may occasionally tag subword tokens as entities, requiring post-processing to handle these cases.

Training and evaluation data

  • Training Dataset: CoNLL-2003

  • Training Evaluation Metrics:

    Label Precision Recall F1-Score Support
    B-PER 0.98 0.98 0.98 11273
    I-PER 0.98 0.99 0.99 9323
    B-ORG 0.88 0.92 0.90 10447
    I-ORG 0.81 0.92 0.86 5137
    B-LOC 0.86 0.94 0.90 9621
    I-LOC 1.00 0.08 0.14 1267
    B-MISC 0.81 0.73 0.77 4793
    I-MISC 0.83 0.36 0.50 1329
    Micro Avg 0.90 0.90 0.90 53190
    Macro Avg 0.89 0.74 0.75 53190
    Weighted Avg 0.90 0.90 0.89 53190
  • Validation Evaluation Metrics:

    Label Precision Recall F1-Score Support
    B-PER 0.97 0.98 0.97 3018
    I-PER 0.98 0.98 0.98 2741
    B-ORG 0.86 0.91 0.88 2056
    I-ORG 0.77 0.81 0.79 900
    B-LOC 0.86 0.94 0.90 2618
    I-LOC 1.00 0.10 0.18 281
    B-MISC 0.77 0.74 0.76 1231
    I-MISC 0.77 0.34 0.48 390
    Micro Avg 0.90 0.89 0.89 13235
    Macro Avg 0.87 0.73 0.74 13235
    Weighted Avg 0.90 0.89 0.88 13235
  • Test Evaluation Metrics:

    Label Precision Recall F1-Score Support
    B-PER 0.96 0.95 0.96 2714
    I-PER 0.98 0.99 0.98 2487
    B-ORG 0.81 0.87 0.84 2588
    I-ORG 0.74 0.87 0.80 1050
    B-LOC 0.81 0.90 0.85 2121
    I-LOC 0.89 0.12 0.22 276
    B-MISC 0.75 0.67 0.71 996
    I-MISC 0.85 0.49 0.62 241
    Micro Avg 0.87 0.88 0.87 12473
    Macro Avg 0.85 0.73 0.75 12473
    Weighted Avg 0.87 0.88 0.86 12473

Training procedure

Training Hyperparameters

  • Optimizer: AdamWeightDecay

    • Learning Rate: 2e-05
    • Decay Schedule: PolynomialDecay
    • Warmup Steps: 0.1
    • Weight Decay Rate: 0.01
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
0.1016 0.0254 0
0.0228 0.0180 1

Optimizer Details

from transformers import create_optimizer

batch_size = 32
num_train_epochs = 2
num_train_steps = (len(tokenized_conll["train"]) // batch_size) * num_train_epochs

optimizer, lr_schedule = create_optimizer(
    init_lr=2e-5,
    num_train_steps=num_train_steps,
    weight_decay_rate=0.01,
    num_warmup_steps=0.1
)

How to Use

Using a Pipeline

from transformers import pipeline

pipe = pipeline("token-classification", model="huseyincenik/conll_ner_with_bert")

from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert")
model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert")
Abbreviation Description
O Outside of a named entity
B-MISC Beginning of a miscellaneous entity right after another miscellaneous entity
I-MISC Miscellaneous entity
B-PER Beginning of a person’s name right after another person’s name
I-PER Person’s name
B-ORG Beginning of an organization right after another organization
I-ORG organization
B-LOC Beginning of a location right after another location
I-LOC Location

CoNLL-2003 English Dataset Statistics

This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper.

# of training examples per entity type

Dataset LOC MISC ORG PER
Train 7140 3438 6321 6600
Dev 1837 922 1341 1842
Test 1668 702 1661 1617

# of articles/sentences/tokens per dataset

Dataset Articles Sentences Tokens
Train 946 14,987 203,621
Dev 216 3,466 51,362
Test 231 3,684 46,435

Framework versions

  • Transformers 4.45.0.dev0
  • TensorFlow 2.17.0
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
18
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for huseyincenik/conll_ner_with_bert

Finetuned
(2281)
this model

Dataset used to train huseyincenik/conll_ner_with_bert