stefan-it's picture
readme: add initial version
60f2a42
|
raw
history blame
2.49 kB
metadata
language: en
license: mit
datasets:
  - conll2003
widget:
  - text: My name is Clara Clever and I live in Berkeley , California .

T5 Base Model for Named Entity Recognition (NER, CoNLL-2003)

In this repository, we open source a T5 Base model, that was fine-tuned on the official CoNLL-2003 NER dataset.

We use the great TANL library from Amazon for fine-tuning the model.

The exact approach of fine-tuning is presented in the "TANL: Structured Prediction as Translation between Augmented Natural Languages" paper from Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang and Stefano Soatto.

Fine-Tuning

We use the same hyper-parameter settings as used in the official implementation with one minor change. Instead of using 8 V100 GPUs, we train the model on one V100 GPU and used gradient accumulation. The slighly modified configuration file (config.ini) then looks like:

[conll03]
datasets = conll03
model_name_or_path = t5-base
num_train_epochs = 10
max_seq_length = 256
max_seq_length_eval = 512
per_device_train_batch_size = 4
per_device_eval_batch_size = 4
do_train = True
do_eval = True
do_predict = True
gradient_accumulation_steps = 8

It took around 2 hours to fine-tune that model on the 14,041 training sentences of CoNLL-2003 dataset.

Evaluation

On the development set, the following evaluation results could be achieved:

{
"entity_precision": 0.9536446086664427,
"entity_recall": 0.9555705149781218,
"entity_f1": 0.9546065904505716,
"entity_precision_no_type": 0.9773261672824992,
"entity_recall_no_type": 0.9792998990238977,
"entity_f1_no_type": 0.9783120376597176
}

The evaluation results on the test set looks like:

{
"entity_precision": 0.912182296231376,
"entity_recall": 0.9213881019830028,
"entity_f1": 0.9167620893155995,
"entity_precision_no_type": 0.953900087642419,
"entity_recall_no_type": 0.9635269121813032,
"entity_f1_no_type": 0.9586893332158901
}

To summarize: On the development set, 95.46% F1-Score and 91.68% on test set were achieved with this model. The paper reported a F1-Score of 91.7%.

License

The models is licensed under MIT.

Acknowledgments

Thanks to the generous support from the Hugging Face team, it is possible to download both cased and uncased models from their S3 storage 🤗