--- license: apache-2.0 language: - en metrics: - accuracy - precision - recall - f1 datasets: - eriktks/conll2003 base_model: - google-bert/bert-base-cased --- # BERT Base Uncased Fine-Tuned on CoNLL2003 for English Named Entity Recognition (NER) This model is a fine-tuned version of [BERT-base-cased](https://huggingface.co/google-bert/bert-base-cased) on the [CoNLL2003](https://huggingface.co/datasets/eriktks/conll2003) dataset for Named Entity Recognition (NER) in English. The CoNLL2003 dataset contains four types of named entities: Person (PER), Location (LOC), Organization (ORG), and Miscellaneous (MISC). ## Model Details - Model Architecture: BERT (Bidirectional Encoder Representations from Transformers) - Pre-trained Base Model: bert-base-cased - Dataset: CoNLL2003 (NER task) - Languages: English - Fine-tuned for: Named Entity Recognition (NER) - Entities recognized: - PER: Person - LOC: Location - ORG: Organization - MISC: Miscellaneous entities ## Use Cases This model is ideal for tasks that require identifying and classifying named entities within English text, such as: - Information extraction from unstructured text - Content classification and tagging - Automated text summarization - Question answering systems with a focus on entity recognition ## How to Use To use this model in your code, you can load it via Hugging Face’s Transformers library: ```python from transformers import AutoTokenizer, AutoModelForTokenClassification from transformers import pipeline tokenizer = AutoTokenizer.from_pretrained("MrRobson9/bert-base-cased-finetuned-conll2003-english-ner") model = AutoModelForTokenClassification.from_pretrained("MrRobson9/bert-base-cased-finetuned-conll2003-english-ner") nlp_ner = pipeline("ner", model=model, tokenizer=tokenizer) result = nlp_ner("John lives in New York and works for the United Nations.") print(result) ``` ## Performance |accuracy |precision |recall |f1-score| |:-------:|:--------:|:-----:|:------:| | 0.991 | 0.946 | 0.953 | 0.950 | ## License This model is licensed under the same terms as the BERT-base-cased model and the CoNLL2003 dataset. Please ensure compliance with all respective licenses when using this model.