MrRobson9's picture
Update README.md
1d92511 verified
---
license: apache-2.0
datasets:
- eriktks/conll2003
language:
- en
metrics:
- accuracy
- precision
- recall
- f1
base_model:
- distilbert/distilbert-base-cased
---
# DistilBERT Base Cased Fine-Tuned on CoNLL2003 for English Named Entity Recognition (NER)
This model is a fine-tuned version of [DistilBERT-base-cased](https://huggingface.co/distilbert/distilbert-base-cased) on the [CoNLL2003](https://huggingface.co/datasets/eriktks/conll2003) dataset for Named Entity Recognition (NER) in English. The CoNLL2003 dataset contains four types of named entities: Person (PER), Location (LOC), Organization (ORG), and Miscellaneous (MISC).
## Model Details
- Model Architecture: BERT (Bidirectional Encoder Representations from Transformers)
- Pre-trained Base Model: bert-base-cased
- Dataset: CoNLL2003 (NER task)
- Languages: English
- Fine-tuned for: Named Entity Recognition (NER)
- Entities recognized:
- PER: Person
- LOC: Location
- ORG: Organization
- MISC: Miscellaneous entities
## Use Cases
This model is ideal for tasks that require identifying and classifying named entities within English text, such as:
- Information extraction from unstructured text
- Content classification and tagging
- Automated text summarization
- Question answering systems with a focus on entity recognition
## How to Use
To use this model in your code, you can load it via Hugging Face’s Transformers library:
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")
model = AutoModelForTokenClassification.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")
nlp_ner = pipeline("ner", model=model, tokenizer=tokenizer)
result = nlp_ner("John lives in New York and works for the United Nations.")
print(result)
```
## Performance
|accuracy |precision |recall |f1-score|
|:-------:|:--------:|:-----:|:------:|
| 0.987 | 0.937 | 0.941 | 0.939 |
## License
This model is licensed under the same terms as the BERT-base-cased model and the CoNLL2003 dataset. Please ensure compliance with all respective licenses when using this model.