MrRobson9
/

distilbert-base-cased-finetuned-conll2003-english-ner

Token Classification

Inference Endpoints

Model card Files Files and versions Community

distilbert-base-cased-finetuned-conll2003-english-ner / README.md

MrRobson9's picture

Update README.md

1d92511 verified about 1 month ago

|

history blame contribute delete

2.24 kB

	---
	license: apache-2.0
	datasets:
	- eriktks/conll2003
	language:
	- en
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	base_model:
	- distilbert/distilbert-base-cased
	---

	# DistilBERT Base Cased Fine-Tuned on CoNLL2003 for English Named Entity Recognition (NER)

	This model is a fine-tuned version of [DistilBERT-base-cased](https://huggingface.co/distilbert/distilbert-base-cased) on the [CoNLL2003](https://huggingface.co/datasets/eriktks/conll2003) dataset for Named Entity Recognition (NER) in English. The CoNLL2003 dataset contains four types of named entities: Person (PER), Location (LOC), Organization (ORG), and Miscellaneous (MISC).

	## Model Details
	- Model Architecture: BERT (Bidirectional Encoder Representations from Transformers)
	- Pre-trained Base Model: bert-base-cased
	- Dataset: CoNLL2003 (NER task)
	- Languages: English
	- Fine-tuned for: Named Entity Recognition (NER)
	- Entities recognized:
	- PER: Person
	- LOC: Location
	- ORG: Organization
	- MISC: Miscellaneous entities

	## Use Cases
	This model is ideal for tasks that require identifying and classifying named entities within English text, such as:

	- Information extraction from unstructured text
	- Content classification and tagging
	- Automated text summarization
	- Question answering systems with a focus on entity recognition

	## How to Use
	To use this model in your code, you can load it via Hugging Face’s Transformers library:

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	from transformers import pipeline

	tokenizer = AutoTokenizer.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")
	model = AutoModelForTokenClassification.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")

	nlp_ner = pipeline("ner", model=model, tokenizer=tokenizer)
	result = nlp_ner("John lives in New York and works for the United Nations.")
	print(result)
	```

	## Performance
	\|accuracy \|precision \|recall \|f1-score\|
	\|:-------:\|:--------:\|:-----:\|:------:\|
	\| 0.987 \| 0.937 \| 0.941 \| 0.939 \|

	## License
	This model is licensed under the same terms as the BERT-base-cased model and the CoNLL2003 dataset. Please ensure compliance with all respective licenses when using this model.