--- language: bn datasets: - wikiann examples: widget: - text: "মারভিন দি মারসিয়ান" example_title: "Sentence_1" - text: "লিওনার্দো দা ভিঞ্চি" example_title: "Sentence_2" - text: "বসনিয়া ও হার্জেগোভিনা" example_title: "Sentence_3" - text: "সাউথ ইস্ট ইউনিভার্সিটি" example_title: "Sentence_4" - text: "মানিক বন্দ্যোপাধ্যায় লেখক" example_title: "Sentence_5" ---

Bengali Named Entity Recognition

Fine-tuning bert-base-multilingual-cased on Wikiann dataset for performing NER on Bengali language. ## Label ID and its corresponding label name | Label ID | Label Name| | -------- | ----- | |0 | O | | 1 | B-PER | | 2 | I-PER | | 3 | B-ORG| | 4 | I-ORG | | 5 | B-LOC | | 6 | I-LOC |

Results

| Name | Overall F1 | LOC F1 | ORG F1 | PER F1 | | ---- | -------- | ----- | ---- | ---- | | Train set | 0.997927 | 0.998246 | 0.996613 | 0.998769 | | Validation set | 0.970187 | 0.969212 | 0.956831 | 0.982079 | | Test set | 0.9673011 | 0.967120 | 0.963614 | 0.970938 | Example ```py from transformers import AutoTokenizer, AutoModelForTokenClassification from transformers import pipeline tokenizer = AutoTokenizer.from_pretrained("Suchandra/bengali_language_NER") model = AutoModelForTokenClassification.from_pretrained("Suchandra/bengali_language_NER") nlp = pipeline("ner", model=model, tokenizer=tokenizer) example = "মারভিন দি মারসিয়ান" ner_results = nlp(example) ner_results ```