engineersakibcse47
/

NER_on_Bangla_Language

Token Classification

Inference Endpoints

Model card Files Files and versions Community

NER_on_Bangla_Language / README.md

engineersakibcse47's picture

engineersakibcse47

Update README.md

d97e5d4 verified 2 months ago

|

history blame contribute delete

No virus

2.03 kB

	---
	language: bn
	datasets:
	- wikiann
	examples:
	widget:
	- text: "আমি, সাকিব হোসেন হিমেল, ডাটা সায়েন্সে স্নাতকোত্তর করছি, বর্তমানে জার্মানির বার্লিনে থাকি, গত বছর বাংলাদেশ থেকে এসেছি।"
	example_title: "Sentence_1"
	- text: "হোর্হেলুইস বোর্হেস"
	example_title: "Sentence_2"
	- text: "বাংলাদেশ জাতীয় ক্রিকেট দল"
	example_title: "Sentence_3"
	- text: "কুড়িগ্রাম উপজেলা"
	example_title: "Sentence_4"
	- text: "লিওনার্দো দা ভিঞ্চি"
	example_title: "Sentence_5"
	- text: "রিয়াল মাদ্রিদ ফুটবল ক্লাব"
	example_title: "Sentence_6"
	---

	<h1>Named Entity Recognition on Bangla Language</h1>
	Fine Tuning BERT for NER on Bengali Language Tagging using HuggingFace


	## Correspondence Label ID and Label Name

	\| Label ID \| Label Name\|
	\| -------- \| ----- \|
	\|0 \| O \|
	\| 1 \| B-PER \|
	\| 2 \| I-PER \|
	\| 3 \| B-ORG\|
	\| 4 \| I-ORG \|
	\| 5 \| B-LOC \|
	\| 6 \| I-LOC \|

	<h1>Evaluation and Validation</h1>

	\| Name \| Precision \| Recall \| F1 \| Accuracy \|
	\| ---- \| -------- \| ----- \| ---- \| ---- \|
	\| Train/Val set \| 0.963899 \| 0.964770 \| 0.964334 \| 0.981252 \|
	\| Test set \| 0.952855 \| 0.965105 \| 0.958941 \| 0.981349 \|


	Transformers AutoModelForTokenClassification

	```py
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	from transformers import pipeline

	tokenizer = AutoTokenizer.from_pretrained("engineersakibcse47/NER_on_Bangla_Language")
	model_ner = AutoModelForTokenClassification.from_pretrained("engineersakibcse47/NER_on_Bangla_Language")

	pipe = pipeline("ner", model=model_ner, tokenizer=tokenizer, aggregation_strategy="simple")

	sample = "বসনিয়া ও হার্জেগোভিনা"

	result = pipe(sample)
	result
	```