--- license: apache-2.0 datasets: - conll2003 - ai4privacy/pii-masking-200k language: - en metrics: - accuracy - f1 library_name: transformers pipeline_tag: token-classification --- ## Model Details ### Model Description This model is electra-small finetuned for NER prediction task. The model currently predicts three entities which are given below. 1. Location 2. Person 3. Organization - **Developed by:** விபின் (Vipin) - **Model type:** Google's electra small discriminator - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model [optional]:** Google's electra small discriminator ### Model Sources [optional] - **Repository:** https://huggingface.co/google/electra-small-discriminator ## Uses This model uses tokenizer that is from distilbert family. So the model may predict wrong entities for same word (different sub word). Use 'aggregation_strategy' to "max" when using transformer's pipeline. for example 'ashwin ::" ash" => Person win => Location ### Out-of-Scope Use May not work well for some long sentences. ## How to Get Started with the Model Use the code below to get started with the model. ``` from transformers import AutoModelForTokenClassification, AutoTokenizer from transformers import pipeline model = AutoModelForTokenClassification.from_pretrained("rv2307/electra-small-ner") tokenizer = AutoTokenizer.from_pretrained("rv2307/electra-small-ner") nlp = pipeline("ner", model=model, tokenizer=tokenizer,device="cpu", aggregation_strategy = "max") ``` ## Training Details ### Training Procedure This model is trained for 6 epoch in 3e-4 lr. ``` [39168/39168 41:18, Epoch 6/6] Step Training Loss Validation Loss Precision Recall F1 Accuracy 10000 0.086300 0.088625 0.863476 0.876271 0.869827 0.972581 20000 0.059800 0.079611 0.894612 0.884521 0.889538 0.976563 30000 0.050400 0.074552 0.895812 0.902591 0.899188 0.978380 ``` ## Evaluation Validation loss is 0.07 for this model