--- tags: - token-classification language: - fi widget: - text: Asun Brysselissä, Euroopan pääkaupungissa. datasets: - drvenabili/autotrain-data-turku-ner - turku_ner_corpus co2_eq_emissions: emissions: 0.2165403288824756 license: apache-2.0 pipeline_tag: token-classification --- # Info This is a fine-tuned model on the NER task. The original model is Turku NLP's [bert-base-finnish-uncased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-uncased-v1), and the fine-tuning dataset is Turku NLP's [turku_ner_corpus](https://huggingface.co/datasets/turku_ner_corpus/). The model is released under Apache 2.0. Please mention the training dataset if you use this model: ```bibtex @inproceedings{luoma-etal-2020-broad, title = "A Broad-coverage Corpus for {F}innish Named Entity Recognition", author = {Luoma, Jouni and Oinonen, Miika and Pyyk{\"o}nen, Maria and Laippala, Veronika and Pyysalo, Sampo}, booktitle = "Proceedings of The 12th Language Resources and Evaluation Conference", year = "2020", url = "https://www.aclweb.org/anthology/2020.lrec-1.567", pages = "4615--4624", } ``` # Validation Metrics - Loss: 0.075 - Accuracy: 0.982 - Precision: 0.879 - Recall: 0.868 - F1: 0.873 # Test Metrics ### Overall Metrics - Accuracy: 0.986 - Precision: 0.857 - Recall: 0.872 - F1: 0.864 ### Per-entity metrics ```json { "DATE": { "precision": 0.925, "recall": 0.9736842105263158, "f1": 0.9487179487179489, "number": "114" }, "EVENT": { "precision": 0.3, "recall": 0.42857142857142855, "f1": 0.3529411764705882, "number": "7" }, "LOC": { "precision": 0.9057239057239057, "recall": 0.9372822299651568, "f1": 0.9212328767123287, "number": "287" }, "ORG": { "precision": 0.8274111675126904, "recall": 0.7836538461538461, "f1": 0.8049382716049382, "number": "208" }, "PER": { "precision": 0.88, "recall": 0.9225806451612903, "f1": 0.9007874015748031, "number": "310" }, "PRO": { "precision": 0.6081081081081081, "recall": 0.569620253164557, "f1": 0.5882352941176471, "number": "79" } } ``` ## Usage You can use cURL to access this model: ``` $ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "Asun Brysselissä, Euroopan pääkaupungissa."}' https://api-inference.huggingface.co/models/iguanodon-ai/bert-base-finnish-uncased-ner ``` Or Python API: ``` from transformers import AutoModelForTokenClassification, AutoTokenizer model = AutoModelForTokenClassification.from_pretrained("iguanodon-ai/bert-base-finnish-uncased-ner") tokenizer = AutoTokenizer.from_pretrained("iguanodon-ai/bert-base-finnish-uncased-ner") inputs = tokenizer("Asun Brysselissä, Euroopan pääkaupungissa.", return_tensors="pt") outputs = model(**inputs) ```