Info
This is a fine-tuned model on the NER task. The original model is Turku NLP's bert-base-finnish-uncased-v1, and the fine-tuning dataset is Turku NLP's turku_ner_corpus.
The model is released under Apache 2.0.
Please mention the training dataset if you use this model:
@inproceedings{luoma-etal-2020-broad,
title = "A Broad-coverage Corpus for {F}innish Named Entity Recognition",
author = {Luoma, Jouni and Oinonen, Miika and Pyyk{\"o}nen, Maria and Laippala, Veronika and Pyysalo, Sampo},
booktitle = "Proceedings of The 12th Language Resources and Evaluation Conference",
year = "2020",
url = "https://www.aclweb.org/anthology/2020.lrec-1.567",
pages = "4615--4624",
}
Validation Metrics
- Loss: 0.075
- Accuracy: 0.982
- Precision: 0.879
- Recall: 0.868
- F1: 0.873
Test Metrics
Overall Metrics
- Accuracy: 0.986
- Precision: 0.857
- Recall: 0.872
- F1: 0.864
Per-entity metrics
{
"DATE": {
"precision": 0.925,
"recall": 0.9736842105263158,
"f1": 0.9487179487179489,
"number": "114"
},
"EVENT": {
"precision": 0.3,
"recall": 0.42857142857142855,
"f1": 0.3529411764705882,
"number": "7"
},
"LOC": {
"precision": 0.9057239057239057,
"recall": 0.9372822299651568,
"f1": 0.9212328767123287,
"number": "287"
},
"ORG": {
"precision": 0.8274111675126904,
"recall": 0.7836538461538461,
"f1": 0.8049382716049382,
"number": "208"
},
"PER": {
"precision": 0.88,
"recall": 0.9225806451612903,
"f1": 0.9007874015748031,
"number": "310"
},
"PRO": {
"precision": 0.6081081081081081,
"recall": 0.569620253164557,
"f1": 0.5882352941176471,
"number": "79"
}
}
Usage
You can use cURL to access this model:
$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "Asun Brysselissä, Euroopan pääkaupungissa."}' https://api-inference.huggingface.co/models/iguanodon-ai/bert-base-finnish-uncased-ner
Or Python API:
from transformers import AutoModelForTokenClassification, AutoTokenizer
model = AutoModelForTokenClassification.from_pretrained("iguanodon-ai/bert-base-finnish-uncased-ner")
tokenizer = AutoTokenizer.from_pretrained("iguanodon-ai/bert-base-finnish-uncased-ner")
inputs = tokenizer("Asun Brysselissä, Euroopan pääkaupungissa.", return_tensors="pt")
outputs = model(**inputs)
- Downloads last month
- 463
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.