File size: 2,976 Bytes
083e7bd 0a4cd71 083e7bd 0a4cd71 083e7bd 0a4cd71 083e7bd b8bd211 0a4cd71 083e7bd 0a4cd71 9b1dfcc 083e7bd b8bd211 81d93b7 0a4cd71 9ec8d6b 0a4cd71 083e7bd 7f8b301 083e7bd 7f8b301 59ecec4 083e7bd b29768d 083e7bd 81d93b7 083e7bd 45ef805 083e7bd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
---
tags:
- token-classification
language:
- fi
widget:
- text: Asun Brysselissä, Euroopan pääkaupungissa.
datasets:
- drvenabili/autotrain-data-turku-ner
- turku_ner_corpus
co2_eq_emissions:
emissions: 0.2165403288824756
license: apache-2.0
pipeline_tag: token-classification
---
# Info
This is a fine-tuned model on the NER task. The original model is Turku NLP's [bert-base-finnish-uncased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-uncased-v1), and the fine-tuning dataset is Turku NLP's [turku_ner_corpus](https://huggingface.co/datasets/turku_ner_corpus/).
The model is released under Apache 2.0.
Please mention the training dataset if you use this model:
```bibtex
@inproceedings{luoma-etal-2020-broad,
title = "A Broad-coverage Corpus for {F}innish Named Entity Recognition",
author = {Luoma, Jouni and Oinonen, Miika and Pyyk{\"o}nen, Maria and Laippala, Veronika and Pyysalo, Sampo},
booktitle = "Proceedings of The 12th Language Resources and Evaluation Conference",
year = "2020",
url = "https://www.aclweb.org/anthology/2020.lrec-1.567",
pages = "4615--4624",
}
```
# Validation Metrics
- Loss: 0.075
- Accuracy: 0.982
- Precision: 0.879
- Recall: 0.868
- F1: 0.873
# Test Metrics
### Overall Metrics
- Accuracy: 0.986
- Precision: 0.857
- Recall: 0.872
- F1: 0.864
### Per-entity metrics
```json
{
"DATE": {
"precision": 0.925,
"recall": 0.9736842105263158,
"f1": 0.9487179487179489,
"number": "114"
},
"EVENT": {
"precision": 0.3,
"recall": 0.42857142857142855,
"f1": 0.3529411764705882,
"number": "7"
},
"LOC": {
"precision": 0.9057239057239057,
"recall": 0.9372822299651568,
"f1": 0.9212328767123287,
"number": "287"
},
"ORG": {
"precision": 0.8274111675126904,
"recall": 0.7836538461538461,
"f1": 0.8049382716049382,
"number": "208"
},
"PER": {
"precision": 0.88,
"recall": 0.9225806451612903,
"f1": 0.9007874015748031,
"number": "310"
},
"PRO": {
"precision": 0.6081081081081081,
"recall": 0.569620253164557,
"f1": 0.5882352941176471,
"number": "79"
}
}
```
## Usage
You can use cURL to access this model:
```
$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "Asun Brysselissä, Euroopan pääkaupungissa."}' https://api-inference.huggingface.co/models/iguanodon-ai/bert-base-finnish-uncased-ner
```
Or Python API:
```
from transformers import AutoModelForTokenClassification, AutoTokenizer
model = AutoModelForTokenClassification.from_pretrained("iguanodon-ai/bert-base-finnish-uncased-ner")
tokenizer = AutoTokenizer.from_pretrained("iguanodon-ai/bert-base-finnish-uncased-ner")
inputs = tokenizer("Asun Brysselissä, Euroopan pääkaupungissa.", return_tensors="pt")
outputs = model(**inputs)
``` |