File size: 2,871 Bytes
083e7bd 0a4cd71 083e7bd 0a4cd71 083e7bd 0a4cd71 083e7bd 0a4cd71 083e7bd 0a4cd71 083e7bd 0a4cd71 9ec8d6b 0a4cd71 083e7bd 7f8b301 083e7bd 7f8b301 59ecec4 083e7bd 0a4cd71 083e7bd 45ef805 083e7bd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
---
tags:
- token-classification
language:
- fi
widget:
- text: Asun Brysselissä, Euroopan pääkaupungissa.
datasets:
- drvenabili/autotrain-data-turku-ner
- turku_ner_corpus
co2_eq_emissions:
emissions: 0.2165403288824756
license: cc-by-sa-4.0
pipeline_tag: token-classification
---
# Info
This is a fine-tuned model on the NER task. The original model is Turku NLP's [bert-base-finnish-uncased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-uncased-v1), and the fine-tuning dataset is Turku NLP's [turku_ner_corpus](turku_ner_corpus).
Please mention the original dataset if you use this model:
```bibtex
@inproceedings{luoma-etal-2020-broad,
title = "A Broad-coverage Corpus for {F}innish Named Entity Recognition",
author = {Luoma, Jouni and Oinonen, Miika and Pyyk{\"o}nen, Maria and Laippala, Veronika and Pyysalo, Sampo},
booktitle = "Proceedings of The 12th Language Resources and Evaluation Conference",
year = "2020",
url = "https://www.aclweb.org/anthology/2020.lrec-1.567",
pages = "4615--4624",
}
```
# Validation Metrics
- Loss: 0.075
- Accuracy: 0.982
- Precision: 0.879
- Recall: 0.868
- F1: 0.873
# Test Metrics
### Overall Metrics
- Accuracy: 0.986
- Precision: 0.857
- Recall: 0.872
- F1: 0.864
### Per-entity metrics
```json
{
"DATE": {
"precision": 0.925,
"recall": 0.9736842105263158,
"f1": 0.9487179487179489,
"number": "114"
},
"EVENT": {
"precision": 0.3,
"recall": 0.42857142857142855,
"f1": 0.3529411764705882,
"number": "7"
},
"LOC": {
"precision": 0.9057239057239057,
"recall": 0.9372822299651568,
"f1": 0.9212328767123287,
"number": "287"
},
"ORG": {
"precision": 0.8274111675126904,
"recall": 0.7836538461538461,
"f1": 0.8049382716049382,
"number": "208"
},
"PER": {
"precision": 0.88,
"recall": 0.9225806451612903,
"f1": 0.9007874015748031,
"number": "310"
},
"PRO": {
"precision": 0.6081081081081081,
"recall": 0.569620253164557,
"f1": 0.5882352941176471,
"number": "79"
}
}
```
## Usage
You can use cURL to access this model:
```
$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "I love AutoTrain"}' https://api-inference.huggingface.co/models/drvenabili/autotrain-turku-ner-65992136346
```
Or Python API:
```
from transformers import AutoModelForTokenClassification, AutoTokenizer
model = AutoModelForTokenClassification.from_pretrained("drvenabili/bert-base-finnish-uncased-ner")
tokenizer = AutoTokenizer.from_pretrained("drvenabili/bert-base-finnish-uncased-ner")
inputs = tokenizer("Asun Brysselissä, Euroopan pääkaupungissa.", return_tensors="pt")
outputs = model(**inputs)
``` |