Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- token-classification
|
4 |
+
- sequence-tagger-model
|
5 |
+
language: sv
|
6 |
+
datasets:
|
7 |
+
- suc3_1
|
8 |
+
widget:
|
9 |
+
- text: "Emil bor i Lönneberga"
|
10 |
+
---
|
11 |
+
|
12 |
+
# KB-BERT for NER
|
13 |
+
|
14 |
+
## Cased data
|
15 |
+
|
16 |
+
This model is based on [KB-BERT](https://huggingface.co/KB/bert-base-swedish-cased) and was fine-tuned on the [SUC 3.1](https://huggingface.co/datasets/KBLab/suc3_1) corpus, using the _simple_ tags and cased data.
|
17 |
+
For this model we used a variation of the data that did **not** use BIO-encoding to differentiate between the beginnings (B), and insides (I) of named entity tags.
|
18 |
+
|
19 |
+
The model was trained on the training data only, with the best model chosen by its performance on the validation data.
|
20 |
+
You find more information about the model and the performance on our blog: https://kb-labb.github.io/posts/2022-02-07-suc31
|