File size: 820 Bytes
48014e7
 
 
 
07fae38
48014e7
 
38afe97
48014e7
 
 
 
 
 
 
 
2e0e293
48014e7
 
 
a713067
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
tags:
- token-classification
- sequence-tagger-model
- bert
language: sv
datasets:
- KBLab/sucx3_ner
widget:
- text: "Emil bor i Lönneberga"
---

# KB-BERT for NER

## Cased data

This model is based on [KB-BERT](https://huggingface.co/KB/bert-base-swedish-cased) and was fine-tuned on the [SUCX 3.0 - NER](https://huggingface.co/datasets/KBLab/sucx3_ner) corpus, using the _simple_ tags and cased data.
For this model we used a variation of the data that did **not** use BIO-encoding to differentiate between the beginnings (B), and insides (I) of named entity tags.

The model was trained on the training data only, with the best model chosen by its performance on the validation data.
You find more information about the model and the performance on our blog: https://kb-labb.github.io/posts/2022-02-07-sucx3_ner