File size: 914 Bytes
47be8a1
 
 
2a9d5b4
 
 
 
 
 
 
 
eb135ad
2a9d5b4
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
widget:
- text: Simon dog i <mask> i går.
license: mit
datasets:
- ChangeIsKey/kubhist2
language:
- sv
library_name: transformers
---

This is a roberta model trained on kubhist2 (https://spraakbanken.gu.se/en/resources/kubhist2, https://spraakbanken.gu.se/blogg/index.php/2019/09/15/the-kubhist-corpus-of-swedish-newspapers/). For a HF version of kubhist2, see here: https://huggingface.co/datasets/ChangeIsKey/kubhist2

This is a work in progress, the quality of the model -- just like the quality of the training data -- is far from great.

Shared here with no guarantee whatsoever, will likely change, use at your own risk, etc.

### Discussion of Biases
This is trained on historical data. As such, outdated views might be present in the data.

### Other Known Limitations
The data comes from an OCR process. The text is thus not perfect, especially so in the earlier decades.

### Contact
Simon Hengchen