Edit model card

This is a roberta model trained on kubhist2 (https://spraakbanken.gu.se/en/resources/kubhist2, https://spraakbanken.gu.se/blogg/index.php/2019/09/15/the-kubhist-corpus-of-swedish-newspapers/). For a HF version of kubhist2, see here: https://huggingface.co/datasets/ChangeIsKey/kubhist2

This is a work in progress, the quality of the model -- just like the quality of the training data -- is far from great.

Shared here with no guarantee whatsoever, will likely change, use at your own risk, etc.

Discussion of Biases

This is trained on historical data. As such, outdated views might be present in the data.

Other Known Limitations

The data comes from an OCR process. The text is thus not perfect, especially so in the earlier decades.

Contact

Simon Hengchen, iguanodon.ai

Downloads last month
3
Safetensors
Model size
78.1M params
Tensor type
I64
·
F32
·

Dataset used to train ChangeIsKey/roberta-kubhist2