NielsOerbaek
commited on
Commit
•
3c24570
1
Parent(s):
41e3d31
Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ A text classification model for determining if a social media post in Danish or
|
|
15 |
|
16 |
The model is based on the north/t5_large_scand (by Per E. Kummervold, not publicly available) which is a Scandinavian language pretrained for 1.700.000 steps starting with the mT5 checkpoint on a Scandinavian corpus (Bokmål, Nynorsk, Danish, Swedish and Icelandic (+ a tiny bit Faeroyish)).
|
17 |
|
18 |
-
The model is finetuned for 20.000 steps in batches of 8. The data consists of ~70k Norwegian and ~67k Danish social media posts which have been classified as either 'verbal attack' or 'nothing', making it a text-to-text model restricted to do classification. The model is described in Danish in [this report](https://
|
19 |
|
20 |
|
21 |
- **Developed by:** The development team at Analyse & Tal
|
|
|
15 |
|
16 |
The model is based on the north/t5_large_scand (by Per E. Kummervold, not publicly available) which is a Scandinavian language pretrained for 1.700.000 steps starting with the mT5 checkpoint on a Scandinavian corpus (Bokmål, Nynorsk, Danish, Swedish and Icelandic (+ a tiny bit Faeroyish)).
|
17 |
|
18 |
+
The model is finetuned for 20.000 steps in batches of 8. The data consists of ~70k Norwegian and ~67k Danish social media posts which have been classified as either 'verbal attack' or 'nothing', making it a text-to-text model restricted to do classification. The model is described in Danish in [this report](https://www.ogtal.dk/assets/files/230403-Analyse-Tall-Angrep-hat-i-den-offentlige-debatten-paa-Facebook.pdf).
|
19 |
|
20 |
|
21 |
- **Developed by:** The development team at Analyse & Tal
|