--- |- Based on Finnish pretrained T5 model version small-nl24 Train data Around 300k samples from from following datasets
- wikipedia
- Yle Finnish News Archive 2011-2018
- Yle Finnish News Archive 2019-2020
- Finnish News Agency Archive (STT)
- The Suomi24 Sentences Corpus
Tested with 1000 samples from the previous datasets Median CER 1.1% MEAN CER 4.2% More detailed info coming later...