--- license: cc-by-nc-sa-4.0 --- This repository contains KenLM models for the Ukrainian language Metrics for the NEWS models (tested with an acoustic model of [wav2vec2-xls-r-300m model](https://huggingface.co/Yehor/wav2vec2-xls-r-300m-uk-with-small-lm)): | Model | CER | WER | |-|-|-| | no LM | 0.0412 | 0.2206 | | lm-3gram-10k (alpha=0.1) | 0.0398 | 0.2191 | | lm-4gram-10k (alpha=0.1) | 0.0398 | 0.219 | | lm-5gram-10k (alpha=0.1) | 0.0398 | 0.219 | | lm-3gram-30k | 0.038 | 0.2023 | | lm-4gram-30k | 0.0379 | 0.2018 | | lm-5gram-30k | 0.0379 | 0.202 | | lm-3gram-50k | 0.0348 | 0.1826 | | lm-4gram-50k | 0.0347 | 0.1818 | | lm-5gram-50k | 0.0347 | 0.1821 | | lm-3gram-100k | 0.031 | 0.1588 | | lm-4gram-100k | 0.0308 | 0.1579 | | lm-5gram-100k | 0.0308 | 0.1579 | | lm-3gram-300k | 0.0261 | 0.1294 | | lm-4gram-300k | 0.0261 | 0.1293 | | lm-5gram-300k | 0.0261 | 0.1293 | | lm-3gram-500k | 0.0248 | 0.1209 | | lm-4gram-500k | 0.0247 | 0.1207 | | lm-5gram-500k | 0.0247 | 0.1209 | Files of the models are under the Files and versions section. Attribution to the NEWS models: - Chaplynskyi, D. et al. (2021) lang-uk Ukrainian Ubercorpus [Data set]. https://lang.org.ua/uk/corpora/#anchor4