--- license: cc-by-sa-4.0 datasets: - cjvt/cc_gigafida - cjvt/solar3 - cjvt/sloleks language: - sl tags: - word order correction --- --- language: - sl license: cc-by-sa-4.0 --- # T5-slo-word-order-corrector This T5 model is designed to correct the word order inside sentence sections. Sentences are split into sections based on commas and conjunctions. ## Model Output Example Imagine we have the following Slovenian text: _Popravi model besedilo, v katerem vrstni je red nekaterih besed napačen._ The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!): _Model popravi besedilo, v katerem je vrstni red nekaterih besed napačen._ We observe that in the input sentence, the sentence sections `Popravi model besedilo` and `v katerem vrstni je red nekaterih besed napačen` are written with incorrect word order, so our model corrects word order of words `Popravi model` and `je vrstni`. ## More details Testing the model with **generated** test sets provides the following result (combining detection and correction of words with incorrect word order): - `Precission`: 0,937 - `Recall`: 0,869 - `F1`: 0,902 ## Acknowledgement The authors acknowledge the financial support from the Slovenian Research and Innovation Agency - research core funding No. P6-0411: Language Resources and Technologies for Slovene and research project No. J7-3159: Empirical foundations for digitally-supported development of writing skills. ## Authors Thanks to Martin Božič, Marko Robnik-Šikonja and Špela Arhar Holdt for developing these models.