|
--- |
|
language: |
|
- uk |
|
library_name: transformers |
|
pipeline_tag: text2text-generation |
|
tags: |
|
- grammatical error correction |
|
- GEC |
|
- Grammar |
|
--- |
|
This model was trained by Pravopysnyk team for the Ukrainian NLP shared task in Ukrainian grammar correction. The model is MBart-50-large |
|
set to ukr-to-ukr translation task finetuned on UA-GEC augmented by custom dataset generated using our synthetic error generation. |
|
The code for error generation will be uploaded on github soon, and the detailed procedure is described in our paper. For this model we added to ua-gec:- |
|
5k sentences generated by round translaion (ukr-rus-ukr) |
|
10k sentences using our punctuation error generation script |
|
2k of dilution (just fully correct sentences sampled from our dataset) |
|
10k of russism (errors generated using our russism error generation). |
|
|
|
Hope you find this description helpful! |