This model was finetuned on errorful sentences from the
train subset of UA-GEC corpus, introduced in UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language paper.
Only sentences containing errors were used; 8,874 sentences for training and 987 sentences for validation. The training arguments were defined as follows:
batch_size = 8 num_train_epochs = 6 learning_rate=5e-5 weight_decay=0.01 optim = "adafactor"
- Downloads last month