This is a BART-large model finetuned on roughly 58000 aligned sentence pairs in English and Middle English, collected from the works of Geoffrey Chaucer, John Wycliffe, and the Gawain Poet.
It includes special characters such as þ.
This model reflects the spelling inconsistencies characteristic of Middle English.
Because the model is trained largely on poetry and some prose, it is best at translating those sorts of tasks.
Performance can be improved by sentence tokenizing input data and translating sentence-by-sentence.
Removing contractions (hadn't -> had not) also boosts performance.
- Downloads last month