BART Translation model

For further models, scripts and details, see our repository or our demo site.

  • Source language: English

  • Target language: Hungarian

  • BART base model:

    • Pretrained on English WikiText-103 and Hungarian Wikipedia
    • Finetuned on subcorpora from OPUS
      • Segments: 56.837.602

Limitations

  • tokenized input text (tokenizer: HuSpaCy)
  • max_source_length = 128
  • max_target_length = 128

Results

Model BLEU chrF-3 chrF-6
Google 25.30 54.09 49.0
BART 36.89 60.77 56.4
mT5 27.69 53.73 48.57

Citation

If you use this model, please cite the following paper:

@inproceedings {laki-yang-mt,
    title = {{Jobban fordítunk magyarra, mint a Google!}},
    booktitle = {XVIII. Magyar Számítógépes Nyelvészeti Konferencia},
    year = {2022},
    publisher = {Szegedi Tudományegyetem, Informatikai Intézet},
    address = {Szeged, Magyarország},
    author = {Laki, László and Yang, Zijian Győző},
    pages = {357--372}
}
Downloads last month
35
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.