Edit model card

[ss-en] Siswati to English Translation Model based on M2M100 and The South African Gov-ZA multilingual corpus

Model created from Siswati to English aligned sentences from The South African Gov-ZA multilingual corpus

The data set contains cabinet statements from the South African government, maintained by the Government Communication and Information System (GCIS). Data was scraped from the governments website: https://www.gov.za/cabinet-statements


  • Vukosi Marivate - @vukosi
  • Matimba Shingange
  • Richard Lastrucci
  • Isheanesu Joseph Dzingirai
  • Jenalea Rajab

BibTeX entry and citation info

    title = "Preparing the Vuk{'}uzenzele and {ZA}-gov-multilingual {S}outh {A}frican multilingual corpora",
    author = "Richard Lastrucci and Isheanesu Dzingirai and Jenalea Rajab and Andani Madodonga and Matimba Shingange and Daniel Njini and Vukosi Marivate",
    booktitle = "Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023)",
    month = may,
    year = "2023",
    address = "Dubrovnik, Croatia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.rail-1.3",
    pages = "18--25"

Paper - Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora

Downloads last month
Model size
486M params
Tensor type