#### Languages: - Source language: English - Source language: isiZulu #### Model Details: - model: transformer - Architecture: MarianMT - pre-processing: normalization + SentencePiece #### Pre-trained Model: - https://huggingface.co/Helsinki-NLP/opus-mt-en-xh #### Corpus: - Umsuka English-isiZulu Parallel Corpus (https://zenodo.org/record/5035171#.Yh5NIOhBy3A) #### Benchmark: | Benchmark | Train | Test | |-----------|-------|-------| | Umsuka | 17.61 | 13.73 | #### GitHub: - https://github.com/umair-nasir14/Geographical-Distance-Is-The-New-Hyperparameter #### Citation: ``` @article{umair2022geographical, title={Geographical Distance Is The New Hyperparameter: A Case Study Of Finding The Optimal Pre-trained Language For English-isiZulu Machine Translation}, author={Umair Nasir, Muhammad and Amos Mchechesi, Innocent}, journal={arXiv e-prints}, pages={arXiv--2205}, year={2022} } ```