chrisjay commited on
Commit
722c1fd
1 Parent(s): a3954f8

model checkpoints

Browse files
Files changed (2) hide show
  1. README.md +57 -0
  2. mmt_translation.pt +3 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MMTAfrica
2
+ [Paper](https://aclanthology.org/2021.wmt-1.48/) - [Installation](#installation) - [Example](#example) - [Model checkpoint](#model-checkpoint) - [Citation](#citation)
3
+
4
+
5
+ This repository contains the official implementation of the MMTAfrica paper ([Emezue & Dossou, WMT 2021](https://aclanthology.org/2021.wmt-1.48/)).
6
+
7
+ We focus on the task of multilingual machine translation for African languages in the 2021 WMT Shared Task: Large-Scale Multilingual Machine Translation. We introduce MMTAfrica, the first many-to-many multilingual translation system for six African languages: Fon (fon), Igbo (ibo), Kinyarwanda (kin), Swahili/Kiswahili (swa), Xhosa (xho), and Yoruba (yor) and two non-African languages: English (eng) and French (fra). For multilingual translation concerning African languages, we introduce a novel backtranslation and reconstruction objective, BT\&REC, inspired by the random online back translation and T5 modeling framework respectively, to effectively leverage monolingual data. Additionally, we report improvements from MMTAfrica over the FLORES 101 benchmarks (spBLEU gains ranging from +0.58 in Swahili to French to +19.46 in French to Xhosa).
8
+
9
+ ## Installation
10
+ To avoid any conflict with your existing Python setup, we suggest to work in a virtual environment:
11
+ ```
12
+ python -m venv mmtenv
13
+ source mmtenv/bin/activate
14
+ ```
15
+
16
+ Follow these instructions to install MMTAfrica.
17
+ ```
18
+ git clone https://github.com/edaiofficial/mmtafrica.git
19
+ cd mmtafrica
20
+ pip install -r requirements.txt
21
+ ```
22
+
23
+ ## Example
24
+ ```bash
25
+ python mmtafrica.py
26
+ ```
27
+ Consult the arguments [here](https://github.com/edaiofficial/mmtafrica/blob/main/mmtafrica.py#L772-L860).
28
+
29
+ ### Reproducing our paper
30
+ Our data for the paper experiments is stored in the `/experiments` folder. To train MMTAfrica from scratch and reproduce our experiemnts, using the data we have in `/experiments`, run
31
+ ```bash
32
+ cd experiments
33
+ python ../mmtafrica.py --model_name='mmtafrica' --homepath="<YOUR HOMEPATH>"
34
+ ```
35
+ By default, homepath is the current working directory when you run the code.
36
+
37
+ # Model checkpoint
38
+ Our model checkpoints is saved [here](https://drive.google.com/file/d/1gUINHLRQC06HGGeP211-x3IIr3WS84Iy/view?usp=sharing).
39
+
40
+
41
+ ## Citation
42
+ ```
43
+ @inproceedings{emezue-dossou-2021-mmtafrica,
44
+ title = "{MMTA}frica: Multilingual Machine Translation for {A}frican Languages",
45
+ author = "Emezue, Chris Chinenye and
46
+ Dossou, Bonaventure F. P.",
47
+ booktitle = "Proceedings of the Sixth Conference on Machine Translation",
48
+ month = nov,
49
+ year = "2021",
50
+ address = "Online",
51
+ publisher = "Association for Computational Linguistics",
52
+ url = "https://aclanthology.org/2021.wmt-1.48",
53
+ pages = "398--411",
54
+ abstract = "In this paper, we focus on the task of multilingual machine translation for African languages and describe our contribution in the 2021 WMT Shared Task: Large-Scale Multilingual Machine Translation. We introduce MMTAfrica, the first many-to-many multilingual translation system for six African languages: Fon (fon), Igbo (ibo), Kinyarwanda (kin), Swahili/Kiswahili (swa), Xhosa (xho), and Yoruba (yor) and two non-African languages: English (eng) and French (fra). For multilingual translation concerning African languages, we introduce a novel backtranslation and reconstruction objective, BT{\&}REC, inspired by the random online back translation and T5 modelling framework respectively, to effectively leverage monolingual data. Additionally, we report improvements from MMTAfrica over the FLORES 101 benchmarks (spBLEU gains ranging from +0.58 in Swahili to French to +19.46 in French to Xhosa).",
55
+ }
56
+ ```
57
+
mmt_translation.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f625c6b29607333df7b65d9ca693d5b89a5e724b1263bc4f5151938b07a4917b
3
+ size 2329707789