# English to Igbo Author: iroro orife ## Data - The JW300 English-Igbo dataset. ## Model - Default Masakhane Transformer translation model. - [Link to google drive folder with models](https://drive.google.com/drive/folders/1bVPKPkaivIT9k23ydbSlVj3Qwd3GJZf0) ## Analysis The dataset requires more preprocessing to remove special characters and Scripture chapters/verse names & figures. One very nice aspect of the Igbo translations are the proper tonal and orthographic diacritic forms predicted by the model. This is not a feature that is available with Google Translate! Example 1 ```sh Source: It’s not about the alcohol . Reference: Nsogbu ya abụghị na ịṅụ mmanya na - aba n’anya na - agụ ya . Hypothesis: Ọ bụghị banyere mmanya na - aba n’anya . ``` Example 2 ```sh Source: Is this also the case with your neighborhood ? Reference: Ọ̀ bụ otú a ka ọ dịkwa n’agbata obi gị ? Hypothesis: Nke a ọ̀ bụkwa ihe banyere ndị agbata obi gị ? ``` # Results Tokenization | BLEU dev | BLEU test --- | --- | --- BPE| 33.51 | 34.85