edaiofficial's picture
additional commits
45eca8f
|
raw
history blame
No virus
1.1 kB

English to Igbo

Author: iroro orife

Data

- The JW300 English-Igbo dataset.

Model

Analysis

The dataset requires more preprocessing to remove special characters and Scripture chapters/verse names & figures. One very nice aspect of the Igbo translations are the proper tonal and orthographic diacritic forms predicted by the model. This is not a feature that is available with Google Translate!

Example 1

    Source: It’s not about the alcohol .
    Reference: Nsogbu ya abụghị na ịṅụ mmanya na - aba n’anya na - agụ ya .
    Hypothesis:        Ọ bụghị banyere mmanya na - aba n’anya .	

Example 2

    Source: Is this also the case with your neighborhood ?
    Reference:        Ọ̀ bụ otú a ka ọ dịkwa n’agbata obi gị ?
    Hypothesis: Nke a ọ̀ bụkwa ihe banyere ndị agbata obi gị ?

Results

Tokenization BLEU dev BLEU test
BPE 33.51 34.85