edaiofficial's picture
initial commits
78aa4ee

English to Kikuyu

Author: Kathleen Siminyu

Data

- The JW300 English-Kikuyu dataset, 106699 lines.

Model

- Link to google drive folder with model(https://drive.google.com/open?id=1kjb2hXaSaG-Esl_SHJIptTZ9-Gw-er6f)

Analysis

- Tried out different BPE settings and managed some improvements on the baseline. Highest BLEU score I recorded was at BPE 20000. It might be worth exploring different bpe settings for the source 		and target languages. Results from different settings included in the analysis below.

BPE 4000

    BLEU dev: 23.83
    BLEU test: 36.06

BPE 5000

    2020-05-02 16:15:52,007 -  dev bleu:  23.90 [Beam search decoding with beam size = 5 and alpha = 1.0]
    2020-05-02 16:19:02,219 - test bleu:  36.53 [Beam search decoding with beam size = 5 and alpha = 1.0]

BPE 10000

    2020-05-02 16:24:16,890 -  dev bleu:  25.08 [Beam search decoding with beam size = 5 and alpha = 1.0]
    2020-05-02 16:27:32,801 - test bleu:  37.39 [Beam search decoding with beam size = 5 and alpha = 1.0]

BPE 15000

    2020-05-02 16:29:11,281 -  dev bleu:  25.60 [Beam search decoding with beam size = 5 and alpha = 1.0]
    2020-05-02 16:32:22,789 - test bleu:  37.75 [Beam search decoding with beam size = 5 and alpha = 1.0]

BPE 25000

    2020-05-02 16:38:37,910 -  dev bleu:  25.13 [Beam search decoding with beam size = 5 and alpha = 1.0]
    2020-05-02 16:41:35,004 - test bleu:  37.35 [Beam search decoding with beam size = 5 and alpha = 1.0]

BPE 30000

    2020-05-02 16:43:32,174 -  dev bleu:  25.36 [Beam search decoding with beam size = 5 and alpha = 1.0]
    2020-05-02 16:46:28,748 - test bleu:  37.35 [Beam search decoding with beam size = 5 and alpha = 1.0]

BPE 35000

    2020-05-02 16:48:32,963 -  dev bleu:  25.95 [Beam search decoding with beam size = 5 and alpha = 1.0]
    2020-05-02 16:51:30,421 - test bleu:  37.77 [Beam search decoding with beam size = 5 and alpha = 1.0]

Results

BPE 20000

    2020-05-02 16:33:57,175 -  dev bleu:  25.14 [Beam search decoding with beam size = 5 and alpha = 1.0]
    2020-05-02 16:36:53,670 - test bleu:  37.85 [Beam search decoding with beam size = 5 and alpha = 1.0]