metadata

tags:
  - generated_from_trainer
datasets:
  - open_subtitles
metrics:
  - bleu
model-index:
  - name: opus-mt-en-id-open-subtitles
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: open_subtitles
          type: open_subtitles
          config: en-id
          split: train
          args: en-id
        metrics:
          - name: Bleu
            type: bleu
            value: 30.2272

opus-mt-en-id-open-subtitles

This model was trained from scratch on the open_subtitles dataset. It achieves the following results on the evaluation set:

Loss: 2.3148
Bleu: 30.2272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 4000
num_epochs: 25

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu
1.5356	1.0	28125	1.5619	31.8599
1.4703	2.0	56250	1.6047	31.8339
1.3857	3.0	84375	1.6281	32.0796
1.313	4.0	112500	1.6619	31.7391
1.2468	5.0	140625	1.6706	31.9009
1.1831	6.0	168750	1.6924	31.4491
1.1232	7.0	196875	1.7252	31.7229
1.0649	8.0	225000	1.7483	31.7093
1.0078	9.0	253125	1.7697	31.4902
0.9516	10.0	281250	1.8026	31.4342
0.8969	11.0	309375	1.8364	31.2466
0.8436	12.0	337500	1.8747	31.1737
0.7916	13.0	365625	1.9035	31.0118
0.7406	14.0	393750	1.9414	30.9409
0.6912	15.0	421875	1.9776	30.9562
0.6439	16.0	450000	2.0221	30.582
0.5983	17.0	478125	2.0588	30.4478
0.5544	18.0	506250	2.1023	30.4601
0.5126	19.0	534375	2.1367	30.4802
0.474	20.0	562500	2.1790	30.4211
0.438	21.0	590625	2.2131	30.3327
0.4039	22.0	618750	2.2484	30.196
0.3737	23.0	646875	2.2779	30.1145
0.3475	24.0	675000	2.3022	30.2635
0.326	25.0	703125	2.3148	30.2272

Framework versions

Transformers 4.26.1
Pytorch 2.0.0
Datasets 2.10.1
Tokenizers 0.11.0