Terjman-Ultra / README.md
BounharAbdelaziz's picture
BounharAbdelaziz/Terjman-Ultra
f9cf741 verified
|
raw
history blame
No virus
3.26 kB
metadata
license: cc-by-nc-4.0
base_model: facebook/nllb-200-1.3B
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: Terjman-Ultra
    results: []

Terjman-Ultra

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7070
  • Bleu: 4.6998
  • Gen Len: 35.6088

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
3.203 0.9999 2242 2.9015 4.3057 36.7548
2.9175 1.9998 4484 2.7602 4.4286 35.708
2.8558 2.9997 6726 2.7303 4.629 35.562
2.8696 4.0 8969 2.7195 4.6537 35.562
2.8604 4.9999 11211 2.7144 4.6905 35.5702
2.8509 5.9998 13453 2.7112 4.599 35.5427
2.853 6.9997 15695 2.7098 4.6625 35.5317
2.8475 8.0 17938 2.7081 4.6901 35.6419
2.8192 8.9999 20180 2.7082 4.5474 35.6391
2.8395 9.9998 22422 2.7077 4.722 35.6088
2.8395 10.9997 24664 2.7076 4.752 35.5868
2.8362 12.0 26907 2.7074 4.6664 35.562
2.8673 12.9999 29149 2.7072 4.7004 35.6639
2.8465 13.9998 31391 2.7076 4.6715 35.5923
2.8281 14.9997 33633 2.7075 4.7045 35.5647
2.8191 16.0 35876 2.7068 4.7487 35.6253
2.874 16.9999 38118 2.7076 4.71 35.6006
2.8666 17.9998 40360 2.7069 4.6047 35.6281
2.8645 18.9997 42602 2.7063 4.6664 35.6088
2.8458 20.0 44845 2.7070 4.6552 35.5813
2.8501 20.9999 47087 2.7074 4.6919 35.5647
2.8309 21.9998 49329 2.7074 4.623 35.6226
2.854 22.9997 51571 2.7072 4.6495 35.5978
2.8407 24.0 53814 2.7070 4.6879 35.5482
2.8129 24.9972 56050 2.7070 4.6998 35.6088

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1