ardzdirect

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2738
  • Wer: 0.4396
  • Bleu: 0.3030
  • Rouge: {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Bleu Rouge
2.9801 0.8316 100 0.4689 0.6744 0.0952 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.5487 1.6570 200 0.3861 0.6112 0.1243 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4999 2.4823 300 0.3645 0.5976 0.1358 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.482 3.3077 400 0.3483 0.5838 0.1788 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4421 4.1331 500 0.3513 0.5801 0.1489 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4465 4.9647 600 0.3361 0.5699 0.1562 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4425 5.7900 700 0.3482 0.5886 0.1368 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4049 6.6154 800 0.3304 0.5537 0.1708 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4011 7.4407 900 0.3155 0.5451 0.2087 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4033 8.2661 1000 0.3319 0.5522 0.1835 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3804 9.0915 1100 0.2983 0.5172 0.2160 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3672 9.9231 1200 0.2957 0.5170 0.2282 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3661 10.7484 1300 0.2999 0.5152 0.2346 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3599 11.5738 1400 0.2952 0.4998 0.2372 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3475 12.3992 1500 0.3041 0.4961 0.2493 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3387 13.2245 1600 0.2971 0.5205 0.2432 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3474 14.0499 1700 0.2948 0.4808 0.2616 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3238 14.8815 1800 0.2809 0.4860 0.2669 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3156 15.7069 1900 0.2740 0.4544 0.2857 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3185 16.5322 2000 0.2736 0.4627 0.2799 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3148 17.3576 2100 0.2793 0.4455 0.2959 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2996 18.1830 2200 0.2705 0.4451 0.2943 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.3112 19.0083 2300 0.2708 0.4440 0.2986 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.2916 19.8399 2400 0.2738 0.4396 0.3030 {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
7
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ilyes25/ardzdirect

Finetuned
(273)
this model