kab-dz

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3296
  • Wer: 0.5537
  • Bleu: {'bleu': 0.17822041427852187, 'precisions': [0.46242010138858275, 0.24001479289940827, 0.13158998741434763, 0.0734417780641005], 'brevity_penalty': 0.984798238899528, 'length_ratio': 0.9849126234668404, 'translation_length': 9074, 'reference_length': 9213}
  • Rouge: {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Bleu Rouge
8.3957 1.0 121 6.4435 1.0002 {'bleu': 0.0, 'precisions': [0.0, 0.0, 0.0, 0.0], 'brevity_penalty': 0.028413494474637858, 'length_ratio': 0.21925539997829155, 'translation_length': 2020, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
3.8246 2.0 242 1.7852 1.0036 {'bleu': 0.0, 'precisions': [0.0019450800915331808, 0.0, 0.0, 0.0], 'brevity_penalty': 0.9475361779864253, 'length_ratio': 0.9488654869178157, 'translation_length': 8740, 'reference_length': 9211} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.8242 3.0 363 0.5552 0.7259 {'bleu': 0.05984194666820544, 'precisions': [0.2893541597429932, 0.10006199628022319, 0.036166619757951025, 0.013298734998378203], 'brevity_penalty': 0.9796059773354316, 'length_ratio': 0.9798111364376425, 'translation_length': 9027, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.6641 4.0 484 0.4531 0.6539 {'bleu': 0.09740569316232665, 'precisions': [0.36318407960199006, 0.15155264134603488, 0.06640460480134774, 0.026528631510837918], 'brevity_penalty': 0.9815976322925238, 'length_ratio': 0.981764897427548, 'translation_length': 9045, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.5731 5.0 605 0.4109 0.6272 {'bleu': 0.12229576623270189, 'precisions': [0.38988526233708365, 0.17741734248284466, 0.08717221828490432, 0.04121013900245298], 'brevity_penalty': 0.9740531517333079, 'length_ratio': 0.9743840225767937, 'translation_length': 8977, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.5674 6.0 726 0.3918 0.6109 {'bleu': 0.1305912953348509, 'precisions': [0.40551617190961453, 0.1891891891891892, 0.09289232934553132, 0.0442966087944183], 'brevity_penalty': 0.9797167269065808, 'length_ratio': 0.9799196787148594, 'translation_length': 9028, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.5257 7.0 847 0.3782 0.6064 {'bleu': 0.13275289042482177, 'precisions': [0.41021946353358457, 0.19280397022332507, 0.09367516551626989, 0.045624289657411915], 'brevity_penalty': 0.9790520492063531, 'length_ratio': 0.9792684250515575, 'translation_length': 9022, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.5374 8.0 968 0.3713 0.5998 {'bleu': 0.12766441539188614, 'precisions': [0.41609475315474875, 0.1911546085232904, 0.08957952468007313, 0.04035656401944895], 'brevity_penalty': 0.9803809720556327, 'length_ratio': 0.9805709323781613, 'translation_length': 9034, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.5153 9.0 1089 0.3626 0.5952 {'bleu': 0.13409888931641242, 'precisions': [0.4208892338396718, 0.1984609656199578, 0.0951240135287486, 0.044354183590576766], 'brevity_penalty': 0.9787195480653427, 'length_ratio': 0.9789427982199067, 'translation_length': 9019, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.5001 10.0 1210 0.3580 0.5899 {'bleu': 0.1381005218002083, 'precisions': [0.42647221301513644, 0.2020027197428607, 0.09708193041526375, 0.04671839637892014], 'brevity_penalty': 0.9822606533452595, 'length_ratio': 0.9824161510908499, 'translation_length': 9051, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.482 11.0 1331 0.3538 0.5894 {'bleu': 0.1433986930790172, 'precisions': [0.4262295081967213, 0.20443838333746592, 0.10175932441942294, 0.051760506246957654], 'brevity_penalty': 0.9797167269065808, 'length_ratio': 0.9799196787148594, 'translation_length': 9028, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4755 12.0 1452 0.3485 0.5860 {'bleu': 0.14969579657748497, 'precisions': [0.4299645390070922, 0.21161002232696602, 0.10744965497817209, 0.055853222925799646], 'brevity_penalty': 0.9792736565176406, 'length_ratio': 0.9794855096059916, 'translation_length': 9024, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4663 13.0 1573 0.3582 0.5948 {'bleu': 0.13789844298163847, 'precisions': [0.42050093787928944, 0.1963955067275645, 0.09663865546218488, 0.048410521219945137], 'brevity_penalty': 0.9835854011732358, 'length_ratio': 0.9837186584174537, 'translation_length': 9063, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4862 14.0 1694 0.3405 0.5753 {'bleu': 0.16755433246839507, 'precisions': [0.44081497065662717, 0.22419134960961706, 0.12253798536859876, 0.07054816736944534], 'brevity_penalty': 0.9800489035331547, 'length_ratio': 0.9802453055465103, 'translation_length': 9031, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4745 15.0 1815 0.3422 0.5763 {'bleu': 0.1536234645948005, 'precisions': [0.43993794326241137, 0.21719176383031505, 0.10998450922405295, 0.05762987012987013], 'brevity_penalty': 0.9792736565176406, 'length_ratio': 0.9794855096059916, 'translation_length': 9024, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4736 16.0 1936 0.3341 0.5685 {'bleu': 0.17290916107086277, 'precisions': [0.4472433985195006, 0.2301891457534924, 0.1271043771043771, 0.07337966704380151], 'brevity_penalty': 0.9822606533452595, 'length_ratio': 0.9824161510908499, 'translation_length': 9051, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4583 17.0 2057 0.3318 0.5657 {'bleu': 0.1716852104726425, 'precisions': [0.4503597122302158, 0.23287501548371115, 0.1262654668166479, 0.07098865478119935], 'brevity_penalty': 0.9804916375458205, 'length_ratio': 0.9806794746553783, 'translation_length': 9035, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4551 18.0 2178 0.3335 0.5633 {'bleu': 0.16701443400029833, 'precisions': [0.4525805028672254, 0.22921292869479398, 0.1224632610216935, 0.06529098823150088], 'brevity_penalty': 0.9841368705211414, 'length_ratio': 0.9842613698035385, 'translation_length': 9068, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4481 19.0 2299 0.3296 0.5607 {'bleu': 0.1729834332448216, 'precisions': [0.4553581282419159, 0.2354611680454377, 0.1272065004202858, 0.07020658489347967], 'brevity_penalty': 0.9833647296422493, 'length_ratio': 0.9835015738630196, 'translation_length': 9061, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4514 20.0 2420 0.3267 0.5555 {'bleu': 0.17945797616615725, 'precisions': [0.4616664817485854, 0.24146068811327787, 0.13328631875881522, 0.07628497072218608], 'brevity_penalty': 0.9780542210332569, 'length_ratio': 0.9782915445566048, 'translation_length': 9013, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4522 21.0 2541 0.3293 0.5527 {'bleu': 0.17917597769682736, 'precisions': [0.46393805309734515, 0.2416439712800198, 0.13151608823942673, 0.07546558704453442], 'brevity_penalty': 0.9810447849231894, 'length_ratio': 0.9812221860414632, 'translation_length': 9040, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.439 22.0 2662 0.3274 0.5484 {'bleu': 0.1837033708040611, 'precisions': [0.4683110275412012, 0.24495605891818295, 0.1361337454341107, 0.07869170984455959], 'brevity_penalty': 0.9811553783926978, 'length_ratio': 0.9813307283186802, 'translation_length': 9041, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}
0.4342 23.0 2783 0.3296 0.5537 {'bleu': 0.17822041427852187, 'precisions': [0.46242010138858275, 0.24001479289940827, 0.13158998741434763, 0.0734417780641005], 'brevity_penalty': 0.984798238899528, 'length_ratio': 0.9849126234668404, 'translation_length': 9074, 'reference_length': 9213} {'rouge1': 0.0, 'rouge2': 0.0, 'rougeL': 0.0, 'rougeLsum': 0.0}

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ilyes25/kab-dz

Finetuned
(273)
this model