wav2vec2-large-mms-1b-testkabtodz2

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3050
  • Wer: 0.3912
  • Bleu: 0.3965
  • Rouge: {'rouge1': 0.6866479942224073, 'rouge2': 0.49419944096729274, 'rougeL': 0.6863865403045579, 'rougeLsum': 0.6862003777203931}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Bleu Rouge
4.8641 0.0975 100 0.5230 0.5717 0.2226 {'rouge1': 0.5519119696796888, 'rouge2': 0.33242491862912893, 'rougeL': 0.5508885674519883, 'rougeLsum': 0.5508823290573681}
0.802 0.1950 200 0.4593 0.5114 0.2648 {'rouge1': 0.58953299187648, 'rouge2': 0.37419159025327825, 'rougeL': 0.5887974575821588, 'rougeLsum': 0.5887229664506309}
0.7738 0.2925 300 0.4312 0.4973 0.2784 {'rouge1': 0.5944044018037185, 'rouge2': 0.3820164110303814, 'rougeL': 0.5940712199651539, 'rougeLsum': 0.5938586337310308}
0.7313 0.3901 400 0.4266 0.4986 0.2750 {'rouge1': 0.5902592746297463, 'rouge2': 0.3793940643871797, 'rougeL': 0.5898845762564204, 'rougeLsum': 0.5897563640930832}
0.7433 0.4876 500 0.4189 0.4877 0.2935 {'rouge1': 0.602750482474579, 'rouge2': 0.39647848573284306, 'rougeL': 0.6023053374343627, 'rougeLsum': 0.602343728896114}
0.7161 0.5851 600 0.4086 0.4900 0.2882 {'rouge1': 0.6029911807487445, 'rouge2': 0.395504534464239, 'rougeL': 0.6024831110103502, 'rougeLsum': 0.6024934714069899}
0.7132 0.6826 700 0.3988 0.4816 0.2969 {'rouge1': 0.607575303103311, 'rouge2': 0.4004789110948897, 'rougeL': 0.6072235649946007, 'rougeLsum': 0.6071928634776437}
0.7112 0.7801 800 0.3911 0.4806 0.3018 {'rouge1': 0.6073472190198619, 'rouge2': 0.4003632127294614, 'rougeL': 0.6069395068299949, 'rougeLsum': 0.6069252916295935}
0.6906 0.8776 900 0.4150 0.4860 0.3025 {'rouge1': 0.6205375795316523, 'rouge2': 0.40823156015499745, 'rougeL': 0.6194362181055922, 'rougeLsum': 0.6195256177263742}
0.6809 0.9751 1000 0.3911 0.4624 0.3211 {'rouge1': 0.6319197384342741, 'rouge2': 0.4253312675520845, 'rougeL': 0.6313844014257174, 'rougeLsum': 0.63132350437317}
0.6617 1.0731 1100 0.3689 0.4498 0.3370 {'rouge1': 0.6322575330436386, 'rouge2': 0.43072845459598497, 'rougeL': 0.6316818340415837, 'rougeLsum': 0.6316415313394874}
0.6796 1.1706 1200 0.3763 0.4528 0.3355 {'rouge1': 0.6304769195021532, 'rouge2': 0.42696564640627155, 'rougeL': 0.6297729935133036, 'rougeLsum': 0.6297530510684646}
0.6632 1.2682 1300 0.3739 0.4479 0.3354 {'rouge1': 0.6313161561048481, 'rouge2': 0.426488391370188, 'rougeL': 0.6306511289347073, 'rougeLsum': 0.630651790827732}
0.6636 1.3657 1400 0.3690 0.4476 0.3408 {'rouge1': 0.64202594743763, 'rouge2': 0.4378429738943459, 'rougeL': 0.6414380729875171, 'rougeLsum': 0.6415206228179132}
0.6536 1.4632 1500 0.3642 0.4533 0.3356 {'rouge1': 0.6280357205734783, 'rouge2': 0.42568559943236817, 'rougeL': 0.6275443914084615, 'rougeLsum': 0.6274534461139677}
0.6446 1.5607 1600 0.3653 0.4368 0.3507 {'rouge1': 0.6461420249002992, 'rouge2': 0.4438001627474606, 'rougeL': 0.6456928379840208, 'rougeLsum': 0.6455444711615679}
0.6553 1.6582 1700 0.3626 0.4456 0.3390 {'rouge1': 0.6432331806097094, 'rouge2': 0.4401191070889086, 'rougeL': 0.6427475958875217, 'rougeLsum': 0.642852459332723}
0.6572 1.7557 1800 0.3609 0.4324 0.3575 {'rouge1': 0.6454328653360675, 'rouge2': 0.44388267210664833, 'rougeL': 0.6453086687834093, 'rougeLsum': 0.6449502001807803}
0.6444 1.8532 1900 0.3672 0.4484 0.3365 {'rouge1': 0.6395083185418058, 'rouge2': 0.43679257290650747, 'rougeL': 0.6392403411016865, 'rougeLsum': 0.6392631517211713}
0.6473 1.9508 2000 0.3486 0.4240 0.3616 {'rouge1': 0.6587693372834835, 'rouge2': 0.45810889018933787, 'rougeL': 0.6584142817709787, 'rougeLsum': 0.6584115986696275}
0.6471 2.0488 2100 0.3552 0.4378 0.3456 {'rouge1': 0.6417328756997103, 'rouge2': 0.44050602927893867, 'rougeL': 0.6411885518559381, 'rougeLsum': 0.6412466877393013}
0.6407 2.1463 2200 0.3534 0.4326 0.3514 {'rouge1': 0.6448482875294254, 'rouge2': 0.44273171458956706, 'rougeL': 0.6444301511527669, 'rougeLsum': 0.6444244429017728}
0.6217 2.2438 2300 0.3484 0.4357 0.3464 {'rouge1': 0.6465100353141731, 'rouge2': 0.446484151305944, 'rougeL': 0.6463539104358466, 'rougeLsum': 0.6461423326163024}
0.6331 2.3413 2400 0.3487 0.4420 0.3445 {'rouge1': 0.6432138087685608, 'rouge2': 0.4430196912630696, 'rougeL': 0.6428205290032315, 'rougeLsum': 0.6428858114522856}
0.6249 2.4388 2500 0.3488 0.4503 0.3447 {'rouge1': 0.632184865709364, 'rouge2': 0.43333846471754733, 'rougeL': 0.6318449953893892, 'rougeLsum': 0.6316670922841026}
0.6218 2.5363 2600 0.3413 0.4251 0.3612 {'rouge1': 0.654535927905691, 'rouge2': 0.45408373566098514, 'rougeL': 0.6540658630428152, 'rougeLsum': 0.653914599912157}
0.6113 2.6338 2700 0.3420 0.4413 0.3504 {'rouge1': 0.6394928862097133, 'rouge2': 0.4399224799200752, 'rougeL': 0.6391920754076109, 'rougeLsum': 0.6393017372316758}
0.6192 2.7314 2800 0.3365 0.4227 0.3658 {'rouge1': 0.6569316485671998, 'rouge2': 0.4594423103476325, 'rougeL': 0.656243913947681, 'rougeLsum': 0.6562233301855882}
0.6044 2.8289 2900 0.3363 0.4353 0.3587 {'rouge1': 0.646183071116983, 'rouge2': 0.4481004150293554, 'rougeL': 0.64580671285873, 'rougeLsum': 0.6456593928176401}
0.6195 2.9264 3000 0.3300 0.4129 0.3764 {'rouge1': 0.6657770242527834, 'rouge2': 0.46935666332141, 'rougeL': 0.6655864650059968, 'rougeLsum': 0.6655324555835426}
0.5829 3.0244 3100 0.3264 0.4179 0.3745 {'rouge1': 0.6625863240137326, 'rouge2': 0.46648204129527227, 'rougeL': 0.6621263834369671, 'rougeLsum': 0.6621527460167836}
0.5928 3.1219 3200 0.3297 0.4127 0.3817 {'rouge1': 0.673095495338516, 'rouge2': 0.47826999550174626, 'rougeL': 0.6727886265659615, 'rougeLsum': 0.6728029081394497}
0.5887 3.2194 3300 0.3223 0.4090 0.3824 {'rouge1': 0.6688601636397324, 'rouge2': 0.4750190169053594, 'rougeL': 0.66856603378048, 'rougeLsum': 0.6684573932745119}
0.5945 3.3169 3400 0.3247 0.4081 0.3825 {'rouge1': 0.6749468235719442, 'rouge2': 0.4809339895704339, 'rougeL': 0.6748059627732026, 'rougeLsum': 0.6744011490382518}
0.5963 3.4144 3500 0.3245 0.4051 0.3833 {'rouge1': 0.6768289087359305, 'rouge2': 0.48033010945324517, 'rougeL': 0.6763928574950594, 'rougeLsum': 0.6763194694880247}
0.596 3.5119 3600 0.3231 0.4049 0.3870 {'rouge1': 0.6754344734887452, 'rouge2': 0.480847056896067, 'rougeL': 0.6749397480251109, 'rougeLsum': 0.6749071522513619}
0.5981 3.6095 3700 0.3244 0.4063 0.3837 {'rouge1': 0.6729111725992345, 'rouge2': 0.477890058685856, 'rougeL': 0.6725502207828387, 'rougeLsum': 0.6721684033198132}
0.6034 3.7070 3800 0.3233 0.4013 0.3868 {'rouge1': 0.67786147843042, 'rouge2': 0.48410465236909106, 'rougeL': 0.6774730911128821, 'rougeLsum': 0.6775752673148032}
0.5905 3.8045 3900 0.3181 0.4026 0.3881 {'rouge1': 0.6746463587397786, 'rouge2': 0.48087177712472506, 'rougeL': 0.6741599935315957, 'rougeLsum': 0.6742392547974307}
0.578 3.9020 4000 0.3135 0.4002 0.3909 {'rouge1': 0.6766370103189397, 'rouge2': 0.48446493874533947, 'rougeL': 0.676219158730177, 'rougeLsum': 0.676396292595394}
0.6072 3.9995 4100 0.3155 0.4069 0.3845 {'rouge1': 0.6705085415083452, 'rouge2': 0.4788008366001088, 'rougeL': 0.6702966838791513, 'rougeLsum': 0.6702456065782014}
0.5705 4.0975 4200 0.3139 0.3963 0.3960 {'rouge1': 0.6822347354700067, 'rouge2': 0.490574259076373, 'rougeL': 0.681798081565976, 'rougeLsum': 0.6815869681930599}
0.582 4.1950 4300 0.3134 0.4071 0.3828 {'rouge1': 0.6726798708257792, 'rouge2': 0.48009448280339134, 'rougeL': 0.6722160672844482, 'rougeLsum': 0.6722426197160876}
0.5689 4.2925 4400 0.3108 0.4038 0.3913 {'rouge1': 0.6756523337259654, 'rouge2': 0.48440150230871104, 'rougeL': 0.6749977405034011, 'rougeLsum': 0.6751590961316083}
0.5731 4.3901 4500 0.3099 0.4011 0.3941 {'rouge1': 0.6749760552086685, 'rouge2': 0.48412826753355265, 'rougeL': 0.6747442225929513, 'rougeLsum': 0.6747095374256473}
0.5831 4.4876 4600 0.3080 0.3916 0.3991 {'rouge1': 0.6868557645416755, 'rouge2': 0.49595091165875926, 'rougeL': 0.6865384428143437, 'rougeLsum': 0.686681358757959}
0.5712 4.5851 4700 0.3063 0.3906 0.4052 {'rouge1': 0.6867997497042635, 'rouge2': 0.49669116102302024, 'rougeL': 0.6862084016634582, 'rougeLsum': 0.6865160268262884}
0.5683 4.6826 4800 0.3094 0.4085 0.3875 {'rouge1': 0.6681678051625316, 'rouge2': 0.47516161044719774, 'rougeL': 0.6677845942070786, 'rougeLsum': 0.6677001134976471}
0.5496 4.7801 4900 0.3062 0.3941 0.3982 {'rouge1': 0.6834352161678257, 'rouge2': 0.4927639366977512, 'rougeL': 0.6830659548513661, 'rougeLsum': 0.6830753004552528}
0.5728 4.8776 5000 0.3032 0.3891 0.4023 {'rouge1': 0.6880263785743472, 'rouge2': 0.4987834022242583, 'rougeL': 0.6876201767041785, 'rougeLsum': 0.6875855637537824}
0.563 4.9751 5100 0.3045 0.3895 0.4023 {'rouge1': 0.6887255112012787, 'rouge2': 0.49861382354522943, 'rougeL': 0.6883970083813754, 'rougeLsum': 0.6882883604966845}
0.5494 5.0731 5200 0.3021 0.3843 0.4087 {'rouge1': 0.6939325331780102, 'rouge2': 0.5040785087564004, 'rougeL': 0.6935542668132652, 'rougeLsum': 0.6934608880900174}
0.5618 5.1706 5300 0.3017 0.3853 0.4080 {'rouge1': 0.6936578398564996, 'rouge2': 0.5037794471603186, 'rougeL': 0.6930974809519181, 'rougeLsum': 0.6931407281548427}
0.553 5.2682 5400 0.3011 0.3837 0.4096 {'rouge1': 0.6923051375209295, 'rouge2': 0.5026897974184028, 'rougeL': 0.6918655537482983, 'rougeLsum': 0.6917266777521216}
0.5509 5.3657 5500 0.3015 0.3831 0.4086 {'rouge1': 0.6944360586381928, 'rouge2': 0.5044384360519536, 'rougeL': 0.693898682984476, 'rougeLsum': 0.6938576528845959}
0.5628 5.4632 5600 0.3028 0.3842 0.4066 {'rouge1': 0.694715154463311, 'rouge2': 0.5047600478198726, 'rougeL': 0.6940227272152517, 'rougeLsum': 0.6942445051207934}
0.554 5.5607 5700 0.3036 0.3881 0.4012 {'rouge1': 0.6889158380662912, 'rouge2': 0.49795640341672526, 'rougeL': 0.6886375969693838, 'rougeLsum': 0.6884327077528181}
0.5577 5.6582 5800 0.3038 0.3887 0.3999 {'rouge1': 0.6884965753359451, 'rouge2': 0.49573428094504507, 'rougeL': 0.6880012859807018, 'rougeLsum': 0.6879626944690841}
0.5459 5.7557 5900 0.3037 0.3875 0.4022 {'rouge1': 0.6900102070748361, 'rouge2': 0.49876755030616227, 'rougeL': 0.689684294718296, 'rougeLsum': 0.6895918260247067}
0.5519 5.8532 6000 0.3043 0.3902 0.3983 {'rouge1': 0.6871252623368083, 'rouge2': 0.4951939028642095, 'rougeL': 0.6867325292840226, 'rougeLsum': 0.6868102516080106}
0.5673 5.9508 6100 0.3050 0.3912 0.3965 {'rouge1': 0.6866479942224073, 'rouge2': 0.49419944096729274, 'rougeL': 0.6863865403045579, 'rougeLsum': 0.6862003777203931}

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ilyes25/kab

Finetuned
(268)
this model