wav2vec2-large-mms-1b-testkabtodz2
This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3050
- Wer: 0.3912
- Bleu: 0.3965
- Rouge: {'rouge1': 0.6866479942224073, 'rouge2': 0.49419944096729274, 'rougeL': 0.6863865403045579, 'rougeLsum': 0.6862003777203931}
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 6
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer | Bleu | Rouge |
---|---|---|---|---|---|---|
4.8641 | 0.0975 | 100 | 0.5230 | 0.5717 | 0.2226 | {'rouge1': 0.5519119696796888, 'rouge2': 0.33242491862912893, 'rougeL': 0.5508885674519883, 'rougeLsum': 0.5508823290573681} |
0.802 | 0.1950 | 200 | 0.4593 | 0.5114 | 0.2648 | {'rouge1': 0.58953299187648, 'rouge2': 0.37419159025327825, 'rougeL': 0.5887974575821588, 'rougeLsum': 0.5887229664506309} |
0.7738 | 0.2925 | 300 | 0.4312 | 0.4973 | 0.2784 | {'rouge1': 0.5944044018037185, 'rouge2': 0.3820164110303814, 'rougeL': 0.5940712199651539, 'rougeLsum': 0.5938586337310308} |
0.7313 | 0.3901 | 400 | 0.4266 | 0.4986 | 0.2750 | {'rouge1': 0.5902592746297463, 'rouge2': 0.3793940643871797, 'rougeL': 0.5898845762564204, 'rougeLsum': 0.5897563640930832} |
0.7433 | 0.4876 | 500 | 0.4189 | 0.4877 | 0.2935 | {'rouge1': 0.602750482474579, 'rouge2': 0.39647848573284306, 'rougeL': 0.6023053374343627, 'rougeLsum': 0.602343728896114} |
0.7161 | 0.5851 | 600 | 0.4086 | 0.4900 | 0.2882 | {'rouge1': 0.6029911807487445, 'rouge2': 0.395504534464239, 'rougeL': 0.6024831110103502, 'rougeLsum': 0.6024934714069899} |
0.7132 | 0.6826 | 700 | 0.3988 | 0.4816 | 0.2969 | {'rouge1': 0.607575303103311, 'rouge2': 0.4004789110948897, 'rougeL': 0.6072235649946007, 'rougeLsum': 0.6071928634776437} |
0.7112 | 0.7801 | 800 | 0.3911 | 0.4806 | 0.3018 | {'rouge1': 0.6073472190198619, 'rouge2': 0.4003632127294614, 'rougeL': 0.6069395068299949, 'rougeLsum': 0.6069252916295935} |
0.6906 | 0.8776 | 900 | 0.4150 | 0.4860 | 0.3025 | {'rouge1': 0.6205375795316523, 'rouge2': 0.40823156015499745, 'rougeL': 0.6194362181055922, 'rougeLsum': 0.6195256177263742} |
0.6809 | 0.9751 | 1000 | 0.3911 | 0.4624 | 0.3211 | {'rouge1': 0.6319197384342741, 'rouge2': 0.4253312675520845, 'rougeL': 0.6313844014257174, 'rougeLsum': 0.63132350437317} |
0.6617 | 1.0731 | 1100 | 0.3689 | 0.4498 | 0.3370 | {'rouge1': 0.6322575330436386, 'rouge2': 0.43072845459598497, 'rougeL': 0.6316818340415837, 'rougeLsum': 0.6316415313394874} |
0.6796 | 1.1706 | 1200 | 0.3763 | 0.4528 | 0.3355 | {'rouge1': 0.6304769195021532, 'rouge2': 0.42696564640627155, 'rougeL': 0.6297729935133036, 'rougeLsum': 0.6297530510684646} |
0.6632 | 1.2682 | 1300 | 0.3739 | 0.4479 | 0.3354 | {'rouge1': 0.6313161561048481, 'rouge2': 0.426488391370188, 'rougeL': 0.6306511289347073, 'rougeLsum': 0.630651790827732} |
0.6636 | 1.3657 | 1400 | 0.3690 | 0.4476 | 0.3408 | {'rouge1': 0.64202594743763, 'rouge2': 0.4378429738943459, 'rougeL': 0.6414380729875171, 'rougeLsum': 0.6415206228179132} |
0.6536 | 1.4632 | 1500 | 0.3642 | 0.4533 | 0.3356 | {'rouge1': 0.6280357205734783, 'rouge2': 0.42568559943236817, 'rougeL': 0.6275443914084615, 'rougeLsum': 0.6274534461139677} |
0.6446 | 1.5607 | 1600 | 0.3653 | 0.4368 | 0.3507 | {'rouge1': 0.6461420249002992, 'rouge2': 0.4438001627474606, 'rougeL': 0.6456928379840208, 'rougeLsum': 0.6455444711615679} |
0.6553 | 1.6582 | 1700 | 0.3626 | 0.4456 | 0.3390 | {'rouge1': 0.6432331806097094, 'rouge2': 0.4401191070889086, 'rougeL': 0.6427475958875217, 'rougeLsum': 0.642852459332723} |
0.6572 | 1.7557 | 1800 | 0.3609 | 0.4324 | 0.3575 | {'rouge1': 0.6454328653360675, 'rouge2': 0.44388267210664833, 'rougeL': 0.6453086687834093, 'rougeLsum': 0.6449502001807803} |
0.6444 | 1.8532 | 1900 | 0.3672 | 0.4484 | 0.3365 | {'rouge1': 0.6395083185418058, 'rouge2': 0.43679257290650747, 'rougeL': 0.6392403411016865, 'rougeLsum': 0.6392631517211713} |
0.6473 | 1.9508 | 2000 | 0.3486 | 0.4240 | 0.3616 | {'rouge1': 0.6587693372834835, 'rouge2': 0.45810889018933787, 'rougeL': 0.6584142817709787, 'rougeLsum': 0.6584115986696275} |
0.6471 | 2.0488 | 2100 | 0.3552 | 0.4378 | 0.3456 | {'rouge1': 0.6417328756997103, 'rouge2': 0.44050602927893867, 'rougeL': 0.6411885518559381, 'rougeLsum': 0.6412466877393013} |
0.6407 | 2.1463 | 2200 | 0.3534 | 0.4326 | 0.3514 | {'rouge1': 0.6448482875294254, 'rouge2': 0.44273171458956706, 'rougeL': 0.6444301511527669, 'rougeLsum': 0.6444244429017728} |
0.6217 | 2.2438 | 2300 | 0.3484 | 0.4357 | 0.3464 | {'rouge1': 0.6465100353141731, 'rouge2': 0.446484151305944, 'rougeL': 0.6463539104358466, 'rougeLsum': 0.6461423326163024} |
0.6331 | 2.3413 | 2400 | 0.3487 | 0.4420 | 0.3445 | {'rouge1': 0.6432138087685608, 'rouge2': 0.4430196912630696, 'rougeL': 0.6428205290032315, 'rougeLsum': 0.6428858114522856} |
0.6249 | 2.4388 | 2500 | 0.3488 | 0.4503 | 0.3447 | {'rouge1': 0.632184865709364, 'rouge2': 0.43333846471754733, 'rougeL': 0.6318449953893892, 'rougeLsum': 0.6316670922841026} |
0.6218 | 2.5363 | 2600 | 0.3413 | 0.4251 | 0.3612 | {'rouge1': 0.654535927905691, 'rouge2': 0.45408373566098514, 'rougeL': 0.6540658630428152, 'rougeLsum': 0.653914599912157} |
0.6113 | 2.6338 | 2700 | 0.3420 | 0.4413 | 0.3504 | {'rouge1': 0.6394928862097133, 'rouge2': 0.4399224799200752, 'rougeL': 0.6391920754076109, 'rougeLsum': 0.6393017372316758} |
0.6192 | 2.7314 | 2800 | 0.3365 | 0.4227 | 0.3658 | {'rouge1': 0.6569316485671998, 'rouge2': 0.4594423103476325, 'rougeL': 0.656243913947681, 'rougeLsum': 0.6562233301855882} |
0.6044 | 2.8289 | 2900 | 0.3363 | 0.4353 | 0.3587 | {'rouge1': 0.646183071116983, 'rouge2': 0.4481004150293554, 'rougeL': 0.64580671285873, 'rougeLsum': 0.6456593928176401} |
0.6195 | 2.9264 | 3000 | 0.3300 | 0.4129 | 0.3764 | {'rouge1': 0.6657770242527834, 'rouge2': 0.46935666332141, 'rougeL': 0.6655864650059968, 'rougeLsum': 0.6655324555835426} |
0.5829 | 3.0244 | 3100 | 0.3264 | 0.4179 | 0.3745 | {'rouge1': 0.6625863240137326, 'rouge2': 0.46648204129527227, 'rougeL': 0.6621263834369671, 'rougeLsum': 0.6621527460167836} |
0.5928 | 3.1219 | 3200 | 0.3297 | 0.4127 | 0.3817 | {'rouge1': 0.673095495338516, 'rouge2': 0.47826999550174626, 'rougeL': 0.6727886265659615, 'rougeLsum': 0.6728029081394497} |
0.5887 | 3.2194 | 3300 | 0.3223 | 0.4090 | 0.3824 | {'rouge1': 0.6688601636397324, 'rouge2': 0.4750190169053594, 'rougeL': 0.66856603378048, 'rougeLsum': 0.6684573932745119} |
0.5945 | 3.3169 | 3400 | 0.3247 | 0.4081 | 0.3825 | {'rouge1': 0.6749468235719442, 'rouge2': 0.4809339895704339, 'rougeL': 0.6748059627732026, 'rougeLsum': 0.6744011490382518} |
0.5963 | 3.4144 | 3500 | 0.3245 | 0.4051 | 0.3833 | {'rouge1': 0.6768289087359305, 'rouge2': 0.48033010945324517, 'rougeL': 0.6763928574950594, 'rougeLsum': 0.6763194694880247} |
0.596 | 3.5119 | 3600 | 0.3231 | 0.4049 | 0.3870 | {'rouge1': 0.6754344734887452, 'rouge2': 0.480847056896067, 'rougeL': 0.6749397480251109, 'rougeLsum': 0.6749071522513619} |
0.5981 | 3.6095 | 3700 | 0.3244 | 0.4063 | 0.3837 | {'rouge1': 0.6729111725992345, 'rouge2': 0.477890058685856, 'rougeL': 0.6725502207828387, 'rougeLsum': 0.6721684033198132} |
0.6034 | 3.7070 | 3800 | 0.3233 | 0.4013 | 0.3868 | {'rouge1': 0.67786147843042, 'rouge2': 0.48410465236909106, 'rougeL': 0.6774730911128821, 'rougeLsum': 0.6775752673148032} |
0.5905 | 3.8045 | 3900 | 0.3181 | 0.4026 | 0.3881 | {'rouge1': 0.6746463587397786, 'rouge2': 0.48087177712472506, 'rougeL': 0.6741599935315957, 'rougeLsum': 0.6742392547974307} |
0.578 | 3.9020 | 4000 | 0.3135 | 0.4002 | 0.3909 | {'rouge1': 0.6766370103189397, 'rouge2': 0.48446493874533947, 'rougeL': 0.676219158730177, 'rougeLsum': 0.676396292595394} |
0.6072 | 3.9995 | 4100 | 0.3155 | 0.4069 | 0.3845 | {'rouge1': 0.6705085415083452, 'rouge2': 0.4788008366001088, 'rougeL': 0.6702966838791513, 'rougeLsum': 0.6702456065782014} |
0.5705 | 4.0975 | 4200 | 0.3139 | 0.3963 | 0.3960 | {'rouge1': 0.6822347354700067, 'rouge2': 0.490574259076373, 'rougeL': 0.681798081565976, 'rougeLsum': 0.6815869681930599} |
0.582 | 4.1950 | 4300 | 0.3134 | 0.4071 | 0.3828 | {'rouge1': 0.6726798708257792, 'rouge2': 0.48009448280339134, 'rougeL': 0.6722160672844482, 'rougeLsum': 0.6722426197160876} |
0.5689 | 4.2925 | 4400 | 0.3108 | 0.4038 | 0.3913 | {'rouge1': 0.6756523337259654, 'rouge2': 0.48440150230871104, 'rougeL': 0.6749977405034011, 'rougeLsum': 0.6751590961316083} |
0.5731 | 4.3901 | 4500 | 0.3099 | 0.4011 | 0.3941 | {'rouge1': 0.6749760552086685, 'rouge2': 0.48412826753355265, 'rougeL': 0.6747442225929513, 'rougeLsum': 0.6747095374256473} |
0.5831 | 4.4876 | 4600 | 0.3080 | 0.3916 | 0.3991 | {'rouge1': 0.6868557645416755, 'rouge2': 0.49595091165875926, 'rougeL': 0.6865384428143437, 'rougeLsum': 0.686681358757959} |
0.5712 | 4.5851 | 4700 | 0.3063 | 0.3906 | 0.4052 | {'rouge1': 0.6867997497042635, 'rouge2': 0.49669116102302024, 'rougeL': 0.6862084016634582, 'rougeLsum': 0.6865160268262884} |
0.5683 | 4.6826 | 4800 | 0.3094 | 0.4085 | 0.3875 | {'rouge1': 0.6681678051625316, 'rouge2': 0.47516161044719774, 'rougeL': 0.6677845942070786, 'rougeLsum': 0.6677001134976471} |
0.5496 | 4.7801 | 4900 | 0.3062 | 0.3941 | 0.3982 | {'rouge1': 0.6834352161678257, 'rouge2': 0.4927639366977512, 'rougeL': 0.6830659548513661, 'rougeLsum': 0.6830753004552528} |
0.5728 | 4.8776 | 5000 | 0.3032 | 0.3891 | 0.4023 | {'rouge1': 0.6880263785743472, 'rouge2': 0.4987834022242583, 'rougeL': 0.6876201767041785, 'rougeLsum': 0.6875855637537824} |
0.563 | 4.9751 | 5100 | 0.3045 | 0.3895 | 0.4023 | {'rouge1': 0.6887255112012787, 'rouge2': 0.49861382354522943, 'rougeL': 0.6883970083813754, 'rougeLsum': 0.6882883604966845} |
0.5494 | 5.0731 | 5200 | 0.3021 | 0.3843 | 0.4087 | {'rouge1': 0.6939325331780102, 'rouge2': 0.5040785087564004, 'rougeL': 0.6935542668132652, 'rougeLsum': 0.6934608880900174} |
0.5618 | 5.1706 | 5300 | 0.3017 | 0.3853 | 0.4080 | {'rouge1': 0.6936578398564996, 'rouge2': 0.5037794471603186, 'rougeL': 0.6930974809519181, 'rougeLsum': 0.6931407281548427} |
0.553 | 5.2682 | 5400 | 0.3011 | 0.3837 | 0.4096 | {'rouge1': 0.6923051375209295, 'rouge2': 0.5026897974184028, 'rougeL': 0.6918655537482983, 'rougeLsum': 0.6917266777521216} |
0.5509 | 5.3657 | 5500 | 0.3015 | 0.3831 | 0.4086 | {'rouge1': 0.6944360586381928, 'rouge2': 0.5044384360519536, 'rougeL': 0.693898682984476, 'rougeLsum': 0.6938576528845959} |
0.5628 | 5.4632 | 5600 | 0.3028 | 0.3842 | 0.4066 | {'rouge1': 0.694715154463311, 'rouge2': 0.5047600478198726, 'rougeL': 0.6940227272152517, 'rougeLsum': 0.6942445051207934} |
0.554 | 5.5607 | 5700 | 0.3036 | 0.3881 | 0.4012 | {'rouge1': 0.6889158380662912, 'rouge2': 0.49795640341672526, 'rougeL': 0.6886375969693838, 'rougeLsum': 0.6884327077528181} |
0.5577 | 5.6582 | 5800 | 0.3038 | 0.3887 | 0.3999 | {'rouge1': 0.6884965753359451, 'rouge2': 0.49573428094504507, 'rougeL': 0.6880012859807018, 'rougeLsum': 0.6879626944690841} |
0.5459 | 5.7557 | 5900 | 0.3037 | 0.3875 | 0.4022 | {'rouge1': 0.6900102070748361, 'rouge2': 0.49876755030616227, 'rougeL': 0.689684294718296, 'rougeLsum': 0.6895918260247067} |
0.5519 | 5.8532 | 6000 | 0.3043 | 0.3902 | 0.3983 | {'rouge1': 0.6871252623368083, 'rouge2': 0.4951939028642095, 'rougeL': 0.6867325292840226, 'rougeLsum': 0.6868102516080106} |
0.5673 | 5.9508 | 6100 | 0.3050 | 0.3912 | 0.3965 | {'rouge1': 0.6866479942224073, 'rouge2': 0.49419944096729274, 'rougeL': 0.6863865403045579, 'rougeLsum': 0.6862003777203931} |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for ilyes25/kab
Base model
facebook/mms-1b-all