xlsr-big-kcnnn

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0000
Wer: 0.0420

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0004
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 132
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
2.213	1.2945	200	1.0902	0.8641
0.7297	2.5890	400	0.1257	0.1338
0.2259	3.8835	600	0.0404	0.0591
0.1269	5.1780	800	0.0286	0.0332
0.0854	6.4725	1000	0.0162	0.0289
0.0716	7.7670	1200	0.0185	0.0358
0.057	9.0615	1400	0.0130	0.0267
0.0444	10.3560	1600	0.0031	0.0205
0.0368	11.6505	1800	0.0036	0.0287
0.0396	12.9450	2000	0.0069	0.0480
0.0315	14.2395	2200	0.0030	0.0291
0.0352	15.5340	2400	0.0074	0.0119
0.0393	16.8285	2600	0.0013	0.0414
0.0291	18.1230	2800	0.0012	0.0068
0.0216	19.4175	3000	0.0035	0.0161
0.0248	20.7120	3200	0.0023	0.0070
0.0207	22.0065	3400	0.0015	0.0235
0.0212	23.3010	3600	0.0137	0.0360
0.0225	24.5955	3800	0.0008	0.0454
0.019	25.8900	4000	0.0005	0.0125
0.0195	27.1845	4200	0.0015	0.0316
0.0175	28.4790	4400	0.0032	0.0050
0.0196	29.7735	4600	0.0008	0.0056
0.017	31.0680	4800	0.0009	0.0193
0.0191	32.3625	5000	0.0002	0.0523
0.0165	33.6570	5200	0.0016	0.0094
0.0172	34.9515	5400	0.0030	0.0551
0.0098	36.2460	5600	0.0014	0.0468
0.0109	37.5405	5800	0.0012	0.0508
0.0104	38.8350	6000	0.0007	0.0472
0.0124	40.1294	6200	0.0008	0.0328
0.0147	41.4239	6400	0.0008	0.0336
0.0092	42.7184	6600	0.0010	0.0107
0.0097	44.0129	6800	0.0008	0.0291
0.0095	45.3074	7000	0.0002	0.0330
0.0088	46.6019	7200	0.0020	0.0209
0.0095	47.8964	7400	0.0006	0.0384
0.0085	49.1909	7600	0.0002	0.0470
0.0085	50.4854	7800	0.0001	0.0436
0.0109	51.7799	8000	0.0010	0.0422
0.0087	53.0744	8200	0.0012	0.0076
0.0099	54.3689	8400	0.0009	0.0348
0.0087	55.6634	8600	0.0002	0.0173
0.0094	56.9579	8800	0.0016	0.0183
0.006	58.2524	9000	0.0001	0.0105
0.0064	59.5469	9200	0.0001	0.0342
0.0054	60.8414	9400	0.0001	0.0394
0.0055	62.1359	9600	0.0000	0.0295
0.0058	63.4304	9800	0.0000	0.0289
0.007	64.7249	10000	0.0001	0.0480
0.0043	66.0194	10200	0.0000	0.0364
0.0045	67.3139	10400	0.0001	0.0309
0.0026	68.6084	10600	0.0000	0.0354
0.0031	69.9029	10800	0.0000	0.0352
0.0035	71.1974	11000	0.0004	0.0263
0.0038	72.4919	11200	0.0000	0.0225
0.0027	73.7864	11400	0.0001	0.0227
0.0037	75.0809	11600	0.0000	0.0366
0.0024	76.3754	11800	0.0000	0.0370
0.0028	77.6699	12000	0.0002	0.0245
0.0025	78.9644	12200	0.0003	0.0137
0.0022	80.2589	12400	0.0000	0.0320
0.0021	81.5534	12600	0.0000	0.0348
0.0023	82.8479	12800	0.0000	0.0255
0.002	84.1424	13000	0.0000	0.0257
0.0016	85.4369	13200	0.0000	0.0354
0.0018	86.7314	13400	0.0000	0.0442
0.0018	88.0259	13600	0.0000	0.0380
0.0014	89.3204	13800	0.0000	0.0390
0.0014	90.6149	14000	0.0000	0.0404
0.0014	91.9094	14200	0.0000	0.0430
0.0015	93.2039	14400	0.0000	0.0428
0.0008	94.4984	14600	0.0000	0.0428
0.0011	95.7929	14800	0.0000	0.0414
0.001	97.0874	15000	0.0000	0.0396
0.0008	98.3819	15200	0.0000	0.0416
0.0009	99.6764	15400	0.0000	0.0420

Framework versions

Transformers 4.45.0.dev0
Pytorch 2.1.2
Datasets 2.20.0
Tokenizers 0.19.1

susmitabhatt
/

xlsr-big-kcnnn

xlsr-big-kcnnn

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for susmitabhatt/xlsr-big-kcnnn

Evaluation results