libri-smallw2v2-no-copy-mse-alpha-0.75-T-1

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 299.5353
Wer: 0.5607

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.2
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
2621.6909	1.79	400	892.3344	1.0000
1182.3671	3.59	800	583.0377	0.9766
946.8591	5.38	1200	493.0189	0.9365
833.3073	7.17	1600	444.2031	0.9141
751.816	8.97	2000	400.5189	0.8937
698.0563	10.76	2400	369.2302	0.8649
644.5172	12.56	2800	345.4725	0.8458
600.5121	14.35	3200	323.2295	0.8151
561.2316	16.14	3600	307.8979	0.7950
523.6023	17.94	4000	291.1282	0.7720
496.2595	19.73	4400	279.5770	0.7517
466.5839	21.52	4800	267.7692	0.7263
440.6372	23.32	5200	258.0482	0.7028
417.6962	25.11	5600	254.2653	0.6902
392.6432	26.91	6000	252.7504	0.6774
378.9874	28.7	6400	246.1504	0.6714
361.5383	30.49	6800	243.2339	0.6630
345.9527	32.29	7200	242.4769	0.6494
335.28	34.08	7600	243.5441	0.6448
320.2823	35.87	8000	242.6982	0.6353
306.8673	37.67	8400	252.8057	0.6280
300.3173	39.46	8800	249.5198	0.6267
290.0972	41.26	9200	243.8252	0.6218
283.0466	43.05	9600	242.9062	0.6173
273.4801	44.84	10000	249.2953	0.6121
264.7652	46.64	10400	251.2528	0.6127
257.2499	48.43	10800	257.1318	0.6089
250.4637	50.22	11200	264.7531	0.6069
245.2013	52.02	11600	258.1169	0.6000
240.3053	53.81	12000	255.6198	0.5941
232.0262	55.61	12400	261.2134	0.5949
228.3372	57.4	12800	263.1902	0.5919
222.2224	59.19	13200	271.1831	0.5898
218.0379	60.99	13600	264.8538	0.5821
215.7583	62.78	14000	268.3278	0.5874
210.5649	64.57	14400	273.0802	0.5808
205.7684	66.37	14800	273.7237	0.5792
202.7417	68.16	15200	276.0477	0.5750
197.5293	69.96	15600	272.9896	0.5785
196.512	71.75	16000	274.7004	0.5777
193.9167	73.54	16400	275.8243	0.5727
190.2722	75.34	16800	283.2336	0.5743
187.5092	77.13	17200	284.5899	0.5723
182.8259	78.92	17600	287.1284	0.5732
184.1322	80.72	18000	285.4507	0.5704
180.0438	82.51	18400	283.7040	0.5684
179.6387	84.3	18800	288.0453	0.5670
176.7927	86.1	19200	290.4306	0.5662
173.8417	87.89	19600	298.6391	0.5663
174.4995	89.69	20000	299.5260	0.5634
171.9799	91.48	20400	301.1923	0.5653
170.4011	93.27	20800	299.7171	0.5667
170.4423	95.07	21200	296.5629	0.5639
169.7396	96.86	21600	299.2657	0.5622
166.9574	98.65	22000	299.5353	0.5607

Framework versions

Transformers 4.25.1
Pytorch 1.12.1
Datasets 2.8.0
Tokenizers 0.13.2

rohitp1
/

libri-smallw2v2-no-copy-mse-alpha-0.75-T-1

libri-smallw2v2-no-copy-mse-alpha-0.75-T-1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results