common8

This model is a fine-tuned version of wghts/checkpoint-20000 on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - FA dataset. It achieves the following results on the evaluation set:

Loss: 0.3174
Wer: 0.3022

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 32
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 6
total_train_batch_size: 192
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 250.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
3.5847	1.93	500	3.5104	1.0
2.7858	3.86	1000	2.9601	1.0001
1.6827	5.79	1500	0.7853	0.7030
1.4656	7.72	2000	0.6076	0.6014
1.3693	9.65	2500	0.5114	0.5307
1.379	11.58	3000	0.4666	0.4940
1.2832	13.51	3500	0.4257	0.4593
1.1931	15.44	4000	0.4039	0.4427
1.2911	17.37	4500	0.3956	0.4295
1.1577	19.3	5000	0.3705	0.4114
1.1135	21.24	5500	0.3740	0.4010
1.19	23.17	6000	0.3611	0.3935
1.1008	25.1	6500	0.3503	0.3880
1.0805	27.03	7000	0.3427	0.3781
1.1556	28.96	7500	0.3442	0.3727
1.0596	30.89	8000	0.3398	0.3646
1.0219	32.82	8500	0.3312	0.3660
1.1042	34.75	9000	0.3287	0.3612
1.0273	36.68	9500	0.3236	0.3556
1.0383	38.61	10000	0.3217	0.3558
1.0498	40.54	10500	0.3205	0.3520
0.9969	42.47	11000	0.3125	0.3504
1.0658	44.4	11500	0.3120	0.3493
0.992	46.33	12000	0.3137	0.3476
0.9737	48.26	12500	0.3085	0.3413
1.0817	50.19	13000	0.3091	0.3418
0.9414	52.12	13500	0.3072	0.3344
0.9295	54.05	14000	0.3039	0.3322
1.0248	55.98	14500	0.2991	0.3325
0.9474	57.91	15000	0.3032	0.3348
0.928	59.85	15500	0.2999	0.3285
1.0321	61.78	16000	0.2982	0.3253
0.9255	63.71	16500	0.2970	0.3231
0.8928	65.64	17000	0.2993	0.3250
1.008	67.57	17500	0.2985	0.3222
0.9371	69.5	18000	0.2968	0.3216
0.9077	71.43	18500	0.3011	0.3299
1.0044	73.36	19000	0.3053	0.3306
0.9625	75.29	19500	0.3159	0.3295
0.9816	77.22	20000	0.3080	0.3304
0.9587	119.19	20500	0.3088	0.3284
0.9178	122.09	21000	0.3132	0.3320
1.0282	125.0	21500	0.3099	0.3266
0.9337	127.9	22000	0.3110	0.3317
0.8822	130.81	22500	0.3037	0.3247
0.9644	133.72	23000	0.3037	0.3238
0.9214	136.62	23500	0.3040	0.3234
0.9167	139.53	24000	0.3079	0.3203
0.9047	142.44	24500	0.3018	0.3177
0.8909	145.35	25000	0.3053	0.3181
0.9646	148.25	25500	0.3095	0.3229
0.8802	151.16	26000	0.3111	0.3192
0.8411	154.07	26500	0.3068	0.3123
0.9235	156.97	27000	0.3090	0.3177
0.8943	159.88	27500	0.3115	0.3179
0.8854	162.79	28000	0.3052	0.3157
0.8734	165.69	28500	0.3077	0.3124
0.8515	168.6	29000	0.3117	0.3128
0.912	171.51	29500	0.3039	0.3121
0.8669	174.42	30000	0.3120	0.3123
0.823	177.32	30500	0.3148	0.3118
0.9129	180.23	31000	0.3179	0.3101
0.8255	183.14	31500	0.3164	0.3114
0.8948	186.05	32000	0.3128	0.3101
0.8397	188.95	32500	0.3143	0.3068
0.8341	191.86	33000	0.3127	0.3136
0.873	194.76	33500	0.3149	0.3124
0.8232	197.67	34000	0.3166	0.3086
0.8002	200.58	34500	0.3149	0.3061
0.8621	203.49	35000	0.3160	0.3093
0.8123	206.39	35500	0.3141	0.3063
0.7995	209.3	36000	0.3174	0.3075
0.8271	212.21	36500	0.3173	0.3043
0.8059	215.12	37000	0.3176	0.3079
0.8835	218.02	37500	0.3169	0.3062
0.8027	220.93	38000	0.3203	0.3098
0.775	223.83	38500	0.3159	0.3068
0.8487	226.74	39000	0.3161	0.3072
0.7929	229.65	39500	0.3143	0.3037
0.7653	232.56	40000	0.3160	0.3048
0.8211	235.46	40500	0.3173	0.3031
0.7761	238.37	41000	0.3176	0.3025
0.7761	241.28	41500	0.3179	0.3027
0.7903	244.19	42000	0.3181	0.3016
0.7807	247.09	42500	0.3170	0.3027
0.8406	250.0	43000	0.3174	0.3022

Framework versions

Transformers 4.17.0.dev0
Pytorch 1.10.2
Datasets 1.18.3.dev0
Tokenizers 0.10.3

ghofrani
/

xls-r-1b-fa-cv8

common8

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train ghofrani/xls-r-1b-fa-cv8

Spaces using ghofrani/xls-r-1b-fa-cv8 4

Evaluation results