Edit model card

common8

This model is a fine-tuned version of wghts/checkpoint-20000 on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - FA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3174
  • Wer: 0.3022

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 6
  • total_train_batch_size: 192
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 250.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.5847 1.93 500 3.5104 1.0
2.7858 3.86 1000 2.9601 1.0001
1.6827 5.79 1500 0.7853 0.7030
1.4656 7.72 2000 0.6076 0.6014
1.3693 9.65 2500 0.5114 0.5307
1.379 11.58 3000 0.4666 0.4940
1.2832 13.51 3500 0.4257 0.4593
1.1931 15.44 4000 0.4039 0.4427
1.2911 17.37 4500 0.3956 0.4295
1.1577 19.3 5000 0.3705 0.4114
1.1135 21.24 5500 0.3740 0.4010
1.19 23.17 6000 0.3611 0.3935
1.1008 25.1 6500 0.3503 0.3880
1.0805 27.03 7000 0.3427 0.3781
1.1556 28.96 7500 0.3442 0.3727
1.0596 30.89 8000 0.3398 0.3646
1.0219 32.82 8500 0.3312 0.3660
1.1042 34.75 9000 0.3287 0.3612
1.0273 36.68 9500 0.3236 0.3556
1.0383 38.61 10000 0.3217 0.3558
1.0498 40.54 10500 0.3205 0.3520
0.9969 42.47 11000 0.3125 0.3504
1.0658 44.4 11500 0.3120 0.3493
0.992 46.33 12000 0.3137 0.3476
0.9737 48.26 12500 0.3085 0.3413
1.0817 50.19 13000 0.3091 0.3418
0.9414 52.12 13500 0.3072 0.3344
0.9295 54.05 14000 0.3039 0.3322
1.0248 55.98 14500 0.2991 0.3325
0.9474 57.91 15000 0.3032 0.3348
0.928 59.85 15500 0.2999 0.3285
1.0321 61.78 16000 0.2982 0.3253
0.9255 63.71 16500 0.2970 0.3231
0.8928 65.64 17000 0.2993 0.3250
1.008 67.57 17500 0.2985 0.3222
0.9371 69.5 18000 0.2968 0.3216
0.9077 71.43 18500 0.3011 0.3299
1.0044 73.36 19000 0.3053 0.3306
0.9625 75.29 19500 0.3159 0.3295
0.9816 77.22 20000 0.3080 0.3304
0.9587 119.19 20500 0.3088 0.3284
0.9178 122.09 21000 0.3132 0.3320
1.0282 125.0 21500 0.3099 0.3266
0.9337 127.9 22000 0.3110 0.3317
0.8822 130.81 22500 0.3037 0.3247
0.9644 133.72 23000 0.3037 0.3238
0.9214 136.62 23500 0.3040 0.3234
0.9167 139.53 24000 0.3079 0.3203
0.9047 142.44 24500 0.3018 0.3177
0.8909 145.35 25000 0.3053 0.3181
0.9646 148.25 25500 0.3095 0.3229
0.8802 151.16 26000 0.3111 0.3192
0.8411 154.07 26500 0.3068 0.3123
0.9235 156.97 27000 0.3090 0.3177
0.8943 159.88 27500 0.3115 0.3179
0.8854 162.79 28000 0.3052 0.3157
0.8734 165.69 28500 0.3077 0.3124
0.8515 168.6 29000 0.3117 0.3128
0.912 171.51 29500 0.3039 0.3121
0.8669 174.42 30000 0.3120 0.3123
0.823 177.32 30500 0.3148 0.3118
0.9129 180.23 31000 0.3179 0.3101
0.8255 183.14 31500 0.3164 0.3114
0.8948 186.05 32000 0.3128 0.3101
0.8397 188.95 32500 0.3143 0.3068
0.8341 191.86 33000 0.3127 0.3136
0.873 194.76 33500 0.3149 0.3124
0.8232 197.67 34000 0.3166 0.3086
0.8002 200.58 34500 0.3149 0.3061
0.8621 203.49 35000 0.3160 0.3093
0.8123 206.39 35500 0.3141 0.3063
0.7995 209.3 36000 0.3174 0.3075
0.8271 212.21 36500 0.3173 0.3043
0.8059 215.12 37000 0.3176 0.3079
0.8835 218.02 37500 0.3169 0.3062
0.8027 220.93 38000 0.3203 0.3098
0.775 223.83 38500 0.3159 0.3068
0.8487 226.74 39000 0.3161 0.3072
0.7929 229.65 39500 0.3143 0.3037
0.7653 232.56 40000 0.3160 0.3048
0.8211 235.46 40500 0.3173 0.3031
0.7761 238.37 41000 0.3176 0.3025
0.7761 241.28 41500 0.3179 0.3027
0.7903 244.19 42000 0.3181 0.3016
0.7807 247.09 42500 0.3170 0.3027
0.8406 250.0 43000 0.3174 0.3022

Framework versions

  • Transformers 4.17.0.dev0
  • Pytorch 1.10.2
  • Datasets 1.18.3.dev0
  • Tokenizers 0.10.3
Downloads last month
2

Dataset used to train ghofrani/xls-r-1b-fa-cv8