Edit model card

adapter_head_full_const_lr_1e-4_l20-l23_const_lr_1e-7_l1-l19

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3638
  • Wer: 0.1946
  • Cer: 0.0323

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.1716 2.3077 750 0.2372 0.3536 0.0576
0.0888 4.6154 1500 0.2341 0.3066 0.0509
0.0487 6.9231 2250 0.2555 0.2823 0.0467
0.0221 9.2308 3000 0.2957 0.2668 0.0444
0.0193 11.5385 3750 0.3013 0.2461 0.0411
0.0162 13.8462 4500 0.3230 0.2584 0.0431
0.0107 16.1538 5250 0.3377 0.2454 0.0408
0.0106 18.4615 6000 0.3370 0.2473 0.0413
0.0111 20.7692 6750 0.3457 0.2448 0.0414
0.0084 23.0769 7500 0.3279 0.2302 0.0387
0.0083 25.3846 8250 0.3402 0.2308 0.0382
0.009 27.6923 9000 0.3411 0.2302 0.0384
0.0085 30.0 9750 0.3311 0.2292 0.0375
0.006 32.3077 10500 0.3492 0.2238 0.0371
0.0063 34.6154 11250 0.3560 0.2330 0.0381
0.0064 36.9231 12000 0.3584 0.2259 0.0379
0.0054 39.2308 12750 0.3484 0.2123 0.0351
0.0041 41.5385 13500 0.3565 0.2131 0.0356
0.0044 43.8462 14250 0.3522 0.2171 0.0363
0.0025 46.1538 15000 0.3702 0.2084 0.0350
0.0073 48.4615 15750 0.3579 0.2203 0.0360
0.0048 50.7692 16500 0.3462 0.2116 0.0353
0.0053 53.0769 17250 0.3264 0.2014 0.0337
0.0028 55.3846 18000 0.3560 0.2059 0.0343
0.0039 57.6923 18750 0.3685 0.2081 0.0348
0.0026 60.0 19500 0.3649 0.2075 0.0347
0.0027 62.3077 20250 0.3636 0.2091 0.0350
0.0038 64.6154 21000 0.3675 0.2147 0.0350
0.0024 66.9231 21750 0.3707 0.2050 0.0341
0.0045 69.2308 22500 0.3397 0.1961 0.0329
0.0032 71.5385 23250 0.3645 0.1985 0.0332
0.0041 73.8462 24000 0.3451 0.2047 0.0338
0.0018 76.1538 24750 0.3468 0.1935 0.0321
0.0045 78.4615 25500 0.3366 0.1982 0.0332
0.0023 80.7692 26250 0.3551 0.1996 0.0336
0.0022 83.0769 27000 0.3778 0.1948 0.0331
0.0026 85.3846 27750 0.3622 0.1950 0.0328
0.0013 87.6923 28500 0.3600 0.1908 0.0319
0.0032 90.0 29250 0.3632 0.1945 0.0324
0.0027 92.3077 30000 0.3436 0.1913 0.0320
0.002 94.6154 30750 0.3721 0.1985 0.0334
0.0022 96.9231 31500 0.3659 0.1966 0.0330
0.0025 99.2308 32250 0.3638 0.1946 0.0323

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
10
Safetensors
Model size
606M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results