tz579's picture
Training in progress, step 12776
6e42c7f verified
metadata
license: apache-2.0
base_model: facebook/wav2vec2-base
tags:
  - automatic-speech-recognition
  - timit_asr
  - generated_from_trainer
datasets:
  - timit_asr
metrics:
  - wer
model-index:
  - name: wav2vec2-base-timit-fine-tuned
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: TIMIT_ASR - NA
          type: timit_asr
          config: clean
          split: test
          args: 'Config: na, Training split: train, Eval split: test'
        metrics:
          - name: Wer
            type: wer
            value: 0.41728125284530637

wav2vec2-base-timit-fine-tuned

This model is a fine-tuned version of facebook/wav2vec2-base on the TIMIT_ASR - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4275
  • Wer: 0.4173

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.1618 0.8621 100 3.1117 1.0
2.9798 1.7241 200 2.9736 1.0
2.9144 2.5862 300 2.9075 1.0
2.1714 3.4483 400 2.0945 1.0325
1.1579 4.3103 500 1.0451 0.8299
0.6087 5.1724 600 0.6754 0.6441
0.481 6.0345 700 0.5275 0.5761
0.3072 6.8966 800 0.4836 0.5264
0.332 7.7586 900 0.4403 0.5234
0.1876 8.6207 1000 0.4758 0.5222
0.2232 9.4828 1100 0.4508 0.4892
0.1332 10.3448 1200 0.4394 0.4740
0.1085 11.2069 1300 0.4466 0.4621
0.098 12.0690 1400 0.4230 0.4493
0.1219 12.9310 1500 0.4180 0.4460
0.1021 13.7931 1600 0.4179 0.4406
0.0741 14.6552 1700 0.4113 0.4309
0.0896 15.5172 1800 0.4392 0.4308
0.0492 16.3793 1900 0.4202 0.4313
0.0759 17.2414 2000 0.4348 0.4207
0.0406 18.1034 2100 0.4419 0.4205
0.074 18.9655 2200 0.4306 0.4200
0.0378 19.8276 2300 0.4273 0.4173

Framework versions

  • Transformers 4.42.0.dev0
  • Pytorch 2.3.0.post300
  • Datasets 2.19.1
  • Tokenizers 0.19.1