yujiepan's picture
upload model
cf342b7
metadata
license: apache-2.0
tags:
  - audio-classification
  - generated_from_trainer
datasets:
  - superb
metrics:
  - accuracy
model-index:
  - name: w2v2-ks-jpqd-finetuned-student
    results: []

w2v2-ks-jpqd-finetuned-student

This model is a fine-tuned version of anton-l/wav2vec2-base-ft-keyword-spotting on the superb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0641
  • Accuracy: 0.9815

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 15.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.4606 1.0 399 0.1543 0.9723
14.8746 2.0 798 14.9490 0.9681
24.7043 3.0 1197 24.6662 0.9706
30.626 4.0 1596 30.4279 0.9732
33.4796 5.0 1995 33.3182 0.9750
34.4405 6.0 2394 34.2327 0.9744
34.1743 7.0 2793 34.0161 0.9741
33.47 8.0 3192 33.2669 0.9748
0.2278 9.0 3591 0.1125 0.9757
0.2259 10.0 3990 0.0848 0.9778
0.1629 11.0 4389 0.0734 0.9788
0.1658 12.0 4788 0.0736 0.9803
0.2264 13.0 5187 0.0658 0.9803
0.1564 14.0 5586 0.0677 0.9819
0.1716 15.0 5985 0.0641 0.9815

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.8.0
  • Tokenizers 0.13.2