Edit model card

Accent Classification

Whisper-based audio classification model trained on the EdAcc dataset.

This model was trained for 15 epochs.

Evaluation

It achieves the following results on the evaluation set:

  • Loss: 0.0340
  • Accuracy: 0.9960

Hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 0
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 15.0
  • mixed_precision_training: Native AMP

Training Results

Training Loss Epoch Step Validation Loss Accuracy
2.75 1.0 290 2.6410 0.2788
1.7626 2.0 580 1.6589 0.6165
1.2434 3.0 870 1.0752 0.7746
0.903 4.0 1160 0.7628 0.8295
0.6878 5.0 1450 0.5528 0.8865
0.5063 6.0 1740 0.4051 0.9189
0.3312 7.0 2030 0.3041 0.9471
0.2562 8.0 2320 0.2236 0.9629
0.2163 9.0 2610 0.1633 0.9757
0.127 10.0 2900 0.1250 0.9813
0.1187 11.0 3190 0.0842 0.9888
0.0649 12.0 3480 0.0604 0.9920
0.0506 13.0 3770 0.0447 0.9945
0.0514 14.0 4060 0.0370 0.9956
0.0311 15.0 4350 0.0340 0.9960

Disclaimer

This model may make mistakes and should not be used in high-risk situations.

THE MODEL IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS MODEL INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS MODEL.

Downloads last month
38
Safetensors
Model size
20.7M params
Tensor type
F32
·

Dataset used to train ml-for-speech/accent-classification

Space using ml-for-speech/accent-classification 1