wav2vec2-xlsr-korean-dialect-recognition

This model is a fine-tuned version of fleek/wav2vec-large-xlsr-korean on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0752
  • Accuracy: 0.9783

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.8264 0.0794 500 0.3492 0.8641
0.674 0.1588 1000 0.2810 0.8985
0.3338 0.2382 1500 0.2596 0.9269
0.3121 0.3176 2000 0.2037 0.9403
0.2074 0.3970 2500 0.1472 0.9494
0.4901 0.4764 3000 0.1448 0.9582
0.2544 0.5558 3500 0.1676 0.9535
0.2138 0.6352 4000 0.1057 0.9684
0.1705 0.7146 4500 0.1463 0.9551
0.4207 0.7940 5000 0.0907 0.9722
0.0229 0.8734 5500 0.0887 0.9738
0.203 0.9528 6000 0.0752 0.9783

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
47
Safetensors
Model size
316M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Flitto/wav2vec2-xlsr-korean-dialect-recognition

Finetuned
(2)
this model