anantoj's picture
Update README.md
07e4690
metadata
language: id
license: apache-2.0
tags:
  - audio-classification
  - generated_from_trainer
metrics:
  - accuracy
  - f1
model-index:
  - name: distil-wav2vec2-adult-child-id-cls-52m
    results: []

DistilWav2Vec2 Adult/Child Indonesian Speech Classifier 52M

DistilWav2Vec2 Adult/Child Indonesian Speech Classifier is an audio classification model based on the wav2vec 2.0 architecture. This model is a distilled version of wav2vec2-adult-child-id-cls on a private adult/child Indonesian speech classification dataset.

This model was trained using HuggingFace's PyTorch framework. All training was done on a Tesla P100, provided by Kaggle. Training metrics were logged via Tensorboard.

Model

Model #params Arch. Training/Validation data (text)
distil-wav2vec2-adult-child-id-cls-52m 52m wav2vec 2.0 Adult/Child Indonesian Speech Classification Dataset

Evaluation Results

The model achieves the following results on evaluation:

Dataset Loss Accuracy F1
Adult/Child Indonesian Speech Classification 0.1560 94.89% 0.9480

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 7

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
0.2494 1.0 76 0.1706 0.9454 0.9421
0.2015 2.0 152 0.1519 0.9483 0.9464
0.1674 3.0 228 0.1560 0.9489 0.9480
0.1596 4.0 304 0.1760 0.9449 0.9414
0.0873 5.0 380 0.1825 0.9478 0.9452
0.0996 6.0 456 0.1733 0.9478 0.9460
0.1055 7.0 532 0.1749 0.9454 0.9433

Disclaimer

Do consider the biases which came from pre-training datasets that may be carried over into the results of this model.

Authors

DistilWav2Vec2 Adult/Child Indonesian Speech Classifier was trained and evaluated by Ananto Joyoadikusumo. All computation and development are done on Kaggle.

Framework versions

  • Transformers 4.16.2
  • Pytorch 1.10.2+cu102
  • Datasets 1.18.3
  • Tokenizers 0.10.3