Edit model card

Audio Classification

This repo contains code and notes for this tutorial.

Dataset

GTZAN is used.

Usage

export HUGGINGFACE_TOKEN=<your_token>
python main.py

Performance

Acc: 0.81 (default setting)

Notes

  1. 🤗 Datasets support train_test_split() method to split the dataset.

  2. feature_extractor can not handle resampling

    • To resample, one can use dataset.map()
from datasets import Audio

gtzan = gtzan.cast_column("audio", Audio(sampling_rate=feature_extractor.sampling_rate))
  1. feature_extractor do the normalization and returns input_values and attention_mask.

  2. .map() support batched preprocess.

  3. Why AutoModelForAudioClassification.from_pretrained takes label2id and id2label?

Downloads last month
23

Dataset used to train anthony-wss/distilhubert-finetuned-gtzan