Edit model card

HuBERT-Genre-Clf

Model description

This model is a fine-tuned version of DistilHuBERT for audio genre classification tasks. DistilHuBERT is a distilled variant of the HuBERT model, optimized for efficient and effective audio processing. This classifier is capable of categorizing audio files into various musical genres, leveraging the powerful representations learned by DistilHuBERT.

Model Details:

  • Architecture: DistilHuBERT
  • Task: Audio Genre Classification
  • Genres: [List the genres your model can classify, e.g., Blues, Classical, Country, Electronic, Hip-Hop, Jazz, Pop, Rock, etc.]
  • Dataset: GTZAN dataset
  • Training: The model was fine-tuned on a diverse set of audio tracks, encompassing various genres to ensure robust classification performance.

Usage:

To use this model, you can load it with the transformers library as follows:

from transformers import AutoModelForAudioClassification, AutoFeatureExtractor

model_name = "danilotpnta/HuBERT-Genre-Clf"

model = AutoModelForAudioClassification.from_pretrained(model_name)
feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)

# Example usage for an audio file
import torch
import librosa

audio_file = "path_to_your_audio_file.wav"
audio, sr = librosa.load(audio_file, sr=feature_extractor.sampling_rate)

inputs = feature_extractor(audio, sampling_rate=sr, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits

predicted_class = logits.argmax(dim=-1).item()
print(f"Predicted genre: {model.config.id2label[predicted_class]}")

Performance:

The model achieves an impressive 80.63% accuracy on the GTZAN test dataset for genre classification tasks, demonstrating its efficacy and reliability. This high level of performance makes it a valuable asset for various applications, including music recommendation systems and audio analysis tools.

Download model

Weights for this model are available in Safetensors,PyTorch format.

Download them in the Files & versions tab.

License: MIT

Downloads last month
3,681
Safetensors
Model size
23.7M params
Tensor type
F32
·
Inference API
or
This model can be loaded on Inference API (serverless).

Finetuned from