|
--- |
|
license: apache-2.0 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- audiofolder |
|
metrics: |
|
- accuracy |
|
- f1 |
|
- precision |
|
- recall |
|
model-index: |
|
- name: wav2vec2-base-Drum_Kit_Sounds |
|
results: [] |
|
language: |
|
- en |
|
pipeline_tag: audio-classification |
|
--- |
|
|
|
# wav2vec2-base-Drum_Kit_Sounds |
|
|
|
This model is a fine-tuned version of [facebook/wav2vec2-base](https://huggingface.co/facebook/wav2vec2-base). |
|
|
|
It achieves the following results on the evaluation set: |
|
- Loss: 1.0887 |
|
- Accuracy: 0.7812 |
|
- F1 |
|
- Weighted: 0.7692 |
|
- Micro: 0.7812 |
|
- Macro: 0.7845 |
|
- Recall |
|
- Weighted: 0.7812 |
|
- Micro: 0.7812 |
|
- Macro: 0.8187 |
|
- Precision |
|
- Weighted: 0.8717 |
|
- Micro: 0.7812 |
|
- Macro: 0.8534 |
|
|
|
## Model description |
|
|
|
This is a multiclass classification of sounds to determine which type of drum is hit in the audio sample. The options are: kick, overheads, snare, and toms. |
|
|
|
For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/blob/main/Audio-Projects/Classification/Audio-Drum_Kit_Sounds.ipynb |
|
|
|
## Intended uses & limitations |
|
|
|
This model is intended to demonstrate my ability to solve a complex problem using technology. |
|
|
|
## Training and evaluation data |
|
|
|
Dataset Source: https://www.kaggle.com/datasets/anubhavchhabra/drum-kit-sound-samples |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 3e-05 |
|
- train_batch_size: 32 |
|
- eval_batch_size: 32 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.1 |
|
- num_epochs: 12 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Weighted F1 | Micro F1 | Macro F1 | Weighted Recall | Micro Recall | Macro Recall | Weighted Precision | Micro Precision | Macro Precision | |
|
|:-------------:|:-----:|:----:|:---------------:|:--------:|:-----------:|:--------:|:--------:|:---------------:|:------------:|:------------:|:------------------:|:---------------:|:---------------:| |
|
| 1.3743 | 1.0 | 4 | 1.3632 | 0.5625 | 0.5801 | 0.5625 | 0.5678 | 0.5625 | 0.5625 | 0.5670 | 0.6786 | 0.5625 | 0.6429 | |
|
| 1.3074 | 2.0 | 8 | 1.3149 | 0.3438 | 0.2567 | 0.3438 | 0.2696 | 0.3438 | 0.3438 | 0.375 | 0.3067 | 0.3438 | 0.3148 | |
|
| 1.2393 | 3.0 | 12 | 1.3121 | 0.2188 | 0.0785 | 0.2188 | 0.0897 | 0.2188 | 0.2188 | 0.25 | 0.0479 | 0.2188 | 0.0547 | |
|
| 1.2317 | 4.0 | 16 | 1.3112 | 0.2812 | 0.1800 | 0.2812 | 0.2057 | 0.2812 | 0.2812 | 0.3214 | 0.2698 | 0.2812 | 0.3083 | |
|
| 1.2107 | 5.0 | 20 | 1.2604 | 0.4375 | 0.3030 | 0.4375 | 0.3462 | 0.4375 | 0.4375 | 0.5 | 0.2552 | 0.4375 | 0.2917 | |
|
| 1.1663 | 6.0 | 24 | 1.2112 | 0.4688 | 0.3896 | 0.4688 | 0.4310 | 0.4688 | 0.4688 | 0.5268 | 0.5041 | 0.4688 | 0.5404 | |
|
| 1.1247 | 7.0 | 28 | 1.1746 | 0.5938 | 0.5143 | 0.5938 | 0.5603 | 0.5938 | 0.5938 | 0.6562 | 0.5220 | 0.5938 | 0.5609 | |
|
| 1.0856 | 8.0 | 32 | 1.1434 | 0.5938 | 0.5143 | 0.5938 | 0.5603 | 0.5938 | 0.5938 | 0.6562 | 0.5220 | 0.5938 | 0.5609 | |
|
| 1.0601 | 9.0 | 36 | 1.1417 | 0.6562 | 0.6029 | 0.6562 | 0.6389 | 0.6562 | 0.6562 | 0.7125 | 0.8440 | 0.6562 | 0.8217 | |
|
| 1.0375 | 10.0 | 40 | 1.1227 | 0.6875 | 0.6582 | 0.6875 | 0.6831 | 0.6875 | 0.6875 | 0.7330 | 0.8457 | 0.6875 | 0.8237 | |
|
| 1.0168 | 11.0 | 44 | 1.1065 | 0.7812 | 0.7692 | 0.7812 | 0.7845 | 0.7812 | 0.7812 | 0.8187 | 0.8717 | 0.7812 | 0.8534 | |
|
| 1.0093 | 12.0 | 48 | 1.0887 | 0.7812 | 0.7692 | 0.7812 | 0.7845 | 0.7812 | 0.7812 | 0.8187 | 0.8717 | 0.7812 | 0.8534 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.25.1 |
|
- Pytorch 1.12.1 |
|
- Datasets 2.8.0 |
|
- Tokenizers 0.12.1 |