xpariz10
/

ast-finetuned-audioset-10-10-0.4593_ft_ESC-50_aug_0-1

Audio Classification

audio-spectrogram-transformer

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

xpariz10 commited on Apr 3, 2023

Commit

ac71199

·

1 Parent(s): 1a45862

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -26,19 +26,19 @@ It achieves the following results on the evaluation set:
 Training and evaluation data were augmented with audiomentations [GitHub: iver56/audiomentations](https://github.com/iver56/audiomentations) library and the following augmentation methods have been performed based on previous experiments [Elliott et al.: Tiny transformers for audio classification at the edge](https://arxiv.org/pdf/2103.12157.pdf):
-#Gain
 - each audio sample is amplified/attenuated by a random factor between 0.5 and 1.5 with a 0.3 probability
-#Noise
 - a random amount of Gaussian noise with a relative amplitude between 0.001 and 0.015 is added to each audio sample with a 0.5 probability
-#Speed adjust
 - duration of each audio sample is extended by a random amount between 0.5 and 1.5 with a 0.3 probability
-#Pitch shift
 - pitch of each audio sample is shifted by a random amount of semitones selected from the closed interval [-4,4] with a 0.3 probability
-#Time masking
 - a random fraction of lenght of each audio sample in the range of (0,0.02] is erased with a 0.3 probability

 Training and evaluation data were augmented with audiomentations [GitHub: iver56/audiomentations](https://github.com/iver56/audiomentations) library and the following augmentation methods have been performed based on previous experiments [Elliott et al.: Tiny transformers for audio classification at the edge](https://arxiv.org/pdf/2103.12157.pdf):
+**Gain**
 - each audio sample is amplified/attenuated by a random factor between 0.5 and 1.5 with a 0.3 probability
+**Noise**
 - a random amount of Gaussian noise with a relative amplitude between 0.001 and 0.015 is added to each audio sample with a 0.5 probability
+**Speed adjust**
 - duration of each audio sample is extended by a random amount between 0.5 and 1.5 with a 0.3 probability
+**Pitch shift**
 - pitch of each audio sample is shifted by a random amount of semitones selected from the closed interval [-4,4] with a 0.3 probability
+**Time masking**
 - a random fraction of lenght of each audio sample in the range of (0,0.02] is erased with a 0.3 probability