Spaces:
Runtime error
Runtime error
> Note: The examples provides may not work on Safari, tablets and iOS devices. Try an alternate approach. | |
## Dataset | |
- [UrbanSound8K](https://urbansounddataset.weebly.com/urbansound8k.html) | |
## Audio files | |
Files are converted to melspectrograms that perform better in general for visual transformations of such audio files. | |
## Training | |
Using With Fast.ai and three epochs with minimal lines of code approaches 95% accuracy with a 20% validation of the entire dataset of 8732 labelled sound excerpts of 10 classes shown above. Fast.ai was used to train this classifier with a Resnet34 vision learner with three epochs. | |
epoch train_loss valid_loss accuracy time | |
0 1.462791 0.710250 0.775487 01:12 | |
epoch train_loss valid_loss accuracy time | |
0 0.600056 0.309964 0.892325 00:40 | |
1 0.260431 0.200901 0.945017 00:39 | |
2 0.090158 0.164748 0.950745 00:40 | |
# Classical Approaches | |
[Classical approaches on this dataset as of 2019](https://www.researchgate.net/publication/335862311_Evaluation_of_Classical_Machine_Learning_Techniques_towards_Urban_Sound_Recognition_on_Embedded_Systems) | |
## State of the Art Approaches | |
The state-of-the-art methods for audio classification approach this problem as an image classification task. For such image classification problems from audio samples, three common(https://scottmduda.medium.com/urban-environmental-audio-classification-using-mel-spectrograms-706ee6f8dcc1) | |
transformation approaches are: | |
Linear Spectrograms | |
Log Spectrograms | |
[Mel Spectrograms](https://towardsdatascience.com/audio-deep-learning-made-simple-part-2-why-mel-spectrograms-perform-better-aad889a93505) | |
## Credits | |
Thanks to [Kurian Benoy](https://kurianbenoy.com/) and countless others that generously leave code public. | |