ARespiratory audio classification model

This model classifies respiratory audio recordings from the ICBHI 2017 Challenge dataset into crackles, wheezes, both, or none (multi-label classification). It utilizes the AST encoder (MIT/ast-finetuned-audioset-14-14-0.443) with a lightweight classification head.

The model has been pushed to the Hub using the PyTorchModelHubMixin integration.

Dataset

Source: ICBHI 2017 Challenge

Performance metrics

Label	F1	Precision	Recall	AUC
Crackle	0.6756	0.6147	0.7500	0.7033
Wheeze	0.4853	0.6565	0.3849	0.8031
Macro Avg	0.5805	0.6356	0.5674	0.7532

Usage

Run inference using s05_inference.py.
Ensure you install the necessary dependencies. For setup instructions, please see the documentation.

Notes

For additional details, check the documentation notes.

Contact

For any questions or further information, feel free to reach out via email: fabiocat@mit.edu.