This model is a finetuned whisper-large model with 1M audio samples from the dataset mitermix/audiosnippets_long_1M, and 500K emotion dataset.

Safetensors

Model size

1.54B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including cahya/whisper-large-emotion-v1.0