Open Whisper-style Speech Models (OWSM)
Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/
- 55🔊
espnet/owsm_ctc_v4_1B
Automatic Speech Recognition • Updated • 20 • 1Note OWSM-CTC v4 is trained on a newly curated dataset from YODAS along with previous OWSM data, which significantly enhances multilingual performance.
espnet/owsm_v4_medium_1B
Automatic Speech Recognition • Updated • 8 • 1Note OWSM v4 (1B) is trained on a newly curated dataset from YODAS along with previous OWSM data, which significantly enhances multilingual performance.
espnet/owsm_v4_small_370M
Automatic Speech Recognition • Updated • 5 • 1Note OWSM v4 (370M) is trained on a newly curated dataset from YODAS along with previous OWSM data, which significantly enhances multilingual performance.
espnet/owsm_v4_base_102M
Automatic Speech Recognition • Updated • 5 • 1Note OWSM v4 (102M) is trained on a newly curated dataset from YODAS along with previous OWSM data, which significantly enhances multilingual performance.
espnet/owsm_ctc_v3.2_ft_1B
Automatic Speech Recognition • Updated • 64 • 4Note OWSM-CTC v3.1 is further fine-tuned on v3.2 data to improve long-form robustness.
espnet/owsm_ctc_v3.1_1B
Automatic Speech Recognition • Updated • 54 • 13Note (ACL'24) CTC-based non-autoregressive speech foundation model for multilingual ASR, ST, and LID.
espnet/owsm_v3.1_ebf
Automatic Speech Recognition • Updated • 215 • 17Note (INTERSPEECH'24) OWSM v3.1 medium with 1.02B parameters.
espnet/owsm_v3.1_ebf_small
Automatic Speech Recognition • Updated • 17 • 2Note (INTERSPEECH'24) OWSM v3.1 small with 367M parameters.
espnet/owsm_v3.1_ebf_base
Automatic Speech Recognition • Updated • 18 • 3Note (INTERSPEECH'24) OWSM v3.1 base with 101M parameters.
espnet/owsm_v3.1_ebf_small_lowrestriction
Automatic Speech Recognition • Updated • 7 • 2Note (INTERSPEECH'24) OWSM v3.1 small trained on a subset of data with low restriction licenses.
espnet/owsm_v3.2
Automatic Speech Recognition • Updated • 11 • 5Note (INTERSPEECH'24) OWSM small with data cleaning.
espnet/owsm_v3
Automatic Speech Recognition • Updated • 5 • 27espnet/owsm_v2_ebranchformer
Automatic Speech Recognition • Updated • 7espnet/owsm_v2
Automatic Speech Recognition • Updated • 3 • 4espnet/owsm_v1
Automatic Speech Recognition • UpdatedOWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Paper • 2402.12654 • Published • 1OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Paper • 2401.16658 • Published • 14Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Paper • 2309.13876 • Published • 1