pyf98
/

owsm_ctc_v3.1_1B

Automatic Speech Recognition

speech-translation

language-identification

Model card Files Files and versions Community

owsm_ctc_v3.1_1B / README.md

pyf98's picture

add model files

3b3dddc 5 months ago

|

raw history blame

553 Bytes

	---
	tags:
	- espnet
	- audio
	- automatic-speech-recognition
	- speech-translation
	- language-identification
	language: multilingual
	datasets:
	- owsm_v3.1_ctc
	license: cc-by-4.0
	---

	[OWSM-CTC](https://arxiv.org/abs/2402.12654) is an encoder-only speech foundation model based on multi-task self-conditioned CTC.
	It is trained on 180k hours of public audio data for multilingual speech recognition, any-to-any speech translation, and language identification, which follows the design of the previous [encoder-decoder OWSM](https://arxiv.org/abs/2401.16658).