nelfproject
/

ASR_subtitles_v2

Automatic Speech Recognition

Model card Files Files and versions Community

ASR_subtitles_v2 / README.md

Jakobkee's picture

add paper link

8e497df verified 11 days ago

|

history blame contribute delete

1.69 kB

	---
	license: cc-by-nc-4.0
	language:
	- nl
	pipeline_tag: automatic-speech-recognition
	---

	# Model

	This repository contains the second version of our Automatic Speech Recognition and Subtitle Generation model, with improved architecture and trained on 14000 hours of Flemish broadcast subtitled speech data.
	It can generate both an exact verbatim transcription with annotation tags as well as a fully formatted and cleaned up subtitle transcription.
	It outputs both modalities with separate decoders.

	This repository contains the large variant of the model with 180M parameters.

	Version: April 2024

	# Usage

	This repository only hosts the pre-trained model itself and the configuration files.
	To download this model, see the instructions [here](https://huggingface.co/docs/hub/models-downloading).

	Usage of this model, as well as our other ASR models, is integrated in [our Github codebase](https://github.com/nelfproject/NeLF_Transcription_ASR).
	Please refer to the Github for installation.

	# Webservice

	This model can also be accessed through the [webservice of the NeLF Project](https://nelfproject.be/web_service.php). After requesting access, you can upload audio or video files and they will be transcribed according to the desired settings.

	# Citation

	If you use this model, please cite the research paper:
	```bibtex
	@article{poncelet2024,
	author = "Poncelet, Jakob and Van hamme, Hugo",
	title = "Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling",
	year={2024},
	journal={arXiv preprint arXiv:2502.03212},
	url = {https://arxiv.org/abs/2502.03212}
	```

	# Contact

	Jakob Poncelet: jakob.poncelet@kuleuven.be