The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System Paper • 2310.12378 • Published Oct 18, 2023
Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer Paper • 2306.08753 • Published Jun 14, 2023 • 1
Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach Paper • 2309.05248 • Published Sep 11, 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models Paper • 2309.15701 • Published Sep 27, 2023 • 2
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition Paper • 2310.06434 • Published Oct 10, 2023 • 4
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 19
Voice2Series: Reprogramming Acoustic Models for Time Series Classification Paper • 2106.09296 • Published Jun 17, 2021
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers Paper • 2307.03183 • Published Jul 6, 2023 • 10