MERaLiON
/

MERaLiON-AudioLLM-Whisper-SEA-LION

Automatic Speech Recognition

Model card Files Files and versions Community

YingxuHe commited on 13 days ago

Commit

ea45b1a

·

verified ·

1 Parent(s): 735299c

Update README.md

Files changed (1) hide show

README.md +2 -4

README.md CHANGED Viewed

@@ -66,10 +66,8 @@ MERaLiON-AudioLLM is trained to mainly address 6 tasks, namely `Automatic Speech
 `Spoken Dialogue Summarization` (SDS), `Speech Instruction` (SI), and `Paralinguistics` (PARA).
 We benchmark MERaLiON-AudioLLM with a series of test sets from the [AudioBench benchmark](https://github.com/AudioLLMs/AudioBench)
-against three well-known AudioLLMs: `Qwen2-Audio 7B`, `WavLLM`, and `SALMONN`. We also compared with a cascaded model,
-which feeds the transcriptions recognized by Whisper-large-v2 and the instruction prompts to a Gemma2 9B CPT SEA-LIONv3 Instruct model to
-get the responses. We tuned its hyperparameters and prompt template to optimise performance across
-various speech-to-text tasks. As is shown in the following table, MERaLiON-AudioLLM performs better in the Singapore local context,
 as evidenced by evaluation results on Singapore's [Multitask National Speech Corpus](https://huggingface.co/datasets/MERaLiON/Multitask-National-Speech-Corpus-v1) (MNSC) datasets.
 > [!NOTE]

 `Spoken Dialogue Summarization` (SDS), `Speech Instruction` (SI), and `Paralinguistics` (PARA).
 We benchmark MERaLiON-AudioLLM with a series of test sets from the [AudioBench benchmark](https://github.com/AudioLLMs/AudioBench)
+against three well-known AudioLLMs: `Qwen2-Audio 7B`, `WavLLM`, `SALMONN`, and a cascaded model.
+As is shown in the following table, MERaLiON-AudioLLM performs better in the Singapore local context,
 as evidenced by evaluation results on Singapore's [Multitask National Speech Corpus](https://huggingface.co/datasets/MERaLiON/Multitask-National-Speech-Corpus-v1) (MNSC) datasets.
 > [!NOTE]