suzii
/

vi-whisper-large-v3-turbo-v1

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

suzii commited on 1 day ago

Commit

e05c588

·

verified ·

1 Parent(s): 1e9075e

Update README.md

Files changed (1) hide show

README.md +32 -11

README.md CHANGED Viewed

@@ -1,3 +1,24 @@
 # Fine-tuned Whisper-V3-Turbo for Vietnamese ASR
 This project involves fine-tuning the Whisper-V3-Turbo model to improve its performance for Automatic Speech Recognition (ASR) in the Vietnamese language. The model was trained for 240 hours using a single Nvidia A6000 GPU.
@@ -66,17 +87,17 @@ To use the fine-tuned model, follow the steps below:
 This project would not be possible without the following datasets:
-- [capleaf/viVoice](https://www.kaggle.com/datasets/capleaf/viVoice)
-- [NhutP/VSV-1100](https://www.kaggle.com/datasets/nhutp/vsv-1100)
-- [doof-ferb/fpt_fosd](https://www.kaggle.com/datasets/doof-ferb/fpt_fosd)
-- [doof-ferb/infore1_25hours](https://www.kaggle.com/datasets/doof-ferb/infore1-25hours)
-- [google/fleurs](https://www.kaggle.com/datasets/google/fleurs)
-- [doof-ferb/LSVSC](https://www.kaggle.com/datasets/doof-ferb/LSVSC)
-- [quocanh34/viet_vlsp](https://www.kaggle.com/datasets/quocanh34/viet-vlsp)
-- [linhtran92/viet_youtube_asr_corpus_v2](https://www.kaggle.com/datasets/linhtran92/viet-youtube-asr-corpus-v2)
-- [doof-ferb/infore2_audiobooks](https://www.kaggle.com/datasets/doof-ferb/infore2-audiobooks)
-- [linhtran92/viet_bud500](https://www.kaggle.com/datasets/linhtran92/viet-bud500)
 ## License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

+---
+license: mit
+datasets:
+- capleaf/viVoice
+- NhutP/VSV-1100
+- doof-ferb/fpt_fosd
+- doof-ferb/infore1_25hours
+- google/fleurs
+- doof-ferb/LSVSC
+- quocanh34/viet_vlsp
+- linhtran92/viet_youtube_asr_corpus_v2
+- doof-ferb/infore2_audiobooks
+- linhtran92/viet_bud500
+language:
+- vi
+metrics:
+- wer
+base_model:
+- openai/whisper-large-v3-turbo
+library_name: transformers
+---
 # Fine-tuned Whisper-V3-Turbo for Vietnamese ASR
 This project involves fine-tuning the Whisper-V3-Turbo model to improve its performance for Automatic Speech Recognition (ASR) in the Vietnamese language. The model was trained for 240 hours using a single Nvidia A6000 GPU.
 This project would not be possible without the following datasets:
+- [capleaf/viVoice](https://huggingface.co/datasets/capleaf/viVoice)
+- [NhutP/VSV-1100](https://huggingface.co/datasets/nhutp/vsv-1100)
+- [doof-ferb/fpt_fosd](https://huggingface.co/datasets/doof-ferb/fpt_fosd)
+- [doof-ferb/infore1_25hours](https://huggingface.co/datasets/doof-ferb/infore1_25hours)
+- [google/fleurs](https://huggingface.co/datasets/google/fleurs)
+- [doof-ferb/LSVSC](https://huggingface.co/datasets/doof-ferb/LSVSC)
+- [quocanh34/viet_vlsp](https://huggingface.co/datasets/quocanh34/viet-vlsp)
+- [linhtran92/viet_youtube_asr_corpus_v2](https://huggingface.co/datasets/linhtran92/viet_youtube_asr_corpus_v2)
+- [doof-ferb/infore2_audiobooks](https://huggingface.co/datasets/doof-ferb/infore2_audiobooks/)
+- [linhtran92/viet_bud500](https://huggingface.co/datasets/linhtran92/viet_bud500)
 ## License
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.