TalTechNLP
/

whisper-large-et

Automatic Speech Recognition

hf-asr-leaderboard

Inference Endpoints

Model card Files Files and versions Community

TanelAlumae commited on Oct 3, 2023

Commit

d091766

•

1 Parent(s): 8ca127d

New version of the model

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -63,7 +63,9 @@ For example:
     `ct2-transformers-converter --model TalTechNLP/whisper-large-et --output_dir whisper-large-et.ct2  --copy_files tokenizer.json --quantization float16`
-  * Decode: `whisper-ctranslate2 --model_directory whisper-large-et.ct2 --task transcribe --language et --beam_size 5 some_file.mp3`
 #### Limitations and bias
@@ -95,7 +97,7 @@ Finetuned using Espnet, and then comverted to transformers format using [this](h
 Finetuning procedure is similar to [this](https://huggingface.co/espnet/shihlun_asr_whisper_medium_finetuned_librispeech100) model.
 Finetuning was done for 3 epochs, with model averaging at the end of training.
-*Update*: 2023-10-03 bersion of the model is trained on long segments (like the original Whisper model) and
 is therefore especially well suited to be used e.g. with [faster-whisper](https://github.com/guillaumekln/faster-whisper) to
 transcribe long speech recordings "end-to-end" (i.e., without any prior segmentation).

     `ct2-transformers-converter --model TalTechNLP/whisper-large-et --output_dir whisper-large-et.ct2  --copy_files tokenizer.json --quantization float16`
+  * Decode:
+    `whisper-ctranslate2 --model_directory whisper-large-et.ct2 --task transcribe --language et --beam_size 5 some_file.mp3`
 #### Limitations and bias
 Finetuning procedure is similar to [this](https://huggingface.co/espnet/shihlun_asr_whisper_medium_finetuned_librispeech100) model.
 Finetuning was done for 3 epochs, with model averaging at the end of training.
+*Update*: 2023-10-03 version of the model is trained on long segments (like the original Whisper model) and
 is therefore especially well suited to be used e.g. with [faster-whisper](https://github.com/guillaumekln/faster-whisper) to
 transcribe long speech recordings "end-to-end" (i.e., without any prior segmentation).