Tanel commited on
Commit
d091766
1 Parent(s): 8ca127d

New version of the model

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -63,7 +63,9 @@ For example:
63
 
64
  `ct2-transformers-converter --model TalTechNLP/whisper-large-et --output_dir whisper-large-et.ct2 --copy_files tokenizer.json --quantization float16`
65
 
66
- * Decode: `whisper-ctranslate2 --model_directory whisper-large-et.ct2 --task transcribe --language et --beam_size 5 some_file.mp3`
 
 
67
 
68
 
69
  #### Limitations and bias
@@ -95,7 +97,7 @@ Finetuned using Espnet, and then comverted to transformers format using [this](h
95
  Finetuning procedure is similar to [this](https://huggingface.co/espnet/shihlun_asr_whisper_medium_finetuned_librispeech100) model.
96
  Finetuning was done for 3 epochs, with model averaging at the end of training.
97
 
98
- *Update*: 2023-10-03 bersion of the model is trained on long segments (like the original Whisper model) and
99
  is therefore especially well suited to be used e.g. with [faster-whisper](https://github.com/guillaumekln/faster-whisper) to
100
  transcribe long speech recordings "end-to-end" (i.e., without any prior segmentation).
101
 
 
63
 
64
  `ct2-transformers-converter --model TalTechNLP/whisper-large-et --output_dir whisper-large-et.ct2 --copy_files tokenizer.json --quantization float16`
65
 
66
+ * Decode:
67
+
68
+ `whisper-ctranslate2 --model_directory whisper-large-et.ct2 --task transcribe --language et --beam_size 5 some_file.mp3`
69
 
70
 
71
  #### Limitations and bias
 
97
  Finetuning procedure is similar to [this](https://huggingface.co/espnet/shihlun_asr_whisper_medium_finetuned_librispeech100) model.
98
  Finetuning was done for 3 epochs, with model averaging at the end of training.
99
 
100
+ *Update*: 2023-10-03 version of the model is trained on long segments (like the original Whisper model) and
101
  is therefore especially well suited to be used e.g. with [faster-whisper](https://github.com/guillaumekln/faster-whisper) to
102
  transcribe long speech recordings "end-to-end" (i.e., without any prior segmentation).
103