waveletdeboshir
/

whisper-base-ru-pruned-ft

@@ -4,7 +4,7 @@ language:
 - ru
 library_name: transformers
 pipeline_tag: automatic-speech-recognition
-base_model: waveletdeboshir/whisper-base-ru-pruned-finetuned
 tags:
 - asr
 - Pytorch
@@ -41,7 +41,7 @@ datasets:
 - mozilla-foundation/common_voice_15_0
 ---
-# Whisper-base-ru-pruned-finetuned
 ## Model info
 This is a finetuned version of pruned whisper-base model ([waveletdeboshir/whisper-base-ru-pruned](https://huggingface.co/waveletdeboshir/whisper-base-ru-pruned)) for Russian language.
@@ -50,7 +50,7 @@ Model was finetuned on russian part of [mozilla-foundation/common_voice_15_0](ht
 ## Metrics
-| metric | dataset | waveletdeboshir/whisper-base-ru-pruned | waveletdeboshir/whisper-small-ru-pruned-finetuned |
 | :------ | :------ | :------ | :------ |
 | WER (without punctuation) | common_voice_15_0_test |  |  |
 | WER | common_voice_15_0_test |  |  |
@@ -60,7 +60,7 @@ Model was finetuned on russian part of [mozilla-foundation/common_voice_15_0](ht
 Only 10% tokens was left including special whisper tokens (no language tokens except \<|ru|\> and \<|en|\>, no timestamp tokens), 200 most popular tokens from tokenizer and 4000 most popular Russian tokens computed by tokenization of russian text corpus.
 Model size is 30%  less then original whisper-base:
-|  | openai/whisper-base | waveletdeboshir/whisper-base-ru-pruned-finetuned |
 | :------ | :------ | :------ |
 | n of parameters | 74 M | 48 M |
 | n of parameters (with proj_out layer) | 99 M | 50 M |
@@ -78,8 +78,8 @@ Model can be used as an original whisper:
 >>> wav, sr = torchaudio.load("audio.wav")
 >>> # load model and processor
->>> processor = WhisperProcessor.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-finetuned")
->>> model = WhisperForConditionalGeneration.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-finetuned")
 >>> input_features = processor(wav[0], sampling_rate=sr, return_tensors="pt").input_features

 - ru
 library_name: transformers
 pipeline_tag: automatic-speech-recognition
+base_model: waveletdeboshir/whisper-base-ru-pruned
 tags:
 - asr
 - Pytorch
 - mozilla-foundation/common_voice_15_0
 ---
+# Whisper-base-ru-pruned-ft
 ## Model info
 This is a finetuned version of pruned whisper-base model ([waveletdeboshir/whisper-base-ru-pruned](https://huggingface.co/waveletdeboshir/whisper-base-ru-pruned)) for Russian language.
 ## Metrics
+| metric | dataset | waveletdeboshir/whisper-base-ru-pruned | waveletdeboshir/whisper-small-ru-pruned-ft |
 | :------ | :------ | :------ | :------ |
 | WER (without punctuation) | common_voice_15_0_test |  |  |
 | WER | common_voice_15_0_test |  |  |
 Only 10% tokens was left including special whisper tokens (no language tokens except \<|ru|\> and \<|en|\>, no timestamp tokens), 200 most popular tokens from tokenizer and 4000 most popular Russian tokens computed by tokenization of russian text corpus.
 Model size is 30%  less then original whisper-base:
+|  | openai/whisper-base | waveletdeboshir/whisper-base-ru-pruned-ft |
 | :------ | :------ | :------ |
 | n of parameters | 74 M | 48 M |
 | n of parameters (with proj_out layer) | 99 M | 50 M |
 >>> wav, sr = torchaudio.load("audio.wav")
 >>> # load model and processor
+>>> processor = WhisperProcessor.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-ft")
+>>> model = WhisperForConditionalGeneration.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-ft")
 >>> input_features = processor(wav[0], sampling_rate=sr, return_tensors="pt").input_features