waveletdeboshir
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ language:
|
|
4 |
- ru
|
5 |
library_name: transformers
|
6 |
pipeline_tag: automatic-speech-recognition
|
7 |
-
base_model: waveletdeboshir/whisper-base-ru-pruned
|
8 |
tags:
|
9 |
- asr
|
10 |
- Pytorch
|
@@ -41,7 +41,7 @@ datasets:
|
|
41 |
- mozilla-foundation/common_voice_15_0
|
42 |
---
|
43 |
|
44 |
-
# Whisper-base-ru-pruned-
|
45 |
|
46 |
## Model info
|
47 |
This is a finetuned version of pruned whisper-base model ([waveletdeboshir/whisper-base-ru-pruned](https://huggingface.co/waveletdeboshir/whisper-base-ru-pruned)) for Russian language.
|
@@ -50,7 +50,7 @@ Model was finetuned on russian part of [mozilla-foundation/common_voice_15_0](ht
|
|
50 |
|
51 |
## Metrics
|
52 |
|
53 |
-
| metric | dataset | waveletdeboshir/whisper-base-ru-pruned | waveletdeboshir/whisper-small-ru-pruned-
|
54 |
| :------ | :------ | :------ | :------ |
|
55 |
| WER (without punctuation) | common_voice_15_0_test | | |
|
56 |
| WER | common_voice_15_0_test | | |
|
@@ -60,7 +60,7 @@ Model was finetuned on russian part of [mozilla-foundation/common_voice_15_0](ht
|
|
60 |
Only 10% tokens was left including special whisper tokens (no language tokens except \<|ru|\> and \<|en|\>, no timestamp tokens), 200 most popular tokens from tokenizer and 4000 most popular Russian tokens computed by tokenization of russian text corpus.
|
61 |
|
62 |
Model size is 30% less then original whisper-base:
|
63 |
-
| | openai/whisper-base | waveletdeboshir/whisper-base-ru-pruned-
|
64 |
| :------ | :------ | :------ |
|
65 |
| n of parameters | 74 M | 48 M |
|
66 |
| n of parameters (with proj_out layer) | 99 M | 50 M |
|
@@ -78,8 +78,8 @@ Model can be used as an original whisper:
|
|
78 |
>>> wav, sr = torchaudio.load("audio.wav")
|
79 |
|
80 |
>>> # load model and processor
|
81 |
-
>>> processor = WhisperProcessor.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-
|
82 |
-
>>> model = WhisperForConditionalGeneration.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-
|
83 |
|
84 |
>>> input_features = processor(wav[0], sampling_rate=sr, return_tensors="pt").input_features
|
85 |
|
|
|
4 |
- ru
|
5 |
library_name: transformers
|
6 |
pipeline_tag: automatic-speech-recognition
|
7 |
+
base_model: waveletdeboshir/whisper-base-ru-pruned
|
8 |
tags:
|
9 |
- asr
|
10 |
- Pytorch
|
|
|
41 |
- mozilla-foundation/common_voice_15_0
|
42 |
---
|
43 |
|
44 |
+
# Whisper-base-ru-pruned-ft
|
45 |
|
46 |
## Model info
|
47 |
This is a finetuned version of pruned whisper-base model ([waveletdeboshir/whisper-base-ru-pruned](https://huggingface.co/waveletdeboshir/whisper-base-ru-pruned)) for Russian language.
|
|
|
50 |
|
51 |
## Metrics
|
52 |
|
53 |
+
| metric | dataset | waveletdeboshir/whisper-base-ru-pruned | waveletdeboshir/whisper-small-ru-pruned-ft |
|
54 |
| :------ | :------ | :------ | :------ |
|
55 |
| WER (without punctuation) | common_voice_15_0_test | | |
|
56 |
| WER | common_voice_15_0_test | | |
|
|
|
60 |
Only 10% tokens was left including special whisper tokens (no language tokens except \<|ru|\> and \<|en|\>, no timestamp tokens), 200 most popular tokens from tokenizer and 4000 most popular Russian tokens computed by tokenization of russian text corpus.
|
61 |
|
62 |
Model size is 30% less then original whisper-base:
|
63 |
+
| | openai/whisper-base | waveletdeboshir/whisper-base-ru-pruned-ft |
|
64 |
| :------ | :------ | :------ |
|
65 |
| n of parameters | 74 M | 48 M |
|
66 |
| n of parameters (with proj_out layer) | 99 M | 50 M |
|
|
|
78 |
>>> wav, sr = torchaudio.load("audio.wav")
|
79 |
|
80 |
>>> # load model and processor
|
81 |
+
>>> processor = WhisperProcessor.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-ft")
|
82 |
+
>>> model = WhisperForConditionalGeneration.from_pretrained("waveletdeboshir/whisper-base-ru-pruned-ft")
|
83 |
|
84 |
>>> input_features = processor(wav[0], sampling_rate=sr, return_tensors="pt").input_features
|
85 |
|