tensorops commited on
Commit
fe4293a
1 Parent(s): 1e18719

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -6,7 +6,8 @@ tags:
6
  - whisper-event
7
  - generated_from_trainer
8
  datasets:
9
- - mozilla-foundation/common_voice_11_0
 
10
  metrics:
11
  - wer
12
  model-index:
@@ -25,6 +26,7 @@ model-index:
25
  - name: Wer
26
  type: wer
27
  value: 8.44
 
28
  ---
29
 
30
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -32,8 +34,8 @@ should probably proofread and complete it, then remove this comment. -->
32
 
33
  # Whisper Medium (Thai): Combined V2
34
 
35
- This model is a fine-tuned version of [biodatlab/whisper-medium-th-1000iter](https://huggingface.co/biodatlab/whisper-medium-th-1000iter) on the mozilla-foundation/common_voice_11_0 th dataset.
36
- It achieves the following results on the evaluation set:
37
  - Loss: 0.1475
38
  - WER: 13.03 (without Tokenizer)
39
  - WER: 8.44 (with Deepcut Tokenizer)
@@ -45,7 +47,7 @@ Use the model with huggingface's `transformers` as follows:
45
  ```py
46
  from transformers import pipeline
47
 
48
- MODEL_NAME = "biodatlab/whisper-medium-th-combined-v2" # specify the model name
49
  lang = "th" # change to Thai langauge
50
 
51
  device = 0 if torch.cuda.is_available() else "cpu"
@@ -96,10 +98,10 @@ The following hyperparameters were used during training:
96
 
97
  ### Framework versions
98
 
99
- - Transformers 4.26.0.dev0
100
- - Pytorch 1.13.0
101
- - Datasets 2.7.1
102
- - Tokenizers 0.13.2
103
 
104
  ## Citation
105
 
 
6
  - whisper-event
7
  - generated_from_trainer
8
  datasets:
9
+ - mozilla-foundation/common_voice_13_0
10
+ - google/fleurs
11
  metrics:
12
  - wer
13
  model-index:
 
26
  - name: Wer
27
  type: wer
28
  value: 8.44
29
+ library_name: transformers
30
  ---
31
 
32
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
34
 
35
  # Whisper Medium (Thai): Combined V2
36
 
37
+ This model is a fine-tuned, augmented versions of [biodatlab/whisper-medium-th-1000iter](https://huggingface.co/biodatlab/whisper-medium-th-1000iter) on the mozilla-foundation/common_voice_13_0 th, google/fleurs, and curated datasets.
38
+ It achieves the following results (NOT-UP-TO-DATE) on the common-voice-11 evaluation set:
39
  - Loss: 0.1475
40
  - WER: 13.03 (without Tokenizer)
41
  - WER: 8.44 (with Deepcut Tokenizer)
 
47
  ```py
48
  from transformers import pipeline
49
 
50
+ MODEL_NAME = "biodatlab/whisper-medium-th-combined" # specify the model name
51
  lang = "th" # change to Thai langauge
52
 
53
  device = 0 if torch.cuda.is_available() else "cpu"
 
98
 
99
  ### Framework versions
100
 
101
+ - Transformers 4.31.0.dev0
102
+ - Pytorch 2.1.0
103
+ - Datasets 2.13.1
104
+ - Tokenizers 0.13.3
105
 
106
  ## Citation
107