jonatasgrosman commited on
Commit
6c23e25
1 Parent(s): 863305b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -49,7 +49,7 @@ model-index:
49
 
50
  # Whisper Large Portuguese
51
 
52
- This model is a fine-tuned version of [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) on Spanish using the train split of [Common Voice 11](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0). When using this model, make sure that your speech input is sampled at 16kHz.
53
 
54
  ## Usage
55
 
@@ -75,7 +75,7 @@ transcription = transcriber("path/to/my_audio.wav")
75
 
76
  ## Evaluation
77
 
78
- We perform evaluation of the model using the test split of two datasets, the [Common Voice 11](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0) (same dataset used for the fine-tuning) and the [Fleurs](https://huggingface.co/datasets/google/fleurs) (dataset not seen during the fine-tuning). As Whisper can transcribe casing and punctuation, I performed the model evaluation in 2 different scenarios, one using the raw text and the other using the normalized text (lowercase + removal of punctuations). Additionally, for the Fleurs dataset, I evaluated the model in a scenario where there are no transcriptions of numerical values since the way these values are described in this dataset is different from how they are described in the dataset used in fine-tuning (Common Voice), so it is expected that this difference in the way of describing numerical values will affect the performance of the model for this type of transcription in Fleurs.
79
 
80
  ### Common Voice 11
81
 
 
49
 
50
  # Whisper Large Portuguese
51
 
52
+ This model is a fine-tuned version of [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) on Spanish using the train split of [Common Voice 11](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0).
53
 
54
  ## Usage
55
 
 
75
 
76
  ## Evaluation
77
 
78
+ I've performed the evaluation of the model using the test split of two datasets, the [Common Voice 11](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0) (same dataset used for the fine-tuning) and the [Fleurs](https://huggingface.co/datasets/google/fleurs) (dataset not seen during the fine-tuning). As Whisper can transcribe casing and punctuation, I've performed the model evaluation in 2 different scenarios, one using the raw text and the other using the normalized text (lowercase + removal of punctuations). Additionally, for the Fleurs dataset, I've evaluated the model in a scenario where there are no transcriptions of numerical values since the way these values are described in this dataset is different from how they are described in the dataset used in fine-tuning (Common Voice), so it is expected that this difference in the way of describing numerical values will affect the performance of the model for this type of transcription in Fleurs.
79
 
80
  ### Common Voice 11
81