lvzixin commited on
Commit
0353998
1 Parent(s): e11984f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -1
README.md CHANGED
@@ -9,4 +9,18 @@ pipeline_tag: text-to-image
9
  tags:
10
  - medical
11
  - free tags
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  tags:
10
  - medical
11
  - free tags
12
+ ---
13
+
14
+ # Whisper
15
+
16
+ Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours
17
+ of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains **without** the need
18
+ for fine-tuning.
19
+
20
+ Whisper was proposed in the paper [Robust Speech Recognition via Large-Scale Weak Supervision](https://arxiv.org/abs/2212.04356)
21
+ by Alec Radford et al. from OpenAI. The original code repository can be found [here](https://github.com/openai/whisper).
22
+
23
+ Whisper `large-v3` has the same architecture as the previous large models except the following minor differences:
24
+
25
+ 1. The input uses 128 Mel frequency bins instead of 80
26
+ 2. A new language token for Cantonese