pere commited on
Commit
9670916
1 Parent(s): 7724919

updated template

Browse files
Files changed (1) hide show
  1. README.md +17 -20
README.md CHANGED
@@ -30,35 +30,32 @@ widget:
30
 
31
  This model is trained 200 additional steps on top of the model below. This makes it outputting only text in lowercase and without punctation. It is also considerably more verbatim, and will not make any attempt at correcting grammatical errors in the text
32
 
33
- # NB-Whisper Base Verbatim (Release Candidate)
34
-
35
- **IMPORTANT:** These models are currently Release Candidates. We are in the final stages of testing. If everything proceeds smoothly, we plan to officially release the models later this month.
36
 
37
  Introducing the **_Norwegian NB-Whisper Base Verbatim model_**, proudly developed by the National Library of Norway. NB-Whisper is a cutting-edge series of models designed for automatic speech recognition (ASR) and speech translation. These models are based on the work of [OpenAI's Whisper](https://arxiv.org/abs/2212.04356). Each model in the series has been trained for 250,000 steps, utilizing a diverse dataset of 8 million samples. These samples consist of aligned audio clips, each 30 seconds long, culminating in a staggering 66,000 hours of speech. For an in-depth understanding of our training methodology and dataset composition, keep an eye out for our upcoming article.
38
 
39
  | Model Size | Parameters | Model |
40
  |------------|------------|------------|
41
- | Tiny | 39M | [NB-Whisper Tiny](https://huggingface.co/NbAiLabBeta/nb-whisper-tiny) |
42
- | Base | 74M | [NB-Whisper Base](https://huggingface.co/NbAiLabBeta/nb-whisper-base) |
43
- | Small | 244M | [NB-Whisper Small](https://huggingface.co/NbAiLabBeta/nb-whisper-small) |
44
- | Medium | 769M | [NB-Whisper Medium](https://huggingface.co/NbAiLabBeta/nb-whisper-medium) |
45
- | Large | 1550M | [NB-Whisper Large](https://huggingface.co/NbAiLabBeta/nb-whisper-large) |
46
 
47
 
48
 
49
- ### Specialised Models
50
  While the main models are suitable for most transcription task, we demonstrate how easy it is to change the output of the main model. The following models are trained 250 additional steps from the main models above, and might be suitable for more targetted use cases:
51
  - **Verbatim version**: This lower-cased variant is more literal and suitable for tasks requiring detailed transcription, such as linguistic analysis.
52
- - **Semantic version**: This variant focuses less on verbatim accuracy but captures the essence of content, ideal for meeting minutes and subtitling.
53
 
54
 
55
- | Model Size | Parameters | Verbatim version | Semantic version |
56
- |------------|------------|------------|------------------|
57
- | Tiny | 39M | [Tiny - verbatim](https://huggingface.co/NbAiLabBeta/nb-whisper-tiny-verbatim) | [Tiny - semantic](https://huggingface.co/NbAiLabBeta/nb-whisper-tiny-semantic) |
58
- | Base | 74M | [Base - verbatim](https://huggingface.co/NbAiLabBeta/nb-whisper-base-verbatim) | [Base - semantic](https://huggingface.co/NbAiLabBeta/nb-whisper-base-semantic) |
59
- | Small | 244M | [Small - verbatim](https://huggingface.co/NbAiLabBeta/nb-whisper-small-verbatim) | [Small - semantic](https://huggingface.co/NbAiLabBeta/nb-whisper-small-semantic) |
60
- | Medium | 769M | [Medium - verbatim](https://huggingface.co/NbAiLabBeta/nb-whisper-medium-verbatim) | [Medium - semantic](https://huggingface.co/NbAiLabBeta/nb-whisper-medium-semantic) |
61
- | Large | 1550M | [Large - verbatim](https://huggingface.co/NbAiLabBeta/nb-whisper-large-verbatim) | [Large - semantic](https://huggingface.co/NbAiLabBeta/nb-whisper-large-semantic) |
62
 
63
 
64
  ### Model Description
@@ -77,7 +74,7 @@ While the main models are suitable for most transcription task, we demonstrate h
77
  ## How to Use the Models
78
 
79
  ### Online Demos
80
- You can try the models directly through the HuggingFace Inference API, accessible on the right side of this page. Be aware that initially, the model needs to load and will run on limited CPU capacity, which might be slow. To enhance your experience, we are temporarily hosting some models on TPUs for a few days, significantly boosting their performance. Explore these under the **Spaces** section on the [Main Page](https://huggingface.co/NbAiLabBeta/).
81
 
82
  ### Local Setup with HuggingFace
83
  Alternatively, you can run the models locally. The Tiny, Base, and Small models are optimized for CPU execution. For the Medium and Large models, we recommend a system equipped with a GPU to ensure efficient processing. Setting up and using these models with HuggingFace's Transformers is straightforward, provided you have [Python](https://www.python.org/downloads/) installed on your machine. For practical demonstrations, refer to examples using this [sample mp3 file](https://github.com/NbAiLab/nb-whisper/raw/main/audio/king.mp3).
@@ -225,8 +222,8 @@ $ wget -N https://github.com/NbAiLab/nb-whisper/raw/main/audio/king.mp3
225
  $ ffmpeg -i king.mp3 -ar 16000 -ac 1 -c:a pcm_s16le king.wav
226
 
227
  # Lets download the two ggml-files from this site
228
- wget -N https://huggingface.co/NbAiLabBeta/nb-whisper-base/resolve/main/ggml-model.bin -O models/nb-base-ggml-model.bin
229
- wget -N https://huggingface.co/NbAiLabBeta/nb-whisper-base/resolve/main/ggml-model-q5_0.bin -O models/nb-base-ggml-model-q5_0.bin
230
 
231
  # And run it with the f16 default model
232
  $ ./main -l no -m models/nb-base-ggml-model.bin king.wav
 
30
 
31
  This model is trained 200 additional steps on top of the model below. This makes it outputting only text in lowercase and without punctation. It is also considerably more verbatim, and will not make any attempt at correcting grammatical errors in the text
32
 
33
+ # NB-Whisper Base Verbatim
 
 
34
 
35
  Introducing the **_Norwegian NB-Whisper Base Verbatim model_**, proudly developed by the National Library of Norway. NB-Whisper is a cutting-edge series of models designed for automatic speech recognition (ASR) and speech translation. These models are based on the work of [OpenAI's Whisper](https://arxiv.org/abs/2212.04356). Each model in the series has been trained for 250,000 steps, utilizing a diverse dataset of 8 million samples. These samples consist of aligned audio clips, each 30 seconds long, culminating in a staggering 66,000 hours of speech. For an in-depth understanding of our training methodology and dataset composition, keep an eye out for our upcoming article.
36
 
37
  | Model Size | Parameters | Model |
38
  |------------|------------|------------|
39
+ | Tiny | 39M | [NB-Whisper Tiny](https://huggingface.co/NbAiLab/nb-whisper-tiny) |
40
+ | Base | 74M | [NB-Whisper Base](https://huggingface.co/NbAiLab/nb-whisper-base) |
41
+ | Small | 244M | [NB-Whisper Small](https://huggingface.co/NbAiLab/nb-whisper-small) |
42
+ | Medium | 769M | [NB-Whisper Medium](https://huggingface.co/NbAiLab/nb-whisper-medium) |
43
+ | Large | 1550M | [NB-Whisper Large](https://huggingface.co/NbAiLab/nb-whisper-large) |
44
 
45
 
46
 
47
+ ### Verbatim Model
48
  While the main models are suitable for most transcription task, we demonstrate how easy it is to change the output of the main model. The following models are trained 250 additional steps from the main models above, and might be suitable for more targetted use cases:
49
  - **Verbatim version**: This lower-cased variant is more literal and suitable for tasks requiring detailed transcription, such as linguistic analysis.
 
50
 
51
 
52
+ | Model Size | Parameters | Semantic version |
53
+ |------------|------------|------------------|
54
+ | Tiny | 39M | [Tiny - semantic](https://huggingface.co/NbAiLab/nb-whisper-tiny-semantic) |
55
+ | Base | 74M | [Base - semantic](https://huggingface.co/NbAiLab/nb-whisper-base-semantic) |
56
+ | Small | 244M | [Small - semantic](https://huggingface.co/NbAiLab/nb-whisper-small-semantic) |
57
+ | Medium | 769M | [Medium - semantic](https://huggingface.co/NbAiLab/nb-whisper-medium-semantic) |
58
+ | Large | 1550M | [Large - semantic](https://huggingface.co/NbAiLab/nb-whisper-large-semantic) |
59
 
60
 
61
  ### Model Description
 
74
  ## How to Use the Models
75
 
76
  ### Online Demos
77
+ You can try the models directly through the HuggingFace Inference API, accessible on the right side of this page. Be aware that initially, the model needs to load and will run on limited CPU capacity, which might be slow. To enhance your experience, we are temporarily hosting some models on TPUs for a few days, significantly boosting their performance. Explore these under the **Spaces** section on the [Main Page](https://huggingface.co/NbAiLab/).
78
 
79
  ### Local Setup with HuggingFace
80
  Alternatively, you can run the models locally. The Tiny, Base, and Small models are optimized for CPU execution. For the Medium and Large models, we recommend a system equipped with a GPU to ensure efficient processing. Setting up and using these models with HuggingFace's Transformers is straightforward, provided you have [Python](https://www.python.org/downloads/) installed on your machine. For practical demonstrations, refer to examples using this [sample mp3 file](https://github.com/NbAiLab/nb-whisper/raw/main/audio/king.mp3).
 
222
  $ ffmpeg -i king.mp3 -ar 16000 -ac 1 -c:a pcm_s16le king.wav
223
 
224
  # Lets download the two ggml-files from this site
225
+ wget -N https://huggingface.co/NbAiLab/nb-whisper-base/resolve/main/ggml-model.bin -O models/nb-base-ggml-model.bin
226
+ wget -N https://huggingface.co/NbAiLab/nb-whisper-base/resolve/main/ggml-model-q5_0.bin -O models/nb-base-ggml-model-q5_0.bin
227
 
228
  # And run it with the f16 default model
229
  $ ./main -l no -m models/nb-base-ggml-model.bin king.wav