Titouan commited on
Commit
957b82c
1 Parent(s): d0fd36c

pushing fr model

Browse files

Files changed (4) hide show
  1. README.md +77 -0
  2. asr.ckpt +3 -0
  3. normalizer.ckpt +3 -0
  4. tokenizer.ckpt +3 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "fr"
3
+ thumbnail:
4
+ tags:
5
+ - ASR
6
+ - CTC
7
+ - Attention
8
+ - pytorch
9
+ license: "apache-2.0"
10
+ datasets:
11
+ - commonvoice
12
+ metrics:
13
+ - wer
14
+ - cer
15
+ ---
16
+
17
+ # CRDNN with CTC/Attention trained on CommonVoice French (No LM)
18
+
19
+ This repository provides all the necessary tools to perform automatic speech
20
+ recognition from an end-to-end system pretrained on CommonVoice (FR) within
21
+ SpeechBrain. For a better experience we encourage you to learn more about
22
+ [SpeechBrain](https://speechbrain.github.io). The given ASR model performance are:
23
+
24
+ | Release | Test CER | Test WER | GPUs |
25
+ |:-------------:|:--------------:|:--------------:| :--------:|
26
+ | 07-03-21 | 6.54 | 17.70 | 2xV100 16GB |
27
+
28
+ ## Pipeline description
29
+
30
+ This ASR system is composed with 2 different but linked blocks:
31
+ 1. Tokenizer (unigram) that transforms words into subword units and trained with
32
+ the train transcriptions (train.tsv) of CommonVoice (FR).
33
+ 3. Acoustic model (CRDNN + CTC/Attention). The CRDNN architecture is made of
34
+ N blocks of convolutional neural networks with normalisation and pooling on the
35
+ frequency domain. Then, a bidirectional LSTM is connected to a final DNN to obtain
36
+ the final acoustic representation that is given to the CTC and attention decoders.
37
+
38
+ ## Intended uses & limitations
39
+
40
+ This model has been primilarly developed to be run within SpeechBrain as a pretrained ASR model
41
+ for the French language. Thanks to the flexibility of SpeechBrain, any of the 2 blocks
42
+ detailed above can be extracted and connected to you custom pipeline as long as SpeechBrain is
43
+ installed.
44
+
45
+ ## Install SpeechBrain
46
+
47
+ First of all, please install SpeechBrain with the following command:
48
+
49
+ ```
50
+ pip install \\we hide ! SpeechBrain is still private :p
51
+ ```
52
+
53
+ Please notice that we encourage you to read our tutorials and learn more about
54
+ [SpeechBrain](https://speechbrain.github.io).
55
+
56
+ ### Transcribing your own audio files
57
+
58
+ ```python
59
+ from speechbrain.pretrained import EncoderDecoderASR
60
+
61
+ asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-crdnn-commonvoice-fr")
62
+ asr_model.transcribe_file("path_to_your_file.wav")
63
+
64
+ ```
65
+
66
+ #### Referencing SpeechBrain
67
+
68
+ ```
69
+ @misc{SB2021,
70
+ author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
71
+ title = {SpeechBrain},
72
+ year = {2021},
73
+ publisher = {GitHub},
74
+ journal = {GitHub repository},
75
+ howpublished = {\url{https://github.com/speechbrain/speechbrain}},
76
+ }
77
+ ```
asr.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:42d4644742da8f95e68124d8d04605907b11fb10e82c5800982d098380b2cd49
3
+ size 592775161
normalizer.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:75bf1f53645ef67244c27a9474c63144b79ac6453c827af6a45e2c5e385fcdf7
3
+ size 1783
tokenizer.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd21b3558352be835d8f8855f8c677c5794133c7e1d59aec47f6ba40dc2ca63e
3
+ size 244544