juliensimon HF staff commited on
Commit
5df9582
1 Parent(s): e6b2d99

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -39,15 +39,28 @@ TBD
39
 
40
  The model can spot one of the following keywords: "Yes", "No", "Up", "Down", "Left", "Right", "On", "Off", "Stop", "Go", "Zero", "One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Bed", "Bird", "Cat", "Dog", "Happy", "House", "Marvin", "Sheila", "Tree", "Wow", "Backward", "Forward", "Follow", "Learn", "Visual".
41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  ### Training and evaluation data
43
 
44
- - subset v0.02
45
  - full training set
46
  - full validation set
47
 
48
  ### Training procedure
49
 
50
- TBD
51
 
52
  #### Training hyperparameters
53
 
 
39
 
40
  The model can spot one of the following keywords: "Yes", "No", "Up", "Down", "Left", "Right", "On", "Off", "Stop", "Go", "Zero", "One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Bed", "Bird", "Cat", "Dog", "Happy", "House", "Marvin", "Sheila", "Tree", "Wow", "Backward", "Forward", "Follow", "Learn", "Visual".
41
 
42
+ The repository includes sample files that I recorded (WAV, 16Khz sampling rate, mono). The simplest way to use the model is with the ```pipeline``` API:
43
+
44
+ ```
45
+ >>> from transformers import pipeline
46
+ >>> p = pipeline("audio-classification", model="juliensimon/wav2vec2-conformer-rel-pos-large-finetuned-speech-commands")
47
+ >>> p("up16k.wav")
48
+ [{'score': 0.7008192539215088, 'label': 'up'}, {'score': 0.04346614331007004, 'label': 'off'}, {'score': 0.029526518657803535, 'label': 'left'}, {'score': 0.02905120886862278, 'label': 'stop'}, {'score': 0.027142534032464027, 'label': 'on'}]
49
+ >>> p("stop16k.wav")
50
+ [{'score': 0.6969656944274902, 'label': 'stop'}, {'score': 0.03391443192958832, 'label': 'up'}, {'score': 0.027382319793105125, 'label': 'seven'}, {'score': 0.020835857838392258, 'label': 'five'}, {'score': 0.018051736056804657, 'label': 'down'}]
51
+ >>> p("marvin16k.wav")
52
+ [{'score': 0.5276530981063843, 'label': 'marvin'}, {'score': 0.04645705968141556, 'label': 'down'}, {'score': 0.038583893328905106, 'label': 'backward'}, {'score': 0.03578080236911774, 'label': 'wow'}, {'score': 0.03178196772933006, 'label': 'bird'}]
53
+ ```
54
+
55
  ### Training and evaluation data
56
 
57
+ - subset: v0.02
58
  - full training set
59
  - full validation set
60
 
61
  ### Training procedure
62
 
63
+ The model was fine-tuned on [Amazon SageMaker](https://aws.amazon.com/sagemaker), using an [ml.p3dn.24xlarge](https://aws.amazon.com/fr/ec2/instance-types/p3/) instance (8 NVIDIA V100 GPUs). Total training time for 10 epochs was 4.5 hours.
64
 
65
  #### Training hyperparameters
66