JBJoyce commited on
Commit
671e461
1 Parent(s): 0caacc1

Create README.md

Browse files

## Model description
This a Wav2Vec2 pretrained model fine tuned on the keyword spotting task.

## Intended uses & limitations
Keyword Spotting (KS) detects preregistered keywords by classifying utterances into a predefined set of words.

Predefined words are limited to "Yes", "No", "Up", "Down", "Left", "Right", "On", "Off", "Stop", "Go", "Zero", "One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Bed", "Bird", "Cat", "Dog", "Happy", "House", "Marvin", "Sheila", "Tree", "Wow", and a label for silence

## How to use
Options for use include the Hugging Face API to the left which will record audio snippets or via upload of audio files.

Using the pipeline function from the Transformers library:

from transformers import pipeline

keyword_spotter = pipeline(model="JBJoyce/wav2vec2-base-superb-ks")
keyword_spotter("path_to_audio_file.wav")

## Training data
Preprocessing steps include resampling audio to 16 kHz. Finetuning of the Wav2Vec2 model was done via the KS subset of the Superb benchmark dataset of 51093 audio examples over 10 epochs. Final validation set accuracy was 0.97

## Evaluation results
Accuracy on a holdout test set was 0.89

Files changed (1) hide show
  1. README.md +8 -0
README.md ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - superb
4
+ language:
5
+ - en
6
+ metrics:
7
+ - accuracy
8
+ ---