metadata
language: en
thumbnail: null
tags:
- Spoken language understanding
license: CC0
datasets:
- Timers and Such
metrics:
- Accuracy
End-to-end SLU model for Timers and Such
Attention-based RNN sequence-to-sequence model for Timers and Such trained on the train-real
subset. This model checkpoint achieves 86.7% accuracy on test-real
.
The model uses an ASR model trained on LibriSpeech (speechbrain/asr-crdnn-rnnlm-librispeech
) to extract features from the input audio, then maps these features to an intent and slot labels using a beam search.
The dataset has four intents: SetTimer
, SetAlarm
, SimpleMath
, and UnitConversion
. Try testing the model by saying something like "set a timer for 5 minutes" or "what's 32 degrees Celsius in Fahrenheit?"
You can try the model on the math.wav
file included here as follows:
from speechbrain.pretrained import EndToEndSLU
slu = EndToEndSLU.from_hparams("speechbrain/slu-timers-and-such-direct-librispeech-asr")
slu.decode_file("math.wav")
Referencing SpeechBrain
@misc{SB2021,
author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
title = {SpeechBrain},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\url{https://github.com/speechbrain/speechbrain}},
}
Referencing Timers and Such
(TODO add paper once released)