|
--- |
|
language: "en" |
|
thumbnail: |
|
tags: |
|
- Spoken language understanding |
|
license: "CC0" |
|
datasets: |
|
- Timers and Such |
|
metrics: |
|
- Accuracy |
|
|
|
--- |
|
|
|
# End-to-end SLU model for Timers and Such |
|
|
|
Attention-based RNN sequence-to-sequence model for [Timers and Such](https://zenodo.org/record/4623772) trained on the `train-real` subset. This model checkpoint achieves 86.7% accuracy on `test-real`. |
|
|
|
The model uses an ASR model trained on LibriSpeech (`speechbrain/asr-crdnn-rnnlm-librispeech`) to extract features from the input audio, then maps these features to an intent and slot labels using a beam search. |
|
|
|
The dataset has four intents: `SetTimer`, `SetAlarm`, `SimpleMath`, and `UnitConversion`. Try testing the model by saying something like "set a timer for 5 minutes" or "what's 32 degrees Celsius in Fahrenheit?" |
|
|
|
You can try the model on the `math.wav` file included here as follows: |
|
``` |
|
from speechbrain.pretrained import EndToEndSLU |
|
slu = EndToEndSLU.from_hparams("speechbrain/slu-timers-and-such-direct-librispeech-asr") |
|
slu.decode_file("math.wav") |
|
``` |
|
|
|
#### Referencing SpeechBrain |
|
|
|
``` |
|
@misc{SB2021, |
|
author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua }, |
|
title = {SpeechBrain}, |
|
year = {2021}, |
|
publisher = {GitHub}, |
|
journal = {GitHub repository}, |
|
howpublished = {\\\\\\\\\\\\\\\\url{https://github.com/speechbrain/speechbrain}}, |
|
} |
|
``` |
|
|
|
#### Referencing Timers and Such |
|
(TODO add paper once released) |