metadata

language: en
thumbnail: null
tags:
  - Spoken language understanding
license: CC0
datasets:
  - Timers and Such
metrics:
  - Accuracy

End-to-end SLU model for Timers and Such

Attention-based RNN sequence-to-sequence model for Timers and Such trained on the train-real subset. This model checkpoint achieves 86.7% accuracy on test-real.

The model uses an ASR model trained on LibriSpeech (speechbrain/asr-crdnn-rnnlm-librispeech) to extract features from the input audio, then maps these features to an intent and slot labels using a beam search.

The dataset has four intents: SetTimer, SetAlarm, SimpleMath, and UnitConversion. Try testing the model by saying something like "set a timer for 5 minutes" or "what's 32 degrees Celsius in Fahrenheit?"

You can try the model on the math.wav file included here as follows:

from speechbrain.pretrained import EndToEndSLU
slu = EndToEndSLU.from_hparams("speechbrain/slu-timers-and-such-direct-librispeech-asr")
slu.decode_file("math.wav")

Referencing SpeechBrain

@misc{SB2021,
author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
title = {SpeechBrain},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\url{https://github.com/speechbrain/speechbrain}},
}

Referencing Timers and Such

(TODO add paper once released)