File size: 1,134 Bytes
8b664ce 00a166a 8b664ce 00a166a 8b664ce 00a166a 8b664ce |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# Tunisian Arabic ASR Model with wav2vec2
This repository provides all the necessary tools to perform automatic speech recognition from an end-to-end system pretrained on Tunisian arabic dialect
## Performance
The following table summarizes the performance of the model on various considered test sets :
| Dataset | CER | WER |
|-------------- |------- |------- |
| TARIC | 6.22 | 10.55 |
| IWSLT | 21.18 | 39.53 |
| TunSwitch TO | 9.67 | 25.54 |
More details about the test sets, and the conditions leading to this performance in the paper.
## Datasets
This ASR model was trained on :
* TARIC : The corpus, named TARIC (Tunisian Arabic Railway Interaction Corpus) has a collection of audio recordings and transcriptions from dialogues in the Tunisian Railway Transport Network. - [Taric Corpus](https://aclanthology.org/L14-1385/) -
* IWSLT : A Tunisian conversational speech - [IWSLT Corpus](https://iwslt.org/2022/dialect)-
* TunSwitch : Our crowd-collected dataset described in the paper presented below.
## Inference
## Install
```python
pip install speechbrain transformers
```
|