File size: 1,134 Bytes
8b664ce
 
 
 
 
00a166a
 
 
 
 
 
 
 
 
 
 
8b664ce
 
 
00a166a
 
 
8b664ce
00a166a
8b664ce
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Tunisian Arabic ASR Model with wav2vec2

This repository provides all the necessary tools to perform automatic speech recognition from an end-to-end system pretrained on Tunisian arabic dialect

## Performance
The following table summarizes the performance of the model on various considered test sets : 

| Dataset      	| CER   	| WER   	|
|--------------	|-------	|-------	|
| TARIC        	| 6.22  	| 10.55 	|
| IWSLT        	| 21.18 	| 39.53 	|
| TunSwitch TO 	| 9.67  	| 25.54 	|

More details about the test sets, and the conditions leading to this performance in the paper. 

## Datasets
This ASR model was trained on :
* TARIC : The corpus, named TARIC (Tunisian Arabic Railway Interaction Corpus) has a collection of audio recordings and transcriptions from dialogues in the Tunisian Railway Transport Network. - [Taric Corpus](https://aclanthology.org/L14-1385/) -
* IWSLT : A Tunisian conversational speech - [IWSLT Corpus](https://iwslt.org/2022/dialect)-
* TunSwitch : Our crowd-collected dataset described in the paper presented below. 



## Inference 
## Install
```python
pip install speechbrain transformers
```