SalahZa
/

Tunisian_Automatic_Speech_Recognition

Model card Files Files and versions Community

Tunisian_Automatic_Speech_Recognition / README.md

SalahZa's picture

better readme and requirements

00a166a about 1 year ago

|

1.13 kB

	# Tunisian Arabic ASR Model with wav2vec2

	This repository provides all the necessary tools to perform automatic speech recognition from an end-to-end system pretrained on Tunisian arabic dialect

	## Performance
	The following table summarizes the performance of the model on various considered test sets :

	\| Dataset \| CER \| WER \|
	\|-------------- \|------- \|------- \|
	\| TARIC \| 6.22 \| 10.55 \|
	\| IWSLT \| 21.18 \| 39.53 \|
	\| TunSwitch TO \| 9.67 \| 25.54 \|

	More details about the test sets, and the conditions leading to this performance in the paper.

	## Datasets
	This ASR model was trained on :
	* TARIC : The corpus, named TARIC (Tunisian Arabic Railway Interaction Corpus) has a collection of audio recordings and transcriptions from dialogues in the Tunisian Railway Transport Network. - [Taric Corpus](https://aclanthology.org/L14-1385/) -
	* IWSLT : A Tunisian conversational speech - [IWSLT Corpus](https://iwslt.org/2022/dialect)-
	* TunSwitch : Our crowd-collected dataset described in the paper presented below.



	## Inference
	## Install
	```python
	pip install speechbrain transformers
	```