diallomama
/

wav2vec2-xls-r-300m-ar

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

wav2vec2-xls-r-300m-ar / README.md

diallomama's picture

update model card README.md

f3736e3 about 1 year ago

|

raw history blame contribute delete

3.76 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- common_voice
	metrics:
	- wer
	model-index:
	- name: wav2vec2-xls-r-300m-en-ar-fr-es
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: common_voice
	type: common_voice
	config: ar
	split: test
	args: ar
	metrics:
	- name: Wer
	type: wer
	value: 0.48692477711277227
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# wav2vec2-xls-r-300m-en-ar-fr-es

	This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8565
	- Wer: 0.4869

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 20
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:------:\|
	\| 6.3938 \| 0.59 \| 400 \| 3.3703 \| 1.0 \|
	\| 2.353 \| 1.18 \| 800 \| 0.9696 \| 0.7809 \|
	\| 0.9859 \| 1.77 \| 1200 \| 0.7031 \| 0.6515 \|
	\| 0.7685 \| 2.35 \| 1600 \| 0.6575 \| 0.6321 \|
	\| 0.6892 \| 2.94 \| 2000 \| 0.6030 \| 0.5927 \|
	\| 0.5866 \| 3.53 \| 2400 \| 0.5552 \| 0.5541 \|
	\| 0.5496 \| 4.12 \| 2800 \| 0.5805 \| 0.5503 \|
	\| 0.4897 \| 4.71 \| 3200 \| 0.5526 \| 0.5335 \|
	\| 0.4671 \| 5.3 \| 3600 \| 0.5622 \| 0.5507 \|
	\| 0.4346 \| 5.89 \| 4000 \| 0.5641 \| 0.5312 \|
	\| 0.3859 \| 6.48 \| 4400 \| 0.5685 \| 0.5071 \|
	\| 0.3728 \| 7.06 \| 4800 \| 0.6106 \| 0.5157 \|
	\| 0.3243 \| 7.65 \| 5200 \| 0.6782 \| 0.5270 \|
	\| 0.3073 \| 8.24 \| 5600 \| 0.6121 \| 0.5232 \|
	\| 0.2748 \| 8.83 \| 6000 \| 0.6318 \| 0.5209 \|
	\| 0.25 \| 9.42 \| 6400 \| 0.6334 \| 0.4906 \|
	\| 0.2477 \| 10.01 \| 6800 \| 0.6403 \| 0.5169 \|
	\| 0.2125 \| 10.6 \| 7200 \| 0.6498 \| 0.5080 \|
	\| 0.1997 \| 11.18 \| 7600 \| 0.7029 \| 0.5153 \|
	\| 0.1803 \| 11.77 \| 8000 \| 0.6796 \| 0.5193 \|
	\| 0.1644 \| 12.36 \| 8400 \| 0.7320 \| 0.5080 \|
	\| 0.1609 \| 12.95 \| 8800 \| 0.6705 \| 0.5081 \|
	\| 0.1419 \| 13.54 \| 9200 \| 0.7108 \| 0.5120 \|
	\| 0.1375 \| 14.13 \| 9600 \| 0.7570 \| 0.4909 \|
	\| 0.1265 \| 14.72 \| 10000 \| 0.7681 \| 0.5044 \|
	\| 0.1152 \| 15.31 \| 10400 \| 0.8180 \| 0.5011 \|
	\| 0.1094 \| 15.89 \| 10800 \| 0.7753 \| 0.4947 \|
	\| 0.0998 \| 16.48 \| 11200 \| 0.8077 \| 0.4972 \|
	\| 0.1019 \| 17.07 \| 11600 \| 0.8189 \| 0.4921 \|
	\| 0.0882 \| 17.66 \| 12000 \| 0.8351 \| 0.4922 \|
	\| 0.0855 \| 18.25 \| 12400 \| 0.8688 \| 0.4902 \|
	\| 0.0826 \| 18.84 \| 12800 \| 0.8476 \| 0.4916 \|
	\| 0.0769 \| 19.43 \| 13200 \| 0.8565 \| 0.4869 \|


	### Framework versions

	- Transformers 4.28.0
	- Pytorch 2.0.1+cu118
	- Datasets 1.18.3
	- Tokenizers 0.13.3