update model card README.md

00b9db6 about 1 year ago

4.46 kB

	---
	license: apache-2.0
	tags:
	- automatic-speech-recognition
	- hts98/original_ver1.2
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: wav2vec2-xls-r-300m-paper
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# wav2vec2-xls-r-300m-paper

	This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the HTS98/ORIGINAL_VER1.2 - NA dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7895
	- Wer: 0.4398

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 10
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 420
	- num_epochs: 50.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:------:\|
	\| No log \| 1.0 \| 335 \| 3.7157 \| 1.0 \|
	\| 6.2976 \| 2.0 \| 670 \| 3.3644 \| 1.0 \|
	\| 3.2342 \| 3.0 \| 1005 \| 2.4597 \| 0.9739 \|
	\| 3.2342 \| 4.0 \| 1340 \| 1.4160 \| 0.7444 \|
	\| 1.2813 \| 5.0 \| 1675 \| 1.1338 \| 0.6543 \|
	\| 0.7279 \| 6.0 \| 2010 \| 1.0020 \| 0.5856 \|
	\| 0.7279 \| 7.0 \| 2345 \| 0.8435 \| 0.4823 \|
	\| 0.5226 \| 8.0 \| 2680 \| 0.8757 \| 0.5078 \|
	\| 0.4218 \| 9.0 \| 3015 \| 0.7895 \| 0.4398 \|
	\| 0.4218 \| 10.0 \| 3350 \| 0.7992 \| 0.4228 \|
	\| 0.3421 \| 11.0 \| 3685 \| 0.8118 \| 0.4307 \|
	\| 0.287 \| 12.0 \| 4020 \| 0.8215 \| 0.4248 \|
	\| 0.287 \| 13.0 \| 4355 \| 0.8603 \| 0.4077 \|
	\| 0.2415 \| 14.0 \| 4690 \| 0.8329 \| 0.3886 \|
	\| 0.2132 \| 15.0 \| 5025 \| 0.8728 \| 0.3955 \|
	\| 0.2132 \| 16.0 \| 5360 \| 0.8741 \| 0.3918 \|
	\| 0.1857 \| 17.0 \| 5695 \| 0.8633 \| 0.3675 \|
	\| 0.1673 \| 18.0 \| 6030 \| 0.8884 \| 0.3804 \|
	\| 0.1673 \| 19.0 \| 6365 \| 0.9141 \| 0.3679 \|
	\| 0.1479 \| 20.0 \| 6700 \| 0.9568 \| 0.3605 \|
	\| 0.1386 \| 21.0 \| 7035 \| 0.9341 \| 0.3630 \|
	\| 0.1386 \| 22.0 \| 7370 \| 0.9645 \| 0.3537 \|
	\| 0.1233 \| 23.0 \| 7705 \| 0.9729 \| 0.3567 \|
	\| 0.1177 \| 24.0 \| 8040 \| 1.0013 \| 0.3454 \|
	\| 0.1177 \| 25.0 \| 8375 \| 1.0323 \| 0.3597 \|
	\| 0.1061 \| 26.0 \| 8710 \| 1.0269 \| 0.3456 \|
	\| 0.1028 \| 27.0 \| 9045 \| 1.0042 \| 0.3424 \|
	\| 0.1028 \| 28.0 \| 9380 \| 1.0424 \| 0.3394 \|
	\| 0.0961 \| 29.0 \| 9715 \| 1.0600 \| 0.3412 \|
	\| 0.0949 \| 30.0 \| 10050 \| 1.0512 \| 0.3389 \|
	\| 0.0949 \| 31.0 \| 10385 \| 1.0957 \| 0.3389 \|
	\| 0.0878 \| 32.0 \| 10720 \| 1.0924 \| 0.3311 \|
	\| 0.0852 \| 33.0 \| 11055 \| 1.0859 \| 0.3366 \|
	\| 0.0852 \| 34.0 \| 11390 \| 1.1498 \| 0.3450 \|
	\| 0.0837 \| 35.0 \| 11725 \| 1.0844 \| 0.3329 \|
	\| 0.0814 \| 36.0 \| 12060 \| 1.1051 \| 0.3321 \|
	\| 0.0814 \| 37.0 \| 12395 \| 1.0878 \| 0.3311 \|
	\| 0.0793 \| 38.0 \| 12730 \| 1.1377 \| 0.3286 \|
	\| 0.0759 \| 39.0 \| 13065 \| 1.1136 \| 0.3246 \|
	\| 0.0759 \| 40.0 \| 13400 \| 1.1216 \| 0.3268 \|
	\| 0.0726 \| 41.0 \| 13735 \| 1.1300 \| 0.3253 \|
	\| 0.0715 \| 42.0 \| 14070 \| 1.1507 \| 0.3262 \|
	\| 0.0715 \| 43.0 \| 14405 \| 1.1562 \| 0.3275 \|
	\| 0.0711 \| 44.0 \| 14740 \| 1.1486 \| 0.3219 \|
	\| 0.0699 \| 45.0 \| 15075 \| 1.1580 \| 0.3194 \|
	\| 0.0699 \| 46.0 \| 15410 \| 1.1580 \| 0.3195 \|
	\| 0.0667 \| 47.0 \| 15745 \| 1.1504 \| 0.3212 \|
	\| 0.0667 \| 48.0 \| 16080 \| 1.1580 \| 0.3203 \|
	\| 0.0667 \| 49.0 \| 16415 \| 1.1698 \| 0.3192 \|
	\| 0.0664 \| 50.0 \| 16750 \| 1.1744 \| 0.3192 \|


	### Framework versions

	- Transformers 4.31.0.dev0
	- Pytorch 2.0.0+cu117
	- Datasets 2.7.0
	- Tokenizers 0.13.2

	---
	license: apache-2.0
	tags:
	- automatic-speech-recognition
	- hts98/original_ver1.2
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: wav2vec2-xls-r-300m-paper
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# wav2vec2-xls-r-300m-paper

	This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the HTS98/ORIGINAL_VER1.2 - NA dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7895
	- Wer: 0.4398

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 10
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 420
	- num_epochs: 50.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:------:\|
	\| No log \| 1.0 \| 335 \| 3.7157 \| 1.0 \|
	\| 6.2976 \| 2.0 \| 670 \| 3.3644 \| 1.0 \|
	\| 3.2342 \| 3.0 \| 1005 \| 2.4597 \| 0.9739 \|
	\| 3.2342 \| 4.0 \| 1340 \| 1.4160 \| 0.7444 \|
	\| 1.2813 \| 5.0 \| 1675 \| 1.1338 \| 0.6543 \|
	\| 0.7279 \| 6.0 \| 2010 \| 1.0020 \| 0.5856 \|
	\| 0.7279 \| 7.0 \| 2345 \| 0.8435 \| 0.4823 \|
	\| 0.5226 \| 8.0 \| 2680 \| 0.8757 \| 0.5078 \|
	\| 0.4218 \| 9.0 \| 3015 \| 0.7895 \| 0.4398 \|
	\| 0.4218 \| 10.0 \| 3350 \| 0.7992 \| 0.4228 \|
	\| 0.3421 \| 11.0 \| 3685 \| 0.8118 \| 0.4307 \|
	\| 0.287 \| 12.0 \| 4020 \| 0.8215 \| 0.4248 \|
	\| 0.287 \| 13.0 \| 4355 \| 0.8603 \| 0.4077 \|
	\| 0.2415 \| 14.0 \| 4690 \| 0.8329 \| 0.3886 \|
	\| 0.2132 \| 15.0 \| 5025 \| 0.8728 \| 0.3955 \|
	\| 0.2132 \| 16.0 \| 5360 \| 0.8741 \| 0.3918 \|
	\| 0.1857 \| 17.0 \| 5695 \| 0.8633 \| 0.3675 \|
	\| 0.1673 \| 18.0 \| 6030 \| 0.8884 \| 0.3804 \|
	\| 0.1673 \| 19.0 \| 6365 \| 0.9141 \| 0.3679 \|
	\| 0.1479 \| 20.0 \| 6700 \| 0.9568 \| 0.3605 \|
	\| 0.1386 \| 21.0 \| 7035 \| 0.9341 \| 0.3630 \|
	\| 0.1386 \| 22.0 \| 7370 \| 0.9645 \| 0.3537 \|
	\| 0.1233 \| 23.0 \| 7705 \| 0.9729 \| 0.3567 \|
	\| 0.1177 \| 24.0 \| 8040 \| 1.0013 \| 0.3454 \|
	\| 0.1177 \| 25.0 \| 8375 \| 1.0323 \| 0.3597 \|
	\| 0.1061 \| 26.0 \| 8710 \| 1.0269 \| 0.3456 \|
	\| 0.1028 \| 27.0 \| 9045 \| 1.0042 \| 0.3424 \|
	\| 0.1028 \| 28.0 \| 9380 \| 1.0424 \| 0.3394 \|
	\| 0.0961 \| 29.0 \| 9715 \| 1.0600 \| 0.3412 \|
	\| 0.0949 \| 30.0 \| 10050 \| 1.0512 \| 0.3389 \|
	\| 0.0949 \| 31.0 \| 10385 \| 1.0957 \| 0.3389 \|
	\| 0.0878 \| 32.0 \| 10720 \| 1.0924 \| 0.3311 \|
	\| 0.0852 \| 33.0 \| 11055 \| 1.0859 \| 0.3366 \|
	\| 0.0852 \| 34.0 \| 11390 \| 1.1498 \| 0.3450 \|
	\| 0.0837 \| 35.0 \| 11725 \| 1.0844 \| 0.3329 \|
	\| 0.0814 \| 36.0 \| 12060 \| 1.1051 \| 0.3321 \|
	\| 0.0814 \| 37.0 \| 12395 \| 1.0878 \| 0.3311 \|
	\| 0.0793 \| 38.0 \| 12730 \| 1.1377 \| 0.3286 \|
	\| 0.0759 \| 39.0 \| 13065 \| 1.1136 \| 0.3246 \|
	\| 0.0759 \| 40.0 \| 13400 \| 1.1216 \| 0.3268 \|
	\| 0.0726 \| 41.0 \| 13735 \| 1.1300 \| 0.3253 \|
	\| 0.0715 \| 42.0 \| 14070 \| 1.1507 \| 0.3262 \|
	\| 0.0715 \| 43.0 \| 14405 \| 1.1562 \| 0.3275 \|
	\| 0.0711 \| 44.0 \| 14740 \| 1.1486 \| 0.3219 \|
	\| 0.0699 \| 45.0 \| 15075 \| 1.1580 \| 0.3194 \|
	\| 0.0699 \| 46.0 \| 15410 \| 1.1580 \| 0.3195 \|
	\| 0.0667 \| 47.0 \| 15745 \| 1.1504 \| 0.3212 \|
	\| 0.0667 \| 48.0 \| 16080 \| 1.1580 \| 0.3203 \|
	\| 0.0667 \| 49.0 \| 16415 \| 1.1698 \| 0.3192 \|
	\| 0.0664 \| 50.0 \| 16750 \| 1.1744 \| 0.3192 \|


	### Framework versions

	- Transformers 4.31.0.dev0
	- Pytorch 2.0.0+cu117
	- Datasets 2.7.0
	- Tokenizers 0.13.2