yesj1234
/

xlsr_mid1_zh-ko

Automatic Speech Recognition

./sample_speech.py

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

xlsr_mid1_zh-ko / README.md

yesj1234's picture

Upload folder using huggingface_hub

9b1ce71 about 1 year ago

|

3.27 kB

	---
	license: apache-2.0
	base_model: facebook/wav2vec2-large-xlsr-53
	tags:
	- automatic-speech-recognition
	- ./sample_speech.py
	- generated_from_trainer
	model-index:
	- name: zh-xlsr
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# zh-xlsr

	This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on the ./SAMPLE_SPEECH.PY - NA dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.8449
	- Cer: 0.4954

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 32
	- total_eval_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 150
	- num_epochs: 15

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Cer \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|
	\| 6.0153 \| 0.5 \| 330 \| 5.3438 \| 0.9522 \|
	\| 5.3776 \| 1.0 \| 660 \| 5.1534 \| 0.9409 \|
	\| 5.2604 \| 1.5 \| 990 \| 5.0832 \| 0.9108 \|
	\| 5.2393 \| 2.01 \| 1320 \| 5.0655 \| 0.9073 \|
	\| 5.1721 \| 2.51 \| 1650 \| 5.0464 \| 0.9000 \|
	\| 5.1619 \| 3.01 \| 1980 \| 5.0244 \| 0.9045 \|
	\| 5.1308 \| 3.51 \| 2310 \| 5.0216 \| 0.9020 \|
	\| 5.0971 \| 4.01 \| 2640 \| 4.9341 \| 0.9040 \|
	\| 5.0137 \| 4.51 \| 2970 \| 4.8795 \| 0.9144 \|
	\| 4.9341 \| 5.02 \| 3300 \| 4.7250 \| 0.9039 \|
	\| 4.6832 \| 5.52 \| 3630 \| 4.2140 \| 0.8367 \|
	\| 4.1627 \| 6.02 \| 3960 \| 3.4010 \| 0.7318 \|
	\| 3.5448 \| 6.52 \| 4290 \| 2.8830 \| 0.6480 \|
	\| 3.2576 \| 7.02 \| 4620 \| 2.6253 \| 0.6266 \|
	\| 2.8561 \| 7.52 \| 4950 \| 2.4300 \| 0.5866 \|
	\| 2.7894 \| 8.02 \| 5280 \| 2.2998 \| 0.5750 \|
	\| 2.6018 \| 8.53 \| 5610 \| 2.1878 \| 0.5549 \|
	\| 2.546 \| 9.03 \| 5940 \| 2.1450 \| 0.5351 \|
	\| 2.3787 \| 9.53 \| 6270 \| 2.1027 \| 0.5340 \|
	\| 2.335 \| 10.03 \| 6600 \| 2.0304 \| 0.5166 \|
	\| 2.2138 \| 10.53 \| 6930 \| 2.0100 \| 0.5165 \|
	\| 2.2381 \| 11.03 \| 7260 \| 1.9651 \| 0.5031 \|
	\| 2.1108 \| 11.53 \| 7590 \| 1.9666 \| 0.5035 \|
	\| 2.0916 \| 12.04 \| 7920 \| 1.9136 \| 0.4998 \|
	\| 2.0229 \| 12.54 \| 8250 \| 1.8988 \| 0.5028 \|
	\| 2.0056 \| 13.04 \| 8580 \| 1.8769 \| 0.4996 \|
	\| 1.9245 \| 13.54 \| 8910 \| 1.8716 \| 0.4955 \|
	\| 1.9378 \| 14.04 \| 9240 \| 1.8561 \| 0.4946 \|
	\| 1.9003 \| 14.54 \| 9570 \| 1.8485 \| 0.4936 \|


	### Framework versions

	- Transformers 4.34.0
	- Pytorch 2.1.0+cu121
	- Datasets 2.14.5
	- Tokenizers 0.14.1