Upload README.md with huggingface_hub

037dcb5 verified 5 months ago

4.31 kB

	---
	tags:
	- generated_from_trainer
	base_model: Lakoc/DeCRED_small_cv_2
	datasets:
	- common_voice_13_0
	metrics:
	- wer
	model-index:
	- name: DeCRED_linear_mixing_tuning
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# DeCRED_linear_mixing_tuning

	This model is a fine-tuned version of [Lakoc/DeCRED_small_cv_2](https://huggingface.co/Lakoc/DeCRED_small_cv_2) on the common_voice_13_0 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.0601
	- Cer: 0.0632
	- Wer: 0.1472
	- Mer: 0.1445
	- Wil: 0.2408
	- Wip: 0.7592
	- Hits: 23157
	- Substitutions: 2930
	- Deletions: 486
	- Insertions: 495

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.01
	- train_batch_size: 256
	- eval_batch_size: 64
	- seed: 42
	- distributed_type: multi-GPU
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 1024
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Cer \| Wer \| Mer \| Wil \| Wip \| Hits \| Substitutions \| Deletions \| Insertions \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:-----:\|:-------------:\|:---------:\|:----------:\|
	\| 3.4981 \| 2.67 \| 20 \| 3.3391 \| 3.8755 \| 3.5546 \| 0.9635 \| 0.9950 \| 0.0050 \| 3582 \| 22099 \| 892 \| 71466 \|
	\| 1.2736 \| 5.33 \| 40 \| 1.2175 \| 0.0756 \| 0.1717 \| 0.1678 \| 0.2775 \| 0.7225 \| 22623 \| 3423 \| 527 \| 612 \|
	\| 1.1073 \| 8.0 \| 60 \| 1.0687 \| 0.0647 \| 0.1511 \| 0.1483 \| 0.2464 \| 0.7536 \| 23059 \| 2993 \| 521 \| 501 \|
	\| 1.0963 \| 10.67 \| 80 \| 1.0656 \| 0.0638 \| 0.1492 \| 0.1464 \| 0.2436 \| 0.7564 \| 23122 \| 2963 \| 488 \| 514 \|
	\| 1.0811 \| 13.33 \| 100 \| 1.0630 \| 0.0636 \| 0.1478 \| 0.1451 \| 0.2416 \| 0.7584 \| 23152 \| 2937 \| 484 \| 507 \|
	\| 1.1036 \| 16.0 \| 120 \| 1.0617 \| 0.0634 \| 0.1476 \| 0.1448 \| 0.2410 \| 0.7590 \| 23160 \| 2925 \| 488 \| 509 \|
	\| 1.0831 \| 18.67 \| 140 \| 1.0610 \| 0.0632 \| 0.1474 \| 0.1447 \| 0.2410 \| 0.7590 \| 23157 \| 2931 \| 485 \| 501 \|
	\| 1.0914 \| 21.33 \| 160 \| 1.0607 \| 0.0634 \| 0.1478 \| 0.1451 \| 0.2418 \| 0.7582 \| 23142 \| 2941 \| 490 \| 497 \|
	\| 1.1033 \| 24.0 \| 180 \| 1.0605 \| 0.0631 \| 0.1470 \| 0.1443 \| 0.2405 \| 0.7595 \| 23162 \| 2925 \| 486 \| 496 \|
	\| 1.0849 \| 26.67 \| 200 \| 1.0603 \| 0.0632 \| 0.1472 \| 0.1445 \| 0.2407 \| 0.7593 \| 23159 \| 2926 \| 488 \| 498 \|
	\| 1.0937 \| 29.33 \| 220 \| 1.0603 \| 0.0632 \| 0.1473 \| 0.1445 \| 0.2407 \| 0.7593 \| 23160 \| 2925 \| 488 \| 500 \|
	\| 1.1295 \| 32.0 \| 240 \| 1.0601 \| 0.0632 \| 0.1471 \| 0.1444 \| 0.2406 \| 0.7594 \| 23162 \| 2926 \| 485 \| 499 \|
	\| 1.0741 \| 34.67 \| 260 \| 1.0602 \| 0.0631 \| 0.1471 \| 0.1444 \| 0.2405 \| 0.7595 \| 23161 \| 2924 \| 488 \| 496 \|
	\| 1.073 \| 37.33 \| 280 \| 1.0601 \| 0.0631 \| 0.1471 \| 0.1444 \| 0.2407 \| 0.7593 \| 23159 \| 2927 \| 487 \| 496 \|
	\| 1.0846 \| 40.0 \| 300 \| 1.0601 \| 0.0631 \| 0.1471 \| 0.1445 \| 0.2408 \| 0.7592 \| 23158 \| 2929 \| 486 \| 495 \|
	\| 1.0717 \| 42.67 \| 320 \| 1.0601 \| 0.0632 \| 0.1472 \| 0.1445 \| 0.2408 \| 0.7592 \| 23158 \| 2929 \| 486 \| 497 \|
	\| 1.1017 \| 45.33 \| 340 \| 1.0601 \| 0.0632 \| 0.1472 \| 0.1445 \| 0.2408 \| 0.7592 \| 23157 \| 2930 \| 486 \| 495 \|


	### Framework versions

	- Transformers 4.40.0.dev0
	- Pytorch 2.2.0+rocm5.6
	- Datasets 2.18.0
	- Tokenizers 0.15.2

	### Wandb run
	https://wandb.ai/butspeechfit/decred_commonvoice_en/runs/DeCRED_linear_mixing_tuning