JBZhang2342
/

speecht5_tts

en_accent,mozilla,t5,common_voice_1_0

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

speecht5_tts / README.md

JBZhang2342's picture

End of training

b647b78 6 months ago

|

raw history blame contribute delete

No virus

3.54 kB

	---
	language:
	- en
	license: mit
	base_model: microsoft/speecht5_tts
	tags:
	- en_accent,mozilla,t5,common_voice_1_0
	- generated_from_trainer
	datasets:
	- mozilla-foundation/common_voice_1_0
	model-index:
	- name: SpeechT5 TTS English Accented
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# SpeechT5 TTS English Accented

	This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the Common Voice dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5854

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- training_steps: 10000
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| No log \| 1.41 \| 250 \| 0.5448 \|
	\| 0.6715 \| 2.82 \| 500 \| 0.5147 \|
	\| 0.6715 \| 4.24 \| 750 \| 0.5225 \|
	\| 0.5532 \| 5.65 \| 1000 \| 0.5096 \|
	\| 0.5532 \| 7.06 \| 1250 \| 0.5293 \|
	\| 0.5156 \| 8.47 \| 1500 \| 0.5310 \|
	\| 0.5156 \| 9.89 \| 1750 \| 0.5417 \|
	\| 0.4874 \| 11.3 \| 2000 \| 0.5185 \|
	\| 0.4874 \| 12.71 \| 2250 \| 0.5112 \|
	\| 0.4693 \| 14.12 \| 2500 \| 0.5154 \|
	\| 0.4693 \| 15.54 \| 2750 \| 0.5148 \|
	\| 0.4619 \| 16.95 \| 3000 \| 0.5367 \|
	\| 0.4619 \| 18.36 \| 3250 \| 0.5207 \|
	\| 0.447 \| 19.77 \| 3500 \| 0.5318 \|
	\| 0.447 \| 21.19 \| 3750 \| 0.5286 \|
	\| 0.4348 \| 22.6 \| 4000 \| 0.5345 \|
	\| 0.4348 \| 24.01 \| 4250 \| 0.5362 \|
	\| 0.4237 \| 25.42 \| 4500 \| 0.5568 \|
	\| 0.4237 \| 26.84 \| 4750 \| 0.5352 \|
	\| 0.4195 \| 28.25 \| 5000 \| 0.5395 \|
	\| 0.4195 \| 29.66 \| 5250 \| 0.5487 \|
	\| 0.4132 \| 31.07 \| 5500 \| 0.5443 \|
	\| 0.4132 \| 32.49 \| 5750 \| 0.5491 \|
	\| 0.3975 \| 33.9 \| 6000 \| 0.5465 \|
	\| 0.3975 \| 35.31 \| 6250 \| 0.5505 \|
	\| 0.396 \| 36.72 \| 6500 \| 0.5450 \|
	\| 0.396 \| 38.14 \| 6750 \| 0.5510 \|
	\| 0.3884 \| 39.55 \| 7000 \| 0.5517 \|
	\| 0.3884 \| 40.96 \| 7250 \| 0.5685 \|
	\| 0.383 \| 42.37 \| 7500 \| 0.5622 \|
	\| 0.383 \| 43.79 \| 7750 \| 0.5659 \|
	\| 0.3806 \| 45.2 \| 8000 \| 0.5636 \|
	\| 0.3806 \| 46.61 \| 8250 \| 0.5681 \|
	\| 0.3738 \| 48.02 \| 8500 \| 0.5797 \|
	\| 0.3738 \| 49.44 \| 8750 \| 0.5741 \|
	\| 0.3705 \| 50.85 \| 9000 \| 0.5765 \|
	\| 0.3705 \| 52.26 \| 9250 \| 0.5770 \|
	\| 0.364 \| 53.67 \| 9500 \| 0.5854 \|
	\| 0.364 \| 55.08 \| 9750 \| 0.5806 \|
	\| 0.36 \| 56.5 \| 10000 \| 0.5854 \|


	### Framework versions

	- Transformers 4.36.0.dev0
	- Pytorch 2.1.0+cu121
	- Datasets 2.15.0
	- Tokenizers 0.14.1