Update README.md

cd31f90 verified 6 months ago

4.6 kB

	---
	language:
	- ga
	- en
	license: apache-2.0
	base_model: openai/whisper-small
	tags:
	- generated_from_trainer
	datasets:
	- ymoslem/IWSLT2023-GA-EN
	- ymoslem/FLEURS-GA-EN
	- ymoslem/BitesizeIrish-GA-EN
	- ymoslem/SpokenWords-GA-EN-MTed
	metrics:
	- bleu
	- wer
	- chrf
	model-index:
	- name: Whisper Small GA-EN Speech Translation
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: IWSLT-2023, FLEURS, BiteSize, and SpokenWords
	type: ymoslem/IWSLT2023-GA-EN
	metrics:
	- name: Bleu
	type: bleu
	value: 27.66
	- name: Wer
	type: wer
	value: 72.0396217919856
	library_name: transformers
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Whisper Small GA-EN Speech Translation

	This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the IWSLT-2023, FLEURS, BiteSize, and SpokenWords datasets.
	The best model checkpoint (this version) based on ChrF is at step 2100, epoch 4.5259, and
	it achieves the following results on the evaluation set:
	- Loss: 1.7200
	- Bleu: 29.83
	- Chrf: 44.87
	- Wer: 64.8807


	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	Training: IWSLT-2023 (train+dev), FLEURS, BiteSize, and SpokenWords
	Evaluation: IWSLT-2023 (test)

	## Training procedure

	### Hardware:

	1 NVIDIA A100-SXM4-80GB

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 0
	- training_steps: 3000
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Chrf \| Wer \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:-----:\|:-----:\|:--------:\|
	\| 1.9416 \| 0.2155 \| 100 \| 1.7899 \| 13.09 \| 26.48 \| 104.4575 \|
	\| 1.5186 \| 0.4310 \| 200 \| 1.5696 \| 18.6 \| 35.75 \| 87.5732 \|
	\| 1.2884 \| 0.6466 \| 300 \| 1.4751 \| 17.57 \| 37.19 \| 87.2580 \|
	\| 1.0729 \| 0.8621 \| 400 \| 1.4345 \| 17.92 \| 38.23 \| 99.2346 \|
	\| 0.4574 \| 1.0776 \| 500 \| 1.5585 \| 22.48 \| 39.17 \| 83.1607 \|
	\| 0.4517 \| 1.2931 \| 600 \| 1.5763 \| 22.53 \| 38.38 \| 81.7650 \|
	\| 0.4385 \| 1.5086 \| 700 \| 1.5852 \| 20.05 \| 39.46 \| 96.8483 \|
	\| 0.3934 \| 1.7241 \| 800 \| 1.5332 \| 26.89 \| 42.67 \| 70.6889 \|
	\| 0.3587 \| 1.9397 \| 900 \| 1.5025 \| 28.95 \| 44.16 \| 64.9707 \|
	\| 0.1528 \| 2.1552 \| 1000 \| 1.5882 \| 28.32 \| 42.36 \| 65.8712 \|
	\| 0.1425 \| 2.3707 \| 1100 \| 1.6056 \| 25.5 \| 42.42 \| 75.0113 \|
	\| 0.1389 \| 2.5862 \| 1200 \| 1.6236 \| 26.52 \| 42.11 \| 70.6439 \|
	\| 0.1532 \| 2.8017 \| 1300 \| 1.6196 \| 25.78 \| 41.61 \| 75.9118 \|
	\| 0.1138 \| 3.0172 \| 1400 \| 1.7185 \| 26.01 \| 40.88 \| 69.6983 \|
	\| 0.0661 \| 3.2328 \| 1500 \| 1.6626 \| 28.74 \| 43.16 \| 71.2292 \|
	\| 0.0625 \| 3.4483 \| 1600 \| 1.6835 \| 29.16 \| 43.6 \| 66.3215 \|
	\| 0.0615 \| 3.6638 \| 1700 \| 1.6756 \| 28.93 \| 44.08 \| 68.3476 \|
	\| 0.0611 \| 3.8793 \| 1800 \| 1.6648 \| 27.77 \| 43.67 \| 72.1747 \|
	\| 0.0344 \| 4.0948 \| 1900 \| 1.7351 \| 28.33 \| 44.18 \| 68.1225 \|
	\| 0.0339 \| 4.3103 \| 2000 \| 1.7715 \| 28.9 \| 42.98 \| 67.0869 \|
	\| 0.0369 \| 4.5259 \| 2100 \| 1.7200 \| 29.83 \| 44.87 \| 64.8807 \|
	\| 0.0326 \| 4.7414 \| 2200 \| 1.7232 \| 28.23 \| 43.75 \| 69.3832 \|
	\| 0.0346 \| 4.9569 \| 2300 \| 1.7688 \| 27.72 \| 43.1 \| 72.8050 \|
	\| 0.0167 \| 5.1724 \| 2400 \| 1.8072 \| 28.73 \| 43.26 \| 67.4471 \|
	\| 0.0146 \| 5.3879 \| 2500 \| 1.7801 \| 29.91 \| 44.24 \| 66.4566 \|
	\| 0.0165 \| 5.6034 \| 2600 \| 1.7782 \| 29.34 \| 44.33 \| 68.2125 \|
	\| 0.0143 \| 5.8190 \| 2700 \| 1.7675 \| 27.78 \| 43.07 \| 72.5799 \|
	\| 0.0106 \| 6.0345 \| 2800 \| 1.7660 \| 29.45 \| 43.31 \| 67.5371 \|
	\| 0.0098 \| 6.25 \| 2900 \| 1.7803 \| 27.89 \| 42.67 \| 71.6344 \|
	\| 0.0087 \| 6.4655 \| 3000 \| 1.7786 \| 27.66 \| 43.04 \| 72.0396 \|


	### Framework versions

	- Transformers 4.40.2
	- Pytorch 2.2.0+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1