File size: 889 Bytes
ca94c75 520b962 ca94c75 520b962 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
language:
- en
metrics:
- wer
- bleu
- google_bleu
tags:
- ASR
- Error Correction
- Crossmodal
---
### Model Description
Pre-Training Settings:
166k samples from Common Voice 13.0 was recognized by Whisper tiny.en.
1,000 random samples was selected as the test set, and the rest for training and validation with an 80%-20% split
- Batch size: 256
- Initial learning rate: 1e-5
- Adam optimizer
- 30 epochs
- Cross-entropy loss
- Best checkpoint saved based on WER as the evaluation metric
- Decoding is performed using beam search with a size of 5
- S2S backbone model adopted from ''[Exploring data augmentation for code generation tasks](https://aclanthology.org/2023.findings-eacl.114/)''.
Continue-Training Setting:
- 2 epochs for gold-gold to prevent the over-correction problem on ''[Ted talk data](https://cris.fbk.eu/bitstream/11582/104409/1/WIT3-EAMT2012.pdf)''
|