File size: 889 Bytes
ca94c75
 
 
 
 
 
520b962
ca94c75
 
 
 
520b962
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
language:
- en
metrics:
- wer
- bleu
- google_bleu
tags:
- ASR
- Error Correction
- Crossmodal
---

### Model Description

Pre-Training Settings:

166k samples from Common Voice 13.0 was recognized by Whisper tiny.en.

1,000 random samples was selected as the test set, and the rest for training and validation with an 80%-20% split

- Batch size: 256

- Initial learning rate: 1e-5

- Adam optimizer

- 30 epochs

- Cross-entropy loss

- Best checkpoint saved based on WER as the evaluation metric

- Decoding is performed using beam search with a size of 5

- S2S backbone model adopted from ''[Exploring data augmentation for code generation tasks](https://aclanthology.org/2023.findings-eacl.114/)''.

Continue-Training Setting:

- 2 epochs for gold-gold to prevent the over-correction problem on ''[Ted talk data](https://cris.fbk.eu/bitstream/11582/104409/1/WIT3-EAMT2012.pdf)''