jangmin commited on
Commit
040d602
1 Parent(s): d29cbb4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ metrics:
6
+ - wer
7
+ model-index:
8
+ - name: whisper-meidum-ko-normalized-1273h
9
+ results: []
10
+ ---
11
+
12
+ # whisper-small-ko-normalized-1273h
13
+
14
+ This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-medium) on a custom dataset for improving Korean speech recognition.
15
+ It achieves the following results on the evaluation set:
16
+ - Loss: 0.1254
17
+ - Wer: 0.0551
18
+
19
+ ## Model description
20
+
21
+ The model was trained to transcript the Korean audio sources into text.
22
+
23
+ ## Intended uses & limitations
24
+
25
+ This model was trained to extend the performance of the original whisper model for Korean transcription task.
26
+
27
+ ## Training and evaluation data
28
+
29
+ I downloaded all data from AI-HUB (https://aihub.or.kr/). Two datasets, in particular, caught my attention: "Instruction Audio Set" and "Noisy Conversation Audio Set".
30
+ Following indicates the hours information for each dastset.
31
+
32
+ |dataset name| train_split | validation_split|
33
+ |---|---|---|
34
+ |Instruction Audio Set|910|105|
35
+ |Noisy Conversation Audio Set|363|76|
36
+
37
+ ## Training procedure
38
+
39
+ ### Training hyperparameters
40
+
41
+ The following hyperparameters were used during training:
42
+ - learning_rate: 1e-05
43
+ - train_batch_size: 32
44
+ - eval_batch_size: 32
45
+ - seed: 42
46
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
+ - lr_scheduler_type: linear
48
+ - lr_scheduler_warmup_steps: 100
49
+ - num_epochs: 3
50
+ - mixed_precision_training: Native AMP
51
+
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss | Wer |
55
+ |:-------------:|:-----:|:-----:|:---------------:|:------:|
56
+ | 0.0588 | 1.0 | 8775 | 0.1225 | 0.0604 |
57
+ | 0.0287 | 2.0 | 17550 | 0.1186 | 0.0567 |
58
+ | 0.0148 | 3.0 | 26325 | 0.1254 | 0.0551 |
59
+
60
+
61
+ ### Framework versions
62
+
63
+ - Transformers 4.28.0.dev0
64
+ - Pytorch 1.13.1+cu117
65
+ - Datasets 2.11.0
66
+ - Tokenizers 0.13.2