File size: 1,275 Bytes
d922a5d
 
aacf78b
 
 
 
 
 
 
 
 
 
 
 
 
 
37f099a
aacf78b
 
 
d7eb985
37f099a
d922a5d
aacf78b
1c25cae
0488b70
aacf78b
 
 
 
 
 
262f208
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
license: mit
language:
- ko
metrics:
- cer
pipeline_tag: automatic-speech-recognition
tags:
- ksponspeech
model-index:
- name: cwwojin/stt_kr_conformer_ctc_medium
  results:
  - task:
      type: automatic-speech-recognition             # Required. Example: automatic-speech-recognition
    dataset:
      type: Murple/ksponspeech          # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
      name: KsponSpeech-eval (Korean)          # Required. A pretty name for the dataset. Example: Common Voice (French)
      split: test        # Optional. Example: test
    metrics:
      - type: cer         # Required. Example: wer. Use metric id from https://hf.co/metrics
        value: 11.902       # Required. Example: 20.90
        name: Test CER(%)         # Optional. Example: Test WER
---

# stt_kr_conformer_ctc_medium
- Fine-tuned from "stt_en_conformer_ctc_medium" https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_ctc_medium
- Trained on KsponSpeech, provided by https://aihub.or.kr/
## Preprocessing
- Files converted from .pcm -> .wav
- Text - Korean phonetic transcription
- SentencePiece tokenizer (Byte-pair encoding), vocab-size = 5,000
## Evaluation
- "KsponSpeech_eval_clean", "KsponSpeech_eval_other"