cwwojin's picture
Update README.md
262f208
|
raw
history blame
1.27 kB
---
license: mit
language:
- ko
metrics:
- cer
pipeline_tag: automatic-speech-recognition
tags:
- ksponspeech
model-index:
- name: cwwojin/stt_kr_conformer_ctc_medium
results:
- task:
type: automatic-speech-recognition # Required. Example: automatic-speech-recognition
dataset:
type: Murple/ksponspeech # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
name: KsponSpeech (Korean) # Required. A pretty name for the dataset. Example: Common Voice (French)
split: test # Optional. Example: test
metrics:
- type: cer # Required. Example: wer. Use metric id from https://hf.co/metrics
value: 11.902 # Required. Example: 20.90
name: Test CER (%) # Optional. Example: Test WER
---
# stt_kr_conformer_ctc_medium
- Fine-tuned from "stt_en_conformer_ctc_medium" https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_ctc_medium
- Trained on KsponSpeech, provided by https://aihub.or.kr/
## Preprocessing
- Files converted from .pcm -> .wav
- Text - Korean phonetic transcription
- SentencePiece tokenizer (Byte-pair encoding), vocab-size = 5,000
## Evaluation
- "KsponSpeech_eval_clean", "KsponSpeech_eval_other"