metadata

license: mit
language:
  - ko
metrics:
  - cer
pipeline_tag: automatic-speech-recognition
tags:
  - ksponspeech
model-index:
  - name: cwwojin/stt_kr_conformer_ctc_medium
    results:
      - task:
          type: automatic-speech-recognition
        dataset:
          type: Murple/ksponspeech
          name: KsponSpeech (Korean)
          split: test
        metrics:
          - type: cer
            value: 0
            name: Test CER (%)

stt_kr_conformer_ctc_medium

Fine-tuned from "stt_en_conformer_ctc_medium" https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_ctc_medium
Trained on KsponSpeech, provided by https://aihub.or.kr/

Preprocessing

Files converted from .pcm -> .wav
Text - Korean phonetic transcription
SentencePiece tokenizer (Byte-pair encoding), vocab-size = 5,000

Evaluation

"KsponSpeech_eval_clean", "KsponSpeech_eval_other