--- license: mit language: - ko metrics: - cer pipeline_tag: automatic-speech-recognition tags: - ksponspeech model-index: - name: cwwojin/stt_kr_conformer_ctc_medium results: - task: type: automatic-speech-recognition # Required. Example: automatic-speech-recognition dataset: type: Murple/ksponspeech # Required. Example: common_voice. Use dataset id from https://hf.co/datasets name: KsponSpeech-eval (Korean) # Required. A pretty name for the dataset. Example: Common Voice (French) split: test # Optional. Example: test metrics: - type: cer # Required. Example: wer. Use metric id from https://hf.co/metrics value: 11.902 # Required. Example: 20.90 name: Test CER(%) # Optional. Example: Test WER --- # stt_kr_conformer_ctc_medium - Fine-tuned from "stt_en_conformer_ctc_medium" https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_ctc_medium - Trained on KsponSpeech, provided by https://aihub.or.kr/ ## Preprocessing - Files converted from .pcm -> .wav - Text - Korean phonetic transcription - SentencePiece tokenizer (Byte-pair encoding), vocab-size = 5,000 ## Evaluation - "KsponSpeech_eval_clean", "KsponSpeech_eval_other"