spow12
/

whisper-medium-zeroth_korean

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

whisper-medium-zeroth_korean / README.md

spow12's picture

Update README.md

3f2064c 11 months ago

|

No virus

1.09 kB

	---
	license: apache-2.0
	datasets:
	- Bingsu/zeroth-korean
	language:
	- ko
	metrics:
	- cer
	- wer
	pipeline_tag: automatic-speech-recognition
	---
	# Whisper-Medium-KsponSpeech

	The Whisper-medium Model finetunned with [KsponSpeech](https://huggingface.co/datasets/Murple/ksponspeech)


	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by : [yw0nam](https://github.com/yw0nam)
	- Shared by : [yw0nam](https://github.com/yw0nam)
	- Model type : ASR
	- License: [apache-2.0]

	## Uses

	```

	processor = WhisperProcessor.from_pretrained("openai/whisper-medium", language="ko", task="transcribe")
	model = WhisperForConditionalGeneration.from_pretrained('spow12/whisper-medium-zeroth_korean').cuda()

	data, _ = librosa.load(wav_path, sr=16000)
	input_features = processor(data, sampling_rate=16000, return_tensors="pt").input_features.cuda()

	predicted_ids = model.generate(input_features)
	transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]

	```

	### Metrics

	Metric \| result \|
	--- \| --- \|
	WER \| 3.96 \|
	CER \| 1.71 \|