lvzixin
/

test

Model card Files Files and versions Community

test / README.md

lvzixin's picture

Update README.md

cef9501 11 months ago

|

891 Bytes

	---
	license: apache-2.0
	language:
	- zh
	- aa
	- af
	metrics:
	- accuracy
	library_name: diffusers
	pipeline_tag: text-to-image
	tags:
	- medical
	- code
	- suibian
	---

	# Whisper

	Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours
	of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need
	for fine-tuning.

	Whisper was proposed in the paper [Robust Speech Recognition via Large-Scale Weak Supervision](https://arxiv.org/abs/2212.04356)
	by Alec Radford et al. from OpenAI. The original code repository can be found [here](https://github.com/openai/whisper).

	Whisper `large-v3` has the same architecture as the previous large models except the following minor differences:

	1. The input uses 128 Mel frequency bins instead of 80
	2. A new language token for Cantonese