mmalyska
/

distil-whisper-large-v3-pl-ct2

Automatic Speech Recognition

Model card Files Files and versions Community

distil-whisper-large-v3-pl-ct2 / README.md

mmalyska's picture

Update README.md

19a598a verified 4 months ago

|

No virus

1.6 kB

	---
	license: apache-2.0
	language:
	- pl
	pipeline_tag: automatic-speech-recognition
	tags:
	- audio
	datasets:
	- Aspik101/distil-whisper-large-v3-pl
	library_name: ctranslate2
	---

	<style>
	img {
	display: inline;
	}
	</style>

	# Fine-tuned Polish Aspik101/distil-whisper-large-v3-pl model for CTranslate2

	This repository contains the [Aspik101/distil-whisper-large-v3-pl](https://huggingface.co/Aspik101/distil-whisper-large-v3-pl) model converted to the [CTranslate2](https://github.com/OpenNMT/CTranslate2) format.

	## Usage

	```python
	from faster_whisper import WhisperModel
	from huggingface_hub import snapshot_download

	downloaded_model_path = snapshot_download(repo_id="mmalyska/distil-whisper-large-v3-pl-ct2")

	# Run on GPU with FP16
	model = WhisperModel(downloaded_model_path, device="cuda", compute_type="float16")
	# or run on GPU with INT8
	# model = WhisperModel(downloaded_model_path, device="cuda", compute_type="int8_float16")
	# or run on CPU with INT8
	# model = WhisperModel(downloaded_model_path, device="cpu", compute_type="int8")

	segments, info = model.transcribe("./sample.wav", beam_size=1)

	print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

	for segment in segments:
	print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
	```

	## Conversion

	The original model was converted with the following command:

	```bash
	ct2-transformers-converter --model Aspik101/distil-whisper-large-v3-pl --output_dir distil-whisper-large-v3-pl-ct2 --copy_files tokenizer.json preprocessor_config.json --quantization float16
	```