mmalyska commited on
Commit
80043dd
1 Parent(s): b8c6c04

Update README.md (#1)

Browse files

- Update README.md (19a598a93d1a30faa4efc29d3a0b65e52be1869d)

Files changed (1) hide show
  1. README.md +42 -1
README.md CHANGED
@@ -8,4 +8,45 @@ tags:
8
  datasets:
9
  - Aspik101/distil-whisper-large-v3-pl
10
  library_name: ctranslate2
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  datasets:
9
  - Aspik101/distil-whisper-large-v3-pl
10
  library_name: ctranslate2
11
+ ---
12
+
13
+ <style>
14
+ img {
15
+ display: inline;
16
+ }
17
+ </style>
18
+
19
+ # Fine-tuned Polish Aspik101/distil-whisper-large-v3-pl model for CTranslate2
20
+
21
+ This repository contains the [Aspik101/distil-whisper-large-v3-pl](https://huggingface.co/Aspik101/distil-whisper-large-v3-pl) model converted to the [CTranslate2](https://github.com/OpenNMT/CTranslate2) format.
22
+
23
+ ## Usage
24
+
25
+ ```python
26
+ from faster_whisper import WhisperModel
27
+ from huggingface_hub import snapshot_download
28
+
29
+ downloaded_model_path = snapshot_download(repo_id="mmalyska/distil-whisper-large-v3-pl-ct2")
30
+
31
+ # Run on GPU with FP16
32
+ model = WhisperModel(downloaded_model_path, device="cuda", compute_type="float16")
33
+ # or run on GPU with INT8
34
+ # model = WhisperModel(downloaded_model_path, device="cuda", compute_type="int8_float16")
35
+ # or run on CPU with INT8
36
+ # model = WhisperModel(downloaded_model_path, device="cpu", compute_type="int8")
37
+
38
+ segments, info = model.transcribe("./sample.wav", beam_size=1)
39
+
40
+ print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
41
+
42
+ for segment in segments:
43
+ print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
44
+ ```
45
+
46
+ ## Conversion
47
+
48
+ The original model was converted with the following command:
49
+
50
+ ```bash
51
+ ct2-transformers-converter --model Aspik101/distil-whisper-large-v3-pl --output_dir distil-whisper-large-v3-pl-ct2 --copy_files tokenizer.json preprocessor_config.json --quantization float16
52
+ ```