flozi00 commited on
Commit
ead395e
1 Parent(s): 78b3da1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -48,6 +48,40 @@ The following hyperparameters were used during training:
48
  - Datasets 2.18.0
49
  - Tokenizers 0.15.2
50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ## [About us](https://primeline-ai.com/en/)
52
 
53
  [![primeline AI](https://primeline-ai.com/wp-content/uploads/2024/02/pl_ai_bildwortmarke_original.svg)](https://primeline-ai.com/en/)
 
48
  - Datasets 2.18.0
49
  - Tokenizers 0.15.2
50
 
51
+
52
+ ### How to use
53
+
54
+ ```python
55
+ import torch
56
+ from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
57
+ from datasets import load_dataset
58
+ device = "cuda:0" if torch.cuda.is_available() else "cpu"
59
+ torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
60
+ model_id = "primeline/whisper-tiny-german"
61
+ model = AutoModelForSpeechSeq2Seq.from_pretrained(
62
+ model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
63
+ )
64
+ model.to(device)
65
+ processor = AutoProcessor.from_pretrained(model_id)
66
+ pipe = pipeline(
67
+ "automatic-speech-recognition",
68
+ model=model,
69
+ tokenizer=processor.tokenizer,
70
+ feature_extractor=processor.feature_extractor,
71
+ max_new_tokens=128,
72
+ chunk_length_s=30,
73
+ batch_size=16,
74
+ return_timestamps=True,
75
+ torch_dtype=torch_dtype,
76
+ device=device,
77
+ )
78
+ dataset = load_dataset("distil-whisper/librispeech_long", "clean", split="validation")
79
+ sample = dataset[0]["audio"]
80
+ result = pipe(sample)
81
+ print(result["text"])
82
+ ```
83
+
84
+
85
  ## [About us](https://primeline-ai.com/en/)
86
 
87
  [![primeline AI](https://primeline-ai.com/wp-content/uploads/2024/02/pl_ai_bildwortmarke_original.svg)](https://primeline-ai.com/en/)