fastinom
/

ASR_fassy

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

fastinom commited on 25 days ago

Commit

80fe41f

•

1 Parent(s): 5e9f513

Update README.md

Files changed (1) hide show

README.md +32 -0

README.md CHANGED Viewed

@@ -69,6 +69,38 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
 ## Training Details

 Use the code below to get started with the model.
+### Running the model
+<details>
+<summary> Click to expand </summary>
+```python
+!pip install transformers datasets torchaudio
+from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
+import torch
+import torchaudio
+model_id = "fastinom/ASR_fassy"
+model = Wav2Vec2ForCTC.from_pretrained(model_id)
+processor = Wav2Vec2Processor.from_pretrained(model_id)
+def load_audio(file_path):
+    speech_array, sampling_rate = torchaudio.load(file_path)
+    resampler = torchaudio.transforms.Resample(sampling_rate, 16000)
+    speech = resampler(speech_array).squeeze().numpy()
+    return speech
+audio_file = "/content/drive/MyDrive/recordings/wavefiles/1.wa"#YOUR AUDIO PATH
+speech = load_audio(audio_file)
+inputs = processor(speech, sampling_rate=16000, return_tensors="pt", padding=True)
+with torch.no_grad():
+    logits = model(inputs.input_values).logits
+predicted_ids = torch.argmax(logits, dim=-1)
+transcription = processor.batch_decode(predicted_ids)
+print(transcription[0])
+```
+</details>
 ## Training Details