fastinom
/

ASR_fassy

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

fastinom commited on Jun 26

Commit

59d318e

•

1 Parent(s): f4c509a

Update README.md

Files changed (1) hide show

README.md +41 -1

README.md CHANGED Viewed

@@ -69,7 +69,47 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details

 Use the code below to get started with the model.
+[### Running the model on a CPU
+<details>
+#<summary> Click to expand </summary>
+```python
+!pip install transformers datasets torchaudio
+from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
+import torch
+import torchaudio
+model_id = "fastinom/ASR_fassy"
+# Load model and processor
+model = Wav2Vec2ForCTC.from_pretrained(model_id)
+processor = Wav2Vec2Processor.from_pretrained(model_id)
+def load_audio(file_path):
+    speech_array, sampling_rate = torchaudio.load(file_path)
+    resampler = torchaudio.transforms.Resample(sampling_rate, 16000)
+    speech = resampler(speech_array).squeeze().numpy()
+    return speech
+# Example audio file path
+audio_file = "/content/drive/MyDrive/recordings/wavefiles/1.wa"#YOUR AUDIO PATH
+speech = load_audio(audio_file)
+# Preprocess the audio
+inputs = processor(speech, sampling_rate=16000, return_tensors="pt", padding=True)
+# Perform inference
+with torch.no_grad():
+    logits = model(inputs.input_values).logits
+# Decode the output
+predicted_ids = torch.argmax(logits, dim=-1)
+transcription = processor.batch_decode(predicted_ids)
+print(transcription[0])
+```
+</details>]
 ## Training Details