addy88 commited on
Commit
2e813ef
1 Parent(s): 8635cb7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Usage
2
+ The model can be used directly (without a language model) as follows:
3
+ ```python
4
+ import soundfile as sf
5
+ import torch
6
+ from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
7
+ import argparse
8
+ def parse_transcription(wav_file):
9
+ # load pretrained model
10
+ processor = Wav2Vec2Processor.from_pretrained("addy88/wav2vec2-nepali-stt")
11
+ model = Wav2Vec2ForCTC.from_pretrained("addy88/wav2vec2-nepali-stt")
12
+ # load audio
13
+ audio_input, sample_rate = sf.read(wav_file)
14
+ # pad input values and return pt tensor
15
+ input_values = processor(audio_input, sampling_rate=sample_rate, return_tensors="pt").input_values
16
+ # INFERENCE
17
+ # retrieve logits & take argmax
18
+ logits = model(input_values).logits
19
+ predicted_ids = torch.argmax(logits, dim=-1)
20
+ # transcribe
21
+ transcription = processor.decode(predicted_ids[0], skip_special_tokens=True)
22
+ print(transcription)
23
+ ```