racai
/

wav2vec2-base-100k-voxpopuli-romanian

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

wav2vec2-base-100k-voxpopuli-romanian / README.md

racai-andrei's picture

Update README.md

8b61d43 over 1 year ago

|

history blame contribute delete

794 Bytes

	---
	license: mit
	---

	The model was fine-tuned on 300h of public and private speech data. More information will be given once the underlying paper gets published.

	```
	import librosa
	from transformers import Wav2Vec2Processor, AutoModelForCTC
	import torch

	audio, _ = librosa.load("[audio_path]", sr=16000)
	model = AutoModelForCTC.from_pretrained("racai/wav2vec2-base-100k-voxpopuli-romanian")
	processor = Wav2Vec2Processor.from_pretrained("racai/wav2vec2-base-100k-voxpopuli-romanian")

	input_dict = processor(audio, sampling_rate=16000, return_tensors="pt")

	with torch.inference_mode():
	logits = model(input_dict.input_values).logits

	predicted_ids = torch.argmax(logits, dim=-1)
	predicted_sentence = processor.batch_decode(predicted_ids)[0]

	print("Prediction:", predicted_sentence)
	```