Huggingface usage and local usage are different

#2
by beraeren - opened

import torch
from transformers import AutoProcessor, WhisperForConditionalGeneration
from scipy.io import wavfile

processor = AutoProcessor.from_pretrained("emre/whisper-medium-turkish-2")
model = WhisperForConditionalGeneration.from_pretrained("emre/whisper-medium-turkish-2")

samplerate, data = wavfile.read('./audio.wav')

data_s16 = np.frombuffer(data, dtype=np.int16, count=len(data)//2, offset=0)
x_data = data_s16.astype(np.float32, order='C') / 32768.0

inputs = processor(x_data, return_tensors="pt")
input_features = inputs.input_features

generated_ids = model.generate(inputs=input_features)

transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(transcription)
###############################################
(I think I'm using it correctly) (doğru şekilde kullandığımı düşünüyorum)
aynı ses dosyasını yüklememe rağmen Huggingface tam doğru bir sonuç verirken local cihazım saçma saban bir çıktı veriyor
Even though I uploaded the same audio file, Huggingface gives an accurate result, while my local device gives a ridiculous output.

could you share the error?

Sign up or log in to comment