VRAM usage
I used the demo - the model was downloaded and the interference works really nicely even in other languages. The only thing that surprises me is why the VRAM usage on the GPU keeps increasing even though I don't have the microphone on.
I also ran into the problem that the model, after showing the correct text that I just said, suddenly starts hallucinating by reducing the output to e.g. [laughs].
give me the code to inference the model ....
I just used this demo - Xenova/realtime-whisper-webgpu - and the VRAM on the GPU over time was taken up almost 100% in 2 minutes
thanks bro and its possible to inference the model using transformers code? if you know the answer please tell me.
I need this code ....
Here is the code for anyone who needs it
https://github.com/xenova/transformers.js/tree/v3/examples/webgpu-whisper
Hi again! This is now fixed!