Spaces:
Running
Running
Can it be used in case of real-time voice-chat scenario?
#2
by
satya7
- opened
As the current hf-cpu deployment space taking 4-6 sec to infer a 10 sec audio output. Can you suggest what all approach is possible to reduce that latency. Thanks for the nice work.
We will be releasing a smaller model which can be used for the same. @utkarshshukla2912 can add more
Hey @satya7 , we will be providing a distill version of the model for realtime inference. Also a bit less expressive model which can run on CPU.