Spaces:
Running
on
Zero
Running
on
Zero
faster inference?
#1
by
DoctorSlimm
- opened
great model im a huge fan! any way to make it faster?
anything along the lines of vllm or so for this model arch?
batching? blfloat16? onnx? quantization?