how to do batch inference for this model?
#31
by
Alan42
- opened
I want to use this model to process many data, so I need batch inference to accelerate this process. Can this model support batch inference. How to use
You can use llama_factory's inference api with vllm backend. And then you can run a multi-thread querying program.
https://github.com/hiyouga/LLaMA-Factory/tree/main?tab=readme-ov-file#quickstart
I’ll try it thank you
shenzhi-wang
changed discussion status to
closed