Does it support server mode?
#1
by yinkaisheng - opened
Hi, I noticed that the current README only shows a CLI usage example (locate-anything-cli detect ...).
I was wondering if there is any support for running this model through a llama-server / OpenAI-compatible HTTP server (similar to llama.cpp server mode)?
Or is the CLI the only supported inference interface at the moment?