RPC server possible?

#8
by x-polyglot-x - opened

Hi!

Thanks for creating this. I am curious if your inference engine supports rpc-server (or something similar)?

Edit: I'd just like to confirm that the q2 quant is very capable. I am using it with MTP and it is fast on M4 Max 128gb. Deepseek is very good at limiting thinking (no thought loops) - just as you stated, it thinks proportional to the complexity of the problem. The reason I asked about rpc-server is that I could run q4 in that case, which would be cool! But either way, this is great! Thank you again!

Sign up or log in to comment