Question: mobile deployment β€” has anyone tested this?

#137
by 3morixd - opened

We test models on 40 phones (Snapdragon 865, 8GB RAM) at Dispatch AI (FZE, UAE).

Question: has anyone benchmarked this on mobile? Specifically:

  • Inference speed (tokens/sec)?
  • Model size after Q4_K_M quantization?
  • RAM usage after load?

Happy to share our phone farm results if there's interest.

  • Dispatch AI (FZE), Sharjah UAE

Sign up or log in to comment