Question: mobile deployment β has anyone tested this?
#137
by 3morixd - opened
We test models on 40 phones (Snapdragon 865, 8GB RAM) at Dispatch AI (FZE, UAE).
Question: has anyone benchmarked this on mobile? Specifically:
- Inference speed (tokens/sec)?
- Model size after Q4_K_M quantization?
- RAM usage after load?
Happy to share our phone farm results if there's interest.
- Dispatch AI (FZE), Sharjah UAE