Inference time on 8vcpu/32GB Ram or T4 30GBRAM, 16GB VRAM, 8vcpu

#18
by NeevrajKB - opened

Planning on deploying on a server, first time user, so asking for guidance. What's max concurrent requests possible with the aforementioned specs with a low inference time?

Sign up or log in to comment