Using a laptop with a 4GB GPU for inference - does it make at all sense to try?

by grzenkom - opened Sep 15, 2023

Sep 15, 2023

I have a laptop with 32GB of RAM memory and Quadro T1000 card with 4GB of memory. I am able to load the model successfully, but the text generation is slow (i.e. I am yet to see a single generated token).

Is there a chance to get any output in reasonable time? Let's say 15-20 minutes would be OK for the experiments I am conducting.

I am wondering if there is anything I could improved in my setup or the only way to go is switching to a more powerful machine right away.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment