Deeply confused about how this is running on my system - is it GPU or CPU?

by mstachow - opened

I'm running Windows, Nvidia Geforce 1070 (OLD! But free, and my other GPU is busy...). I load the model using the transformers sample code and it runs relatively quickly, maybe...a minute or two to generate about 500 tokens? Not bad for the old computer it's running on. However, when I look at the performance in task manager, it is clear that the GPU is storing the model but its utilization is nearly 0%, while the CPU is cranked. Did I miss something about how this is supposed to run?

Sign up or log in to comment