Text2Text Generation
Transformers
PyTorch
5 languages
t5
flan-ul2
Inference Endpoints
text-generation-inference

Is it possible to run this model on the CPU?

#20
by vmajor - opened

I have CPU only pytorch, I set the following in my code:

os.environ["CUDA_VISIBLE_DEVICES"] = "-1" 
tokenizer = AutoTokenizer.from_pretrained("google/flan-ul2") 
model = T5ForConditionalGeneration.from_pretrained("google/flan-ul2") 
model = model.to('cpu') 
torch.device("cpu")

and I just did this in bash:

export TORCH_CUDA_ARCH_LIST=""

and it still tells me this even though I am actively trying to to everything to stop any calls to CUDA:

anaconda3/envs/transformers/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

I got the demo code to run on the CPU. I do not know yet why it does not want to do so when inside the actual code that I have for it, but it is clearly related to the code, not the model so it is up to me to fix.

vmajor changed discussion status to closed

Sign up or log in to comment