Text Generation
Transformers
PyTorch
English
llama
text-generation-inference
Inference Endpoints

Hardware requirements ?

#7
by nobitha - opened

OutOfMemoryError: CUDA out of memory. Tried to allocate 196.00 MiB (GPU 0; 14.75 GiB total capacity; 13.24 GiB already allocated; 6.81 MiB free; 13.67 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I got the above error when I load the model in colab pro ...
can you please tell me the hardware requirements ?????

Sign up or log in to comment