[Code Help] quick start code snippet taking too long to generate a response
#194
by
ppoptart
- opened
Hi, I have tried running the code below on both my local VS code and a google Colab but it is taking very long to run/never completes generating. Can someone help me fix this, or is this normal behaviour?
import transformers
import torch
access_token = 'MY_TOKEN'
model_id = "meta-llama/Meta-Llama-3-8B"
pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto",token=access_token
)
pipeline("Hey how are you doing today?")
I have the same problem. My platform is a cloud virtual machine with 32GB memory, 16 Core AMD CPU, no GPU. It takes about 30-60 minutes to generate the answer. If anyone has the optimization method, discuss with me plz. Thanks a lot.