OutOfMemoryError

#7
by deepakkrish92 - opened

I was trying to use 'mosaicml/mpt-30b' in databricks. compute resource is A100 1 GPU 220 GB. I think it is mentioned in model card that it is easy to deploy mpt-30b on a single GPU—1xA100-80GB. my dependencies are transformers==4.30.2 & torch Version: 1.13.1+cu117. Getting the memory error when I am calling the final inference here -
with torch.autocast('cuda', dtype=torch.bfloat16):
sequences = pipeline("What is Machine Learning?",
max_new_tokens=100,
do_sample=True,
use_cache=True,
)
print(sequences)

Sign up or log in to comment