How can I use this model on CPU?

by zokica - opened Apr 22, 2023

Apr 22, 2023

I tried and got error (ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run pip install flash_attn)

But flash_attn requires cuda/GPU so i want able to install it.

sam-mosaic

Apr 22, 2023

What command did you run to load the model?

zokica

Apr 22, 2023

•

edited Apr 22, 2023

This one:

import transformers
model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)

However, I tried on a GPU and quality is pretty bad. It almost cannot generate anything that make sense,it is able to write just basic things, maybe 200B is just too short training. Hope 7B model will be better.

But I appreciate effort of working on the open-source models.

jfrankle

Apr 22, 2023

•

edited Apr 22, 2023

I don't expect the quality to be that good. It's a pretty small model, and the underlying dataset is of unknown quality. We intend this model to be another way of getting to know the redpajama dataset, not something necessarily good that you should try to use in production. It's possible that (1B, 200B) is too little, or that the dataset is of poor quality. We leave that analysis to the community, and we hope this model is helpful in making that determination.

jfrankle

Apr 22, 2023

It sounds like you've been able to use the model, though, so I'm going to close this issue.

jfrankle changed discussion status to closed Apr 22, 2023

daking

Apr 22, 2023

•

edited Apr 22, 2023

For anyone else ending up here, you should be able to run on CPU without installing flash/Triton. I suspect you may need a more recent transformers version, as they recently added skipping of try/except blocks when checking imports.

daking

Apr 22, 2023

•

edited Apr 22, 2023

I was able to do (run on cpu)

import transformers
model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)
tokenizer = transformers.AutoTokenizer.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)
print(model.generate(**tokenizer('hello', return_tensors='pt'), max_new_tokens=2))

after pip install transformers torch==1.13.1 einops

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment