mosaicml/mpt-7b-instruct · flash

uglydumpling

May 16, 2023

Can we run this model without using flash_attn on GPU?

abhi-mosaic

May 16, 2023

Yes you can! Just use attn_impl: torch.

You can do this by editing the config.json directly or by following the instructions in the README:

config = transformers.AutoConfig.from_pretrained(
  'mosaicml/mpt-7b',
  trust_remote_code=True
)
config.attn_config.attn_impl = 'torch' # it should already be 'torch' but just for clarity

model = transformers.AutoModelForCausalLM.from_pretrained(
  'mosaicml/mpt-7b',
  config=config,
  torch_dtype=torch.bfloat16,
  trust_remote_code=True
)
model.to(device='cuda:0')

abhi-mosaic changed discussion status to closed May 16, 2023

mosaicml
/

mpt-7b-instruct

flash_attn on gpu