Run it on non-Nvidia hardware?
#1
by
David341
- opened
Hello, it uses triton
, which requires an Nvidia GPU. Is there a way to run it on an Apple Silicon mac?
For now a quick work-around would be to use all native PyTorch kernels, but for this one needs to overwrite the default config:
from transformers import AutoModelForCausalLM, AutoTokenizer
from mlstm_kernels.torch.backend_module import mLSTMBackend, mLSTMBackendConfig
device="cpu"
model_name = "NX-AI/xLSTM-7b"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
model.config.step_kernel = "native"
model.config.sequence_kernel = "native_sequence__native"
model.config.chunkwise_kernel = "chunkwise--native_custbw"
config = model.config
for block in model.backbone.blocks:
block.mlstm_layer.mlstm_backend = mLSTMBackend(mLSTMBackendConfig(
chunkwise_kernel=config.chunkwise_kernel,
sequence_kernel=config.sequence_kernel,
step_kernel=config.step_kernel,
mode=config.mode,
chunk_size=config.chunk_size,
return_last_states=config.return_last_states,
autocast_kernel_dtype=config.autocast_kernel_dtype,
eps=config.eps,
inference_state_dtype=config.inference_state_dtype,))
tok = AutoTokenizer.from_pretrained(model_name)
print(model.generate(tok("Hello", return_tensors="pt")["input_ids"].to(device=device)))
However, this is not tested on Apple Silicon Mac yet.
Thank you, I will let you know the result of my test.