Is is possible to run the model in 2 gpu?
#5
by
thanhnew2001
- opened
I tried to use max_memory_mapping to enable 2 gpus but it seems not support
max_memory_mapping = {0: "600MB", 1: "1GB"}
model_name = "bigscience/bloom-3b"
model_4bit = AutoModelForCausalLM.from_pretrained(
model_name, device_map="auto", load_in_4bit=True, max_memory=max_memory_mapping
)
Ctranslate2 in the past did not support this. If you enable multiple GPU‘s each GPU will hold a entrie full copy.
michaelfeil
changed discussion status to
closed