Is is possible to run the model in 2 gpu?

by thanhnew2001 - opened Sep 30, 2023

thanhnew2001

Sep 30, 2023

I tried to use max_memory_mapping to enable 2 gpus but it seems not support

max_memory_mapping = {0: "600MB", 1: "1GB"}
model_name = "bigscience/bloom-3b"
model_4bit = AutoModelForCausalLM.from_pretrained(
model_name, device_map="auto", load_in_4bit=True, max_memory=max_memory_mapping
)

michaelfeil

Owner Sep 30, 2023

Ctranslate2 in the past did not support this. If you enable multiple GPU‘s each GPU will hold a entrie full copy.

michaelfeil changed discussion status to closed Sep 30, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment