How many GPU memories that the MoE module needs?

#8
by Jazzlee - opened

Is Int4 or Int8 possible? how to do it?

Owner

this model is about 61B, I guess it need 64G for int8 and 32G for int4

Owner

tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
model = AutoModelForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.bfloat16, device_map='auto',local_files_only=False, load_in_4bit=True
)

Sign up or log in to comment