GPU Memory Constraints for 01-ai/Yi-9B-200K Model
#3
by
microcn
- opened
What are the GPU memory requirements for loading the 01-ai/Yi-9B-200K model? I am currently facing an issue where loading the model with two RTX 4090 GPUs fails when using the following code:
model = AutoModelForCausalLM.from_pretrained(MODEL_DIR, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR, use_fast=False)
if you lower max_position_embeddings in config.json to a lower value, it should load. Required VRAM will differ based on whether you have flash attention or not.
Add device_map=device_map
when loading the model