Can't load model&inference with LLava's latest inference code

#3
by kldsaid - opened

Great work! However, it seems like your model is incompatible with LLava's inference code, which contradicts what you mentioned in the readme.
A ValueError is raised when loading your model with llava's code. The full trace log is listed below:

python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:8700 --port 8702 --worker http://localhost:8702 --model-path ./Yi-VL-34B
2024-01-22 16:47:40 | INFO | model_worker | Loading the model Yi-VL-34B on worker 4333f5 ...
2024-01-22 16:47:40 | ERROR | stderr | Traceback (most recent call last):
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/paddle_python/torch201/lib/python3.8/runpy.py", line 194, in _run_module_as_main
2024-01-22 16:47:40 | ERROR | stderr |     return _run_code(code, main_globals, None,
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/paddle_python/torch201/lib/python3.8/runpy.py", line 87, in _run_code
2024-01-22 16:47:40 | ERROR | stderr |     exec(code, run_globals)
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/LLaVA/llava/serve/model_worker.py", line 275, in <module>
2024-01-22 16:47:40 | ERROR | stderr |     worker = ModelWorker(args.controller_address,
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/LLaVA/llava/serve/model_worker.py", line 65, in __init__
2024-01-22 16:47:40 | ERROR | stderr |     self.tokenizer, self.model, self.image_processor, self.context_len = load_pretrained_model(
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/LLaVA/llava/model/builder.py", line 127, in load_pretrained_model
2024-01-22 16:47:40 | ERROR | stderr |     model = AutoModelForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs)
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/paddle_python/torch201/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
2024-01-22 16:47:40 | ERROR | stderr |     return model_class.from_pretrained(
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/paddle_python/torch201/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
2024-01-22 16:47:40 | ERROR | stderr |     model = cls(config, *model_args, **model_kwargs)
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/LLaVA/llava/model/language_model/llava_llama.py", line 45, in __init__
2024-01-22 16:47:40 | ERROR | stderr |     self.model = LlavaLlamaModel(config)
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/LLaVA/llava/model/language_model/llava_llama.py", line 37, in __init__
2024-01-22 16:47:40 | ERROR | stderr |     super(LlavaLlamaModel, self).__init__(config)
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/LLaVA/llava/model/llava_arch.py", line 34, in __init__
2024-01-22 16:47:40 | ERROR | stderr |     self.mm_projector = build_vision_projector(config)
2024-01-22 16:47:40 | ERROR | stderr |   File "/root/paddlejob/workspace/log/code/LLaVA/llava/model/multimodal_projector/builder.py", line 51, in build_vision_projector
2024-01-22 16:47:40 | ERROR | stderr |     raise ValueError(f'Unknown projector type: {projector_type}')
2024-01-22 16:47:40 | ERROR | stderr | ValueError: Unknown projector type: mlp2x_gelu_Norm

I looked into the builder.py in llava's repo, and found that your projector type mlp2x_gelu_Norm is not supported https://github.com/haotian-liu/LLaVA/blob/9a26bd1435b4ac42c282757f2c16d34226575e96/llava/model/multimodal_projector/builder.py#L39

# original code in llava repo....
# mlp2x_gelu_Norm can't be matched
mlp_gelu_match = re.match(r'^mlp(\d+)x_gelu$', projector_type)

Besides, I'm confused that there is no "mm_projector.bin" in your model files, which is basically not supported by llava's model loading logic.
Looking forward to your relply!

Same here

kldsaid changed discussion title from Can't inference with LLava's latest inference code to Can't load model&inference with LLava's latest inference code

mm_projector.bin is generated only when you use parameter tune_mm_projector, so it not a problem
seems twice lager than original llava
ζˆͺ屏2024-01-22 17.59.13.png
I wonder why yi mlp parameter in pytorch_model.bin.index.json is different from llava

if projector_type == 'mlp2x_gelu_Norm':
    return nn.Sequential(
        nn.Linear(config.mm_hidden_size, config.hidden_size),
        nn.GELU(),
        nn.LayerNorm(config.hidden_size),
        nn.Linear(config.hidden_size, config.hidden_size),
        nn.LayerNorm(config.hidden_size)
    )

add in build_vision_projector, problem solved

kldsaid changed discussion status to closed

Sign up or log in to comment