# Qwen2-VL-2B-Instruct 4-bit Quantized This is a 4-bit quantized version of the Qwen2-VL-2B-Instruct model. ## Model Description - **Original Model**: Qwen/Qwen2-VL-2B-Instruct - **Quantization**: 4-bit quantization using bitsandbytes - **Usage**: This model is optimized for memory efficiency while maintaining performance - **License**: Same as original model ## Usage ```python from transformers import Qwen2VLModel, AutoTokenizer import torch model = Qwen2VLModel.from_pretrained("ksukrit/qwen2-vl-2b-4bit", trust_remote_code=True, device_map="auto") tokenizer = AutoTokenizer.from_pretrained("ksukrit/qwen2-vl-2b-4bit", trust_remote_code=True) ``` ## Quantization Details - Quantization Method: bitsandbytes 4-bit quantization - Compute dtype: float16 - Uses double quantization: True - Quantization type: nf4