OPEA
/

Safetensors
cicdatopea commited on
Commit
d00d7ea
1 Parent(s): 0b6ed81

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +127 -3
README.md CHANGED
@@ -1,3 +1,127 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ ## Model Details
5
+
6
+ This model is an int4 model with group_size 128 and symmetric quantization of [HuggingFaceTB/SmolVLM-Instruct](https://huggingface.co/HuggingFaceTB/SmolVLM-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision="e289950" to use AutoGPTQ format.
7
+ ## How To Use
8
+ ### INT4 Inference
9
+ ```python
10
+ from auto_round import AutoRoundConfig ##must import for auto-round format
11
+ import torch
12
+ from PIL import Image
13
+ from transformers import AutoProcessor, AutoModelForVision2Seq
14
+ from transformers.image_utils import load_image
15
+
16
+ DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
17
+ quantized_model_path = "/data1/hengguo/SmolVLM-Instruct-w4g128-auto_round/"
18
+
19
+ # Load images
20
+ image_url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
21
+ content = "Describe this image."
22
+
23
+ # Initialize processor and model
24
+ processor = AutoProcessor.from_pretrained(quantized_model_path)
25
+ model = AutoModelForVision2Seq.from_pretrained(
26
+ quantized_model_path,
27
+ torch_dtype="auto",
28
+ device_map=DEVICE,
29
+ _attn_implementation="flash_attention_2" if DEVICE == "cuda" else "eager",
30
+ ##revision="e289950" ##AutoGPTQ format
31
+ )
32
+
33
+ # Create input messages
34
+ messages = [
35
+ {
36
+ "role": "user",
37
+ "content": [
38
+ {"type": "image"},
39
+ {"type": "text", "text": content}
40
+ ]
41
+ },
42
+ ]
43
+
44
+ # Prepare inputs
45
+ prompt = processor.apply_chat_template(messages, add_generation_prompt=True)
46
+ inputs = processor(text=prompt, images=[load_image(image_url)], return_tensors="pt")
47
+ inputs = inputs.to(DEVICE)
48
+
49
+ # Generate outputs
50
+ generated_ids = model.generate(**inputs, max_new_tokens=500)
51
+ generated_texts = processor.batch_decode(
52
+ generated_ids,
53
+ skip_special_tokens=True,
54
+ )
55
+
56
+ print(generated_texts[0])
57
+ ##INT4:
58
+ ## User:<image>Describe this image.
59
+ ## Assistant: A woman is sitting on the beach with a dog. The woman is wearing a plaid shirt and has her hair down. She is smiling and holding the dog's paw. The dog is a golden retriever and is wearing a collar. The dog is sitting on the sand. The sun is setting in the background.
60
+
61
+ ##BF16:
62
+ ## User:<image>Describe this image.
63
+ ## Assistant: The image depicts a sandy beach scene with a young woman and a dog sitting side by side on the sand. The woman is on the right side of the image, wearing a plaid shirt and dark pants. She has long, dark hair and is smiling. She is holding the dog's paw in her right hand. The dog is a golden retriever, and it is wearing a blue collar with a tag. The dog is sitting on its hind legs, facing the woman. The dog's fur is light brown and it has a black nose. The dog's tail is wagging, indicating a happy and friendly demeanor.
64
+ ## The background of the image shows the ocean, with waves gently crashing against the shore. The sky is clear, with a gradient of light blue at the top and a darker blue at the bottom, indicating either sunrise or sunset. The sand on the beach is light brown and appears to be wet, with some footprints visible.
65
+ ## The overall mood of the image is peaceful and happy, as the woman and the dog appear to be enjoying each other's company. The setting is a typical beach scene, with the natural elements of the ocean and the sand providing a serene and calming atmosphere.
66
+
67
+ image_url = "http://images.cocodataset.org/train2017/000000411975.jpg"
68
+ content = "How many people are there on the baseball field in the image?"
69
+ ##INT4:
70
+ ## User:<image>How many people are there on the baseball field in the image?
71
+ ## Assistant: There are four people on the baseball field in the image.
72
+
73
+
74
+ ##BF16:
75
+ ## User:<image>How many people are there on the baseball field in the image?
76
+ ## Assistant: There are four people on the baseball field in the image.
77
+
78
+ image_url = "https://intelcorp.scene7.com/is/image/intelcorp/processor-overview-framed-badge:1920-1080?wid=480&hei=270"
79
+ content = "This image represents which company?"
80
+ ##INT4:
81
+ ## User:<image>This image represents which company?
82
+ ## Assistant: Intel.
83
+
84
+ ##BF16:
85
+ ## User:<image>This image represents which company?
86
+ ## Assistant: Intel.
87
+ ```
88
+
89
+ ### Generate the model
90
+ Here is the sample command to reproduce the model.
91
+ ```bash
92
+ pip install auto-round
93
+ auto-round-mllm \
94
+ --model HuggingFaceTB/SmolVLM-Instruct \
95
+ --device 0 \
96
+ --group_size 128 \
97
+ --bits 4 \
98
+ --iters 1000 \
99
+ --nsample 512 \
100
+ --seqlen 2048 \
101
+ --format 'auto_gptq,auto_round' \
102
+ --output_dir "./tmp_autoround"
103
+ ```
104
+
105
+ ## Ethical Considerations and Limitations
106
+
107
+ The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
108
+
109
+ Therefore, before deploying any applications of the model, developers should perform safety testing.
110
+
111
+ ## Caveats and Recommendations
112
+
113
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
114
+
115
+ Here are a couple of useful links to learn more about Intel's AI software:
116
+
117
+ - Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
118
+
119
+ ## Disclaimer
120
+
121
+ The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.
122
+
123
+ ## Cite
124
+
125
+ @article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }
126
+
127
+ [arxiv](https://arxiv.org/abs/2309.05516) [github](https://github.com/intel/auto-round)