Model Details
This model is an int4 model with group_size 128 and symmetric quantization of deepseek-ai/DeepSeek-R1-Distill-Qwen-32B generated by intel/auto-round algorithm.
Please follow the license of the original model.
How To Use
INT4 Inference on CUDA
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
quantized_model_dir = "OPEA/DeepSeek-R1-Distill-Qwen-32B-int4-gptq-sym-inc"
device_map="auto"
model = AutoModelForCausalLM.from_pretrained(
quantized_model_dir,
torch_dtype=torch.float16,
trust_remote_code=True,
device_map=device_map,
)
tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, trust_remote_code=True)
prompts = [
"9.11和9.8哪个数字大",
"如果你是人,你最想做什么",
"How many e in word deepseek",
"There are ten birds in a tree. A hunter shoots one. How many are left in the tree?",
]
texts = []
for prompt in prompts:
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
texts.append(text)
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
outputs = model.generate(
input_ids=inputs["input_ids"].to(model.device),
attention_mask=inputs["attention_mask"].to(model.device),
max_length=512, ##change this to align with the official usage
num_return_sequences=1,
do_sample=False ##change this to align with the official usage
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs["input_ids"], outputs)
]
decoded_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
for i, prompt in enumerate(prompts):
input_id = inputs
print(f"Prompt: {prompt}")
print(f"Generated: {decoded_outputs[i]}")
print("-" * 50)
"""
Prompt: 9.11和9.8哪个数字大
Generated: .11和.8哪个数字大
</think>
.11和.8哪个数字大
</think>
要比较 **9.11** 和 **9.8** 哪个更大,可以按照以下步骤进行:
1. **比较整数部分**:
- 两个数字的整数部分都是 **9**,所以整数部分相等。
2. **比较小数部分**:
- **9.11** 的小数部分是 **0.11**
- **9.8** 的小数部分是 **0.8**(即 **0.80**)
由于 **0.80 > 0.11**,所以 **9.8** 的小数部分更大。
3. **结论**:
- 因此,**9.8** 比 **9.11** 大。
最终答案:\boxed{9.8}
--------------------------------------------------
Prompt: 如果你是人类,你最想做什么
Generated: 您好!我是由中国的深度求索(DeepSeek)公司开发的智能助手DeepSeek-R1。有关模型和产品的详细内容请参考官方文档。
</think>
</think>
您好!我是由中国的深度求索
</think>
您好!我是由中国的深度求索(DeepSeek)公司开发的智能助手DeepSeek-R1。有关模型和产品的详细内容请参考官方文档。
--------------------------------------------------
Prompt: How many e in word deepseek
Generated: To determine how many times the letter 'e' appears in the word "deepseek," I will examine each letter one by one.
First, I'll list out the letters in the word: D, E, E, P, S, E, E, K.
Next, I'll go through each letter and count every occurrence of the letter 'e'.
Starting with the first letter, D, it's not an 'e'. The second letter is E, which counts as one. The third letter is another E, making it two. The fourth letter is P, not an 'e'. The f
ifth letter is S, also not an 'e'. The sixth letter is E, bringing the count to three. The seventh letter is another E, making it four. The last letter is K, which isn't an 'e'.
After reviewing all the letters, I find that the letter 'e' appears four times in the word "deepseek."
</think>
To determine how many times the letter **e** appears in the word **deepseek**, follow these steps:
1. **Write down the word:**
**d e e p s e e k**
2. **Identify and count each 'e':**
- **e** (position 2)
- **e** (position 3)
- **e** (position 6)
- **e** (position 7)
3. **Total count of 'e':**
There are **4** occurrences of the letter **e** in the word **deepseek**.
\[
\boxed{4}
\]
--------------------------------------------------
Prompt: There are ten birds in a tree. A hunter shoots one. How many are left in the tree?
Generated: \n</think>
If a hunter shoots one bird from a tree that initially has ten birds, the number of birds remaining in the tree would depend on the reaction of the other birds.\n\n1. **Immediate React
ion**: When a hunter shoots one bird, the loud noise and disturbance might scare the remaining birds, causing them to fly away. In this case, all the other nine birds would likely leav
e the tree.\n\n2. **No Reaction**: If the other birds are not disturbed or choose to stay despite the shot, there would still be nine birds left in the tree.\n\nHowever, in most scenar
ios, the loud noise of a gunshot would scare the birds, leading to all of them flying away..
Evaluate the model
pip3 install lm-eval==0.4.7
lm-eval --model hf --model_args pretrained=OPEA/DeepSeek-R1-Distill-Qwen-32B-int4-gptq-sym-inc --tasks leaderboard_mmlu_pro,leaderboard_ifeval,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,mmlu,gsm8k --batch_size 16
Metric | BF16 | INT4 |
---|---|---|
avg | 0.6647 | 0.6639 |
leaderboard_mmlu_pro | - | - |
mmlu | 0.7964 | 0.7928 |
lambada_openai | 0.6649 | 0.6718 |
hellaswag | 0.6292 | 0.6223 |
winogrande | 0.7482 | 0.7482 |
piqa | 0.8058 | 0.7982 |
truthfulqa_mc1 | 0.3831 | 0.3905 |
openbookqa | 0.3520 | 0.3520 |
boolq | 0.8963 | 0.8972 |
arc_easy | 0.8207 | 0.8194 |
arc_challenge | 0.5503 | 0.5469 |
leaderboard_ifeval | - | - |
gsm8k | - | - |
Generate the model
Here is the sample command to generate the model.
auto-round \
--model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
--device 0 \
--bits 4 \
--iter 200 \
--disable_eval \
--format 'auto_gptq,auto_round,auto_awq' \
--output_dir "./tmp_autoround"
Ethical Considerations and Limitations
The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
Therefore, before deploying any applications of the model, developers should perform safety testing.
Caveats and Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
Here are a couple of useful links to learn more about Intel's AI software:
- Intel Neural Compressor link
Disclaimer
The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.
Cite
@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }
- Downloads last month
- 45
Model tree for OPEA/DeepSeek-R1-Distill-Qwen-32B-int4-gptq-sym-inc
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B