Qwen2.5-Coder-0.5B-QwQ-draft
A draft model trained for Qwen/QwQ-32B-Preview
- vocabulary size of 152064, same as QwQ-32B-Preview (can be used in VLLM directly without any hack)
- trained from Qwen/Qwen2.5-Coder-0.5B-Instruct
- on PowerInfer/QWQ-LONGCOT-500K 2 epochs
- draft acceptance rate above 0.8
- up to x2.5 token speed in math problems (33 toks/s vs. 85 toks/s)
- Downloads last month
- 29
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for tugstugi/Qwen2.5-Coder-0.5B-QwQ-draft
Base model
Qwen/Qwen2.5-0.5B
Finetuned
Qwen/Qwen2.5-Coder-0.5B
Finetuned
Qwen/Qwen2.5-Coder-0.5B-Instruct