Qwen3-4B Coder Task-Selected Checkpoint
This repository hosts a Qwen3-4B checkpoint selected as the code expert in our model merging experiments.
Although the checkpoint was selected from a broader pool of Qwen3-4B fine-tuned models, it shows strong code-generation ability under our evaluation protocol. With reasoning mode enabled, it achieves 89.02 pass@1 on HumanEval. This score is higher than several code-specialized checkpoints evaluated under the same protocol in our candidate pool.
We use this checkpoint as a task-selected code expert for model merging experiments, rather than assigning experts solely according to their original repository labels.
Evaluation
| Benchmark | Setting | Score |
|---|---|---|
| HumanEval | reasoning mode, final submitted code evaluated | 89.02 pass@1 |
Base Model
- Base model: Qwen/Qwen3-4B EOFcat > /tmp/README.md <<'EOF'
license: apache-2.0 base_model: Qwen/Qwen3-4B pipeline_tag: text-generation
Qwen3-4B Coder Task-Selected Checkpoint
This repository hosts a Qwen3-4B checkpoint selected as the code expert in our model merging experiments.
Although the checkpoint was selected from a broader pool of Qwen3-4B fine-tuned models, it shows strong code-generation ability under our evaluation protocol. With reasoning mode enabled, it achieves 89.02 pass@1 on HumanEval. This score is higher than several code-specialized checkpoints evaluated under the same protocol in our candidate pool.
We use this checkpoint as a task-selected code expert for model merging experiments, rather than assigning experts solely according to their original repository labels.
Evaluation
| Benchmark | Setting | Score |
|---|---|---|
| HumanEval | reasoning mode, final submitted code evaluated | 89.02 pass@1 |
Base Model
- Base model: Qwen/Qwen3-4B
- Downloads last month
- 1