Qwen2.5-Coder-7B KernelBook SFT (equal tokens)

Supervised Fine-Tuning (SFT) checkpoint of Qwen/Qwen2.5-Coder-7B-Instruct, post-trained on the KernelBook Triton kernel dataset.

This repo is the equal-exposure SFT checkpoint (checkpoint-350, ~1.06 epochs) selected to match SDFT's one-epoch training for fair comparison.

Method

This model was trained with SFT using TRL's SFTTrainer: standard next-token prediction on chat-formatted prompt → Triton completion pairs, with completion-only loss (prompt tokens masked). Training used DeepSpeed ZeRO-3 and bf16 on Modal.

Dataset

  • KernelBook — PyTorch module prompts paired with reference Triton kernels
  • Deduplicated, filtered to completions ≤4096 tokens, repo-stratified 80/10/10 split
  • Stopped at checkpoint-350 (~1.06 epochs) for parity with the SDFT run

Intended use

Generate Triton GPU kernels from PyTorch-style module descriptions. Best for KernelBook-style conversion prompts; not evaluated as a general-purpose chat or reasoning model.

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "aadityabuilds/qwen2-5-coder-7b-kernelbook-sft-equal-tokens"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype="auto", device_map="auto", trust_remote_code=True
)

messages = [
    {
        "role": "user",
        "content": "Convert the following PyTorch code to an equivalent Triton kernel...",
    }
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1200, do_sample=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1] :], skip_special_tokens=True))

Training summary

Setting Value
Base model Qwen2.5-Coder-7B-Instruct
Method SFT (TRL SFTTrainer, completion-only NLL)
Checkpoint checkpoint-350 (~1.06 epochs)
Hardware 4× H100 (Modal)
Parallelism DeepSpeed ZeRO-3, bf16

Limitations

Specialized for KernelBook Triton codegen. May show reduced performance on general coding, math, and knowledge benchmarks compared to the base instruct model.

Downloads last month
121
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with aadityabuilds/qwen2-5-coder-7b-kernelbook-sft-equal-tokens.

Model tree for aadityabuilds/qwen2-5-coder-7b-kernelbook-sft-equal-tokens

Base model

Qwen/Qwen2.5-7B
Finetuned
(387)
this model