track_b_sft_model / README.md
archit11's picture
Upload README.md with huggingface_hub
fba88f9 verified
metadata
base_model: Qwen/Qwen2.5-Coder-1.5B
tags:
  - lora
  - sft
  - code
  - python
  - instruction-tuning
license: apache-2.0

Track B SFT – Qwen2.5-Coder-1.5B + LoRA

Fine-tuned on ~250 synthetic coding instruction pairs generated from the verl corpus.

Results

Metric Baseline Post-SFT Δ
pass@1 0.565 0.804 +0.239
pass@3 0.783 0.848 +0.065

Training

  • Base model: Qwen/Qwen2.5-Coder-1.5B
  • Method: LoRA (r=16, alpha=32)
  • Data: archit11/track_b_sft (~257 train examples)
  • Epochs: 3, LR: 2e-4, Hardware: T4 GPU

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-1.5B")
model = PeftModel.from_pretrained(base, "archit11/track_b_sft_model").merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained("archit11/track_b_sft_model")