BTL-2 Coder 7B

BTL-2 Coder 7B is a LoRA adapter for unsloth/Qwen2.5-Coder-7B-Instruct, trained for structured code-review findings.

Code and evaluation scripts are available at:

https://github.com/Badtheorylabs/btl-2-coder

Intended Use

This adapter is intended for local-first code review. It is trained to produce structured findings with:

  • severity
  • file path
  • line number
  • title
  • evidence
  • recommendation
  • numeric confidence

The main supported issue classes are SQL injection, path traversal, authorization bypass, missing error handling, boundary/off-by-one logic, and related security/correctness findings.

The adapter is optimized for review output rather than broad chat behavior.

Training

  • Base model: unsloth/Qwen2.5-Coder-7B-Instruct
  • Method: LoRA SFT with Unsloth
  • Data mix: 4,000 API-generated review traces + 1,000 template traces
  • Train/eval split: 4,500 train examples + 500 eval examples
  • Epochs: 2
  • Max sequence length: 4096

Only redacted, opt-in traces should be used for future training.

Recommended Prompt Contract

Use strict schema prompting:

Return only a JSON array. No markdown and no wrapper object.
Each finding must include: severity, file, line, title, evidence, recommendation, confidence.
severity must be exactly one of: critical, high, medium, low.
Never put a category in severity.
confidence must be a number from 0 to 1, never a string label.
Every finding must include concrete evidence and a non-empty recommendation.

Example output:

[
  {
    "severity": "critical",
    "file": "src/users.ts",
    "line": 42,
    "title": "SQL injection through string-built query",
    "evidence": "The user id is concatenated directly into the SQL string.",
    "recommendation": "Use a parameterized query.",
    "confidence": 0.96
  }
]

Load The Adapter

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "unsloth/Qwen2.5-Coder-7B-Instruct"
adapter = "badtheorylabs/btl-2-coder"

tokenizer = AutoTokenizer.from_pretrained(adapter)
model = AutoModelForCausalLM.from_pretrained(base, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

Evaluation

Measured on an NVIDIA H200 with 4-bit adapter inference.

Eval JSON parse Schema valid Numeric confidence Category hit File hit Precision Recall Weighted severity recall
Heldout 100 strict 1.000 0.952 1.000 0.783 0.840 n/a n/a n/a
Heldout 30 strict v2 1.000 0.975 1.000 0.867 0.867 n/a n/a n/a
Seeded 15 strict 1.000 1.000 1.000 0.933 1.000 0.933 0.933 0.956

Notes:

  • Heldout precision/recall is marked n/a because the heldout set is broader and does not use one normalized ground-truth finding per example.
  • The seeded benchmark is a controlled regression suite with known findings.
  • Reported results use the recommended strict schema prompt.

Scope

  • Primary task: structured security and correctness review.
  • Output format: JSON findings with severity, location, evidence, recommendation, and confidence.
  • Best runtime path: strict schema prompting, with optional constrained decoding.
  • Evaluation focus: code-review findings, file hits, schema validity, and seeded precision/recall.
  • Next track: patch proposals and terminal workflows.

Files

This repository contains a PEFT/LoRA adapter:

  • adapter_model.safetensors
  • adapter_config.json
  • tokenizer.json
  • tokenizer_config.json
  • chat_template.jinja
  • training_args.bin
  • SHA256SUMS
Downloads last month
24
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for badtheorylabs/btl-2-coder