Instructions to use bartek-flp/qwen3coder-30b-dcr-lora-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use bartek-flp/qwen3coder-30b-dcr-lora-v2 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-Coder-30B-A3B-Instruct") model = PeftModel.from_pretrained(base_model, "bartek-flp/qwen3coder-30b-dcr-lora-v2") - Notebooks
- Google Colab
- Kaggle
Qwen3-Coder-30B-A3B β DCR (Drupal Code Review) QLoRA adapter, v2
A LoRA adapter that specializes Qwen3-Coder-30B-A3B-Instruct for reviewing Drupal 10/11 PHP diffs and emitting structured JSON findings (security, logic, architecture, Drupal-API).
This is round 2. The v1 adapter (qwen3coder-30b-dcr-lora)
scored well on synthetic held-out data but, when finally tested on real Drupal security defects,
caught only 18.8% of them β it had over-learned "looks like clean merged Drupal β clean". v2 adds 38
real, objective security defects (pre-fix code from Drupal security advisories, SA-CORE-*) plus
low-severity contrastive pairs, and recovers real-defect recall to 56.2% while keeping 100% specificity.
Results: base vs v1 vs v2 (real-defect eval, n=32)
16 real CVE-grade defects (advisory fix commits, inverted so the diff reintroduces the vuln; objective ground truth) + 16 matched clean fixes. Same base weights, LoRA hot-swapped, temperature 0.
| Metric | Base | v1 | v2 |
|---|---|---|---|
| Verdict accuracy | 71.9% | 59.4% | 78.1% |
| Positive recall (caught the real defect) | 87.5% (14/16) | 18.8% (3/16) | 56.2% (9/16) |
| Negative specificity (quiet on clean) | 56.2% | 100% | 100% |
| Category match | 56.2% | β | 43.8% |
| Invalid JSON | 0/32 | 0/32 | 0/32 |
Honest read: v2 roughly tripled v1's real-defect recall without giving back specificity, and has the
best overall verdict accuracy. It is not strictly better than base β base still out-recalls it
(14/16 vs 9/16) on subtle logic bypasses, and v2's category labelling regressed. But base false-alarms
on 7 of 16 clean fixes (specificity 56%), where v1 and v2 raise zero. Pick v2 for a low-false-positive
pipeline; pick base if you want maximum recall and will triage the noise. Full report with verbatim
side-by-side outputs (wins and losses) ships in the project repo under docs/eval/.
Training data
v1's 400 pairs + 38 real security positives (inverted SA-CORE fix commits, objective category/severity from the advisory) + matched clean negatives + 11 low-severity contrastive pairs (e.g. O(nΒ²) array_merge-in-loop with a near-miss clean form). 498 train rows; the real-defect eval set was held out by advisory ID. Teacher for the synthetic half: Claude Opus 4.x.
Usage (with the base model)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = "Qwen/Qwen3-Coder-30B-A3B-Instruct"
tok = AutoTokenizer.from_pretrained(base)
m = AutoModelForCausalLM.from_pretrained(base, device_map="auto", torch_dtype="bfloat16")
m = PeftModel.from_pretrained(m, "bartek-flp/qwen3coder-30b-dcr-lora-v2")
Prompt with the DCR system message (review a diff, output JSON findings only).
Limitations
QLoRA on attention projections only (q/k/v/o, r=16). Real-defect recall is 56%, with the remaining gap mostly subtle logic-level access bypasses that the base model catches but v2 does not. Category labelling is weaker than base. The eval is small (n=32) and security-skewed. Always keep a human in the loop for security findings.
- Downloads last month
- 28
Model tree for bartek-flp/qwen3coder-30b-dcr-lora-v2
Base model
Qwen/Qwen3-Coder-30B-A3B-Instruct