B5-SFT-7B β€” LoRA adapter for procedural-compliance reasoning (engaged mode)

LoRA adapter trained via reasoning-scaffolded SFT on Qwen/Qwen2.5-7B-Instruct, representing the engaged-mode endpoint of the procedural-reasoning investigation.

Model description

The B5-SFT recipe is the procedural-compliance adaptation of the "Reasoning Scaffolding" trace-distillation method. Token-level next-token- prediction loss is taken over reasoning-scaffolded teacher traces produced by DeepSeek-R1 (671B, queried via OpenRouter); inline [SIG=...] scaffold-token annotations mark the reasoning stages. The model is trained to internally consult the procedure step-by-step before emitting the final compliance classification.

Of multiple interventions tested in the parent paper, B5-SFT is the engaged- mode endpoint β€” it routes compliance reasoning through procedure-aware step verification, in contrast with the heuristic-mode endpoint (v7-GRPO-14B) which reaches overlapping training-distribution accuracy through pattern matching on surface features rather than step-by-step verification.

Training data

Trained on the canonical centralized procedural-reasoning dataset:

  • 9,013 unique instances after deduplication
  • 6 source cohorts: scaled_86 (1,522), af7lab (813), fda (4,882), af4 (245), v7_200 (154), mi_44 (34)
  • 8 institutional sources for lab safety procedures (Duke, Miami, UCF, UW, Princeton, Cornell, Stanford, MIT) plus FDA Investigations Operations Manual and CMS Medicare manuals
  • 10 perturbation types per process: baseline_compliant, baseline_noncompliant, lexical_paraphrase, step_reorder_invalid, prerequisite_omission, hierarchy_violation, exception_trigger, distractor_injection, adversarial_surface_compliant, adversarial_surface_noncompliant

Scope:

  • Laboratory safety procedures (chemical spills, PPE, hazardous-material handling, lab closeouts)
  • FDA inspection procedures (claim filing, credentialing, billing-after-denial, vehicle accident reporting)
  • CMS Medicare billing procedures (claim processing, modifier handling, pre/post-billing actions, denial cascade handling)

Held-out evaluation splits (v7_200 + af4 + mi_44) are deduplicated against the training portion to ensure no instance-level leakage.

Intended use

Research:

  • Procedural-compliance reasoning evaluation
  • Mechanistic interpretability of LLM reasoning modes
  • Out-of-distribution behavioral evaluation on regulatory-procedure tasks

NOT intended for:

  • Deployment in production compliance auditing without independent verification
  • Domain-specific use outside the lab safety / FDA / CMS scope without additional validation
  • High-stakes decisions where the model's binary classification is taken as authoritative

Limitations

Two admit-verified-subset disclosures from Β§5 methodology:

  • Cornell adversarial_surface_compliant: 72.3% accept rate during dataset construction β€” the hardest-by-construction cell where surface-style perturbations produce ambiguous compliance verdicts. Models trained on this data inherit some of that ambiguity.
  • FDA prerequisite_omission: 65.2% accept rate during dataset construction β€” the post-AF.1-safety-guard-filter prereq cell is the universal-difficulty cell across cohorts; the dataset admits the verified subset but reviewers may find a third of the rejected instances borderline.

Plus methodology footnote: regulatory-procedure corpora (CMS in particular) sometimes encode internal cognitive determinations and system-side actions as numbered steps; the omission-based perturbation pattern can't render these as observable narrative violations. This is a corpus-vs-perturbation-family characteristic of the CMS source data, not a quality flag on this model.

Also expect:

  • Verbose responses with explicit step-by-step reasoning, especially on longer procedures. Discard reasoning prefix downstream if you only want the binary classification.
  • Performance degradation when procedure / scenario / question order in the user message is restructured significantly. The model was trained to expect a specific input shape (procedure β†’ scenario β†’ question in one user message).

Evaluation results

Performance metrics from the cleaned canonical dataset, where available:

Metric Value
Training-distribution accuracy (B5-SFT recipe headline) ~0.68
Var-binding holdout, ord-swap on flipping subset 0.545
Yang Stage-3 retrieval head L19.H17 Ξ΄ on cleaned procedural 0.886

Additional evaluation numbers from Β§3 intervention contrast against v7-GRPO and Β§1 behavioral failure-pattern grid become available as the AF.5 re-runs land; refer to the procedural-reasoning paper's headline tables for the current state.

Prompt format

Use the standard Qwen2.5 chat template via tokenizer.apply_chat_template.

The published evaluation puts the procedure β†’ scenario β†’ question in a single user message (no system message). For apples-to-apples comparison with published results, match this format. See prompt_format.md in the companion package for verbatim examples.

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="cuda",
)
tok = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base, "kennethp97/b5-sft-7b")
model.eval()

messages = [{"role": "user", "content": "Procedure: ...\n\nScenario: ...\n\nQuestion: ..."}]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

License

Apache 2.0 for this LoRA adapter. The Qwen2.5-7B-Instruct base model is licensed under its own license β€” see the official Qwen repository for terms.

Citation

@unpublished{paulsen2026_procedural_reasoning,
  author = {Paulsen, Kenneth and collaborators},
  title  = {Procedural-reasoning compliance: engaged vs heuristic modes},
  year   = {2026},
  note   = {Manuscript in preparation; citation form will be finalized at submission.}
}
Downloads last month
64
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for kennethp97/b5-sft-7b

Base model

Qwen/Qwen2.5-7B
Adapter
(2134)
this model