Olmo-3 7B Think · NIST CSF 2.0 (LoRA)

A LoRA adapter that fine-tunes allenai/Olmo-3-7B-Think to be a policy-following assistant grounded in the NIST Cybersecurity Framework (CSF) 2.0 (NIST CSWP 29).

This repository contains only the LoRA adapter — the base model weights are unchanged. Load the base model and apply the adapter at inference time.

What it does

Answers questions about the NIST CSF 2.0 Functions, Categories, Subcategories, Implementation Tiers, and glossary, referencing the relevant identifiers (e.g. GV.RR-01) and staying faithful to the framework text. As a reasoning model it produces a <think>…</think> block, then a grounded answer.

Training


Base model	`allenai/Olmo-3-7B-Think`
Method	QLoRA (4-bit NF4)
LoRA rank / alpha	16 / 32
Target modules	q,k,v,o,gate,up,down projections
Epochs	3
Max seq length	2048
Train / eval examples	237 / 29
Dataset	SeanJIE250/nist-csf-2.0-sft

Note on the think traces

The training data wraps every answer as <think>{reasoning}</think>\n\n{answer}. This is essential for reasoning models: their chat template forces an opening <think> at generation time, so a fine-tune on plain answers (with no closing </think> and no learned stop) degenerates into repetition. Training on full think-formatted targets — and ensuring the trace survives tokenization — teaches the model to close the think block and stop cleanly.

Usage

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "allenai/Olmo-3-7B-Think"
tok = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(model, "SeanJIE250/olmo3-7b-think-csf")

msgs = [{"role": "user", "content": "What is the GOVERN (GV) Function in NIST CSF 2.0?"}]
inputs = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(inputs, max_new_tokens=1024, temperature=0.6)
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

To serve with vLLM, pass the adapter via --enable-lora --lora-modules csf=<adapter_path>, or merge it into the base first (peft merge_and_unload) for a standalone model.

License

Adapter released under Apache-2.0, consistent with the base model's license. The training data derives from NIST CSWP 29, a U.S. Government publication (public domain).