Granite Switch 4.0 350M — CTI Technique Mapping

A Granite Switch model: the ibm-granite/granite-4.0-350m base with a single LoRA adapter — cti_technique_mapping — embedded into one switchable checkpoint and fired by a control token.

It is the small-model analogue of ibm-granite/granite-switch-4.1-3b-preview: same Granite Switch composition machinery (control tokens, KV-hiding, chat-template integration), but built on the 350M Granite 4.0 base and carrying one CTI adapter instead of the preview's granitelib adapter library.

What the adapter does

cti_technique_mapping maps a piece of cyber threat intelligence (CTI) text — a sentence or short passage describing adversary behavior — to the single best-matching MITRE ATT&CK technique ID (e.g. T1059, T1566.001).

The adapter's I/O contract (io_configs/cti_technique_mapping/io.yaml) constrains the output to a single technique-ID string matching ^T[0-9]{4}(\.[0-9]{3})?$, greedy decoding, max_completion_tokens: 16.

The underlying LoRA scored 96.67% exact-match (290/300) on the held-out CTI validation set.

Composition summary

Base model ibm-granite/granite-4.0-350m (granitemoehybrid)
Embedded adapters 1 — cti_technique_mapping (LoRA)
Control token <|cti_technique_mapping|> (id 100352)
LoRA rank / alpha 16 / 32
Target modules q_proj, k_proj, v_proj, o_proj, input_linear, output_linear
Base params 352,379,904
Composed params 355,362,816 (+0.85%)

Usage

The control token activates the adapter. With the Granite Switch HF backend:

from granite_switch.hf import GraniteSwitchForCausalLM
from transformers import AutoTokenizer

model_id = "barha/granite-switch-4.0-350m-cti"
tok = AutoTokenizer.from_pretrained(model_id)
model = GraniteSwitchForCausalLM.from_pretrained(model_id, device_map="auto")

cti = "The actor used PowerShell to download and execute a payload from a remote server."
messages = [{"role": "user", "content": cti}]
# The chat template inserts <|cti_technique_mapping|> to fire the adapter.
inputs = tok.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
out = model.generate(inputs, max_new_tokens=16, do_sample=False)
print(tok.decode(out[0, inputs.shape[1]:], skip_special_tokens=True))
# -> e.g. "T1059.001"

For fast inference, deploy with vLLM (see the granite-switch docs).

How it was built

Composed with granite_switch.composer.compose_granite_switch:

python -m granite_switch.composer.compose_granite_switch \
  --base-model ibm-granite/granite-4.0-350m \
  --adapters <path-to>/cti_technique_mapping \
  --technology lora \
  --output <out>

Note on granitemoehybrid: all Granite 4.0 / Nano models are granitemoehybrid configs (with num_local_experts=0), whose MLP leaves are the fused input_linear / output_linear rather than dense gate/up/down_proj. The CTI LoRA was therefore trained targeting input_linear / output_linear (plus attention q/k/v/o_proj) so it composes cleanly — a scalar/all-linear target would produce phantom gate/up/down_proj weights the composer rejects.

Files

  • model.safetensors — composed base + embedded LoRA weights
  • config.jsonmodel_type: granite_switch
  • adapter_index.json — adapter → control-token mapping
  • io_configs/cti_technique_mapping/io.yaml — adapter I/O contract (output schema, decoding params)
  • chat_template.jinja — control-token-aware chat template
  • compose_report.json, BUILD.md — full composition provenance

License

Apache-2.0.

Downloads last month
17
Safetensors
Model size
0.4B params
Tensor type
I64
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for barha/granite-switch-4.0-350m-cti

Adapter
(2)
this model

Space using barha/granite-switch-4.0-350m-cti 1