SecureBERT β€” CVE-LMTune ATT&CK Classifier (Flat)

Universite de Lorraine INRIA LORIA SuperViZ

GitHub Paper PhD theses.fr License: MIT Zenodo Data

Part of the CVE-LMTune model suite, a collection of language models fine-tuned for multi-taxonomy vulnerability classification across widely used cybersecurity taxonomies, including CWE, CAPEC, and MITRE ATT&CK.

Paper

Franco Terranova, Sana Rekbi, Abdelkader Lahmadi, Isabelle Chrisment. Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models. The 23rd Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA '26).

Overview

This model performs multi-label ATT&CK classification from vulnerability descriptions. Given a CVE-style description, it predicts one or more ATT&CK identifiers associated with the described vulnerability.

Property Value
Taxonomy MITRE ATT&CK Enterprise Subtechniques
Task Multi-label text classification
Input Vulnerability description (e.g., CVE summary)
Output One or more ATT&CK identifiers
Number of labels 175
Number of samples 231,009
Latest CVE update included 17/06/2026
Split train (60%), val (20%), test (20%)

Evaluation Results

The model was evaluated on the held-out test set using standard multi-label classification metrics using sigmoid activation and a default threshold of 0.5.

Ranking Metrics

LRAP MRR Coverage Error Label Ranking Loss P@1 P@3 P@5 R@1 R@3 R@5
0.9152 0.9460 18.79 0.0173 0.9321 0.9084 0.8458 0.1286 0.3779 0.5554

Threshold = 0.5

Micro P Micro R Micro F1 Macro F1 Weighted F1 Hamming Loss Subset Accuracy
0.8612 0.7767 0.8168 0.4286 0.8093 0.0264 0.6874

Quick Start

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("Sana9/securebert-vuln2attack-flat")
model = AutoModelForSequenceClassification.from_pretrained("Sana9/securebert-vuln2attack-flat")

text = "Buffer overflow vulnerability in OpenSSL allows remote attackers to execute arbitrary code."

with torch.no_grad():
    probs = torch.sigmoid(
        model(**tokenizer(text, return_tensors="pt", truncation=True)).logits
    )[0]

predictions = {
    model.config.id2label[i]: p.item()
    for i, p in enumerate(probs)
    if p > 0.5
}

print(predictions)

Citation

@inproceedings{terranova2026multitaxonomy,
  author    = {Franco Terranova and Sana Rekbi and Abdelkader Lahmadi and Isabelle Chrisment},
  title     = {Multi-Taxonomy Vulnerability Classification with Hierarchically Finetuned Language Models},
  booktitle = {Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA)},
  year      = {2026},
  month     = jul,
  address   = {Chania, Crete, Greece},
  note      = {HAL identifier: hal-05500820v2}
}

Related Resources

Disclaimers

  • This product is a result of the use of the NVD API but is not endorsed or certified by the NVD. The same for the CVE2CAPEC project and the Hugging Face API.
  • This project relies on data publicly available from the CWE, CAPEC, and MITRE ATT&CK projects.
  • This work has been partially supported by the French National Research Agency under the France 2030 label (Superviz ANR-22-PECY-0008). The views reflected herein do not necessarily reflect the opinion of the French government.
Downloads last month
14
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Sana9/securebert-vuln2attack-flat

Finetuned
(15)
this model