sarahwei
/

MITRE-tactic-bert-case-based

Text Classification

Inference Endpoints

Model card Files Files and versions Community

MITRE-tactic-bert-case-based / README.md

sarahwei's picture

Update README.md

49803a9 verified 3 months ago

|

history blame contribute delete

2.78 kB

	---
	license: apache-2.0
	language:
	- en
	base_model: bencyc1129/mitre-bert-base-cased
	pipeline_tag: text-classification
	widget:
	- text: "An attacker performs a SQL injection."
	---

	## MITRE-tactic-bert-case-based

	It's a fine-tuned model from [mitre-bert-base-cased](https://huggingface.co/bencyc1129/mitre-bert-base-cased) on the [MITRE](https://attack.mitre.org/) procedure dataset. It achieves
	- loss:0.057
	- accuracy:0.87

	on evaluation dataset.


	## Intended uses & limitations
	You can use the fine-tuned model for text classification. It aims to identify the tactic that the sentence belongs to in MITRE ATT&CK framework.
	A sentence or an attack may fall into several tactics.

	Note that this model is primarily fine-tuned on text classification for cybersecurity.
	It may not perform well if the sentence is not related to attacks.

	## How to use
	You can use the model with Tensorflow.
	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch
	model_id = "sarahwei/MITRE-tactic-bert-case-based"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForSequenceClassification.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	# device_map="auto",
	)
	question = 'An attacker performs a SQL injection.'
	input_ids = tokenizer(question,return_tensors="pt")
	outputs = model(**input_ids)
	logits = outputs.logits
	sigmoid = torch.nn.Sigmoid()
	probs = sigmoid(logits.squeeze().cpu())
	predictions = np.zeros(probs.shape)
	predictions[np.where(probs >= 0.5)] = 1
	predicted_labels = [model.config.id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]
	```

	## Training procedure
	### Training parameter
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 0
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10
	- warmup_ratio: 0.01
	- weight_decay: 0.001

	### Training results

	\|Step\| Training Loss\| Validation Loss\| F1 \| Roc AUC \| accuracy \|
	\|:--------:\| :------------:\|:----------:\|:------------:\|:-----------:\|:---------------:\|
	\| 100\| 0.409400 \|0.142982\|0.740000\|0.803830\|0.610000\|
	\| 200\|0.106500\|0.093503\|0.818182 \|0.868382 \|0.720000\|
	\| 300\|0.070200\| 0.065937\| 0.893617\| 0.930366\| 0.810000\|
	\| 400\|0.045500\| 0.061865\| 0.892704\| 0.926625\| 0.830000\|
	\| 500\|0.033600\| 0.057814\| 0.902954\| 0.938630\| 0.860000\|
	\| 600\|0.026000\| 0.062982\| 0.894515\| 0.934107\| 0.840000\|
	\| 700\|0.021900\| 0.056275\| 0.904564\| 0.946113\| 0.870000\|
	\| 800\|0.017700\| 0.061058\| 0.887967\| 0.937067\| 0.860000\|
	\| 900\|0.016100\| 0.058965\| 0.890756\| 0.933716\| 0.870000\|
	\| 1000\|0.014200\| 0.055885\| 0.903766\| 0.942372\| 0.880000\|
	\| 1100\|0.013200\| 0.056888\| 0.895397\| 0.937849\| 0.880000\|
	\| 1200\|0.012700\| 0.057484\| 0.895397\| 0.937849\| 0.870000\|