Tini-Cybersec-8B-A1B 🛡️🧠

Tini-Cybersec-8B-A1B is a specialized fine-tuned model based on the LiquidAI/LFM2.5-8B-A1B architecture. It is customized to perform complex Cybersecurity tasks, security analysis, threat modeling, and vulnerability assessment, while preserving and enhancing reasoning and Chain-of-Thought (CoT) capabilities.

The model is SFT-trained using a carefully curated dataset mix of 185,002 records comprising both deep security knowledge and structured step-by-step reasoning paths.

📊 Dataset & Matrix Distribution

The SFT training data is a balanced mixture of domain-specific cybersecurity instruction datasets and general reasoning datasets (CoT), filtered to remove empty/zero-token records.

1. Dataset Components

Dataset Source	Category	Records	Share (%)	Description
`AlicanKiraz0/Cybersecurity-Dataset-Heimdall-v1.1`	Cybersecurity	21,257	11.49%	High-quality offensive/defensive cybersecurity instructions.
`AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1`	Cybersecurity	99,870	53.98%	Large-scale cybersecurity instruction dataset.
`Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset`	Cybersecurity	53,201	28.76%	Tailored cybersecurity instructions and tasks.
`nohurry/Opus-4.6-Reasoning-3000x-filtered`	Reasoning CoT	2,326	1.26%	High-quality step-by-step logical reasoning.
`Jackrong/DeepSeek-V4-Distill-8000x`	Reasoning CoT	7,716	4.17%	Distilled reasoning paths from DeepSeek-V4.
`Jackrong/Qwen3.5-reasoning-700x`	Reasoning CoT	633	0.34%	Specialized logical/reasoning instructions.
Total (Filtered Superset)	Combined	185,002	100%

2. Domain Composition

Cybersecurity Core: 94.23% (~174,328 records)
Pure Reasoning & Chain-of-Thought (CoT): 5.77% (~10,674 records)

📈 Dataset Token Statistics

Calculated using the LiquidAI/LFM2.5-8B-A1B tokenizer:

Total Records: 185,002
Total Tokens: 159,858,904 tokens
Average Token Length: 864.09 tokens per record
Min Token Length: 54 tokens
Max Token Length: 78,313 tokens

Token length distribution:

  < 1,000 tokens        : 162,036 records (87.59%) █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
  1,000 - 10,000        : 22,632 records (12.23%)  █ █ █
  10,000 - 20,000       : 182 records (0.10%)
  20,000 - 30,000       : 77 records (0.04%)
  30,000 - 50,000       : 63 records (0.03%)
  50,000 - 100,000      : 12 records (0.01%)

🏆 Evaluation Results (CS-Eval Benchmark)

Tini-Cybersec-8B-A1B has been evaluated on the CS-Eval Benchmark (a comprehensive cybersecurity evaluation benchmark for Large Language Models) and is published on the CS-Eval Leaderboard (under submission name DungNVT-ISELAB).

The model achieved a Comprehensive Score of 76.65%, demonstrating robust capabilities across all domains of system, network, and application security:

Evaluation Domain / Category	Score (%)
Comprehensive Average (Comprehensive Score)	76.65
Supply Chain Security	86.05
AI and Network Security	83.17
Infrastructure Security	78.04
English Tasks	77.40
Data Security and Privacy Protection	76.79
Chinese Task	76.60
Vulnerability Management and Penetration Testing	76.54
Access Control and Identity Management	76.44
Threat Detection and Prevention	75.28
Encryption Technology and Key Management	75.18
Security Architecture Design	75.12
Fundamentals of System Security and Software Security	74.67
Business Continuity and Emergency Response Recovery	67.33

⚙️ Training Hyperparameters (SFT)

The model was SFT-trained using Unsloth and Hugging Face Trainer with sequence packing to optimize throughput:

Parameter	Configuration Value	Detail / Notes
Base Model	`LiquidAI/LFM2.5-8B-A1B`	Liquid Foundation Model
Max Sequence Length	`8,192`	With packing (blocks of 8,192 tokens)
Data Precision	`bfloat16` (BF16)	Native training precision
LoRA Rank (r)	`64`	Broad PEFT adapter matrices
LoRA Alpha	`128`	Scaling factor
LoRA Targets	`q_proj, k_proj, v_proj, out_proj, in_proj, w1, w2, w3`	Attention & LIV projections
Batch Size per Device	`1`	Sequence packed
Gradient Accumulation	`32`	Effective batch size of 32 blocks (262,144 tokens)
Learning Rate	`5e-5`	Recommended sweet spot for wide LoRA SFT
Learning Rate Scheduler	`cosine`	Cosine annealing for smooth convergence
Warmup Steps	`10%` of total steps	Linear warmup
Optimizer	`adamw_8bit`	Memory efficient 8-bit AdamW
Weight Decay	`0.01`	Regularization
Max Gradient Norm	`1.0`	Gradient clipping

💬 Prompt Format & Templates

This model follows the ChatML format and supports nested <think> tags for reasoning models.

Template Structure:

<|im_start|>system
You are a helpful and knowledgeable cybersecurity expert assistant. You answer all user queries step by step with reasoning.<|im_end|>
<|im_start|>user
[Your cybersecurity query / task here]<|im_end|>
<|im_start|>assistant
<think>
[Step-by-step thinking process / Chain-of-Thought (CoT)]
</think>
[Detailed response / action plan / explanation]
<|im_end|>

🚀 How to Load and Use

To load this model with Hugging Face's transformers library:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "./Tini-Cybersec-8B-A1B_26062026"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Inference example
messages = [
    {"role": "system", "content": "You are a cybersecurity expert assistant."},
    {"role": "user", "content": "What is SQL Injection? And how to prevent it?"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=2048,
    temperature=0.6,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

📄 License & Attribution

Base Model: Licensed under the Apache-2.0 license by LiquidAI.
Fine-tuned Weights: Apache-2.0 License.
Dataset Attribution: Please credit the original authors of AlicanKiraz0/Cybersecurity-Dataset-Heimdall-v1.1, AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1, Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset, nohurry/Opus-4.6-Reasoning-3000x-filtered, Jackrong/DeepSeek-V4-Distill-8000x, and Jackrong/Qwen3.5-reasoning-700x.

Downloads last month: 14

Safetensors

Model size

8B params

Tensor type

F32

BF16

Model tree for iselabvn/Tini-Cybersec-8B-A1B

Base model

LiquidAI/LFM2.5-8B-A1B-Base

Finetuned

LiquidAI/LFM2.5-8B-A1B