Instructions to use deadbydawn101/RavenX-Sec-8B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use deadbydawn101/RavenX-Sec-8B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="deadbydawn101/RavenX-Sec-8B-GGUF", filename="ravenx-sec-v2.0-Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use deadbydawn101/RavenX-Sec-8B-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use deadbydawn101/RavenX-Sec-8B-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deadbydawn101/RavenX-Sec-8B-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deadbydawn101/RavenX-Sec-8B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
- Ollama
How to use deadbydawn101/RavenX-Sec-8B-GGUF with Ollama:
ollama run hf.co/deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
- Unsloth Studio
How to use deadbydawn101/RavenX-Sec-8B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for deadbydawn101/RavenX-Sec-8B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for deadbydawn101/RavenX-Sec-8B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for deadbydawn101/RavenX-Sec-8B-GGUF to start chatting
- Pi
How to use deadbydawn101/RavenX-Sec-8B-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use deadbydawn101/RavenX-Sec-8B-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use deadbydawn101/RavenX-Sec-8B-GGUF with Docker Model Runner:
docker model run hf.co/deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
- Lemonade
How to use deadbydawn101/RavenX-Sec-8B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull deadbydawn101/RavenX-Sec-8B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.RavenX-Sec-8B-GGUF-Q4_K_M
List all available models
lemonade list
RavenX-Sec Qwen3-8B v4.0 — Autonomous Security Intelligence Model 128K (GGUF)
GGUF · Ollama / llama.cpp / LM Studio · 128K context · 6-step RATH protocol · 610K training examples · 21 datasets
This is the GGUF version of RavenX-Sec-8B-Security-RATH-128k-mlx-4bit — the model self-evolved from 4 to 6 RATH steps during training.
🍎 Looking for the MLX version? → RavenX-Sec-8B-Security-RATH-128k-mlx-4bit
Built by @DeadByDawn101 (RavenX LLC)
Quick Start
ollama run hf.co/deadbydawn101/RavenX-Sec-8B-GGUF:ravenx-sec-v4.0-128k-Q8_0
Available Quantizations
v4.0-128k (Latest — Recommended)
| Filename | Quant | Size |
|---|---|---|
ravenx-sec-v4.0-128k-Q4_K_M.gguf |
Q4_K_M | 4.7 GB |
ravenx-sec-v4.0-128k-Q5_K_M.gguf |
Q5_K_M | 5.4 GB |
ravenx-sec-v4.0-128k-Q8_0.gguf |
Q8_0 | 8.1 GB |
ravenx-sec-v4.0-128k-f16.gguf |
F16 | 15.3 GB |
Previous Versions
v3.0-128k, v3.0, v2.0 also available in this repo.
Related Models
| Model | Format | Link |
|---|---|---|
| RavenX-Sec v4.0 MLX 4-bit | MLX Safetensors | MLX version |
| RavenX-Sec v4.0 GGUF | GGUF | This repo |
What This Is
A fine-tuned Qwen3-8B specialized for the complete vulnerability lifecycle: find → classify → fix → verify → report → prevent. Trained on 610,220 examples from 21 security-specific datasets with 8192 sequence length (zero truncation). Extended to 128K context via YaRN rope scaling.
The model self-evolved from a 4-step to a 6-step RATH protocol during training:
| Step | What It Does |
|---|---|
| R — Risk / Identify | Finding context, affected systems, exposure |
| A — Assessment | CVSS score + vector, CWE, scope, ground truth |
| T — Threat | Attacker objectives, attack vectors, likelihood |
| H — Highlight / Remediate | Immediate action, recommended fix, workaround, verification |
| D — Document | Severity, weakness classification, steps, SLA |
| P — Prevent | Process improvements, controls, training, monitoring |
Example Output (v4.0 — 6-Step RATH)
RATH STEP 1: IDENTIFY
- Finding: OpenSSH 7.4 running on port 22 of production server
- Context: Older version with known vulnerabilities
RATH STEP 2: ASSESS
- CVSS Score: 6.3 for multiple vulnerabilities
- Impact: Remote code execution, information disclosure
- Scope: Entire server and SSH-dependent services
RATH STEP 3: THREAT
- Attacker Objective: Exploit known CVEs in OpenSSH 7.4
- Attack Vectors: Remote code execution via SSH
- Likelihood: High — well-documented and widely exploited
RATH STEP 4: REMEDIATE
- Immediate: Apply latest security patches
- Recommended: Upgrade to OpenSSH 8.x or higher
- Workaround: Apply all available security updates
- Verification: Check version post-remediation
RATH STEP 5: DOCUMENT
- Severity: Critical
- Weakness: Outdated software
- SLA: Follow org patching SLA for critical vulns
RATH STEP 6: PREVENT
- Process: Implement automated patch management
- Controls: Deploy CVE scanning, maintain system inventory
- Training: Educate team on software update importance
- Monitoring: Enable continuous vulnerability scanning
✅ RATH VERDICT: REMEDIATE IMMEDIATELY
Model Details
| Parameter | Value |
|---|---|
| Architecture | Qwen3-8B |
| Base | georgehenney/Qwen3-8B-heretic (abliterated) |
| Context Window | 128K (YaRN rope scaling, factor 4.0) |
| Training Data | 610,220 examples |
| Security Content | 53% (323K examples) |
| Agent/Tool Content | 37% (228K examples) |
| Datasets | 21 sources |
| Max Seq Length | 8192 (zero truncation) |
| Tokens Trained | 3,644,923 |
| Method | MLX LoRA (rank 32, 8 layers, 1e-5 LR, 2000 iters) |
| Hardware | Apple M4 Max 128GB |
| Peak Memory | 69.5 GB |
Training Datasets (21)
Security (11): Trendyol/Cybersecurity-Instruction-Tuning (50K) · SkywardNomad92/pentest-findings-v2 (50K) · WNT3D/Ultimate-Offensive-Red-Team (25.6K) · auren-research/cve-sft-v5 (10K) · theelderemo/pentesting-explanations (5.9K) · Rootkit7/pentest-redteam-steering (2K) · acnimatic3722/kali-linux-pentesting-data (343) · AYI-NEDJIMI/bug-bounty-pentest-en · CJJones/Synthetic_PenTest_Reports · Whoisjutanlee/4-Security-Tools-Pentesting · cpagac/venomx-pentesting-harmful
Agent/Tool/Coding (5): burtenshaw/agent-tools · Nanbeige/ToolMind · togethercomputer/CoderForge-Preview · automatelab/mcp-servers-tool-catalog · Jackrong/Claude-opus-4.7-TraceInversion-5000x
Agentic: WithinUsAI/AgentAngel_100k (50K capped) · WithinUsAI/claude_mythos_distilled_25k (16K security)
Extracted: hackingBuddyGPT · PentestGPT · Shannon · Ghidra · OpenMythos + Synthetic RATH chains
Frameworks Supported
CVSS 3.1 · NIST CSF 2.0 · OWASP Top 10 · CWE · MITRE ATT&CK · PCI DSS · HIPAA · SOX
Source Code & Training Pipeline
github.com/DeadByDawn101/RavenX-Sec
License
Apache-2.0
"We don't give up. We do what others don't and build what isn't possible." — RavenX LLC
- Downloads last month
- 493
4-bit
5-bit
8-bit
16-bit