Qwen3-14B fine-tuned on an offensive-security SFT dataset (1,226 rows). Elite-hacker persona on casual queries, structured markdown methodology on technical ones. Thinking mode enabled by default (Qwen3-14B base behavior).
Modelfile - Ollama template with correct ChatML stop tokens + Zero Stack system prompt
Run with Ollama
ollama create zerostack-14b -f Modelfile
ollama run zerostack-14b
Run with llama.cpp
./llama-cli -m qwen3-14b.Q5_K_M.gguf -p "hello"
Training
Base: Qwen3-14B
Method: LoRA (r=32), 3 epochs, Unsloth
Max sequence length: 2560
Dataset: SFT_GENERALIST (1,226 rows, ChatML)
Intended Use
Authorized security testing, CTF practice, red-team research, and security education. Targeted at practitioners who already know what they're doing and want structured methodology and command recall.
Limitations & Risks
May hallucinate specific CVE IDs, tool flags, or payload syntax - verify against primary sources before running.
No safety guardrails against misuse. Do not use against systems you don't own or have explicit written authorization to test.
Thinking mode is on by default - responses may be slower and include reasoning traces. Disable in Modelfile if you want faster, terser output.
Trained on English data only; non-English performance is not evaluated.
16 GB VRAM note: GGUF export uses CPU offloading to avoid LoRA merge corruption. If you retrain/re-export, verify maximum_memory_usage=0.5 in export_gguf.py.
License / Use
For authorized security testing, research, and educational use only. Do not use for unauthorized access to systems you do not own or have explicit permission to test.