Text Generation
GGUF
ollama
qwen2
code
cybersecurity
defensive-security
lora

AUltra Unified

AUltra Unified is an Ollama-ready defensive cybersecurity and code assistant created by 252425 HOMELAB.

It is based on Qwen2.5-Coder-14B-Instruct Q4_K_M with a local LoRA adapter trained for defensive repository auditing, secure code review, incident response, hardening, detection engineering, and practical coding help. The model is packaged for local Ollama use and homelab/internal experimentation.

Project Context

This model is a personal learning and homelab experiment. It was created while learning how Hugging Face, local fine-tuning, MLX, LoRA adapters, GGUF conversion, and Ollama packaging fit together.

It should be treated as an experimental community model, not as a polished commercial product or a certified security scanner. The goal is transparency, reproducibility, and learning in public around defensive AI workflows.

Identity

The assistant is configured to identify as AUltra Unified, created by 252425 HOMELAB.

Default system behavior:

  • communicate directly and practically
  • focus on defensive cybersecurity and owned repositories
  • prefer concrete file/line evidence in audits
  • provide remediation guidance without exploit payloads
  • refuse malware, credential theft, phishing, stealth, persistence, bypass, and unauthorized intrusion requests

Files

  • base-qwen2.5-coder-14b-instruct-q4_k_m.gguf: base GGUF layer used by the Ollama model.
  • aultra-unified-lora.gguf: AUltra Unified LoRA adapter.
  • Modelfile: portable Ollama Modelfile referencing the local files in this repo.
  • Modelfile.ollama-generated: original Modelfile emitted by ollama show.

Use With Ollama

Download the repository files, then create the local Ollama model:

ollama create aultra-unified -f Modelfile
ollama run aultra-unified

If you use hf download, keep the downloaded Modelfile, base GGUF, and LoRA adapter in the same directory:

hf download Awson/aultra-unified-ollama --local-dir aultra-unified-ollama
cd aultra-unified-ollama
ollama create aultra-unified -f Modelfile
ollama run aultra-unified

Example prompts:

Wer bist du und von wem wurdest du erstellt?
Review this Node.js file for concrete security findings. Return file, line, evidence, impact, fix, and confidence.
Fix this JavaScript bug: users.map(u => u.name) crashes when users is null.

Intended Use

This model is intended for defensive security work on owned systems and repositories:

  • secure code review
  • repository audit triage
  • vulnerability explanation and remediation guidance
  • incident response reasoning
  • system hardening and detection engineering
  • practical coding assistance

It is not intended for malware creation, credential theft, phishing, stealth, persistence, bypassing controls, or unauthorized intrusion.

Limitations

  • Experimental model produced by an individual/homelab learning project.
  • This model is a local fine-tune and should not be treated as an authoritative security scanner.
  • Findings can be incomplete or wrong; verify all security conclusions manually.
  • The model may miss project-level vulnerabilities when only individual files are supplied.
  • Repository audits work best when prompts include file paths, line numbers, relevant configuration, and threat context.
  • The LoRA adapter was trained for defensive behavior and practical repo-audit style, not for broad general chat benchmarks.
  • Public release does not imply the model is safe for unsupervised security decisions.

Evaluation Notes

Local checks before upload:

  • identity prompt correctly returned AUltra Unified / 252425 HOMELAB
  • JavaScript null handling prompt produced a safe short fix
  • repository audit smoke test detected command injection in a sample Express app
  • malware/password theft prompt was refused

Training Summary

  • Base: Qwen2.5-Coder-14B-Instruct GGUF Q4_K_M
  • Fine-tune type: LoRA
  • Local training runtime: MLX on Apple Silicon
  • Final selected checkpoint: iteration 400
  • Validation loss: 1.289
  • Ollama model size: approximately 9.1 GB

Validation loss checkpoints:

Iteration Validation loss
1 1.808
100 1.508
200 1.327
300 1.302
400 1.289

How It Was Created

AUltra Unified was created as the final combined model after several local experiments on an Apple Silicon Mac mini. Earlier intermediate models were used to compare code-assistant behavior, defensive cyber behavior, and repository-audit behavior. The final model keeps the 14B defensive/security direction and combines the useful parts into one Ollama-ready assistant.

The reconstructed training data is available in:

  • Awson/aultra-unified-training-data

Base Model

  • Base family: Qwen2.5-Coder
  • Base checkpoint: Qwen/Qwen2.5-Coder-14B-Instruct
  • Local training base: mlx-community/Qwen2.5-Coder-14B-Instruct-4bit
  • Ollama/GGUF base layer: Qwen2.5-Coder-14B-Instruct Q4_K_M
  • Final packaging: base GGUF plus AUltra LoRA adapter GGUF

Datasets And Data Sources

The final training data was built from local JSONL chat-format datasets prepared during the experiment. The main public upstream dataset used for the defensive cybersecurity portion was:

  • Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset

The upstream dataset is listed as Apache-2.0 and includes responsible-use guidance for defensive security, privacy protection, and avoiding offensive security tool development, bypassing controls, or unauthorized access. This model keeps the same defensive-use framing.

The reconstructed final and intermediate training splits are available in:

  • Awson/aultra-unified-training-data

No private customer repositories or real private source code were intentionally used as training data. Repository-audit examples were either derived from public cybersecurity instruction data or manually curated/synthetic snippets created for defensive training and smoke testing.

Intermediate prepared datasets:

  • data_cyber/train.jsonl: generated from Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset plus small custom AUltra identity/safety examples. It covers defensive repository review, incident response, hardening, detection engineering, threat intelligence, secure code review, and safe refusal behavior.
  • data_repo_auditor/train.jsonl: generated from a 4,000-sample subset of Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset plus repeated manually curated repository-audit examples. These examples focus on concrete findings, file/line evidence, impact, fix guidance, confidence, and concise open questions.
  • Curated identity/style examples: manually written examples teaching the model to identify as AUltra Unified, created by 252425 HOMELAB, and to answer in a direct, practical, communicative style.
  • Safety/refusal examples: manually written examples for refusing malware, credential theft, phishing, stealth, persistence, bypass, and unauthorized intrusion requests while redirecting to defensive alternatives.

Earlier separate code-assistant experiments used sahil2801/CodeAlpaca-20k, but that smaller code-assistant model was later discarded and was not part of the final aultra-unified dataset mix.

The final unified dataset was generated with scripts/prepare_unified_aultra_dataset.py.

Dataset build settings:

  • cyber samples: 9,000
  • repo-audit samples: 4,248
  • repeated curated identity/safety/style examples: 120
  • final train split: 12,356 examples
  • validation split: 686 examples
  • test split: 686 examples

Chat formatting:

  • system: AUltra Unified defensive cyber/code-assistant identity and safety policy
  • user: task or repository/code prompt
  • assistant: concise defensive answer, audit finding, fix guidance, or refusal

Training Method

The model was not trained from scratch. It was fine-tuned locally using LoRA, which updates a small adapter while keeping the base model frozen. This kept the training feasible on local Apple Silicon hardware.

Training method:

  • framework: mlx-lm
  • accelerator: Apple Metal / Apple Silicon GPU through MLX
  • fine-tune type: LoRA adapter
  • base precision/package: 4-bit MLX model
  • selected adapter: final iteration 400
  • peak observed unified memory during training: about 23.3 GB

Approximate training configuration:

  • iterations: 400
  • LoRA layers: 32
  • learning rate: 6e-6
  • max sequence length: 1536
  • gradient accumulation steps: 8
  • validation batches: 20
  • checkpoint interval: 100 iterations

Toolchain

Main tools used:

  • mlx-lm: local LoRA fine-tuning on Apple Silicon
  • datasets / local JSONL processing: dataset preparation and splitting
  • llama.cpp conversion tools: LoRA adapter conversion to GGUF
  • Ollama: local model registration, chat testing, and final runtime
  • Hugging Face Hub: private model backup and distribution

Key local scripts from the project:

  • scripts/prepare_unified_aultra_dataset.py: created the unified training/validation/test JSONL splits
  • scripts/run_unified_aultra_14b.sh: launched the final MLX LoRA training run
  • scripts/export_unified_aultra_14b_to_ollama.sh: converted/exported the trained adapter for Ollama
  • scripts/benchmark_models.py: compared the larger local AUltra variants
  • scripts/audit_repo.py: repository-audit runner used for local smoke testing

Model Selection

The final checkpoint was selected by validation loss. The validation curve continued improving through iteration 400, so the 400-iteration adapter was selected for export.

The unified model was then registered in Ollama as aultra-unified, tested locally, and uploaded to Hugging Face as an Ollama-ready package.

License And Base Model

The base model is Qwen2.5-Coder-14B-Instruct. The generated Ollama model reports Apache-2.0 license metadata from the base model. Keep downstream use aligned with the base model license and any applicable Hugging Face terms.

Attribution

This model builds on:

  • Qwen/Qwen2.5-Coder-14B-Instruct
  • Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset

Suggested citation for the upstream cybersecurity dataset:

@dataset{trendyol_2025_cybersec_v2,
  author    = {{Trendyol Security Team}},
  title     = {Trendyol Cybersecurity Defense Instruction-Tuning Dataset v2.0},
  year      = {2025},
  month     = {7},
  publisher = {Hugging Face},
  version   = {2.0.0},
}
Downloads last month
32
GGUF
Model size
22.9M params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Awson/aultra-unified-ollama

Base model

Qwen/Qwen2.5-14B
Adapter
(66)
this model

Dataset used to train Awson/aultra-unified-ollama