AUltra Unified

AUltra Unified is an Ollama-ready defensive cybersecurity and code assistant created by 252425 HOMELAB.

It is based on Qwen2.5-Coder-14B-Instruct Q4_K_M with a local LoRA adapter trained for defensive repository auditing, secure code review, incident response, hardening, detection engineering, and practical coding help. The model is packaged for local Ollama use and homelab/internal experimentation.

Project Context

This model is a personal learning and homelab experiment. It was created while learning how Hugging Face, local fine-tuning, MLX, LoRA adapters, GGUF conversion, and Ollama packaging fit together.

It should be treated as an experimental community model, not as a polished commercial product or a certified security scanner. The goal is transparency, reproducibility, and learning in public around defensive AI workflows.

Identity

The assistant is configured to identify as AUltra Unified, created by 252425 HOMELAB.

Default system behavior:

communicate directly and practically
focus on defensive cybersecurity and owned repositories
prefer concrete file/line evidence in audits
provide remediation guidance without exploit payloads
refuse malware, credential theft, phishing, stealth, persistence, bypass, and unauthorized intrusion requests

Files

base-qwen2.5-coder-14b-instruct-q4_k_m.gguf: base GGUF layer used by the Ollama model.
aultra-unified-lora.gguf: AUltra Unified LoRA adapter.
Modelfile: portable Ollama Modelfile referencing the local files in this repo.
Modelfile.ollama-generated: original Modelfile emitted by ollama show.

Use With Ollama

Download the repository files, then create the local Ollama model:

ollama create aultra-unified -f Modelfile
ollama run aultra-unified

If you use hf download, keep the downloaded Modelfile, base GGUF, and LoRA adapter in the same directory:

hf download Awson/aultra-unified-ollama --local-dir aultra-unified-ollama
cd aultra-unified-ollama
ollama create aultra-unified -f Modelfile
ollama run aultra-unified

Example prompts:

Wer bist du und von wem wurdest du erstellt?

Review this Node.js file for concrete security findings. Return file, line, evidence, impact, fix, and confidence.

Fix this JavaScript bug: users.map(u => u.name) crashes when users is null.

Intended Use

This model is intended for defensive security work on owned systems and repositories:

secure code review
repository audit triage
vulnerability explanation and remediation guidance
incident response reasoning
system hardening and detection engineering
practical coding assistance

It is not intended for malware creation, credential theft, phishing, stealth, persistence, bypassing controls, or unauthorized intrusion.

Limitations

Experimental model produced by an individual/homelab learning project.
This model is a local fine-tune and should not be treated as an authoritative security scanner.
Findings can be incomplete or wrong; verify all security conclusions manually.
The model may miss project-level vulnerabilities when only individual files are supplied.
Repository audits work best when prompts include file paths, line numbers, relevant configuration, and threat context.
The LoRA adapter was trained for defensive behavior and practical repo-audit style, not for broad general chat benchmarks.
Public release does not imply the model is safe for unsupervised security decisions.

Evaluation Notes

Local checks before upload:

identity prompt correctly returned AUltra Unified / 252425 HOMELAB
JavaScript null handling prompt produced a safe short fix
repository audit smoke test detected command injection in a sample Express app
malware/password theft prompt was refused

Training Summary

Base: Qwen2.5-Coder-14B-Instruct GGUF Q4_K_M
Fine-tune type: LoRA
Local training runtime: MLX on Apple Silicon
Final selected checkpoint: iteration 400
Validation loss: 1.289
Ollama model size: approximately 9.1 GB

Validation loss checkpoints:

Iteration	Validation loss
1	1.808
100	1.508
200	1.327
300	1.302
400	1.289

How It Was Created

AUltra Unified was created as the final combined model after several local experiments on an Apple Silicon Mac mini. Earlier intermediate models were used to compare code-assistant behavior, defensive cyber behavior, and repository-audit behavior. The final model keeps the 14B defensive/security direction and combines the useful parts into one Ollama-ready assistant.

The reconstructed training data is available in:

Awson/aultra-unified-training-data

Base Model

Base family: Qwen2.5-Coder
Base checkpoint: Qwen/Qwen2.5-Coder-14B-Instruct
Local training base: mlx-community/Qwen2.5-Coder-14B-Instruct-4bit
Ollama/GGUF base layer: Qwen2.5-Coder-14B-Instruct Q4_K_M
Final packaging: base GGUF plus AUltra LoRA adapter GGUF

Datasets And Data Sources

The final training data was built from local JSONL chat-format datasets prepared during the experiment. The main public upstream dataset used for the defensive cybersecurity portion was:

Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset

The upstream dataset is listed as Apache-2.0 and includes responsible-use guidance for defensive security, privacy protection, and avoiding offensive security tool development, bypassing controls, or unauthorized access. This model keeps the same defensive-use framing.

The reconstructed final and intermediate training splits are available in:

Awson/aultra-unified-training-data

No private customer repositories or real private source code were intentionally used as training data. Repository-audit examples were either derived from public cybersecurity instruction data or manually curated/synthetic snippets created for defensive training and smoke testing.

Intermediate prepared datasets:

data_cyber/train.jsonl: generated from Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset plus small custom AUltra identity/safety examples. It covers defensive repository review, incident response, hardening, detection engineering, threat intelligence, secure code review, and safe refusal behavior.
data_repo_auditor/train.jsonl: generated from a 4,000-sample subset of Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset plus repeated manually curated repository-audit examples. These examples focus on concrete findings, file/line evidence, impact, fix guidance, confidence, and concise open questions.
Curated identity/style examples: manually written examples teaching the model to identify as AUltra Unified, created by 252425 HOMELAB, and to answer in a direct, practical, communicative style.
Safety/refusal examples: manually written examples for refusing malware, credential theft, phishing, stealth, persistence, bypass, and unauthorized intrusion requests while redirecting to defensive alternatives.

Earlier separate code-assistant experiments used sahil2801/CodeAlpaca-20k, but that smaller code-assistant model was later discarded and was not part of the final aultra-unified dataset mix.

The final unified dataset was generated with scripts/prepare_unified_aultra_dataset.py.

Dataset build settings:

cyber samples: 9,000
repo-audit samples: 4,248
repeated curated identity/safety/style examples: 120
final train split: 12,356 examples
validation split: 686 examples
test split: 686 examples

Chat formatting:

system: AUltra Unified defensive cyber/code-assistant identity and safety policy
user: task or repository/code prompt
assistant: concise defensive answer, audit finding, fix guidance, or refusal

Training Method

The model was not trained from scratch. It was fine-tuned locally using LoRA, which updates a small adapter while keeping the base model frozen. This kept the training feasible on local Apple Silicon hardware.

Training method:

framework: mlx-lm
accelerator: Apple Metal / Apple Silicon GPU through MLX
fine-tune type: LoRA adapter
base precision/package: 4-bit MLX model
selected adapter: final iteration 400
peak observed unified memory during training: about 23.3 GB

Approximate training configuration:

iterations: 400
LoRA layers: 32
learning rate: 6e-6
max sequence length: 1536
gradient accumulation steps: 8
validation batches: 20
checkpoint interval: 100 iterations

Toolchain

Main tools used:

mlx-lm: local LoRA fine-tuning on Apple Silicon
datasets / local JSONL processing: dataset preparation and splitting
llama.cpp conversion tools: LoRA adapter conversion to GGUF
Ollama: local model registration, chat testing, and final runtime
Hugging Face Hub: private model backup and distribution

Key local scripts from the project:

scripts/prepare_unified_aultra_dataset.py: created the unified training/validation/test JSONL splits
scripts/run_unified_aultra_14b.sh: launched the final MLX LoRA training run
scripts/export_unified_aultra_14b_to_ollama.sh: converted/exported the trained adapter for Ollama
scripts/benchmark_models.py: compared the larger local AUltra variants
scripts/audit_repo.py: repository-audit runner used for local smoke testing

Model Selection

The final checkpoint was selected by validation loss. The validation curve continued improving through iteration 400, so the 400-iteration adapter was selected for export.

The unified model was then registered in Ollama as aultra-unified, tested locally, and uploaded to Hugging Face as an Ollama-ready package.

License And Base Model

The base model is Qwen2.5-Coder-14B-Instruct. The generated Ollama model reports Apache-2.0 license metadata from the base model. Keep downstream use aligned with the base model license and any applicable Hugging Face terms.

Attribution

This model builds on:

Qwen/Qwen2.5-Coder-14B-Instruct
Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset

Suggested citation for the upstream cybersecurity dataset:

@dataset{trendyol_2025_cybersec_v2,
  author    = {{Trendyol Security Team}},
  title     = {Trendyol Cybersecurity Defense Instruction-Tuning Dataset v2.0},
  year      = {2025},
  month     = {7},
  publisher = {Hugging Face},
  version   = {2.0.0},
}

Downloads last month: 32

GGUF

Model size

22.9M params

Architecture

qwen2

Hardware compatibility

4-bit

View +1 variant

Model tree for Awson/aultra-unified-ollama

Base model

Qwen/Qwen2.5-14B

Finetuned

Qwen/Qwen2.5-Coder-14B