Instructions to use ZeZZm/aero-deuce with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use ZeZZm/aero-deuce with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Configuration Parsing Warning:Config file adapter_config.json cannot be fetched (too big)
Aero-Deuce β LoRA Adapter
A fine-tuned Gemma 4 12B instruction-following model trained with QLoRA + Muon optimizer on 30K samples. This repo contains the LoRA adapter only (~262 MB) β you need the base model to use it.
Which format should I use?
| Format | Best for | Link |
|---|---|---|
| LoRA Adapter β you are here | Merging with base model, further fine-tuning | This repo |
| GGUF Q4_K_M | Local inference, llama.cpp, LM Studio, GPT4All (~7 GB) | ZeZZm/aero-deuce-GGUF |
| MLX 4-bit | Apple Silicon (Mac), fastest on M-series chips (~6.3 GB) | ZeZZm/aero-deuce-MLX |
Just want to run it locally? Use the GGUF version β it's one file, no Python needed, works in LM Studio and GPT4All.
Download
Click Files and versions above. The adapter file is adapter_model.safetensors (~262 MB).
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-12b-it")
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-12b-it",
device_map="auto",
torch_dtype="auto",
)
# Load Aero-Deuce adapter
model = PeftModel.from_pretrained(model, "ZeZZm/aero-deuce")
# Merge for faster inference (optional)
model = model.merge_and_unload()
# Generate
messages = [{"role": "user", "content": "Write a Python function to reverse a linked list."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
Model Details
| Property | Value |
|---|---|
| Base Model | google/gemma-4-12b-it (12B params) |
| Training Method | QLoRA β 4-bit NF4 + LoRA r=16, alpha=32 |
| Trainable Params | 65.6M (0.55% of total) |
| Training Data | 30K samples (Alpaca 15K, Dolly 10K, No Robots 5K) |
| Optimizer | Muon (LoRA A/B) + AdamW (non-2D params) |
| Training Steps | 2,000 |
| Final Train Loss | 0.57 (from 3.82) |
| Final Val Loss | 1.04 |
| Adapter Size | ~262 MB |
Training Infrastructure
| Platform | GPU | Steps |
|---|---|---|
| Modal (spot) | A100 | 0β1000 |
| Lightning AI | A100-80GB | 1000β2000 |
Source Code
Full training code, configuration, and detailed results: github.com/Ryz3nPlayZ/aero-deuce
License
Apache 2.0
- Downloads last month
- 31
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support