Configuration Parsing Warning:Config file adapter_config.json cannot be fetched (too big)

Aero-Deuce β€” LoRA Adapter

A fine-tuned Gemma 4 12B instruction-following model trained with QLoRA + Muon optimizer on 30K samples. This repo contains the LoRA adapter only (~262 MB) β€” you need the base model to use it.

Which format should I use?

Format Best for Link
LoRA Adapter ← you are here Merging with base model, further fine-tuning This repo
GGUF Q4_K_M Local inference, llama.cpp, LM Studio, GPT4All (~7 GB) ZeZZm/aero-deuce-GGUF
MLX 4-bit Apple Silicon (Mac), fastest on M-series chips (~6.3 GB) ZeZZm/aero-deuce-MLX

Just want to run it locally? Use the GGUF version β€” it's one file, no Python needed, works in LM Studio and GPT4All.

Download

Click Files and versions above. The adapter file is adapter_model.safetensors (~262 MB).

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-12b-it")
model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-12b-it",
    device_map="auto",
    torch_dtype="auto",
)

# Load Aero-Deuce adapter
model = PeftModel.from_pretrained(model, "ZeZZm/aero-deuce")

# Merge for faster inference (optional)
model = model.merge_and_unload()

# Generate
messages = [{"role": "user", "content": "Write a Python function to reverse a linked list."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

Model Details

Property Value
Base Model google/gemma-4-12b-it (12B params)
Training Method QLoRA β€” 4-bit NF4 + LoRA r=16, alpha=32
Trainable Params 65.6M (0.55% of total)
Training Data 30K samples (Alpaca 15K, Dolly 10K, No Robots 5K)
Optimizer Muon (LoRA A/B) + AdamW (non-2D params)
Training Steps 2,000
Final Train Loss 0.57 (from 3.82)
Final Val Loss 1.04
Adapter Size ~262 MB

Training Infrastructure

Platform GPU Steps
Modal (spot) A100 0–1000
Lightning AI A100-80GB 1000–2000

Source Code

Full training code, configuration, and detailed results: github.com/Ryz3nPlayZ/aero-deuce

License

Apache 2.0

Downloads last month
31
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support