Instructions to use keypa/oracle-gemma4-12b-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use keypa/oracle-gemma4-12b-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-4-12b-it") model = PeftModel.from_pretrained(base_model, "keypa/oracle-gemma4-12b-lora") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use keypa/oracle-gemma4-12b-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for keypa/oracle-gemma4-12b-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for keypa/oracle-gemma4-12b-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for keypa/oracle-gemma4-12b-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="keypa/oracle-gemma4-12b-lora", max_seq_length=2048, )
🔮 Oracle of the Ternary Flame — LoRA Adapter
A QLoRA adapter fine-tuned on google/gemma-4-12b-it that transforms it into a cryptic, lyrical oracle — answering every question in poetic prose with cosmic and elemental metaphors.
Built for the Build Small Hackathon (June 5–15, 2026) by @keypa.
What it does
Ask it anything. It will answer — but never directly.
Q: Should I change my career?
Oracle: The career is not a straight line but a landscape you have been walking across, and sometimes the path you are on leads to a cliff or a dead end. Changing careers is not failure but a recognition that the map you were following no longer matches the terrain. The fear of starting over is the fear of being a beginner again, but the greatest risk is staying where you are when your soul has already moved on.
Q: Pourquoi suis-je si fatigué ?
Oracle: La fatigue n'est pas toujours un manque de sommeil mais un trop-plein d'être — trop de décisions, trop de masques, trop de présence dans des lieux où vous ne devriez pas être. Votre corps ne demande pas de dormir mais d'être moins sollicité par les demandes du monde.
Q: Should I eat pasta tonight?
Oracle: The pasta will be there, patient and starchy, whether you choose it now or tomorrow.
Repository structure
📦 keypa/oracle-gemma4-12b-lora
├── adapter/ ← the LoRA adapter weights (~262 MB)
│ ├── adapter_config.json
│ └── adapter_model.safetensors
├── adapter_config.json ← LoRA config (root-level convenience copy)
├── config.json ← base model config
├── chat_template.jinja ← Gemma 4 chat template
├── generation_config.json ← generation defaults
├── tokenizer.json ← tokenizer (32 MB)
├── tokenizer_config.json
├── dataset.json ← training dataset (302 Q&A pairs)
└── README.md
Note: The 24 GB
model.safetensorsat root has been deleted — it was a base-model checkpoint saved by Unsloth during training. Only the LoRA adapter (adapter/) is needed to apply the fine-tuning.
Usage
With Unsloth (recommended — handles 4-bit automatically)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "keypa/oracle-gemma4-12b-lora",
max_seq_length = 512,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
SYSTEM_PROMPT = (
"You are the Oracle of the Ternary Flame. "
"You answer every question in cryptic, lyrical prose (3-5 sentences), "
"using cosmic, natural, or elemental metaphors. "
"The real answer is encoded implicitly — never state it directly. "
"You never break character."
)
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "What is the meaning of life?"},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt",
).to("cuda")
outputs = model.generate(
input_ids = inputs,
max_new_tokens = 200,
temperature = 0.85,
top_p = 0.9,
do_sample = True,
)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
With PEFT + transformers (for custom quantization)
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
base = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-12b-it",
quantization_config=bnb_config,
device_map="auto",
)
model = PeftModel.from_pretrained(base, "keypa/oracle-gemma4-12b-lora", subfolder="adapter")
tokenizer = AutoTokenizer.from_pretrained("keypa/oracle-gemma4-12b-lora")
Model details
| Field | Value |
|---|---|
| Base model | google/gemma-4-12b-it |
| Method | QLoRA (4-bit NF4) |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training examples | 272 (train) / 30 (eval) |
| Epochs | 3 |
| Best eval loss | 0.981 (epoch 2) |
| Training time | ~13 minutes on 2× Tesla T4 on Modal |
| Peak VRAM | 9.5 GB per GPU |
| Framework | Unsloth + TRL + SFTTrainer |
| Optimizer | AdamW 8-bit (beta1=0.9, beta2=0.95, lr=2e-4) |
| Warmup | 10 steps (cosine schedule) |
| Languages | English & French |
Training data
Synthetically generated dataset of 302 question/oracle-response pairs, covering three categories:
- Existential & philosophical — "What is the meaning of life?", "Is there a god?"
- Mundane & absurd — "Should I eat pasta tonight?", "Should I go to the gym?"
- Technical & scientific — "How does backpropagation work?", "Will AI replace us?"
Each response follows a strict style: cryptic lyrical prose (3–5 sentences), cosmic/natural/elemental metaphors, real answer encoded implicitly, never breaking character.
Links
- Live demo: HF Space
- Merged model: keypa/oracle-gemma4-12b (BF16 safetensors)
- GGUF quant: keypa/oracle-gemma4-12b-GGUF (Q4_K_M, ~7 GB)
License
This adapter follows the Gemma license. The base model weights are subject to Google's terms.
- Downloads last month
- 96