VLSI-Gemma-v3

A Gemma 4 26B (MoE) model fine-tuned via Reinforcement Learning (GRPO) specifically for VLSI design and verification tasks. Trained on 547 VLSI questions across 7 domains.

HF repo: https://huggingface.co/vxkyyy/vlsi-gemma-v3

Model Details

Property Value
Base model google/gemma-4-26b-a4b-it
Training method GRPO (Group Relative Policy Optimization)
Platform Castform
Parameters 26B total (4B active, MoE)
Adapter type LoRA (rank 128, alpha 256)
Best checkpoint Step 159 (eval correct 0.896)
Total training steps 279

Training Data

547 questions across 7 VLSI domains:

Domain Count
Analog/Mixed-Signal 140
Digital RTL Design 119
Verification (UVM/SystemVerilog) 68
PDK / Physical Design 66
Synthesis / STA 56
General VLSI 56
Mixed Hard Problems 42

Performance

Metrics on held-out eval set (110 questions):

Metric Value
Correct reward (mean) 0.896
Correct reward (max@8) 0.942
Total reward 1.291
Quality reward 0.298
Structure reward 0.049
Syntax (compile) reward 0.018

Comparison vs Qwen 3.5-4B baseline:

  • +11.6% improvement over v2 Qwen eval correct (0.789 โ†’ 0.880)
  • Best-of-8 eval: 0.942 correct pass rate

Usage

Quick start (with LoRA adapter)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "google/gemma-4-26b-a4b-it"
adapter_path = "./vlsi-gemma-v3-checkpoint-159"

# Load base model
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    torch_dtype="auto",
)

# Apply LoRA adapter
model = PeftModel.from_pretrained(base_model, adapter_path)

# Ask a VLSI question
prompt = "Write synthesizable Verilog for a 4-bit synchronous counter with parallel load and active-low reset."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))

Merge and export

merged = model.merge_and_unload()
merged.save_pretrained("./vlsi-gemma-v3-merged")
tokenizer.save_pretrained("./vlsi-gemma-v3-merged")

Training Details

  • Environment: Custom VLSI Q&A environment with 5 reward components:
    • Correctness (keyword matching against ground truth) โ€” weight: 1.0
    • Verilog syntax (compile-check via pyverilog parser) โ€” weight: 0.02
    • Code quality (presence of code blocks) โ€” weight: 0.05
    • Answer quality (technical depth, vocabulary density) โ€” weight: 0.3
    • Structure (proper formatting) โ€” weight: 0.05
  • Learning rate: 1e-5
  • Group size: 9 rollouts per prompt
  • Epochs: 10
  • Hardware: NVIDIA A100/H100 GPUs

License

Apache 2.0

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support