Developer note:

This is release candidate 1, it is substantially more powerful than base E4B gemma.

I'm debating whether it is wise to release the tools I used to do this


How I obtained a representation of Claude's Neurons

I started by analyzing the distribution of activations in each model. For each layer, I looked at:

  • Where do activations cluster in the hidden state space?
  • What input magnitudes are typical?

Though this was a lot harder for Claude since Anthropic does not tell how many layers Claude has.

Comparison with Base Model

Aspect Gemma-4-E4B-IT Gemma-4-E4B-Opus
Base weights ✓ (preserved)
Neuron adapters ✓ (fused)
gate/up Standard Unmodified
down Standard Contoured
Context 8192 8192
Quantization BF16 Q8_0 (~8.0GB)

Downloads last month
1,290
GGUF
Model size
8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support