Gemma 4 E4B IT – GGUF (IQ4_NL)

Format Precision Runtime


🔷 Model Overview

This repository contains a GGUF IQ4_NLconversion of:

  • Base Model: gemma-4-e4b-it
  • Developer: Google
  • Format: GGUF (optimized for llama.cpp)
  • Precision: IQ4_NL

This model is designed for high-quality local inference.


📦 Files

File Description
gemma-4-e4b-it-IQ4_NL.gguf IQ4_NL full-precision GGUF model

⚙️ Technical Details

Parameter Value
Architecture gemma-4-e4b-it
Format GGUF
Precision IQ4_NL
Runtime llama.cpp
Use Case High-quality inference

⚡ Why GGUF?

GGUF enables:

  • Efficient CPU inference via llama.cpp
  • Single-file model distribution
  • Fast loading using memory mapping
  • Cross-platform compatibility

⚠️ License & Usage

This is a converted derivative model.

You must comply with the original license:
👉 https://huggingface.co/google/gemma-4-e4b-it

Important:

  • ❌ Not an official Google release
  • ❌ No additional rights granted
  • ✅ Original model ownership remains with Google
  • ⚠️ Use responsibly under original license terms

🚀 Quick Start (llama.cpp)

./llama-cli -m gemma-4-e4b-it-IQ4_NL.gguf -p "Explain AI simply"
Downloads last month
44
GGUF
Model size
8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support