Granite-3.3-2B-Instruct โ€” Q4_K_M for IBM Power (Linux ppc64le + AIX)

IBM Granite 3.3 2B Instruct โ€” a dense transformer, quantized to Q4_K_M with a Q6_K output head for fast CPU inference on IBM Power (POWER9 VSX, POWER10/11 MMA-accelerated) via LibrePower. No GPU. Size: 1.5G. Apache-2.0, cryptographically signed by IBM.

Run it

Ubuntu / Debian ppc64le:

curl -fsSL https://linux.librepower.org/install.sh | sudo sh
sudo apt install librepower-llama
wget https://huggingface.co/librepowerai/Granite-3.3-2B-Instruct-Power/resolve/main/Granite-3.3-2B-Instruct-Q4_K_M.gguf
lp-llama-completion -m Granite-3.3-2B-Instruct-Q4_K_M.gguf -p "Hello!" -n 64 -t $(nproc)

IBM AIX 7.3 (big-endian):

dnf install llama-aix
wget https://huggingface.co/librepowerai/Granite-3.3-2B-Instruct-Power/resolve/main/Granite-3.3-2B-Instruct-Q4_K_M-be.gguf
lp-llama-completion -m Granite-3.3-2B-Instruct-Q4_K_M-be.gguf -p "Hello!" -n 64 -t $(nproc)

Good for

IBM's own dense small model: enterprise chat, RAG, structured JSON, classification/routing โ€” Apache-2.0, signed

Credits

Base model: IBM Granite (Apache-2.0). Quantization & Power packaging: LibrePower.

Downloads last month
41
GGUF
Model size
3B params
Architecture
granite
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for librepowerai/Granite-3.3-2B-Instruct-Power

Quantized
(48)
this model