Instructions to use jaimef21/gerbil-qwen3-coder-30b-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="jaimef21/gerbil-qwen3-coder-30b-gguf", filename="gerbil-qwen3.q4_k_m.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Use Docker
docker model run hf.co/jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with Ollama:
ollama run hf.co/jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
- Unsloth Studio new
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jaimef21/gerbil-qwen3-coder-30b-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jaimef21/gerbil-qwen3-coder-30b-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jaimef21/gerbil-qwen3-coder-30b-gguf to start chatting
- Pi new
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with Docker Model Runner:
docker model run hf.co/jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
- Lemonade
How to use jaimef21/gerbil-qwen3-coder-30b-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull jaimef21/gerbil-qwen3-coder-30b-gguf:Q4_K_M
Run and chat with the model
lemonade run user.gerbil-qwen3-coder-30b-gguf-Q4_K_M
List all available models
lemonade list
gerbil-qwen3-coder-30b-gguf
GGUF quants of jaimef21/gerbil-qwen3-coder-30b-bf16, fine-tuned from Qwen3-Coder-30B-A3B-Instruct for Gerbil Scheme generation.
Training pipeline and tooling: https://github.com/ober/gerbil-lora
Files
| File | Quant | Size | Notes |
|---|---|---|---|
gerbil-qwen3.q8_0.gguf |
Q8_0 | ~32 GB | High fidelity; indistinguishable from F16 for most use |
gerbil-qwen3.q4_k_m.gguf |
Q4_K_M | ~17 GB | Runs comfortably in 24 GB VRAM |
Usage with llama.cpp
./llama-cli -m gerbil-qwen3.q4_k_m.gguf -p "Write a Gerbil function to..."
Usage with Ollama
Build a Modelfile locally:
FROM ./gerbil-qwen3.q4_k_m.gguf
SYSTEM "You are an expert in Gerbil Scheme, a dialect of Scheme built on Gambit. You provide accurate, idiomatic Gerbil code with correct imports, function names, and arities. Module paths use the :std/* form (e.g. :std/sort, :std/iter, :std/text/json). The idiomatic definition form is `def`, not `define`."
PARAMETER temperature 0.2
PARAMETER num_ctx 32768
PARAMETER stop "<|im_end|>"
Training pipeline
Three-stage LoRA fine-tune (r=32, ฮฑ=64, fused-MoE expert targets):
- CPT โ Continued pre-training on Gerbil source corpus (lr 2e-5, 2 epochs)
- SFT โ Supervised fine-tune on instruction/response pairs (lr 1e-4, 2 epochs)
- DPO โ Direct preference optimization on wrongโright pairs (lr 5e-6, 3 epochs)
DPO eval (vs base Qwen3-Coder-30B-A3B-Instruct)
| Metric | Base | Trained | ฮ |
|---|---|---|---|
| Holdout task score | 31 | 39 | +8 |
| Anti-idioms hit | 1 | 0 | -1 |
| Code blocks wrapped | 9 | 14 | +5 |
| tok_lean_sum (P(chosen) > P(rejected)) | -4.17 | +4.03 | +8.19 |
| wins chosen / rejected (n=66) | 47 / 19 | 52 / 13 | +5 / -6 |
- Downloads last month
- 254
4-bit
8-bit
Model tree for jaimef21/gerbil-qwen3-coder-30b-gguf
Base model
Qwen/Qwen3-Coder-30B-A3B-Instruct