YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
๐ฎ Gemma 4 Vision Game Bot
A fully local, vision-only AI game bot. It sees the screen, thinks via a local LLM, and controls mouse/keyboard โ zero memory reading, zero cloud APIs.
๐๏ธ Architecture
Browser (localhost:7860) llama-server (localhost:8080)
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Gradio GUI โ โ Gemma 4 (GGUF) โ
โ Start/Stop/Pause โ โ Screenshot โ JSON โ
โ Live Screenshot โโโโโโโโโบโ decision โ
โ Live Logs โ HTTP โ โ
โ Stats Dashboard โ โ + mmproj (vision) โ
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Screen Capture โ โ Action Executor โ
โ mss / PyAutoGUI โ โ xdotool / PyAutoGUI โ
โ Window-only mode โ โ Safety bounds โ
โโโโโโโโโโโโโโโโโโโโโโโ โ Human-like delays โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โก Quick Start (Ubuntu)
git clone https://huggingface.co/belal611/gemma4-vision-gamebot
cd gemma4-vision-gamebot
chmod +x setup_gamebot.sh && ./setup_gamebot.sh
# Terminal 1: Start model server
cd ~/game-bot && ./start_server.sh
# Terminal 2: Start GUI
cd ~/game-bot && ./start_gui.sh
# โ Opens http://127.0.0.1:7860
๐ป Hardware Tiers โ From Weakest to Strongest
Every tier includes vision (image understanding). The mmproj file (~940 MB for E2B/E4B, ~1.1 GB for A4B/31B) is always required and is always FP16 โ it cannot be quantized.
๐ข Tier 1 โ Ultra-Low (4 GB RAM)
| Target hardware | Raspberry Pi 4/5 (4 GB), old netbooks, thin clients |
| Model | Gemma 4 E2B โ IQ2_M |
| GGUF size | 2.4 GB |
| Total RAM needed | ~3.8 GB (model + mmproj + overhead) |
| Speed (4 cores) | ~1โ3 tok/s โ 20โ40s per decision |
| Quality | โญโญ โ Degraded but functional for simple tasks |
| Download | huggingface-cli download bartowski/google_gemma-4-E2B-it-GGUF google_gemma-4-E2B-it-IQ2_M.gguf mmproj-google_gemma-4-E2B-it-f16.gguf --local-dir ./models/ |
๐ก Tier 2 โ Budget (8 GB RAM)
| Target hardware | Old desktops (i3/i5 Gen2โ4), basic laptops, 8 GB mini PCs |
| Model | Gemma 4 E2B โ Q4_K_M โญ Recommended |
| GGUF size | 3.2 GB |
| Total RAM needed | ~5.5 GB |
| Speed (4 cores) | ~2โ5 tok/s โ 8โ20s per decision |
| Quality | โญโญโญ โ Good for strategy/idle games |
| Verdict | Best value. Handles Clash of Clans perfectly. |
| Download | huggingface-cli download bartowski/google_gemma-4-E2B-it-GGUF google_gemma-4-E2B-it-Q4_K_M.gguf mmproj-google_gemma-4-E2B-it-f16.gguf --local-dir ./models/ |
Also viable at this tier:
| Quant | Size | Notes |
|---|---|---|
| E2B Q3_K_M | 3.0 GB | Slightly worse, saves 200 MB |
| E2B Q5_K_M | 3.4 GB | Slightly better, costs 200 MB |
| E2B Q8_0 | 4.6 GB | Near-lossless, tight fit at 8 GB |
๐ต Tier 3 โ Mainstream (16 GB RAM)
| Target hardware | Modern desktops/laptops, HP Z230 (the original target), M1 MacBook Air |
| Model | Gemma 4 E4B โ Q4_K_M |
| GGUF size | 5.0 GB |
| Total RAM needed | ~7.5 GB |
| Speed (8 cores) | ~3โ7 tok/s โ 5โ12s per decision |
| Quality | โญโญโญโญ โ Noticeably smarter than E2B |
| Verdict | Sweet spot. Runs the game + model + GUI comfortably. |
| Download | huggingface-cli download bartowski/google_gemma-4-E4B-it-GGUF google_gemma-4-E4B-it-Q4_K_M.gguf mmproj-google_gemma-4-E4B-it-f16.gguf --local-dir ./models/ |
Also viable at this tier:
| Quant | Size | Notes |
|---|---|---|
| E4B Q3_K_M | 4.6 GB | Saves 400 MB RAM |
| E4B Q6_K | 5.9 GB | High quality |
| E4B Q8_0 | 7.5 GB | Near-lossless, ~10 GB total |
| E2B Q8_0 | 4.6 GB | If you want faster speed over smarts |
๐ฃ Tier 4 โ Power User (32 GB RAM)
| Target hardware | Gaming PCs, workstations, 32 GB laptops, M2/M3 MacBook Pro |
| Model | Gemma 4 26B-A4B โ Q4_K_M (MoE, only 4B active params) |
| GGUF size | 15.9 GB |
| Total RAM needed | ~19 GB |
| Speed (8 cores) | ~2โ5 tok/s โ 8โ18s per decision |
| Quality | โญโญโญโญโญ โ Dramatically better reasoning, spatial understanding |
| Why MoE? | 26B total params but only 4B active per token โ fast like a 4B, smart like a 26B |
| Download | huggingface-cli download bartowski/google_gemma-4-26B-A4B-it-GGUF google_gemma-4-26B-A4B-it-Q4_K_M.gguf mmproj-google_gemma-4-26B-A4B-it-f16.gguf --local-dir ./models/ |
Also viable at this tier:
| Quant | Size | Notes |
|---|---|---|
| A4B IQ3_M | 12.4 GB | Fits tight 32 GB with game running |
| A4B Q3_K_M | 12.1 GB | Good balance |
| A4B Q6_K | 21.3 GB | Premium quality, ~24 GB total |
๐ด Tier 5 โ Enthusiast (64 GB+ RAM or GPU)
| Target hardware | 64 GB workstations, M4 Max, or any NVIDIA GPU (8+ GB VRAM) |
| Model | Gemma 4 31B โ Q4_K_M (dense, full 31B) |
| GGUF size | 18.3 GB |
| Total RAM needed | ~22 GB |
| Speed (CPU 16 cores) | ~1โ3 tok/s โ 12โ30s per decision |
| Speed (RTX 3060 12GB) | ~15โ25 tok/s โ 2โ5s per decision ๐ |
| Quality | โญโญโญโญโญ+ โ Best available. Near-GPT-4o-mini level vision |
| Download | huggingface-cli download bartowski/google_gemma-4-31B-it-GGUF google_gemma-4-31B-it-Q4_K_M.gguf mmproj-google_gemma-4-31B-it-f16.gguf --local-dir ./models/ |
GPU offloading (any tier with NVIDIA GPU):
# Offload layers to GPU for huge speedup
./llama-server -m model.gguf --mmproj mmproj.gguf -ngl 99 ...
๐ Summary Table
| Tier | RAM | Model | GGUF | Total RAM | Speed | Quality |
|---|---|---|---|---|---|---|
| ๐ข Ultra-Low | 4 GB | E2B IQ2_M | 2.4 GB | ~3.8 GB | 20โ40s | โญโญ |
| ๐ก Budget | 8 GB | E2B Q4_K_M | 3.2 GB | ~5.5 GB | 8โ20s | โญโญโญ |
| ๐ต Mainstream | 16 GB | E4B Q4_K_M | 5.0 GB | ~7.5 GB | 5โ12s | โญโญโญโญ |
| ๐ฃ Power | 32 GB | A4B Q4_K_M | 15.9 GB | ~19 GB | 8โ18s | โญโญโญโญโญ |
| ๐ด Enthusiast | 64 GB+ | 31B Q4_K_M | 18.3 GB | ~22 GB | 2โ30s* | โญโญโญโญโญ+ |
*GPU offloading dramatically reduces latency
๐ก๏ธ Safety & Robustness Features (v2)
| Feature | Description |
|---|---|
| Safety Bounds | Coordinates clamped to 0โ896. Window-only mode prevents clicking outside game. |
| Robust JSON Parser | Multi-stage: full text โ bracket extraction โ candidate testing. Never crashes on bad output. |
| Structured Output | GBNF grammar forces llama.cpp to generate valid JSON only โ 90% fewer parse errors. |
| Motion Detection | Compares screenshots via image hashing. Skips LLM call if screen unchanged โ saves CPU. |
| Error Recovery | Detects stuck screens (3x identical) and repeated actions (4x same). Auto-presses Escape. |
| Window-Only Capture | Optional: captures only the game window via xdotool, ignoring notifications/desktop. |
| Thread Crash Protection | Bot thread wrapped in try/except โ crashes are logged, not silent. |
| Human-Like Behavior | Random ยฑ3px offset on clicks, random delays between actions. |
๐ฏ Supported Games & Tasks
Clash of Clans
| Task | Description |
|---|---|
collect_resources |
Click full gold mines & elixir collectors |
train_army |
Navigate to barracks, train troops |
attack_farm |
Find weak base, deploy troops |
upgrade_buildings |
Use idle builders on priority upgrades |
donate_troops |
Fulfill clan donation requests |
clear_obstacles |
Remove trees, rocks, gem boxes |
daily_routine |
All of the above in sequence |
Silkroad Online
| Task | Description |
|---|---|
auto_hunt |
Kill monsters, loot drops |
quest |
Follow quest markers, talk to NPCs |
trade |
Buy/sell at market |
level_up |
Grind XP, use potions |
๐ Files
| File | Description |
|---|---|
โญ game_bot_gui.py |
Full GUI control panel (1271 lines) โ Gradio web interface |
game_bot.py |
CLI version for advanced users |
setup_gamebot.sh |
One-script installer (builds llama.cpp + downloads model) |
gui_mockup.png |
Visual preview of the GUI |
๐ง Changing the Model
Edit start_server.sh and point to your chosen GGUF files:
./llama-server \
-m ~/models/YOUR_MODEL.gguf \
--mmproj ~/models/YOUR_MMPROJ.gguf \
--host 127.0.0.1 --port 8080 \
--ctx-size 2048 -t $(nproc) --temp 0.1
Add -ngl 99 if you have an NVIDIA GPU.
๐ License
Apache 2.0 โ Model (Gemma 4) and code.
๐ Model Sources
| Model | Source |
|---|---|
| E2B GGUF | bartowski/google_gemma-4-E2B-it-GGUF |
| E4B GGUF | bartowski/google_gemma-4-E4B-it-GGUF |
| A4B GGUF | bartowski/google_gemma-4-26B-A4B-it-GGUF |
| 31B GGUF | bartowski/google_gemma-4-31B-it-GGUF |
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
