YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

๐ŸŽฎ Gemma 4 Vision Game Bot

A fully local, vision-only AI game bot. It sees the screen, thinks via a local LLM, and controls mouse/keyboard โ€” zero memory reading, zero cloud APIs.

GUI Mockup


๐Ÿ—๏ธ Architecture

  Browser (localhost:7860)        llama-server (localhost:8080)
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚   Gradio GUI        โ”‚        โ”‚   Gemma 4 (GGUF)         โ”‚
  โ”‚   Start/Stop/Pause  โ”‚        โ”‚   Screenshot โ†’ JSON      โ”‚
  โ”‚   Live Screenshot   โ”‚โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ–บโ”‚   decision               โ”‚
  โ”‚   Live Logs         โ”‚  HTTP  โ”‚                          โ”‚
  โ”‚   Stats Dashboard   โ”‚        โ”‚   + mmproj (vision)      โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚                                โ”‚
           โ–ผ                                โ–ผ
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚   Screen Capture    โ”‚        โ”‚   Action Executor        โ”‚
  โ”‚   mss / PyAutoGUI   โ”‚        โ”‚   xdotool / PyAutoGUI    โ”‚
  โ”‚   Window-only mode  โ”‚        โ”‚   Safety bounds          โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ”‚   Human-like delays      โ”‚
                                 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โšก Quick Start (Ubuntu)

git clone https://huggingface.co/belal611/gemma4-vision-gamebot
cd gemma4-vision-gamebot
chmod +x setup_gamebot.sh && ./setup_gamebot.sh

# Terminal 1: Start model server
cd ~/game-bot && ./start_server.sh

# Terminal 2: Start GUI
cd ~/game-bot && ./start_gui.sh
# โ†’ Opens http://127.0.0.1:7860

๐Ÿ’ป Hardware Tiers โ€” From Weakest to Strongest

Every tier includes vision (image understanding). The mmproj file (~940 MB for E2B/E4B, ~1.1 GB for A4B/31B) is always required and is always FP16 โ€” it cannot be quantized.

๐ŸŸข Tier 1 โ€” Ultra-Low (4 GB RAM)

Target hardware Raspberry Pi 4/5 (4 GB), old netbooks, thin clients
Model Gemma 4 E2B โ€” IQ2_M
GGUF size 2.4 GB
Total RAM needed ~3.8 GB (model + mmproj + overhead)
Speed (4 cores) ~1โ€“3 tok/s โ†’ 20โ€“40s per decision
Quality โญโญ โ€” Degraded but functional for simple tasks
Download huggingface-cli download bartowski/google_gemma-4-E2B-it-GGUF google_gemma-4-E2B-it-IQ2_M.gguf mmproj-google_gemma-4-E2B-it-f16.gguf --local-dir ./models/

๐ŸŸก Tier 2 โ€” Budget (8 GB RAM)

Target hardware Old desktops (i3/i5 Gen2โ€“4), basic laptops, 8 GB mini PCs
Model Gemma 4 E2B โ€” Q4_K_M โญ Recommended
GGUF size 3.2 GB
Total RAM needed ~5.5 GB
Speed (4 cores) ~2โ€“5 tok/s โ†’ 8โ€“20s per decision
Quality โญโญโญ โ€” Good for strategy/idle games
Verdict Best value. Handles Clash of Clans perfectly.
Download huggingface-cli download bartowski/google_gemma-4-E2B-it-GGUF google_gemma-4-E2B-it-Q4_K_M.gguf mmproj-google_gemma-4-E2B-it-f16.gguf --local-dir ./models/

Also viable at this tier:

Quant Size Notes
E2B Q3_K_M 3.0 GB Slightly worse, saves 200 MB
E2B Q5_K_M 3.4 GB Slightly better, costs 200 MB
E2B Q8_0 4.6 GB Near-lossless, tight fit at 8 GB

๐Ÿ”ต Tier 3 โ€” Mainstream (16 GB RAM)

Target hardware Modern desktops/laptops, HP Z230 (the original target), M1 MacBook Air
Model Gemma 4 E4B โ€” Q4_K_M
GGUF size 5.0 GB
Total RAM needed ~7.5 GB
Speed (8 cores) ~3โ€“7 tok/s โ†’ 5โ€“12s per decision
Quality โญโญโญโญ โ€” Noticeably smarter than E2B
Verdict Sweet spot. Runs the game + model + GUI comfortably.
Download huggingface-cli download bartowski/google_gemma-4-E4B-it-GGUF google_gemma-4-E4B-it-Q4_K_M.gguf mmproj-google_gemma-4-E4B-it-f16.gguf --local-dir ./models/

Also viable at this tier:

Quant Size Notes
E4B Q3_K_M 4.6 GB Saves 400 MB RAM
E4B Q6_K 5.9 GB High quality
E4B Q8_0 7.5 GB Near-lossless, ~10 GB total
E2B Q8_0 4.6 GB If you want faster speed over smarts

๐ŸŸฃ Tier 4 โ€” Power User (32 GB RAM)

Target hardware Gaming PCs, workstations, 32 GB laptops, M2/M3 MacBook Pro
Model Gemma 4 26B-A4B โ€” Q4_K_M (MoE, only 4B active params)
GGUF size 15.9 GB
Total RAM needed ~19 GB
Speed (8 cores) ~2โ€“5 tok/s โ†’ 8โ€“18s per decision
Quality โญโญโญโญโญ โ€” Dramatically better reasoning, spatial understanding
Why MoE? 26B total params but only 4B active per token โ€” fast like a 4B, smart like a 26B
Download huggingface-cli download bartowski/google_gemma-4-26B-A4B-it-GGUF google_gemma-4-26B-A4B-it-Q4_K_M.gguf mmproj-google_gemma-4-26B-A4B-it-f16.gguf --local-dir ./models/

Also viable at this tier:

Quant Size Notes
A4B IQ3_M 12.4 GB Fits tight 32 GB with game running
A4B Q3_K_M 12.1 GB Good balance
A4B Q6_K 21.3 GB Premium quality, ~24 GB total

๐Ÿ”ด Tier 5 โ€” Enthusiast (64 GB+ RAM or GPU)

Target hardware 64 GB workstations, M4 Max, or any NVIDIA GPU (8+ GB VRAM)
Model Gemma 4 31B โ€” Q4_K_M (dense, full 31B)
GGUF size 18.3 GB
Total RAM needed ~22 GB
Speed (CPU 16 cores) ~1โ€“3 tok/s โ†’ 12โ€“30s per decision
Speed (RTX 3060 12GB) ~15โ€“25 tok/s โ†’ 2โ€“5s per decision ๐Ÿš€
Quality โญโญโญโญโญ+ โ€” Best available. Near-GPT-4o-mini level vision
Download huggingface-cli download bartowski/google_gemma-4-31B-it-GGUF google_gemma-4-31B-it-Q4_K_M.gguf mmproj-google_gemma-4-31B-it-f16.gguf --local-dir ./models/

GPU offloading (any tier with NVIDIA GPU):

# Offload layers to GPU for huge speedup
./llama-server -m model.gguf --mmproj mmproj.gguf -ngl 99 ...

๐Ÿ“Š Summary Table

Tier RAM Model GGUF Total RAM Speed Quality
๐ŸŸข Ultra-Low 4 GB E2B IQ2_M 2.4 GB ~3.8 GB 20โ€“40s โญโญ
๐ŸŸก Budget 8 GB E2B Q4_K_M 3.2 GB ~5.5 GB 8โ€“20s โญโญโญ
๐Ÿ”ต Mainstream 16 GB E4B Q4_K_M 5.0 GB ~7.5 GB 5โ€“12s โญโญโญโญ
๐ŸŸฃ Power 32 GB A4B Q4_K_M 15.9 GB ~19 GB 8โ€“18s โญโญโญโญโญ
๐Ÿ”ด Enthusiast 64 GB+ 31B Q4_K_M 18.3 GB ~22 GB 2โ€“30s* โญโญโญโญโญ+

*GPU offloading dramatically reduces latency


๐Ÿ›ก๏ธ Safety & Robustness Features (v2)

Feature Description
Safety Bounds Coordinates clamped to 0โ€“896. Window-only mode prevents clicking outside game.
Robust JSON Parser Multi-stage: full text โ†’ bracket extraction โ†’ candidate testing. Never crashes on bad output.
Structured Output GBNF grammar forces llama.cpp to generate valid JSON only โ€” 90% fewer parse errors.
Motion Detection Compares screenshots via image hashing. Skips LLM call if screen unchanged โ€” saves CPU.
Error Recovery Detects stuck screens (3x identical) and repeated actions (4x same). Auto-presses Escape.
Window-Only Capture Optional: captures only the game window via xdotool, ignoring notifications/desktop.
Thread Crash Protection Bot thread wrapped in try/except โ€” crashes are logged, not silent.
Human-Like Behavior Random ยฑ3px offset on clicks, random delays between actions.

๐ŸŽฏ Supported Games & Tasks

Clash of Clans

Task Description
collect_resources Click full gold mines & elixir collectors
train_army Navigate to barracks, train troops
attack_farm Find weak base, deploy troops
upgrade_buildings Use idle builders on priority upgrades
donate_troops Fulfill clan donation requests
clear_obstacles Remove trees, rocks, gem boxes
daily_routine All of the above in sequence

Silkroad Online

Task Description
auto_hunt Kill monsters, loot drops
quest Follow quest markers, talk to NPCs
trade Buy/sell at market
level_up Grind XP, use potions

๐Ÿ“ Files

File Description
โญ game_bot_gui.py Full GUI control panel (1271 lines) โ€” Gradio web interface
game_bot.py CLI version for advanced users
setup_gamebot.sh One-script installer (builds llama.cpp + downloads model)
gui_mockup.png Visual preview of the GUI

๐Ÿ”ง Changing the Model

Edit start_server.sh and point to your chosen GGUF files:

./llama-server \
    -m ~/models/YOUR_MODEL.gguf \
    --mmproj ~/models/YOUR_MMPROJ.gguf \
    --host 127.0.0.1 --port 8080 \
    --ctx-size 2048 -t $(nproc) --temp 0.1

Add -ngl 99 if you have an NVIDIA GPU.


๐Ÿ“œ License

Apache 2.0 โ€” Model (Gemma 4) and code.

๐Ÿ”— Model Sources

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support