AIpster

company

https://aipster.com

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

azaiats updated a model 1 day ago

aipster/DevRouter-1.5B

azaiats updated a Space 1 day ago

aipster/README

azaiats updated a model 1 day ago

aipster/DevRouter-1.5B-GGUF

View all activity

Organization Card

Community About org cards

AIpster

An independent think tank on artificial intelligence, society, and the future of thought.

We're a collective of computer science friends from the late '90s who turned a WhatsApp group into a laboratory for exploring what AI is doing to how we work, build, and think.

🌐 aipster.com

What we do here

This Hugging Face organization is where we publish the artifacts of our exploration — models, datasets, and tools that come out of the experiments we write about on our blog.

We're not a company. We don't sell anything. We build things to understand them, then share what we learned.

Focus areas

🔬 Small specialist models — distillation, fine-tuning, and the art of making tiny models punch above their weight
🧭 Prompt engineering & routing — how prompts become infrastructure, not just text
🛠️ Local LLM workflows — what 96 GB of VRAM can (and can't) do
🤖 Coding agents & automation — how AI is reshaping software development from the inside out
📖 AI & society — the uncomfortable conversations the industry would rather skip

What you'll find here

Models

DevRouter-1.5B — our first release. A tiny prompt router that reads a raw developer prompt and returns a single JSON decision: a cleaned-up rewrite, an intent / complexity classification, a suggested model-tier route, and the context the prompt forgot to include. Built on Qwen2.5-Coder-1.5B (Apache 2.0) and distilled from a stronger teacher, it holds ~96% valid-JSON and runs at ~280 tokens/s on a single RTX 3090 — small enough to sit in front of your real models and triage every prompt in 1–3 seconds.

🧠 aipster/DevRouter-1.5B — fp16 weights (transformers / vLLM)
📦 aipster/DevRouter-1.5B-GGUF — Q8_0 + F16, plug-n-play with Ollama / llama.cpp

And one honest caveat, because we ship those too: Q6 and below quantizations break its JSON. A small model doing strict structured output is far more fragile than the "Q4 is fine" rule of thumb suggests — ship Q8_0 or F16.