Base Model

Orchestrator V1 is built on top of Gemma 4 E4B, using the MLX community 4-bit instruction-tuned base:

mlx-community/gemma-4-E4B-it-4bit

This release preserves the Apache 2.0 licensing metadata and is published as an agentic fine-tuned/modified model focused on tool use, automation planning, safety-aware execution, and local-agent workflows.

Orchestrator V1

Orchestrator V1 is an agentic automation model built to act as the planning and decision layer inside local AI agents, desktop assistants, IDE copilots, MCP-style tool systems, and OS-level automation runtimes.

Orchestrator V1 was originally developed for KIRA OS, a local AI operating environment still in development, but it is not limited to KIRA OS. Developers can use it in any compatible runtime that supports the tokenizer, chat template, and optional agent/tool execution loop. The model is not intended to be only a conversational chatbot. It is designed to be connected to tools.

What Makes It Different

Most small local models can answer questions, but agentic systems need more than answers. They need a model that can decide what kind of action is required, when more context is missing, whether a tool result is enough, and whether the next step is safe.

Orchestrator V1 was fine-tuned around that exact pattern:

understand the user's task
decide whether tool use is needed
choose a safe pathway
inspect real returned data
continue from evidence instead of guessing
ask permission before risky operations
produce a clear final response

This makes it useful for developers building agents that need to operate in real environments rather than simply simulate completion.

Best Use Cases

Orchestrator V1 is a strong fit for:

local AI operating systems
desktop automation agents
tool-calling chat engines
IDE assistants and coding copilots
MCP connector orchestration
browser and research agents
voice-first automation systems
system diagnostic assistants
file and workspace inspection agents
report-generation agents
multi-agent or council-style verification workflows

Agentic Capabilities

Capabilities depend on the host runtime. Orchestrator V1 does not execute actions by itself. The chat engine, agent framework, or application must provide tools and return tool results back to the model.

With the right runtime, Orchestrator V1 can be connected to:

web search and page browsing
headless browser inspection
local file and folder inspection
app opening and app control
IDE context gathering
terminal and shell execution
AppleScript or macOS automation
MCP connector discovery and setup
PDF generation
DOCX / Word report generation
PPTX generation, including image-supported slide decks, if the chat engine or agent supports file-generation tools
system diagnostics such as RAM, storage, running apps, network state, and project structure
background or scheduled tasks
permission-gated file operations
multi-agent verification layers

Reasoning Style

Orchestrator V1 was trained for private Train-of-Thought / Tree-of-Thought-style agentic reasoning.

The model is intended to reason internally about safety, tool choice, missing information, task order, and verification. Agent runtimes should not show private reasoning traces directly to users.

Recommended runtime behavior:

hide thought tokens and internal reasoning traces
show concise progress updates instead
show permission requests when needed
show relevant tool results only when useful
show a final answer grounded in completed tool calls

Integration Pattern

The recommended agent loop is:

User request
-> Orchestrator V1 decides the next step
-> Runtime executes the selected tool
-> Runtime returns the real tool result
-> Orchestrator V1 analyzes the result
-> Repeat until complete
-> Final answer

For best results, do not let the model merely say that it completed a task. The runtime should only mark a task complete after a tool result confirms it.

Developer Notes

To integrate Orchestrator V1 into an agent system, provide:

A real tool execution layer
A tool-result continuation loop
Permission gates for destructive or privacy-sensitive actions
Hidden reasoning / thought-token filtering
The original tokenizer and chat template from this repository
Clear system instructions describing available tools
Runtime checks that prevent hallucinated completion
Sandboxing for shell, browser, file, and OS actions

The model works best as the controller brain of an agent, not as the full agent runtime by itself.

Example System Prompt Skeleton

You are Orchestrator V1, the agentic planning model for a local AI runtime.

You can choose tools when needed. Never claim that an action is complete unless the runtime has returned evidence that it completed.

Use read-only tools freely for inspection. Ask permission before destructive, irreversible, privacy-sensitive, or system-changing actions.

Keep internal reasoning private. Return concise progress updates, permission requests, useful tool evidence, and final answers only.

Limitations

This model requires an agent runtime for tool execution.
It should not be used as a proof that an action happened unless the runtime returns tool evidence.
Destructive operations must be permission-gated by the host application.
Hidden reasoning traces should be filtered from user-facing interfaces.
Performance depends heavily on the quality of the tool schema, system prompt, and tool-result loop.

Intended Audience

This release is for developers, researchers, and builders experimenting with local agents, desktop automation, MCP-style connectors, and safe computer-control systems.

If you are building a tool-using agent and want a compact local controller model that is trained around agentic decision-making, Orchestrator V1 is designed for that space.

Downloads last month: 33

Safetensors

Model size

1B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit