Base Model

Orchestrator V1 is built on top of Gemma 4 E4B, using the MLX community 4-bit instruction-tuned base:

mlx-community/gemma-4-E4B-it-4bit

This release preserves the Apache 2.0 licensing metadata and is published as an agentic fine-tuned/modified model focused on tool use, automation planning, safety-aware execution, and local-agent workflows.

Orchestrator V1

Orchestrator V1 is an agentic automation model built to act as the planning and decision layer inside local AI agents, desktop assistants, IDE copilots, MCP-style tool systems, and OS-level automation runtimes.

Orchestrator V1 was originally developed for KIRA OS, a local AI operating environment still in development, but it is not limited to KIRA OS. Developers can use it in any compatible runtime that supports the tokenizer, chat template, and optional agent/tool execution loop. The model is not intended to be only a conversational chatbot. It is designed to be connected to tools.

What Makes It Different

Most small local models can answer questions, but agentic systems need more than answers. They need a model that can decide what kind of action is required, when more context is missing, whether a tool result is enough, and whether the next step is safe.

Orchestrator V1 was fine-tuned around that exact pattern:

  • understand the user's task
  • decide whether tool use is needed
  • choose a safe pathway
  • inspect real returned data
  • continue from evidence instead of guessing
  • ask permission before risky operations
  • produce a clear final response

This makes it useful for developers building agents that need to operate in real environments rather than simply simulate completion.

Best Use Cases

Orchestrator V1 is a strong fit for:

  • local AI operating systems
  • desktop automation agents
  • tool-calling chat engines
  • IDE assistants and coding copilots
  • MCP connector orchestration
  • browser and research agents
  • voice-first automation systems
  • system diagnostic assistants
  • file and workspace inspection agents
  • report-generation agents
  • multi-agent or council-style verification workflows

Agentic Capabilities

Capabilities depend on the host runtime. Orchestrator V1 does not execute actions by itself. The chat engine, agent framework, or application must provide tools and return tool results back to the model.

With the right runtime, Orchestrator V1 can be connected to:

  • web search and page browsing
  • headless browser inspection
  • local file and folder inspection
  • app opening and app control
  • IDE context gathering
  • terminal and shell execution
  • AppleScript or macOS automation
  • MCP connector discovery and setup
  • PDF generation
  • DOCX / Word report generation
  • PPTX generation, including image-supported slide decks, if the chat engine or agent supports file-generation tools
  • system diagnostics such as RAM, storage, running apps, network state, and project structure
  • background or scheduled tasks
  • permission-gated file operations
  • multi-agent verification layers

Reasoning Style

Orchestrator V1 was trained for private Train-of-Thought / Tree-of-Thought-style agentic reasoning.

The model is intended to reason internally about safety, tool choice, missing information, task order, and verification. Agent runtimes should not show private reasoning traces directly to users.

Recommended runtime behavior:

  • hide thought tokens and internal reasoning traces
  • show concise progress updates instead
  • show permission requests when needed
  • show relevant tool results only when useful
  • show a final answer grounded in completed tool calls

Integration Pattern

The recommended agent loop is:

User request
-> Orchestrator V1 decides the next step
-> Runtime executes the selected tool
-> Runtime returns the real tool result
-> Orchestrator V1 analyzes the result
-> Repeat until complete
-> Final answer

For best results, do not let the model merely say that it completed a task. The runtime should only mark a task complete after a tool result confirms it.

Developer Notes

To integrate Orchestrator V1 into an agent system, provide:

  1. A real tool execution layer
  2. A tool-result continuation loop
  3. Permission gates for destructive or privacy-sensitive actions
  4. Hidden reasoning / thought-token filtering
  5. The original tokenizer and chat template from this repository
  6. Clear system instructions describing available tools
  7. Runtime checks that prevent hallucinated completion
  8. Sandboxing for shell, browser, file, and OS actions

The model works best as the controller brain of an agent, not as the full agent runtime by itself.

Example System Prompt Skeleton

You are Orchestrator V1, the agentic planning model for a local AI runtime.

You can choose tools when needed. Never claim that an action is complete unless the runtime has returned evidence that it completed.

Use read-only tools freely for inspection. Ask permission before destructive, irreversible, privacy-sensitive, or system-changing actions.

Keep internal reasoning private. Return concise progress updates, permission requests, useful tool evidence, and final answers only.

Limitations

  • This model requires an agent runtime for tool execution.
  • It should not be used as a proof that an action happened unless the runtime returns tool evidence.
  • Destructive operations must be permission-gated by the host application.
  • Hidden reasoning traces should be filtered from user-facing interfaces.
  • Performance depends heavily on the quality of the tool schema, system prompt, and tool-result loop.

Intended Audience

This release is for developers, researchers, and builders experimenting with local agents, desktop automation, MCP-style connectors, and safe computer-control systems.

If you are building a tool-using agent and want a compact local controller model that is trained around agentic decision-making, Orchestrator V1 is designed for that space.

Downloads last month
33
Safetensors
Model size
1B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support