Instructions to use saggamer/Orchestrator_V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use saggamer/Orchestrator_V1 with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("saggamer/Orchestrator_V1") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use saggamer/Orchestrator_V1 with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "saggamer/Orchestrator_V1"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "saggamer/Orchestrator_V1" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use saggamer/Orchestrator_V1 with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "saggamer/Orchestrator_V1"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default saggamer/Orchestrator_V1
Run Hermes
hermes
- MLX LM
How to use saggamer/Orchestrator_V1 with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "saggamer/Orchestrator_V1"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "saggamer/Orchestrator_V1" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "saggamer/Orchestrator_V1", "messages": [ {"role": "user", "content": "Hello"} ] }'
Base Model
Orchestrator V1 is built on top of Gemma 4 E4B, using the MLX community 4-bit instruction-tuned base:
mlx-community/gemma-4-E4B-it-4bit
This release preserves the Apache 2.0 licensing metadata and is published as an agentic fine-tuned/modified model focused on tool use, automation planning, safety-aware execution, and local-agent workflows.
Orchestrator V1
Orchestrator V1 is an agentic automation model built to act as the planning and decision layer inside local AI agents, desktop assistants, IDE copilots, MCP-style tool systems, and OS-level automation runtimes.
Orchestrator V1 was originally developed for KIRA OS, a local AI operating environment still in development, but it is not limited to KIRA OS. Developers can use it in any compatible runtime that supports the tokenizer, chat template, and optional agent/tool execution loop. The model is not intended to be only a conversational chatbot. It is designed to be connected to tools.
What Makes It Different
Most small local models can answer questions, but agentic systems need more than answers. They need a model that can decide what kind of action is required, when more context is missing, whether a tool result is enough, and whether the next step is safe.
Orchestrator V1 was fine-tuned around that exact pattern:
- understand the user's task
- decide whether tool use is needed
- choose a safe pathway
- inspect real returned data
- continue from evidence instead of guessing
- ask permission before risky operations
- produce a clear final response
This makes it useful for developers building agents that need to operate in real environments rather than simply simulate completion.
Best Use Cases
Orchestrator V1 is a strong fit for:
- local AI operating systems
- desktop automation agents
- tool-calling chat engines
- IDE assistants and coding copilots
- MCP connector orchestration
- browser and research agents
- voice-first automation systems
- system diagnostic assistants
- file and workspace inspection agents
- report-generation agents
- multi-agent or council-style verification workflows
Agentic Capabilities
Capabilities depend on the host runtime. Orchestrator V1 does not execute actions by itself. The chat engine, agent framework, or application must provide tools and return tool results back to the model.
With the right runtime, Orchestrator V1 can be connected to:
- web search and page browsing
- headless browser inspection
- local file and folder inspection
- app opening and app control
- IDE context gathering
- terminal and shell execution
- AppleScript or macOS automation
- MCP connector discovery and setup
- PDF generation
- DOCX / Word report generation
- PPTX generation, including image-supported slide decks, if the chat engine or agent supports file-generation tools
- system diagnostics such as RAM, storage, running apps, network state, and project structure
- background or scheduled tasks
- permission-gated file operations
- multi-agent verification layers
Reasoning Style
Orchestrator V1 was trained for private Train-of-Thought / Tree-of-Thought-style agentic reasoning.
The model is intended to reason internally about safety, tool choice, missing information, task order, and verification. Agent runtimes should not show private reasoning traces directly to users.
Recommended runtime behavior:
- hide thought tokens and internal reasoning traces
- show concise progress updates instead
- show permission requests when needed
- show relevant tool results only when useful
- show a final answer grounded in completed tool calls
Integration Pattern
The recommended agent loop is:
User request
-> Orchestrator V1 decides the next step
-> Runtime executes the selected tool
-> Runtime returns the real tool result
-> Orchestrator V1 analyzes the result
-> Repeat until complete
-> Final answer
For best results, do not let the model merely say that it completed a task. The runtime should only mark a task complete after a tool result confirms it.
Developer Notes
To integrate Orchestrator V1 into an agent system, provide:
- A real tool execution layer
- A tool-result continuation loop
- Permission gates for destructive or privacy-sensitive actions
- Hidden reasoning / thought-token filtering
- The original tokenizer and chat template from this repository
- Clear system instructions describing available tools
- Runtime checks that prevent hallucinated completion
- Sandboxing for shell, browser, file, and OS actions
The model works best as the controller brain of an agent, not as the full agent runtime by itself.
Example System Prompt Skeleton
You are Orchestrator V1, the agentic planning model for a local AI runtime.
You can choose tools when needed. Never claim that an action is complete unless the runtime has returned evidence that it completed.
Use read-only tools freely for inspection. Ask permission before destructive, irreversible, privacy-sensitive, or system-changing actions.
Keep internal reasoning private. Return concise progress updates, permission requests, useful tool evidence, and final answers only.
Limitations
- This model requires an agent runtime for tool execution.
- It should not be used as a proof that an action happened unless the runtime returns tool evidence.
- Destructive operations must be permission-gated by the host application.
- Hidden reasoning traces should be filtered from user-facing interfaces.
- Performance depends heavily on the quality of the tool schema, system prompt, and tool-result loop.
Intended Audience
This release is for developers, researchers, and builders experimenting with local agents, desktop automation, MCP-style connectors, and safe computer-control systems.
If you are building a tool-using agent and want a compact local controller model that is trained around agentic decision-making, Orchestrator V1 is designed for that space.
- Downloads last month
- 33
4-bit