You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Zenthi-AI OS: Agentic Multi-Model Small Language Model (SLM)

Zenthi-AI is a production-grade, custom fine-tuned Small Language Model (SLM) conversational assistant. It is optimized for high-speed, local-first execution and acts as the intent-routing brain and synthesis engine of the Zenthi-AI Multi-Model Operating System.

This repository hosts the merged, full-precision model weights.


🚀 Key Features

  • Base Foundation: Built on the highly capable Qwen/Qwen2.5-0.5B-Instruct.
  • Parameter-Efficient Finetuning: Optimized via QLoRA (4-bit quantization) on a merged, cleaned dataset of Alpaca, Dolly 15K, OpenHermes, UltraChat, and ShareGPT.
  • Agentic Orchestrator Routing: Tuned specifically to act as a Router and Planner Agent, classifying query intents with high accuracy (CODE, VISION, RAG, SEARCH, KNOWLEDGE, COMPLEX).
  • Quantization-Ready: Quantized to GGUF format for local deployment (quantized size under 500 MB).
  • Local RAG Integration: Built to work in tandem with local ChromaDB embedding vector stores.
  • Web Search Coordination: Designed to synthesize real-time context fetched from local SearXNG search clients.
  • Memory Management: Keeps a windowed session history for conversational continuity.

📊 Evaluation & Routing Performance

The model's semantic routing accuracy was benchmarked across 500 unique evaluation test queries (100 queries per intent category) running on a local GPU:

  • Overall Routing Accuracy: 72.60%
  • Average Latency: 651.54 ms per query
Intent Category Accuracy (%) Target Expert Model
CODE 100.00% qwen2.5-coder:3b
VISION 100.00% riven/smolvlm:latest
SEARCH 99.00% qwen2.5:1.5b-instruct
RAG 43.00% qwen2.5:1.5b-instruct
KNOWLEDGE 21.00% qwen2.5:1.5b-instruct

💻 Local Usage & Integration

1. Ollama Deployment (GGUF)

To run Zenthi-AI locally in Ollama:

  1. Create a Modelfile with the system prompt:
    FROM zenthi-ai:latest
    PARAMETER temperature 0.7
    PARAMETER top_p 0.9
    SYSTEM """I am Zenthi-AI OS, a production-grade Agentic Multi-Model AI Operating System. I deliver accurate, secure, maintainable, and production-ready solutions by coordinating specialized AI capabilities."""
    
  2. Build and run:
    ollama create Zenthi-AI -f Modelfile
    ollama run Zenthi-AI
    

2. Python Transformers API

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "KATHIR2006/zenthi-ai"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

# Start conversation
messages = [
    {"role": "system", "content": "You are Zenthi-AI OS, a production-grade Agentic Multi-Model AI Operating System."},
    {"role": "user", "content": "Explain photosynthesis simply."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to("cuda")

generated_ids = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.batch_decode(generated_ids[:, inputs.input_ids.shape[1]:], skip_special_tokens=True)[0]
print(response)

🛠️ Fine-Tuned Expert Adapters

This repository also hosts the fine-tuned LoRA adapters for the specialized expert models of the Zenthi-AI OS:

1. Code Expert Adapters (code-adapters/)

  • Base Model: Qwen/Qwen2.5-Coder-3B-Instruct
  • Dataset: Custom programming and instruction dataset (1,200 training steps)
  • Final Loss: 0.1843
  • Usage: Optimized for React, Node.js, Python, MERN stack development, reviews, and refactoring.

2. Vision Expert Adapters (vision-adapters/)

  • Base Model: HuggingFaceTB/SmolVLM-Instruct
  • Dataset: Synthetic VQA shape and color recognition dataset (100 training steps)
  • Final Loss: 0.9077
  • Usage: Fine-tuned for OCR, visual question-answering, and image analysis.

⚖️ Licenses & Compliance

This project is dual-licensed:

  • LLM Model Weights & Adaptations: Licensed under the Apache License 2.0 (in compliance with the base Qwen2.5 license).
  • RAG Engine, Multi-Agent Framework, & Backend Codebase: Licensed under the MIT License.
Downloads last month
77
Safetensors
Model size
0.5B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KATHIR2006/Zenthi-AI

Datasets used to train KATHIR2006/Zenthi-AI

Space using KATHIR2006/Zenthi-AI 1