Wisedome Router

A lightweight workflow routing model based on MiniLM designed to predict the execution route an AI assistant should follow.

The model is optimized for speed and local deployment, allowing workflow decisions to be generated in milliseconds on consumer hardware. Rather than producing final responses, the model predicts the next action an assistant should perform, enabling fast orchestration of retrieval, tools, code generation, and reasoning pipelines.

The router serves as a lightweight planning layer for AI systems, helping determine which operations should be executed before a response is generated.

Unlike traditional intent classifiers, the model can be invoked at any point during a workflow. Given the current query and the actions that have already been executed, it predicts the most likely next step. By repeatedly calling the model after each predicted action, complete workflow routes can be generated dynamically, allowing future execution paths to be estimated before they occur.

The model predicts the next action given:

User query
Previously executed actions
Context state

The model is designed for AI workflow orchestration, RAG systems, tool use, document analysis pipelines, and local assistants.

Supported Actions

The classifier predicts one of the following actions:

Action	Description
DIRECT	Answer directly from model knowledge
WEB_SEARCH	Retrieve information from the internet
FILE_CONTEXT	Retrieve information from uploaded files
RAG	Retrieve information from local indexed knowledge
TOOL_CALL	Execute a tool or calculation
CODE	Generate code
CLARIFY	Ask the user for clarification
ANSWER	Generate the final answer
EOF	End workflow

Route Benchmark

Example route:

Query: Compare the uploaded PDF with the latest Flutter documentation

Expected:

FILE_CONTEXT
WEB_SEARCH
ANSWER
EOF

Predicted:

FILE_CONTEXT
WEB_SEARCH
ANSWER
EOF

Example

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: none

Output:

FILE_CONTEXT

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: FILE_CONTEXT

Output:

WEB_SEARCH

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: FILE_CONTEXT, WEB_SEARCH

Output:

ANSWER

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: FILE_CONTEXT, WEB_SEARCH, ANSWER

Output:

EOF

Route Exact Match Accuracy: 86.7%

Architecture

Base model:

sentence-transformers/all-MiniLM-L6-v2

Classification head:

Linear classification layer trained for next-step workflow prediction.

Evaluation

Test set size: 3642 samples

Accuracy: 0.93

Macro F1: 0.87

Weighted F1: 0.93

Per-class F1

Label	F1
ANSWER	0.95
FILE_CONTEXT	0.94
RAG	0.96
WEB_SEARCH	0.87
CODE	0.83
DIRECT	0.82
CLARIFY	0.77
TOOL_CALL	0.67
EOF	1.00

Sequence match accuracy: 86.7%

Loading the Model

import torch
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
)

model_name = "bbidpa/wisedome-router-v1"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

device = "cuda" if torch.cuda.is_available() else "cpu"

model.to(device)
model.eval()

Usage


def build_input_text(query, previous_actions=None, context_state="none"):
    previous_actions = previous_actions or []
    previous = ", ".join(previous_actions) if previous_actions else "none"

    return (
        f"Query: {query}\n"
        f"Previous actions: {previous}\n"
        f"Context state: {context_state}"
    )


def predict_next_action(
    query,
    previous_actions=None,
    context_state="none",
    max_length=256,
):
    previous_actions = previous_actions or []

    text = build_input_text(
        query=query,
        previous_actions=previous_actions,
        context_state=context_state,
    )

    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        padding=True,
        max_length=max_length,
    )

    inputs = {
        key: value.to(device)
        for key, value in inputs.items()
    }

    with torch.no_grad():
        outputs = model(**inputs)
        probabilities = torch.softmax(outputs.logits, dim=-1)[0]

    action_id = int(torch.argmax(probabilities).item())
    action = model.config.id2label[action_id]
    confidence = float(probabilities[action_id].item())

    scores = {
        model.config.id2label[i]: float(probabilities[i].item())
        for i in range(len(probabilities))
    }

    return {
        "action": action,
        "confidence": confidence,
        "scores": scores,
    }


def generate_route(
    query,
    context_state="none",
    max_steps=6,
    min_confidence=None,
):
    previous_actions = []
    route = []

    for step in range(max_steps):
        result = predict_next_action(
            query=query,
            previous_actions=previous_actions,
            context_state=context_state,
        )

        action = result["action"]
        confidence = result["confidence"]

        if min_confidence is not None and confidence < min_confidence:
            action = "CLARIFY"

        if previous_actions and action == previous_actions[-1]:
            action = "EOF"

        route.append({
            "step": step + 1,
            "action": action,
            "confidence": round(confidence, 4),
        })

        if action == "EOF":
            break

        previous_actions.append(action)

    return route

Predict a Single Next Action

result = predict_next_action(
    query="Compare the uploaded report with recent market news",
)

print(result["action"])
print(result["confidence"])

Example output:

FILE_CONTEXT
0.9421

Generate a Full Route

route = generate_route(
    query="Compare the uploaded report with recent market news",
)

for step in route:
    print(step)

Example output:

{'step': 1, 'action': 'FILE_CONTEXT', 'confidence': 0.9421}
{'step': 2, 'action': 'WEB_SEARCH', 'confidence': 0.8814}
{'step': 3, 'action': 'ANSWER', 'confidence': 0.9632}
{'step': 4, 'action': 'EOF', 'confidence': 0.9987}

Intended Use

This model is intended for:

AI workflow orchestration
Agent routing
RAG pipelines
Tool-using assistants
Local AI assistants
Multi-step workflow planning

Limitations

The model does not answer user questions directly.

It predicts workflow actions only.

Performance depends on the workflow definitions and training dataset used.

The model may not generalize to workflows that differ significantly from those seen during training.

License

Apache 2.0

Downloads last month: 41

Safetensors

Model size

22.7M params

Tensor type

F32

Model tree for bbidpa/wisedome-router-v1

Base model

nreimers/MiniLM-L6-H384-uncased

Quantized

sentence-transformers/all-MiniLM-L6-v2

Finetuned

(920)

this model

bbidpa
/

wisedome-router-v1

Wisedome Router

Supported Actions

Route Benchmark

Example route:

Example

Architecture

Evaluation

Per-class F1

Loading the Model

Usage

Predict a Single Next Action

Generate a Full Route

Intended Use

Limitations

License

Model tree for bbidpa/wisedome-router-v1

Dataset used to train bbidpa/wisedome-router-v1