Wisedome Router

A lightweight workflow routing model based on MiniLM designed to predict the execution route an AI assistant should follow.

The model is optimized for speed and local deployment, allowing workflow decisions to be generated in milliseconds on consumer hardware. Rather than producing final responses, the model predicts the next action an assistant should perform, enabling fast orchestration of retrieval, tools, code generation, and reasoning pipelines.

The router serves as a lightweight planning layer for AI systems, helping determine which operations should be executed before a response is generated.

Unlike traditional intent classifiers, the model can be invoked at any point during a workflow. Given the current query and the actions that have already been executed, it predicts the most likely next step. By repeatedly calling the model after each predicted action, complete workflow routes can be generated dynamically, allowing future execution paths to be estimated before they occur.

The model predicts the next action given:

  • User query
  • Previously executed actions
  • Context state

The model is designed for AI workflow orchestration, RAG systems, tool use, document analysis pipelines, and local assistants.


Supported Actions

The classifier predicts one of the following actions:

Action Description
DIRECT Answer directly from model knowledge
WEB_SEARCH Retrieve information from the internet
FILE_CONTEXT Retrieve information from uploaded files
RAG Retrieve information from local indexed knowledge
TOOL_CALL Execute a tool or calculation
CODE Generate code
CLARIFY Ask the user for clarification
ANSWER Generate the final answer
EOF End workflow

Route Benchmark

Example route:

Query: Compare the uploaded PDF with the latest Flutter documentation

Expected:

  • FILE_CONTEXT
  • WEB_SEARCH
  • ANSWER
  • EOF

Predicted:

  • FILE_CONTEXT
  • WEB_SEARCH
  • ANSWER
  • EOF

Example

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: none

Output:

FILE_CONTEXT

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: FILE_CONTEXT

Output:

WEB_SEARCH

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: FILE_CONTEXT, WEB_SEARCH

Output:

ANSWER

Input:

Query: summarize the uploaded report and compare it with recent market trends
Previous actions: FILE_CONTEXT, WEB_SEARCH, ANSWER

Output:

EOF
Route Exact Match Accuracy: 86.7%

Architecture

Base model:

sentence-transformers/all-MiniLM-L6-v2

Classification head:

Linear classification layer trained for next-step workflow prediction.


Evaluation

Test set size: 3642 samples

Accuracy: 0.93

Macro F1: 0.87

Weighted F1: 0.93

Per-class F1

Label F1
ANSWER 0.95
FILE_CONTEXT 0.94
RAG 0.96
WEB_SEARCH 0.87
CODE 0.83
DIRECT 0.82
CLARIFY 0.77
TOOL_CALL 0.67
EOF 1.00

Sequence match accuracy: 86.7%


Loading the Model

import torch
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
)

model_name = "bbidpa/wisedome-router-v1"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

device = "cuda" if torch.cuda.is_available() else "cpu"

model.to(device)
model.eval()

Usage


def build_input_text(query, previous_actions=None, context_state="none"):
    previous_actions = previous_actions or []
    previous = ", ".join(previous_actions) if previous_actions else "none"

    return (
        f"Query: {query}\n"
        f"Previous actions: {previous}\n"
        f"Context state: {context_state}"
    )


def predict_next_action(
    query,
    previous_actions=None,
    context_state="none",
    max_length=256,
):
    previous_actions = previous_actions or []

    text = build_input_text(
        query=query,
        previous_actions=previous_actions,
        context_state=context_state,
    )

    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        padding=True,
        max_length=max_length,
    )

    inputs = {
        key: value.to(device)
        for key, value in inputs.items()
    }

    with torch.no_grad():
        outputs = model(**inputs)
        probabilities = torch.softmax(outputs.logits, dim=-1)[0]

    action_id = int(torch.argmax(probabilities).item())
    action = model.config.id2label[action_id]
    confidence = float(probabilities[action_id].item())

    scores = {
        model.config.id2label[i]: float(probabilities[i].item())
        for i in range(len(probabilities))
    }

    return {
        "action": action,
        "confidence": confidence,
        "scores": scores,
    }


def generate_route(
    query,
    context_state="none",
    max_steps=6,
    min_confidence=None,
):
    previous_actions = []
    route = []

    for step in range(max_steps):
        result = predict_next_action(
            query=query,
            previous_actions=previous_actions,
            context_state=context_state,
        )

        action = result["action"]
        confidence = result["confidence"]

        if min_confidence is not None and confidence < min_confidence:
            action = "CLARIFY"

        if previous_actions and action == previous_actions[-1]:
            action = "EOF"

        route.append({
            "step": step + 1,
            "action": action,
            "confidence": round(confidence, 4),
        })

        if action == "EOF":
            break

        previous_actions.append(action)

    return route

Predict a Single Next Action

result = predict_next_action(
    query="Compare the uploaded report with recent market news",
)

print(result["action"])
print(result["confidence"])

Example output:

FILE_CONTEXT
0.9421

Generate a Full Route

route = generate_route(
    query="Compare the uploaded report with recent market news",
)

for step in route:
    print(step)

Example output:

{'step': 1, 'action': 'FILE_CONTEXT', 'confidence': 0.9421}
{'step': 2, 'action': 'WEB_SEARCH', 'confidence': 0.8814}
{'step': 3, 'action': 'ANSWER', 'confidence': 0.9632}
{'step': 4, 'action': 'EOF', 'confidence': 0.9987}

Intended Use

This model is intended for:

  • AI workflow orchestration
  • Agent routing
  • RAG pipelines
  • Tool-using assistants
  • Local AI assistants
  • Multi-step workflow planning

Limitations

The model does not answer user questions directly.

It predicts workflow actions only.

Performance depends on the workflow definitions and training dataset used.

The model may not generalize to workflows that differ significantly from those seen during training.


License

Apache 2.0

Downloads last month
41
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bbidpa/wisedome-router-v1

Dataset used to train bbidpa/wisedome-router-v1