Upload AgentRouter

Browse files

Files changed (5) hide show

README.md +66 -0
config.json +53 -0
model.safetensors +3 -0
tokenizer.json +0 -0
tokenizer_config.json +14 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+language:
+- en
+license: apache-2.0
+library_name: transformers
+tags:
+- text-classification
+- intent-classification
+- query-routing
+- agent
+- llm-router
+pipeline_tag: text-classification
+---
+# ⚡ AgentRouter
+Ultra-fast intent classification for LLM query routing. Classifies user queries into 10 intent categories in **<5ms** on GPU.
+Built on [MiniLM](https://huggingface.co/microsoft/MiniLM-L12-H384-uncased) (33M params) — small enough for CPU inference, fast enough for real-time routing.
+## 🚀 Usage
+```python
+from transformers import pipeline
+router = pipeline("text-classification", model="ENTUM-AI/AgentRouter")
+router("Write a Python function to sort a list")
+# [{'label': 'code_generation', 'score': 0.98}]
+router("Why am I getting a TypeError?")
+# [{'label': 'code_debugging', 'score': 0.97}]
+router("Translate hello to Spanish")
+# [{'label': 'translation', 'score': 0.99}]
+router("What is quantum computing?")
+# [{'label': 'information_retrieval', 'score': 0.96}]
+```
+## 🏷️ Intent Classes
+| Intent | Description | Suggested Tools |
+|--------|-------------|----------------|
+| `code_generation` | Write new code | code_interpreter, file_editor |
+| `code_debugging` | Fix bugs and errors | code_interpreter, debugger |
+| `math_reasoning` | Solve math problems | calculator, wolfram_alpha |
+| `creative_writing` | Write stories, poems, essays | — |
+| `summarization` | Summarize text | file_reader |
+| `translation` | Translate between languages | translator |
+| `information_retrieval` | Answer questions, explain topics | knowledge_base |
+| `data_analysis` | Analyze data, create charts | code_interpreter, data_visualizer |
+| `web_search` | Search the web for current info | web_browser, search_engine |
+| `general_chat` | Casual conversation | — |
+## 🔍 Use Cases
+- **LLM routing** — route queries to specialized models or tools
+- **Agent frameworks** — decide which tool to invoke
+- **Cost optimization** — use cheap models for simple intents, expensive for complex
+- **Latency optimization** — skip heavy pipelines for general chat
+## ⚠️ Limitations
+- English only
+- 10 fixed intent categories

config.json ADDED Viewed

	@@ -0,0 +1,53 @@

+{
+  "add_cross_attention": false,
+  "architectures": [
+    "BertForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": null,
+  "classifier_dropout": null,
+  "dtype": "float32",
+  "eos_token_id": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
+  "id2label": {
+    "0": "code_generation",
+    "1": "code_debugging",
+    "2": "math_reasoning",
+    "3": "creative_writing",
+    "4": "summarization",
+    "5": "translation",
+    "6": "information_retrieval",
+    "7": "data_analysis",
+    "8": "web_search",
+    "9": "general_chat"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 1536,
+  "is_decoder": false,
+  "label2id": {
+    "code_debugging": 1,
+    "code_generation": 0,
+    "creative_writing": 3,
+    "data_analysis": 7,
+    "general_chat": 9,
+    "information_retrieval": 6,
+    "math_reasoning": 2,
+    "summarization": 4,
+    "translation": 5,
+    "web_search": 8
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "problem_type": "single_label_classification",
+  "tie_word_embeddings": true,
+  "transformers_version": "5.1.0",
+  "type_vocab_size": 2,
+  "use_cache": false,
+  "vocab_size": 30522
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0127ed1a00bfed0fd2d8b5050dd8770cbedad91f2ca4c5e39195dd3ec7eb21d4
+size 133478672

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "backend": "tokenizers",
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "is_local": false,
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}