ENTUM-AI commited on
Commit
0355487
·
verified ·
1 Parent(s): f3d07d1

Upload AgentRouter

Browse files
Files changed (5) hide show
  1. README.md +66 -0
  2. config.json +53 -0
  3. model.safetensors +3 -0
  4. tokenizer.json +0 -0
  5. tokenizer_config.json +14 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - text-classification
8
+ - intent-classification
9
+ - query-routing
10
+ - agent
11
+ - llm-router
12
+ pipeline_tag: text-classification
13
+ ---
14
+
15
+ # ⚡ AgentRouter
16
+
17
+ Ultra-fast intent classification for LLM query routing. Classifies user queries into 10 intent categories in **<5ms** on GPU.
18
+
19
+ Built on [MiniLM](https://huggingface.co/microsoft/MiniLM-L12-H384-uncased) (33M params) — small enough for CPU inference, fast enough for real-time routing.
20
+
21
+ ## 🚀 Usage
22
+
23
+ ```python
24
+ from transformers import pipeline
25
+
26
+ router = pipeline("text-classification", model="ENTUM-AI/AgentRouter")
27
+
28
+ router("Write a Python function to sort a list")
29
+ # [{'label': 'code_generation', 'score': 0.98}]
30
+
31
+ router("Why am I getting a TypeError?")
32
+ # [{'label': 'code_debugging', 'score': 0.97}]
33
+
34
+ router("Translate hello to Spanish")
35
+ # [{'label': 'translation', 'score': 0.99}]
36
+
37
+ router("What is quantum computing?")
38
+ # [{'label': 'information_retrieval', 'score': 0.96}]
39
+ ```
40
+
41
+ ## 🏷️ Intent Classes
42
+
43
+ | Intent | Description | Suggested Tools |
44
+ |--------|-------------|----------------|
45
+ | `code_generation` | Write new code | code_interpreter, file_editor |
46
+ | `code_debugging` | Fix bugs and errors | code_interpreter, debugger |
47
+ | `math_reasoning` | Solve math problems | calculator, wolfram_alpha |
48
+ | `creative_writing` | Write stories, poems, essays | — |
49
+ | `summarization` | Summarize text | file_reader |
50
+ | `translation` | Translate between languages | translator |
51
+ | `information_retrieval` | Answer questions, explain topics | knowledge_base |
52
+ | `data_analysis` | Analyze data, create charts | code_interpreter, data_visualizer |
53
+ | `web_search` | Search the web for current info | web_browser, search_engine |
54
+ | `general_chat` | Casual conversation | — |
55
+
56
+ ## 🔍 Use Cases
57
+
58
+ - **LLM routing** — route queries to specialized models or tools
59
+ - **Agent frameworks** — decide which tool to invoke
60
+ - **Cost optimization** — use cheap models for simple intents, expensive for complex
61
+ - **Latency optimization** — skip heavy pipelines for general chat
62
+
63
+ ## ⚠️ Limitations
64
+
65
+ - English only
66
+ - 10 fixed intent categories
config.json ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_cross_attention": false,
3
+ "architectures": [
4
+ "BertForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": null,
8
+ "classifier_dropout": null,
9
+ "dtype": "float32",
10
+ "eos_token_id": null,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 384,
14
+ "id2label": {
15
+ "0": "code_generation",
16
+ "1": "code_debugging",
17
+ "2": "math_reasoning",
18
+ "3": "creative_writing",
19
+ "4": "summarization",
20
+ "5": "translation",
21
+ "6": "information_retrieval",
22
+ "7": "data_analysis",
23
+ "8": "web_search",
24
+ "9": "general_chat"
25
+ },
26
+ "initializer_range": 0.02,
27
+ "intermediate_size": 1536,
28
+ "is_decoder": false,
29
+ "label2id": {
30
+ "code_debugging": 1,
31
+ "code_generation": 0,
32
+ "creative_writing": 3,
33
+ "data_analysis": 7,
34
+ "general_chat": 9,
35
+ "information_retrieval": 6,
36
+ "math_reasoning": 2,
37
+ "summarization": 4,
38
+ "translation": 5,
39
+ "web_search": 8
40
+ },
41
+ "layer_norm_eps": 1e-12,
42
+ "max_position_embeddings": 512,
43
+ "model_type": "bert",
44
+ "num_attention_heads": 12,
45
+ "num_hidden_layers": 12,
46
+ "pad_token_id": 0,
47
+ "problem_type": "single_label_classification",
48
+ "tie_word_embeddings": true,
49
+ "transformers_version": "5.1.0",
50
+ "type_vocab_size": 2,
51
+ "use_cache": false,
52
+ "vocab_size": 30522
53
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0127ed1a00bfed0fd2d8b5050dd8770cbedad91f2ca4c5e39195dd3ec7eb21d4
3
+ size 133478672
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "cls_token": "[CLS]",
4
+ "do_lower_case": false,
5
+ "is_local": false,
6
+ "mask_token": "[MASK]",
7
+ "model_max_length": 1000000000000000019884624838656,
8
+ "pad_token": "[PAD]",
9
+ "sep_token": "[SEP]",
10
+ "strip_accents": null,
11
+ "tokenize_chinese_chars": true,
12
+ "tokenizer_class": "BertTokenizer",
13
+ "unk_token": "[UNK]"
14
+ }