cu3126

tgupj commited on 4 days ago

Commit

fd0e5cf

0 Parent(s):

Duplicate from tgupj/tiny-router

Browse files

Co-authored-by: udara <tgupj@users.noreply.huggingface.co>

Files changed (19) hide show

.gitattributes +35 -0
README.md +153 -0
added_tokens.json +3 -0
model.pt +3 -0
model_config.json +60 -0
onnx/added_tokens.json +3 -0
onnx/model_config.json +60 -0
onnx/onnx_metadata.json +42 -0
onnx/special_tokens_map.json +51 -0
onnx/spm.model +3 -0
onnx/temperature_scaling.json +10 -0
onnx/tiny_router.int8.onnx +3 -0
onnx/tiny_router.onnx +3 -0
onnx/tokenizer_config.json +59 -0
special_tokens_map.json +15 -0
spm.model +3 -0
temperature_scaling.json +10 -0
tokenizer_config.json +59 -0
training_args.json +26 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,153 @@

+---
+license: mit
+base_model:
+- microsoft/deberta-v3-small
+datasets:
+- tgupj/tiny-router-data
+---
+# tiny-router
+`tiny-router` is a compact experimental multi-head routing classifier for short, domain-neutral messages with optional interaction context. It predicts four separate signals that downstream systems or agents can use for update handling, action routing, memory policy, and prioritization.
+## What it predicts
+```
+relation_to_previous: new | follow_up | correction | confirmation | cancellation | closure
+actionability: none | review | act
+retention: ephemeral | useful | remember
+urgency: low | medium | high
+```
+The model emits these heads independently at inference time, plus calibrated confidences and an `overall_confidence`.
+## Intended use
+- Route short user messages into lightweight automation tiers.
+- Detect whether a message updates prior context or starts something new.
+- Decide whether action is required, review is safer, or no action is needed.
+- Separate disposable details from short-term useful context and longer-term memory candidates.
+- Prioritize items by urgency.
+Good use cases:
+- routing message-like requests in assistants or productivity tools
+- triaging follow-ups, corrections, confirmations, and closures
+- conservative automation with review fallback
+Not good use cases:
+- fully autonomous high-stakes action without guardrails
+- domains that need expert reasoning or regulated decisions
+## Training data
+This checkpoint was trained on the synthetic dataset split in:
+- `data/synthetic/train.jsonl`
+- `data/synthetic/validation.jsonl`
+- `data/synthetic/test.jsonl`
+The data follows a structured JSONL schema with:
+- `current_text`
+- optional `interaction.previous_text`
+- optional `interaction.previous_action`
+- optional `interaction.previous_outcome`
+- optional `interaction.recency_seconds`
+- four label heads under `labels`
+## Model details
+- Base encoder: `microsoft/deberta-v3-small`
+- Architecture: encoder-only multitask classifier
+- Pooling: learned attention pooling
+- Structured features:
+  - canonicalized `previous_action` embedding
+  - `previous_outcome` embedding
+  - learned projection of `log1p(recency_seconds)`
+- Head structure:
+  - dependency-aware multitask heads
+  - later heads condition on learned summaries of earlier head predictions
+- Calibration:
+  - post-hoc per-head temperature scaling fit on validation logits
+This checkpoint was trained with:
+- `batch_size = 32`
+- `epochs = 20`
+- `max_length = 128`
+- `encoder_lr = 2e-5`
+- `head_lr = 1e-4`
+- `dropout = 0.1`
+- `pooling_type = attention`
+- `use_head_dependencies = true`
+## Current results
+Held-out test results from `artifacts/tiny-router/eval.json`:
+- `macro_average_f1 = 0.7848`
+- `exact_match = 0.4570`
+- `automation_safe_accuracy = 0.6230`
+- `automation_safe_coverage = 0.5430`
+- `ECE = 0.3440`
+Per-head macro F1:
+- `relation_to_previous = 0.8415`
+- `actionability = 0.7982`
+- `retention = 0.7809`
+- `urgency = 0.7187`
+Ablations:
+- `current_text_only = 0.7058`
+- `current_plus_previous_text = 0.7478`
+- `full_interaction = 0.7848`
+Interpretation:
+- interaction context helps
+- actionability and urgency are usable but still imperfect
+- high-confidence automation is possible only with conservative thresholds
+## Limitations
+- The benchmark is task-specific and internal to this repo.
+- The dataset is synthetic, so distribution shift to real product traffic is likely.
+- Label quality on subtle boundaries still matters a lot.
+- Confidence calibration is improved but not strong enough to justify broad unattended automation.
+## Example inference
+```json
+{
+  "relation_to_previous": { "label": "correction", "confidence": 0.94 },
+  "actionability": { "label": "act", "confidence": 0.97 },
+  "retention": { "label": "useful", "confidence": 0.76 },
+  "urgency": { "label": "medium", "confidence": 0.81 },
+  "overall_confidence": 0.87
+}
+```
+## How to load
+This repo uses a custom checkpoint format. Load it with this project:
+```python
+from tiny_router.io import load_checkpoint
+from tiny_router.runtime import get_device
+device = get_device(requested_device="cpu")
+model, tokenizer, config = load_checkpoint("artifacts/tiny-router", device=device)
+```
+Or run inference with:
+```bash
+uv run python predict.py \
+  --model-dir artifacts/tiny-router \
+  --input-json '{"current_text":"Actually next Monday","interaction":{"previous_text":"Set a reminder for Friday","previous_action":"created_reminder","previous_outcome":"success","recency_seconds":45}}' \
+  --pretty
+```

added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[MASK]": 128000
+}

model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:56812518ded8d9e6f0596b54ad9733722045eb037f2a9637454e9a282a6de975
+size 565320515

model_config.json ADDED Viewed

	@@ -0,0 +1,60 @@

+{
+  "encoder_name": "microsoft/deberta-v3-small",
+  "dropout": 0.1,
+  "action_vocab": [
+    "none",
+    "create",
+    "update",
+    "send",
+    "store",
+    "route",
+    "schedule",
+    "dismissed",
+    "clarify",
+    "search",
+    "notify",
+    "cancel",
+    "complete",
+    "other"
+  ],
+  "outcome_vocab": [
+    "success",
+    "pending",
+    "failed",
+    "cancelled",
+    "unknown"
+  ],
+  "label_maps": {
+    "relation_to_previous": [
+      "new",
+      "follow_up",
+      "correction",
+      "confirmation",
+      "cancellation",
+      "closure"
+    ],
+    "actionability": [
+      "none",
+      "review",
+      "act"
+    ],
+    "retention": [
+      "ephemeral",
+      "useful",
+      "remember"
+    ],
+    "urgency": [
+      "low",
+      "medium",
+      "high"
+    ]
+  },
+  "structured_hidden_dim": 32,
+  "recency_embed_dim": 8,
+  "pooling_type": "attention",
+  "use_head_dependencies": true,
+  "dependency_hidden_dim": 32,
+  "feature_mode": "full_interaction",
+  "max_length": 128,
+  "recency_max": 3600
+}

onnx/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[MASK]": 128000
+}

onnx/model_config.json ADDED Viewed

	@@ -0,0 +1,60 @@

+{
+  "encoder_name": "microsoft/deberta-v3-small",
+  "dropout": 0.1,
+  "action_vocab": [
+    "none",
+    "create",
+    "update",
+    "send",
+    "store",
+    "route",
+    "schedule",
+    "dismissed",
+    "clarify",
+    "search",
+    "notify",
+    "cancel",
+    "complete",
+    "other"
+  ],
+  "outcome_vocab": [
+    "success",
+    "pending",
+    "failed",
+    "cancelled",
+    "unknown"
+  ],
+  "label_maps": {
+    "relation_to_previous": [
+      "new",
+      "follow_up",
+      "correction",
+      "confirmation",
+      "cancellation",
+      "closure"
+    ],
+    "actionability": [
+      "none",
+      "review",
+      "act"
+    ],
+    "retention": [
+      "ephemeral",
+      "useful",
+      "remember"
+    ],
+    "urgency": [
+      "low",
+      "medium",
+      "high"
+    ]
+  },
+  "structured_hidden_dim": 32,
+  "recency_embed_dim": 8,
+  "pooling_type": "attention",
+  "use_head_dependencies": true,
+  "dependency_hidden_dim": 32,
+  "feature_mode": "full_interaction",
+  "max_length": 128,
+  "recency_max": 3600
+}

onnx/onnx_metadata.json ADDED Viewed

	@@ -0,0 +1,42 @@

+{
+  "model_file": "tiny_router.onnx",
+  "feature_mode": "full_interaction",
+  "heads": [
+    "relation_to_previous",
+    "actionability",
+    "retention",
+    "urgency"
+  ],
+  "max_length": 128,
+  "label_maps": {
+    "relation_to_previous": [
+      "new",
+      "follow_up",
+      "correction",
+      "confirmation",
+      "cancellation",
+      "closure"
+    ],
+    "actionability": [
+      "none",
+      "review",
+      "act"
+    ],
+    "retention": [
+      "ephemeral",
+      "useful",
+      "remember"
+    ],
+    "urgency": [
+      "low",
+      "medium",
+      "high"
+    ]
+  },
+  "temperature_scaling": {
+    "relation_to_previous": 1.333521,
+    "actionability": 2.113489,
+    "retention": 2.238721,
+    "urgency": 2.660725
+  }
+}

onnx/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "bos_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

onnx/spm.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
+size 2464616

onnx/temperature_scaling.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "method": "per_head_temperature_scaling",
+  "source_split": "validation",
+  "per_head": {
+    "relation_to_previous": 1.333521,
+    "actionability": 2.113489,
+    "retention": 2.238721,
+    "urgency": 2.660725
+  }
+}

onnx/tiny_router.int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:772b84fa513ff63a8406ecfa031a9f61ff4ac69a4cfa259c91089a3b1133bf62
+size 171758190

onnx/tiny_router.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:817549baf4f6c0531e3230124dcde592d1b8129b439d3a51fcc9f0f50c94f488
+size 565782364

onnx/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,59 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128000": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "[CLS]",
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "eos_token": "[SEP]",
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "sp_model_kwargs": {},
+  "split_by_punct": false,
+  "tokenizer_class": "DebertaV2Tokenizer",
+  "unk_token": "[UNK]",
+  "vocab_type": "spm"
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "bos_token": "[CLS]",
+  "cls_token": "[CLS]",
+  "eos_token": "[SEP]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

spm.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
+size 2464616

temperature_scaling.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "method": "per_head_temperature_scaling",
+  "source_split": "validation",
+  "per_head": {
+    "relation_to_previous": 1.333521,
+    "actionability": 2.113489,
+    "retention": 2.238721,
+    "urgency": 2.660725
+  }
+}

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,59 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128000": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "[CLS]",
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "eos_token": "[SEP]",
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "sp_model_kwargs": {},
+  "split_by_punct": false,
+  "tokenizer_class": "DebertaV2Tokenizer",
+  "unk_token": "[UNK]",
+  "vocab_type": "spm"
+}

training_args.json ADDED Viewed

	@@ -0,0 +1,26 @@

+{
+  "train_file": "data/synthetic/train.jsonl",
+  "validation_file": "data/synthetic/validation.jsonl",
+  "test_file": null,
+  "output_dir": "artifacts/tiny-router",
+  "encoder_name": "microsoft/deberta-v3-small",
+  "device": "auto",
+  "feature_mode": "full_interaction",
+  "pooling_type": "attention",
+  "use_head_dependencies": true,
+  "dependency_hidden_dim": 32,
+  "max_length": 128,
+  "recency_max": 3600,
+  "batch_size": 32,
+  "epochs": 20,
+  "encoder_lr": 2e-05,
+  "head_lr": 0.0001,
+  "weight_decay": 0.01,
+  "warmup_ratio": 0.1,
+  "dropout": 0.1,
+  "seed": 13,
+  "patience": 2,
+  "mixed_precision": false,
+  "confidence_threshold": 0.8,
+  "head_loss_weights": "{}"
+}