Instructions to use oldman-dev/tis-v8b-hard-anchor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use oldman-dev/tis-v8b-hard-anchor with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("oldman-dev/tis-v8b-hard-anchor", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Token Importance Scoring (TIS) v8b - Hard Anchor Final
This checkpoint contains the Token Importance Scoring (TIS) components trained with hard-anchor forcing and tuned stability weights for publication results.
Model Description
This is the publication checkpoint for TIS v8b, featuring optimized hard-anchor forcing and multi-objective loss tuning. It demonstrates strong performance on low-budget scenarios (25% cache budget) with 82% NIAH accuracy.
Key Features:
- โ 82% NIAH accuracy at 25% cache budget (100% evidence survival)
- โ Hard-anchor forcing value: 1.0 (deterministic query/evidence preservation)
- โ Multi-objective loss: ranking + retrieval + stability
- โ Includes benchmark results (NIAH Hard)
Performance
NIAH Hard Benchmark Results
| Cache Budget | Accuracy | Evidence Survival |
|---|---|---|
| 10% | 8.0% | 19.3% |
| 25% | 82.0% | 100.0% |
| 50% | 68.0% | - |
| 75% | 40.0% | 69.8% |
Comparison vs Heuristic (H2O-style):
- At 25% budget: 82% vs 14% (5.9x improvement)
- At 50% budget: 68% vs 28% (2.4x improvement)
Training Configuration
Base Model: mistralai/Mistral-7B-v0.3
Approach: V8 Hard Anchor Restored
Training Steps: 2,000
Loss Weights:
- Ranking Loss: 1.0
- Retrieval Loss: 2.0
- Stability Loss: 0.5
Hyperparameters:
- Learning rate: 1e-3
- Gradient accumulation: 4
- Hard negative margin: 0.5
- Training budgets: [0.25, 0.5, 0.75]
- Hard-anchor force value: 1.0
Model Architecture
This checkpoint contains:
- ImportanceUpdateHead: Hard-anchor forcing importance predictor
- Importance Embedding: Token-level importance embeddings
Components:
{
'importance_embedding': dict, # Token importance embeddings
'importance_head': dict, # Hard-anchor predictor (7 keys)
}
Note: This checkpoint has 7 keys in importance_head (vs 6 in stage3), indicating additional hard-anchor forcing components.
Usage
Installation
git clone https://github.com/nitroxido/token-importance-scoring
cd token-importance-scoring
python -m venv .venv
source .venv/bin/activate
pip install -e .
Load Checkpoint
from token_importance.model.importance_head import ImportanceUpdateHead
import torch
# Load TIS components
checkpoint = torch.load('tis_components.pt', map_location='cuda')
# Extract components
importance_head_state = checkpoint['importance_head']
importance_embedding_state = checkpoint['importance_embedding']
print(f"Importance head keys: {importance_head_state.keys()}")
Evaluate on NIAH Hard Benchmark
python scripts/eval_niah_hard.py \
--model oldman-dev/tis-v8b-hard-anchor \
--baseline tis \
--cache_budgets 0.25 0.5 0.75 \
--n_samples 50 \
--output results/niah_hard_eval.csv
Training Details
Training Objective:
Loss = ranking_loss + 2.0 * retrieval_loss + 0.5 * stability_loss
Hard-Anchor Forcing:
- Forces importance scores of 1.0 for query tokens and evidence tokens
- Ensures deterministic preservation of critical information
- Prevents catastrophic forgetting in low-budget scenarios
Training Data: Synthetic retrieval tasks with hard negatives
Intended Use
Primary Use Cases:
- Low-budget KV cache compression (25-50% budgets)
- Retrieval-focused applications requiring evidence preservation
- Scenarios where query context must be preserved
Best Performance:
- 25% cache budget: 82% NIAH accuracy with perfect evidence survival
- Ideal for extreme memory-constrained deployments
Limitations:
- Performance degrades at very low budgets (10%)
- Lower performance at high budgets compared to stage3-ert
- Optimized for retrieval tasks, not general generation
Comparison with Other Checkpoints
| Checkpoint | NIAH @ 25% | NIAH @ 50% | Best Use Case |
|---|---|---|---|
| tis-v8b-hard-anchor (this) | 82% | 68% | Low-budget retrieval |
| tis-stage3-ert | 98% | 100% | General retrieval + LITM |
| tis-stage1-oracle | 100% | 100% | Oracle baseline |
Citation
If you use this checkpoint, please cite:
@software{token_importance_scoring_2026,
title={Token Importance Scoring: Learned KV Cache Compression for Long-Context LLMs},
author={Token Importance Scoring Contributors},
year={2026},
url={https://github.com/nitroxido/token-importance-scoring}
}
License
MIT License - See LICENSE
Acknowledgments
Training compute sponsored by GPU-Action.
More Information
- Repository: https://github.com/nitroxido/token-importance-scoring
- Documentation: See REPOSITORY-OVERVIEW.md and REPRODUCIBILITY-GUIDE.md in the repository
- Related Checkpoints:
- tis-stage3-ert - Main ERT checkpoint (100% NIAH @ 50%)
- tis-stage1-oracle - Oracle-labeled baseline
Model tree for oldman-dev/tis-v8b-hard-anchor
Base model
mistralai/Mistral-7B-v0.3