Token Importance Scoring (TIS) v8b - Hard Anchor Final

This checkpoint contains the Token Importance Scoring (TIS) components trained with hard-anchor forcing and tuned stability weights for publication results.

Model Description

This is the publication checkpoint for TIS v8b, featuring optimized hard-anchor forcing and multi-objective loss tuning. It demonstrates strong performance on low-budget scenarios (25% cache budget) with 82% NIAH accuracy.

Key Features:

  • โœ… 82% NIAH accuracy at 25% cache budget (100% evidence survival)
  • โœ… Hard-anchor forcing value: 1.0 (deterministic query/evidence preservation)
  • โœ… Multi-objective loss: ranking + retrieval + stability
  • โœ… Includes benchmark results (NIAH Hard)

Performance

NIAH Hard Benchmark Results

Cache Budget Accuracy Evidence Survival
10% 8.0% 19.3%
25% 82.0% 100.0%
50% 68.0% -
75% 40.0% 69.8%

Comparison vs Heuristic (H2O-style):

  • At 25% budget: 82% vs 14% (5.9x improvement)
  • At 50% budget: 68% vs 28% (2.4x improvement)

Training Configuration

Base Model: mistralai/Mistral-7B-v0.3
Approach: V8 Hard Anchor Restored
Training Steps: 2,000

Loss Weights:

  • Ranking Loss: 1.0
  • Retrieval Loss: 2.0
  • Stability Loss: 0.5

Hyperparameters:

  • Learning rate: 1e-3
  • Gradient accumulation: 4
  • Hard negative margin: 0.5
  • Training budgets: [0.25, 0.5, 0.75]
  • Hard-anchor force value: 1.0

Model Architecture

This checkpoint contains:

  • ImportanceUpdateHead: Hard-anchor forcing importance predictor
  • Importance Embedding: Token-level importance embeddings

Components:

{
  'importance_embedding': dict,  # Token importance embeddings
  'importance_head': dict,       # Hard-anchor predictor (7 keys)
}

Note: This checkpoint has 7 keys in importance_head (vs 6 in stage3), indicating additional hard-anchor forcing components.

Usage

Installation

git clone https://github.com/nitroxido/token-importance-scoring
cd token-importance-scoring
python -m venv .venv
source .venv/bin/activate
pip install -e .

Load Checkpoint

from token_importance.model.importance_head import ImportanceUpdateHead
import torch

# Load TIS components
checkpoint = torch.load('tis_components.pt', map_location='cuda')

# Extract components
importance_head_state = checkpoint['importance_head']
importance_embedding_state = checkpoint['importance_embedding']

print(f"Importance head keys: {importance_head_state.keys()}")

Evaluate on NIAH Hard Benchmark

python scripts/eval_niah_hard.py \
  --model oldman-dev/tis-v8b-hard-anchor \
  --baseline tis \
  --cache_budgets 0.25 0.5 0.75 \
  --n_samples 50 \
  --output results/niah_hard_eval.csv

Training Details

Training Objective:

Loss = ranking_loss + 2.0 * retrieval_loss + 0.5 * stability_loss

Hard-Anchor Forcing:

  • Forces importance scores of 1.0 for query tokens and evidence tokens
  • Ensures deterministic preservation of critical information
  • Prevents catastrophic forgetting in low-budget scenarios

Training Data: Synthetic retrieval tasks with hard negatives

Intended Use

Primary Use Cases:

  • Low-budget KV cache compression (25-50% budgets)
  • Retrieval-focused applications requiring evidence preservation
  • Scenarios where query context must be preserved

Best Performance:

  • 25% cache budget: 82% NIAH accuracy with perfect evidence survival
  • Ideal for extreme memory-constrained deployments

Limitations:

  • Performance degrades at very low budgets (10%)
  • Lower performance at high budgets compared to stage3-ert
  • Optimized for retrieval tasks, not general generation

Comparison with Other Checkpoints

Checkpoint NIAH @ 25% NIAH @ 50% Best Use Case
tis-v8b-hard-anchor (this) 82% 68% Low-budget retrieval
tis-stage3-ert 98% 100% General retrieval + LITM
tis-stage1-oracle 100% 100% Oracle baseline

Citation

If you use this checkpoint, please cite:

@software{token_importance_scoring_2026,
  title={Token Importance Scoring: Learned KV Cache Compression for Long-Context LLMs},
  author={Token Importance Scoring Contributors},
  year={2026},
  url={https://github.com/nitroxido/token-importance-scoring}
}

License

MIT License - See LICENSE

Acknowledgments

Training compute sponsored by GPU-Action.

More Information

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for oldman-dev/tis-v8b-hard-anchor

Finetuned
(353)
this model