ContextCrumb-32M

ContextCrumb-32M is a 32M parameter token-classification model for deletion-only context compression. It predicts whether each input token should be kept or deleted so text can be shortened before being sent to LLMs or agents.

This repository is private while packaging and documentation are being stabilized.

Labels

  • DELETE
  • KEEP

Usage

Recommended usage is through the contextcrumb Python package:

from contextcrumb import ContextCompressor

compressor = ContextCompressor()
result = compressor.compress(
    "ContextCrumb deletes low-value words while preserving useful context."
)
print(result.text)

The package loads the ONNX artifacts in onnx/ by default, so users do not need PyTorch or Transformers for normal inference. The original model.safetensors checkpoint remains available for Torch/Transformers workflows.

Golden adaptive cutoff mode is the default:

result = compressor.compress(text)
print(result.text)
print(result.stats["golden_cutoff"])

Golden mode keeps at least one third of word-like tokens by default, so an extreme probability gap does not delete nearly all context. Use target_keep_ratio for an explicit lower fixed budget.

Raw Transformers loading also works:

from transformers import AutoModelForTokenClassification, AutoTokenizer

model_id = "ymao20/contextcrumb-32m"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id)

Intended Use

Use this model for experimental context compression, prompt shortening, and agent memory preprocessing. Review outputs before using it in high-stakes settings because deletion can remove important nuance.

Base Model

Fine-tuned from jhu-clsp/ettin-encoder-32m.

Downloads last month
57
Safetensors
Model size
32M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ymao20/contextcrumb-32m

Finetuned
(21)
this model

Space using ymao20/contextcrumb-32m 1