Instructions to use ymao20/contextcrumb-32m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ymao20/contextcrumb-32m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="ymao20/contextcrumb-32m")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("ymao20/contextcrumb-32m") model = AutoModelForTokenClassification.from_pretrained("ymao20/contextcrumb-32m") - Notebooks
- Google Colab
- Kaggle
ContextCrumb-32M
ContextCrumb-32M is a 32M parameter token-classification model for deletion-only context compression. It predicts whether each input token should be kept or deleted so text can be shortened before being sent to LLMs or agents.
This repository is private while packaging and documentation are being stabilized.
Labels
DELETEKEEP
Usage
Recommended usage is through the contextcrumb Python package:
from contextcrumb import ContextCompressor
compressor = ContextCompressor()
result = compressor.compress(
"ContextCrumb deletes low-value words while preserving useful context."
)
print(result.text)
The package loads the ONNX artifacts in onnx/ by default, so users do not need PyTorch or Transformers for normal inference. The original model.safetensors checkpoint remains available for Torch/Transformers workflows.
Golden adaptive cutoff mode is the default:
result = compressor.compress(text)
print(result.text)
print(result.stats["golden_cutoff"])
Golden mode keeps at least one third of word-like tokens by default, so an extreme probability gap does not delete nearly all context. Use target_keep_ratio for an explicit lower fixed budget.
Raw Transformers loading also works:
from transformers import AutoModelForTokenClassification, AutoTokenizer
model_id = "ymao20/contextcrumb-32m"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id)
Intended Use
Use this model for experimental context compression, prompt shortening, and agent memory preprocessing. Review outputs before using it in high-stakes settings because deletion can remove important nuance.
Base Model
Fine-tuned from jhu-clsp/ettin-encoder-32m.
- Downloads last month
- 57
Model tree for ymao20/contextcrumb-32m
Base model
jhu-clsp/ettin-encoder-32m