Context Sphere Locator

This repository contains the trained Context Locator checkpoint used by the Context Sphere artifact.

The Locator is the learned retrieval/perception component of Context Sphere. It scores issue--repository evidence pairs and provides the initial relevance signals used to form a repository centroid before AST-based neighborhood expansion.

Files

context_sphere_v3_best.pt: best validation checkpoint from the cloud training run.
context_sphere_v3_final.pt: final checkpoint from the same run.
context_sphere_v3_cloud_training_report.json: training report with data counts, hyperparameters, validation loss, and recall-at-5.
context_sphere_v3_micro_selector.pt: small local micro-training checkpoint retained for reproducibility of early selector experiments.
context_sphere_v3_neural_microtrain.json: report for the local micro-training checkpoint.

Training Summary

The cloud training run used SWE-bench-derived issue/file supervision with 15,206 training items and 3,802 validation items. Gold-patch contents were not used as model inputs; patch-derived information was used only to construct touched-file labels. The best run completed five epochs on CUDA and reported a best validation loss of 0.8114 with validation recall-at-5 of 0.7850.

Usage

This is a custom PyTorch checkpoint for the Context Sphere codebase rather than a standalone Transformers model. Loading code and evaluation scripts are available in the companion artifact repository:

https://github.com/johnZYW/context-sphere

Install the artifact package dependencies, then download this checkpoint into the default path expected by scripts/inference.py:

python - <<'PY'
from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="Zywdd/context-sphere-locator",
    repo_type="model",
    local_dir="models",
    allow_patterns=[
        "context_sphere_v3_best.pt",
        "context_sphere_v3_cloud_training_report.json",
    ],
)
PY

Run selector inference with:

find /path/to/target/repo -name "*.py" > /tmp/context_sphere_candidate_files.txt

python scripts/inference.py \
  --checkpoint models/context_sphere_v3_best.pt \
  --problem-statement "Django crashes when resolving a model field during migration rendering" \
  --candidate-files /tmp/context_sphere_candidate_files.txt \
  --out outputs/locator_smoke.json

For full benchmark reproduction, the same checkpoint is used by scripts/run_benchmarks.py through the default scripts/inference.py configuration.

Citation

@misc{zhang2026contextsphere,
  title        = {Context Sphere: Topology-Aware Context Orchestration for Cost-Efficient LLM Repository Repair},
  author       = {Zhang, Yuwen},
  year         = {2026},
  howpublished = {arXiv preprint and artifact release}
}

Downloads last month: -; Downloads are not tracked for this model. How to track