Context Sphere Locator
This repository contains the trained Context Locator checkpoint used by the Context Sphere artifact.
The Locator is the learned retrieval/perception component of Context Sphere. It scores issue--repository evidence pairs and provides the initial relevance signals used to form a repository centroid before AST-based neighborhood expansion.
Files
context_sphere_v3_best.pt: best validation checkpoint from the cloud training run.context_sphere_v3_final.pt: final checkpoint from the same run.context_sphere_v3_cloud_training_report.json: training report with data counts, hyperparameters, validation loss, and recall-at-5.context_sphere_v3_micro_selector.pt: small local micro-training checkpoint retained for reproducibility of early selector experiments.context_sphere_v3_neural_microtrain.json: report for the local micro-training checkpoint.
Training Summary
The cloud training run used SWE-bench-derived issue/file supervision with
15,206 training items and 3,802 validation items. Gold-patch contents were not
used as model inputs; patch-derived information was used only to construct
touched-file labels. The best run completed five epochs on CUDA and reported a
best validation loss of 0.8114 with validation recall-at-5 of 0.7850.
Usage
This is a custom PyTorch checkpoint for the Context Sphere codebase rather than a standalone Transformers model. Loading code and evaluation scripts are available in the companion artifact repository:
https://github.com/johnZYW/context-sphere
Install the artifact package dependencies, then download this checkpoint into
the default path expected by scripts/inference.py:
python - <<'PY'
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="Zywdd/context-sphere-locator",
repo_type="model",
local_dir="models",
allow_patterns=[
"context_sphere_v3_best.pt",
"context_sphere_v3_cloud_training_report.json",
],
)
PY
Run selector inference with:
find /path/to/target/repo -name "*.py" > /tmp/context_sphere_candidate_files.txt
python scripts/inference.py \
--checkpoint models/context_sphere_v3_best.pt \
--problem-statement "Django crashes when resolving a model field during migration rendering" \
--candidate-files /tmp/context_sphere_candidate_files.txt \
--out outputs/locator_smoke.json
For full benchmark reproduction, the same checkpoint is used by
scripts/run_benchmarks.py through the default scripts/inference.py
configuration.
Citation
@misc{zhang2026contextsphere,
title = {Context Sphere: Topology-Aware Context Orchestration for Cost-Efficient LLM Repository Repair},
author = {Zhang, Yuwen},
year = {2026},
howpublished = {arXiv preprint and artifact release}
}