YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Detect-crime Miner β Recipe to Beat the King
Target element: manak0/Detect-crime on subnet 423 (open-source / public track).
Read ANALYSIS.md first β it documents the king's model (the manak0 baseline) and where the gap lives.
Current king of record (2026-05-04 leaderboard): hotkey 5CSeBYβ¦tv9f, score 0.576.
Crime is uncontested: there is no Detect-crime-winner HF repo, and the king's score
is within rounding of the published baseline's overall_iou (0.597). Anybody who lands a
modest improvement takes the throne.
Layout
crime_miner/
βββ ANALYSIS.md β analysis of the king + scoring + constraints
βββ README.md β this file
βββ miner.py β deployable inference (multi-scale TTA + WBF + CLAHE)
βββ chute_config.yml β chute resource spec (16 GB GPU, matches king's)
βββ class_names.txt β target class order β DO NOT REORDER
βββ training/
βββ DATASET.md β dataset sources + pipeline (start here)
βββ build_dataset.py β end-to-end builder: manako + Roboflow + COCO bat
βββ poll_manako.py β background poller for in-domain frames + king's preds
βββ train.py β two-stage YOLOv11 training (silver β clean fine-tune)
βββ verify_dataset.py β QA over assembled YOLO dirs
βββ export_onnx.py β export with NMS baked in -> [1, 300, 6]
βββ requirements.txt
What the miner does differently
miner.py keeps the king's I/O contract (single weights.onnx β TVFrameResult) but adds
six concrete improvements over the auto-generated subnet_bridge template the king ships:
- Letterboxed input at 1280 instead of stretch-resized 640. Small objects (balaclava ~30 px, glove ~25 px, spray paint can ~20 px) survive β the king's stretch resize destroys them. This alone lifts recall on the four catastrophic classes.
- Per-class confidence floors. King uses one global 0.25 across all six classes; we
set
balaclava=0.05, bat=0.10, glove=0.05, graffiti=0.20, hoodie=0.20, spray paint=0.10. Synthetic-benchmark recalls were 0.034 / 0.143 / 0.064 / 0.321 / 0.274 / 0.161 β the bottleneck is recall, and the FFPI cap has plenty of headroom (~6.5 preds/img today). - Multi-scale TTA at
{1280, 1536} Γ {orig, hflip}= 4 forward passes, collapsed to 2 when the ONNX export is static-shape. Pro_6000 has the budget (latency p95 = 10 s). - Weighted Box Fusion across TTA streams. WBF averages cluster boxes weighted by score, which yields tighter localizations than always picking the highest-confidence proposal β and tighter boxes mean more cases cross the IoUβ₯0.5 bar that the scorer uses.
- CLAHE on dark frames only (luma gate). Crime CCTV is night-heavy. King applies no preprocessing.
- Class-aware NMS at IoU=0.45. King uses class-agnostic NMS, which suppresses balaclava-on-hoodie or glove-near-bat overlaps. Class-aware keeps both.
Total ONNX inference cost on Pro_6000 with YOLOv11s + 2-scale TTA is well under 1 s/frame.
How to deploy
You need: a weights.onnx exported in [1, 300, 6] layout (NMS baked in) β produced by
training/export_onnx.py after training, OR you can ship the king's raw ONNX directly to
test the inference improvements alone.
Option 1 β drop-in test with the king's weights
Sanity-check that the inference improvements alone help, before training:
cp /root/turbovision_crime/king_models/Detect-crime/weights.onnx ./weights.onnx
python miner.py # smoke test on /tmp/crime_proof.png
Expected: with the king's weights but our miner.py, you should already see a noticeable lift on the rare classes (recall driven up by the lower per-class conf floors and the 1280 input that the dynamic-shape ONNX accepts). The king's published ONNX is static at 640Γ640, so the dynamic letterbox path won't help unless you re-export β see below.
Option 2 β train a real beating model
See training/DATASET.md for full data-pipeline notes. Quick path:
cd training
pip install -r requirements.txt
# 1) Start the manako poller in the background to accumulate in-domain frames
# (each rotation surfaces a fresh challenge ~every few minutes during active scoring).
python poll_manako.py --out ../manako_pool --interval 120 --forever &
# 2) Build the silver dataset. Combine manako frames (king-labeled), Roboflow
# per-class detection sets, and (optional) COCO baseball bat. Roboflow needs
# ROBOFLOW_API_KEY in env.
python build_dataset.py \
--out ../data \
--king-onnx /root/turbovision_crime/king_models/Detect-crime/weights.onnx \
--manako --manako-polls 30 --manako-poll-delay 120 \
--roboflow balaclava=brainster/balaclava-detection-v3 \
--roboflow glove=ppe-detection/gloves-v1 \
--roboflow graffiti=graffiti-detection/graffiti-v3 \
--roboflow "spray paint=tools/spray-paint-can-v1" \
--coco-bat /path/to/coco/instances_train2017.json /path/to/coco/train2017 \
--extra-dir ../manako_pool/images \
--min-conf 0.10 --keep-empty --intra-threads 16
# 3) Verify the assembled dataset
python verify_dataset.py --data ../data/data.yaml --visualize 20
# 4) Stage A: silver pretrain
python train.py --data ../data/data.yaml --weights yolo11s.pt \
--imgsz 1280 --batch 16 --stage A --epochs 200 --name crime_a
# 5) Build a clean set: hand-verify (or LLM-verify) ~300 manako frames into
# ../data_clean/data.yaml with the same YOLO layout.
# 6) Stage B: clean fine-tune
python train.py --data ../data_clean/data.yaml \
--weights ../runs/detect/crime_a/weights/best.pt \
--imgsz 1280 --batch 16 --stage B --epochs 50 --name crime_b
# 7) Export with NMS baked in -> [1, 300, 6]
python export_onnx.py --weights ../runs/detect/crime_b/weights/best.pt \
--imgsz 1280 --out ../weights.onnx
Option 3 β deploy via the turbovision CLI
cd /root/turbovision_crime
sv -vv deploy-os-miner --model-path scratch/crime_miner --element-id manak0/Detect-crime
The CLI uploads miner.py, weights.onnx, class_names.txt, chute_config.yml to your
HF repo, builds the chute, and commits the on-chain pointer.
Tuning knobs (top of miner.py)
| Constant | Default | Effect of raising | Effect of lowering |
|---|---|---|---|
PER_CLASS_CONF[0] (balaclava) |
0.05 | fewer FPs (good for FFPI) | more recall (better AP, better IoU) |
PER_CLASS_CONF[2] (glove) |
0.05 | as above | as above |
PER_CLASS_CONF[4] (hoodie) |
0.20 | fewer hoodie FPs | more boxes (may hurt precision) |
TTA_SIZES |
(1280, 1536) | better small-object recall | faster inference |
WBF_IOU |
0.55 | more conservative fusion | tighter clusters |
NMS_IOU |
0.45 | keeps more near-duplicates | stricter dedup |
MAX_DET |
100 | more boxes survive ranking | tighter cap |
CLAHE_DARK_THRESHOLD |
70 | CLAHE on more frames | only the very dark ones |
When tuning, validate against runs/detect/crime_b/val_batch*.jpg and the manako latest
challenge image β don't hill-climb on the synthetic benchmark alone (it's only 50 frames).
Why these specific choices
- The IoU pillar dominates the live score (dashboard 0.576 β baseline
overall_iou0.597). IoU is the label-agnostic AUC-F1 placement metric β what matters most is whether any well-placed box exists for each GT. So the optimal strategy is to flood predictions for the rare classes; the FFPI cap (10 FP/image, currently ~6.5 preds/img baseline) gives generous headroom. - mAP@50 matters too because secondary pillars are likely weighted in. mAP@50 is per-class-averaged with strict label match. Raising recall on the four near-zero classes even modestly (0.03 β 0.20 on balaclava) lifts the per-class mean by ~0.03 alone.
- WBF over hard NMS: tighter localizations β more boxes clearing the IoUβ₯0.5 bar.
- Class-aware NMS: balaclava overlaps with hoodie geometry; bat overlaps with glove on a held bat. Class-agnostic NMS would silently kill one of each pair.
- CLAHE only on dark frames: applying CLAHE to bright frames hurts hoodie/graffiti texture. Luma gate keeps it surgical.
Verifying you're actually beating the king
Before committing on-chain:
- Pull the latest annotated challenge image+predictions:
curl -sL "https://console.scorevision.io/api/v2/elements/manak0%2FDetect-crime?lookback_days=7" \ | jq '.latestAnnotatedChallenge' - Run your
miner.pyon that image; visually verify your boxes β₯ king's, especially on balaclava, glove, and spray paint. - Run
sv -vv run-once(perMINER.md) to score yourself end-to-end on a real challenge without committing β confirms the chute deploys correctly and your output format matches. - Only after the offline score is repeatedly above 0.62 (the king + a comfortable margin) should you deploy and commit.
Open questions / pending work
- Live pillar weights for
Detect-crimeβ confirm by reading the active manifest withsv -vv elements listonce.envis configured. The recipe above assumes IoU-dominated scoring; if mAP/precision/recall pillars are weighted higher, the per-class confidence floors should be raised (less recall, more precision). - Real GT vs SAM3 PGT β confirm whether
elements[].ground_truth = truein the live manifest. If real GT (Manako-internal), the synthetic_fixed dataset on HF is the closest proxy and we should overfit it carefully. If SAM3 PGT, the live targets are whatever SAM3 detects when prompted with the 6 class names β slightly fuzzier. - Manako data pull β
poll_manako.pyis built but untested forDetect-crime. The endpoint shape is the same as petrol-station's; if Manako gates the API for low-traffic elements, fall back to using the king's ONNX as the silver labeler over Roboflow data.