YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Detect-crime Miner β€” Recipe to Beat the King

Target element: manak0/Detect-crime on subnet 423 (open-source / public track).

Read ANALYSIS.md first β€” it documents the king's model (the manak0 baseline) and where the gap lives.

Current king of record (2026-05-04 leaderboard): hotkey 5CSeBY…tv9f, score 0.576. Crime is uncontested: there is no Detect-crime-winner HF repo, and the king's score is within rounding of the published baseline's overall_iou (0.597). Anybody who lands a modest improvement takes the throne.

Layout

crime_miner/
β”œβ”€β”€ ANALYSIS.md            ← analysis of the king + scoring + constraints
β”œβ”€β”€ README.md              ← this file
β”œβ”€β”€ miner.py               ← deployable inference (multi-scale TTA + WBF + CLAHE)
β”œβ”€β”€ chute_config.yml       ← chute resource spec (16 GB GPU, matches king's)
β”œβ”€β”€ class_names.txt        ← target class order β€” DO NOT REORDER
└── training/
    β”œβ”€β”€ DATASET.md         ← dataset sources + pipeline (start here)
    β”œβ”€β”€ build_dataset.py   ← end-to-end builder: manako + Roboflow + COCO bat
    β”œβ”€β”€ poll_manako.py     ← background poller for in-domain frames + king's preds
    β”œβ”€β”€ train.py           ← two-stage YOLOv11 training (silver β†’ clean fine-tune)
    β”œβ”€β”€ verify_dataset.py  ← QA over assembled YOLO dirs
    β”œβ”€β”€ export_onnx.py     ← export with NMS baked in -> [1, 300, 6]
    └── requirements.txt

What the miner does differently

miner.py keeps the king's I/O contract (single weights.onnx β†’ TVFrameResult) but adds six concrete improvements over the auto-generated subnet_bridge template the king ships:

  1. Letterboxed input at 1280 instead of stretch-resized 640. Small objects (balaclava ~30 px, glove ~25 px, spray paint can ~20 px) survive β€” the king's stretch resize destroys them. This alone lifts recall on the four catastrophic classes.
  2. Per-class confidence floors. King uses one global 0.25 across all six classes; we set balaclava=0.05, bat=0.10, glove=0.05, graffiti=0.20, hoodie=0.20, spray paint=0.10. Synthetic-benchmark recalls were 0.034 / 0.143 / 0.064 / 0.321 / 0.274 / 0.161 β€” the bottleneck is recall, and the FFPI cap has plenty of headroom (~6.5 preds/img today).
  3. Multi-scale TTA at {1280, 1536} Γ— {orig, hflip} = 4 forward passes, collapsed to 2 when the ONNX export is static-shape. Pro_6000 has the budget (latency p95 = 10 s).
  4. Weighted Box Fusion across TTA streams. WBF averages cluster boxes weighted by score, which yields tighter localizations than always picking the highest-confidence proposal β€” and tighter boxes mean more cases cross the IoUβ‰₯0.5 bar that the scorer uses.
  5. CLAHE on dark frames only (luma gate). Crime CCTV is night-heavy. King applies no preprocessing.
  6. Class-aware NMS at IoU=0.45. King uses class-agnostic NMS, which suppresses balaclava-on-hoodie or glove-near-bat overlaps. Class-aware keeps both.

Total ONNX inference cost on Pro_6000 with YOLOv11s + 2-scale TTA is well under 1 s/frame.

How to deploy

You need: a weights.onnx exported in [1, 300, 6] layout (NMS baked in) β€” produced by training/export_onnx.py after training, OR you can ship the king's raw ONNX directly to test the inference improvements alone.

Option 1 β€” drop-in test with the king's weights

Sanity-check that the inference improvements alone help, before training:

cp /root/turbovision_crime/king_models/Detect-crime/weights.onnx ./weights.onnx
python miner.py   # smoke test on /tmp/crime_proof.png

Expected: with the king's weights but our miner.py, you should already see a noticeable lift on the rare classes (recall driven up by the lower per-class conf floors and the 1280 input that the dynamic-shape ONNX accepts). The king's published ONNX is static at 640Γ—640, so the dynamic letterbox path won't help unless you re-export β€” see below.

Option 2 β€” train a real beating model

See training/DATASET.md for full data-pipeline notes. Quick path:

cd training
pip install -r requirements.txt

# 1) Start the manako poller in the background to accumulate in-domain frames
#    (each rotation surfaces a fresh challenge ~every few minutes during active scoring).
python poll_manako.py --out ../manako_pool --interval 120 --forever &

# 2) Build the silver dataset. Combine manako frames (king-labeled), Roboflow
#    per-class detection sets, and (optional) COCO baseball bat. Roboflow needs
#    ROBOFLOW_API_KEY in env.
python build_dataset.py \
    --out ../data \
    --king-onnx /root/turbovision_crime/king_models/Detect-crime/weights.onnx \
    --manako --manako-polls 30 --manako-poll-delay 120 \
    --roboflow balaclava=brainster/balaclava-detection-v3 \
    --roboflow glove=ppe-detection/gloves-v1 \
    --roboflow graffiti=graffiti-detection/graffiti-v3 \
    --roboflow "spray paint=tools/spray-paint-can-v1" \
    --coco-bat /path/to/coco/instances_train2017.json /path/to/coco/train2017 \
    --extra-dir ../manako_pool/images \
    --min-conf 0.10 --keep-empty --intra-threads 16

# 3) Verify the assembled dataset
python verify_dataset.py --data ../data/data.yaml --visualize 20

# 4) Stage A: silver pretrain
python train.py --data ../data/data.yaml --weights yolo11s.pt \
                --imgsz 1280 --batch 16 --stage A --epochs 200 --name crime_a

# 5) Build a clean set: hand-verify (or LLM-verify) ~300 manako frames into
#    ../data_clean/data.yaml with the same YOLO layout.

# 6) Stage B: clean fine-tune
python train.py --data ../data_clean/data.yaml \
                --weights ../runs/detect/crime_a/weights/best.pt \
                --imgsz 1280 --batch 16 --stage B --epochs 50 --name crime_b

# 7) Export with NMS baked in -> [1, 300, 6]
python export_onnx.py --weights ../runs/detect/crime_b/weights/best.pt \
                      --imgsz 1280 --out ../weights.onnx

Option 3 β€” deploy via the turbovision CLI

cd /root/turbovision_crime
sv -vv deploy-os-miner --model-path scratch/crime_miner --element-id manak0/Detect-crime

The CLI uploads miner.py, weights.onnx, class_names.txt, chute_config.yml to your HF repo, builds the chute, and commits the on-chain pointer.

Tuning knobs (top of miner.py)

Constant Default Effect of raising Effect of lowering
PER_CLASS_CONF[0] (balaclava) 0.05 fewer FPs (good for FFPI) more recall (better AP, better IoU)
PER_CLASS_CONF[2] (glove) 0.05 as above as above
PER_CLASS_CONF[4] (hoodie) 0.20 fewer hoodie FPs more boxes (may hurt precision)
TTA_SIZES (1280, 1536) better small-object recall faster inference
WBF_IOU 0.55 more conservative fusion tighter clusters
NMS_IOU 0.45 keeps more near-duplicates stricter dedup
MAX_DET 100 more boxes survive ranking tighter cap
CLAHE_DARK_THRESHOLD 70 CLAHE on more frames only the very dark ones

When tuning, validate against runs/detect/crime_b/val_batch*.jpg and the manako latest challenge image β€” don't hill-climb on the synthetic benchmark alone (it's only 50 frames).

Why these specific choices

  • The IoU pillar dominates the live score (dashboard 0.576 β‰ˆ baseline overall_iou 0.597). IoU is the label-agnostic AUC-F1 placement metric β€” what matters most is whether any well-placed box exists for each GT. So the optimal strategy is to flood predictions for the rare classes; the FFPI cap (10 FP/image, currently ~6.5 preds/img baseline) gives generous headroom.
  • mAP@50 matters too because secondary pillars are likely weighted in. mAP@50 is per-class-averaged with strict label match. Raising recall on the four near-zero classes even modestly (0.03 β†’ 0.20 on balaclava) lifts the per-class mean by ~0.03 alone.
  • WBF over hard NMS: tighter localizations β†’ more boxes clearing the IoUβ‰₯0.5 bar.
  • Class-aware NMS: balaclava overlaps with hoodie geometry; bat overlaps with glove on a held bat. Class-agnostic NMS would silently kill one of each pair.
  • CLAHE only on dark frames: applying CLAHE to bright frames hurts hoodie/graffiti texture. Luma gate keeps it surgical.

Verifying you're actually beating the king

Before committing on-chain:

  1. Pull the latest annotated challenge image+predictions:
    curl -sL "https://console.scorevision.io/api/v2/elements/manak0%2FDetect-crime?lookback_days=7" \
      | jq '.latestAnnotatedChallenge'
    
  2. Run your miner.py on that image; visually verify your boxes β‰₯ king's, especially on balaclava, glove, and spray paint.
  3. Run sv -vv run-once (per MINER.md) to score yourself end-to-end on a real challenge without committing β€” confirms the chute deploys correctly and your output format matches.
  4. Only after the offline score is repeatedly above 0.62 (the king + a comfortable margin) should you deploy and commit.

Open questions / pending work

  • Live pillar weights for Detect-crime β€” confirm by reading the active manifest with sv -vv elements list once .env is configured. The recipe above assumes IoU-dominated scoring; if mAP/precision/recall pillars are weighted higher, the per-class confidence floors should be raised (less recall, more precision).
  • Real GT vs SAM3 PGT β€” confirm whether elements[].ground_truth = true in the live manifest. If real GT (Manako-internal), the synthetic_fixed dataset on HF is the closest proxy and we should overfit it carefully. If SAM3 PGT, the live targets are whatever SAM3 detects when prompted with the 6 class names β€” slightly fuzzier.
  • Manako data pull β€” poll_manako.py is built but untested for Detect-crime. The endpoint shape is the same as petrol-station's; if Manako gates the API for low-traffic elements, fall back to using the king's ONNX as the silver labeler over Roboflow data.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support