Insert Mouse Battery ResNet18 Reward V1

This repository contains a binary ResNet18 reward model for the insert-mouse-battery task.

Input Format

  • Image input: RGB three-view vertical stack in Wan world-model format.
  • View order: cam_high + cam_left_wrist + cam_right_wrist.
  • Image size: [3, 544, 320] as CHW tensor.
  • Layout: each view is 180x320; the stacked image is 540x320; the full stack is resized to 544x320.
  • Intended preprocessing: the same normalization/preprocessing path used by the RLinf ResNet reward model.

Output Format

The checkpoint is a binary reward model. For a single input image, the model outputs one scalar logit. Applying sigmoid gives the estimated full-task success probability.

Files

File Description
full_weights.pt RLinf ResNet18 reward checkpoint.
model_metadata.json Input/output format and dataset construction metadata.
eval_summary.json Train/validation/hard-validation metrics.
train_grid.jpg Sampled training examples.
val_grid.jpg Sampled validation examples.
hard_val_grid.jpg Sampled hard-validation examples.

Test Results

split samples positives negatives accuracy AUC loss
5582 2791 2791 0.9946 0.9999 0.0189
1388 694 694 0.9856 0.9915 0.1085
536 268 268 0.9832 0.9921 0.1076

Training Data

The dataset uses three-view videos from expert-data, success-and-hil-data, and failure-data.

  • Positive samples: tail frames from expert and success/HIL episodes.
  • Negative samples: early and middle frames from expert and success/HIL episodes, plus sampled frames from failure episodes.
  • Train/validation splitting is done at the episode level.
  • The train and validation sets are class-balanced.
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading