Reward Model Card for reward_classifier

A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable.

This reward model has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.

How to Get Started with the Reward Model

Train from scratch

lerobot-train \
  --dataset.repo_id=${HF_USER}/<dataset> \
  --reward_model.type=reward_classifier \
  --output_dir=outputs/train/<desired_reward_model_repo_id> \
  --job_name=lerobot_reward_training \
  --reward_model.device=cuda \
  --reward_model.repo_id=${HF_USER}/<desired_reward_model_repo_id> \
  --wandb.enable=true

Writes checkpoints to outputs/train/<desired_reward_model_repo_id>/checkpoints/.

Load the reward model in Python

from lerobot.rewards import make_reward_model

reward_model = make_reward_model(pretrained_path="<hf_user>/<reward_model_repo_id>")
reward = reward_model.compute_reward(batch)

Model Details

License: apache-2.0

Downloads last month: 13

Safetensors

Model size

7.27M params

Tensor type

F32

Video Preview

Robotics

qb1t
/

lekiwi-reward-classifier

Reward Model Card for reward_classifier

How to Get Started with the Reward Model

Train from scratch

Load the reward model in Python

Model Details

Dataset used to train qb1t/lekiwi-reward-classifier