CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Repository Overview
This is a Hugging Face Space that provides a GPU-accelerated JupyterLab environment for training and simulating robots using the MuJoCo physics engine. The space covers a wide range of robotics applications including locomotion, manipulation, motion tracking, and general physics simulation. It is designed to run in a Docker container with NVIDIA GPU support for hardware-accelerated physics rendering.
What This Environment Supports
This is a general-purpose MuJoCo training environment with sample notebooks covering:
General MuJoCo Physics (
tutorial.ipynb) - Comprehensive introduction to MuJoCo fundamentals including basic rendering, simulation loops, contacts, friction, tendons, actuators, sensors, and advanced rendering techniquesLocomotion (
locomotion.ipynb) - Training quadrupedal and bipedal robots for walking, running, and acrobatic behaviors. Includes environments for Unitree Go1/G1, Boston Dynamics Spot, Google Barkour, Berkeley Humanoid, Unitree H1, and moreManipulation (
manipulation.ipynb) - Robot arm and dexterous hand control. Includes Franka Emika Panda pick-and-place tasks and Leap Hand dexterous manipulation with asymmetric actor-critic trainingMotion Tracking (
opentrack.ipynb) - Humanoid motion tracking and retargeting using the OpenTrack system with motion capture data
Architecture
Container Environment
- Base Image: nvidia/cuda:12.8.1-devel-ubuntu22.04
- Python: 3.13 (Miniconda)
- GPU Rendering: Uses EGL (OpenGL for headless rendering) with NVIDIA drivers
- Web Server: JupyterLab on port 7860
Key Components
GPU Initialization (
init_gpu.py): Validates GPU setup before starting JupyterLab- Checks NVIDIA driver accessibility via
nvidia-smi - Verifies EGL library availability (libEGL.so.1, libGL.so.1, libEGL_nvidia.so.0)
- Tests EGL device initialization with multiple fallback methods (platform device, default display, surfaceless)
- Validates MuJoCo rendering at multiple resolutions (64x64, 240x320, 480x640)
- Critical environment variables:
MUJOCO_GL=egl,PYOPENGL_PLATFORM=egl,EGL_PLATFORM=surfaceless
- Checks NVIDIA driver accessibility via
MuJoCo Playground Setup (
init_mujoco.py): Downloads MuJoCo model assets- Imports
mujoco_playgroundwhich automatically clones the mujoco_menagerie repository - This repository contains robot models (quadrupeds, bipeds, arms, hands, etc.)
- Imports
Server Startup (
start_server.sh): Container entrypoint- Sets up NVIDIA EGL library symlinks at runtime (searches /usr/local/nvidia/lib64, /usr/local/cuda/lib64, /usr/lib/nvidia)
- Runs GPU validation (
python init_gpu.py) - Downloads MuJoCo assets (
python init_mujoco.py) - Disables JupyterLab announcements
- Launches JupyterLab with iframe embedding support for Hugging Face Spaces
Sample Notebooks
Sample notebooks are organized in individual folders within samples/ and are automatically copied to /data/workspaces/ at container startup:
samples/tutorial/- Complete MuJoCo introduction (2258 lines) covering physics fundamentals, rendering, contacts, actuators, sensors, tendons, and camera controlsamples/locomotion/- Quadrupedal and bipedal locomotion training (1762 lines) with PPO, domain randomization, curriculum learning, and policy fine-tuningsamples/manipulation/- Robot manipulation (649 lines) including pick-and-place (Panda arm) and dexterous manipulation (Leap Hand) with asymmetric actor-criticsamples/opentrack/- Humanoid motion tracking/retargeting (603 lines) including dataset download, training, checkpoint conversion, and video generation
Each sample is copied to its own workspace directory (/data/workspaces/<sample_name>/) at runtime. Notebooks are only copied if they don't already exist, preserving any user modifications.
Development Commands
Running Locally with Docker
# Build the container
docker build -t mujoco-training .
# Run with GPU support
docker run --gpus all -p 7860:7860 mujoco-training
Testing GPU Setup
# Validate GPU rendering capabilities (run inside container)
python init_gpu.py
# Check NVIDIA driver
nvidia-smi
# Test EGL libraries
ldconfig -p | grep EGL
JupyterLab Access
- Default port: 7860
- Default token: "huggingface" (set via
JUPYTER_TOKENenvironment variable) - Default landing page:
/lab/tree/workspaces/locomotion/locomotion.ipynb - Notebook working directory:
/data(when deployed as Hugging Face Space)
Persistent Storage and Workspaces
When deployed on Hugging Face Spaces, the /data directory is backed by persistent storage. At container startup, start_server.sh automatically:
- Creates
/data/workspaces/if it doesn't exist - For each sample in
samples/, creates/data/workspaces/<sample_name>/if it doesn't exist - Copies the
.ipynbfile only if it doesn't already exist in the workspace (preserving user modifications) - Copies any additional files from the sample directory (datasets, scripts, etc.)
This ensures:
- User modifications to notebooks are preserved across container restarts
- Each sample has its own isolated workspace for generated data, models, and outputs
- Sample notebooks can include supporting files that are copied to the workspace
- Users can create additional workspaces in
/data/workspaces/for their own projects
Critical EGL Configuration
The container requires specific EGL configuration for headless GPU rendering:
- NVIDIA EGL Vendor Config: Created at
/usr/share/glvnd/egl_vendor.d/10_nvidia.jsonpointing tolibEGL_nvidia.so.0 - Library Path:
LD_LIBRARY_PATHincludes/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64 - Runtime Symlinks:
start_server.shcreates symlinks tolibEGL_nvidia.so.0from mounted NVIDIA directories - Environment Variables:
__EGL_VENDOR_LIBRARY_DIRS=/usr/share/glvnd/egl_vendor.d
Troubleshooting EGL Issues
If MuJoCo rendering fails:
- Verify NVIDIA drivers:
nvidia-smishould show GPU info - Check EGL vendor config:
cat /usr/share/glvnd/egl_vendor.d/10_nvidia.json - Verify library loading:
ldconfig -p | grep EGL - Run comprehensive diagnostic:
python init_gpu.py - Check that
MUJOCO_GL=eglis set:echo $MUJOCO_GL
Training Workflows
General MuJoCo Simulation (tutorial.ipynb)
Basic simulation loop:
import mujoco
model = mujoco.MjModel.from_xml_string(xml)
data = mujoco.MjData(model)
# Simulation loop
mujoco.mj_resetData(model, data)
while data.time < duration:
mujoco.mj_step(model, data)
# Read sensors, apply controls, etc.
Rendering:
with mujoco.Renderer(model, height, width) as renderer:
mujoco.mj_forward(model, data)
renderer.update_scene(data, camera="camera_name")
pixels = renderer.render()
Locomotion Training (locomotion.ipynb)
Typical workflow using Brax + MuJoCo Playground:
- Load environment:
env = registry.load(env_name) - Get config:
env_cfg = registry.get_default_config(env_name) - Configure PPO:
ppo_params = locomotion_params.brax_ppo_config(env_name) - Apply domain randomization:
randomizer = registry.get_domain_randomizer(env_name) - Train: Use
brax.training.agents.ppo.trainwith the environment and randomization function - Save checkpoints: Policies saved to
checkpoints/{env_name}/{step}/ - Fine-tune: Restore from checkpoint and continue training with modified config
Available environments:
- Quadrupedal: Go1JoystickFlatTerrain, Go1JoystickRoughTerrain, Go1Getup, Go1Handstand, Go1Footstand, SpotFlatTerrainJoystick, SpotGetup, SpotJoystickGaitTracking, BarkourJoystick
- Bipedal: BerkeleyHumanoidJoystickFlatTerrain, BerkeleyHumanoidJoystickRoughTerrain, G1JoystickFlatTerrain, G1JoystickRoughTerrain, H1InplaceGaitTracking, H1JoystickGaitTracking, Op3Joystick, T1JoystickFlatTerrain, T1JoystickRoughTerrain
Full list: registry.locomotion.ALL_ENVS
Key training techniques:
- Domain Randomization: Randomizes friction, armature, center of mass, link masses for sim-to-real transfer
- Energy Penalties:
energy_termination_threshold,reward_config.energy,reward_config.dof_accto control power consumption and smoothness - Curriculum Learning: Fine-tune from checkpoints with progressively modified reward configs
- Asymmetric Actor-Critic: Actor receives proprioception, critic receives privileged simulation state
Manipulation Training (manipulation.ipynb)
Similar to locomotion but focuses on:
- Pick-and-place tasks: PandaPickCubeOrientation (trains in ~3 minutes on RTX 4090)
- Dexterous manipulation: LeapCubeReorient (trains in ~33 minutes on RTX 4090)
- Asymmetric observations: Use
policy_obs_keyandvalue_obs_keyin PPO params to train actor on sensor-like data while critic gets privileged state
Available environments: registry.manipulation.ALL_ENVS
Motion Tracking (opentrack.ipynb)
OpenTrack workflow for humanoid motion tracking:
- Clone repository:
git clone https://github.com/GalaxyGeneralRobotics/OpenTrack.git - Download mocap data: From
huggingface.co/datasets/robfiras/loco-mujoco-datasets(Lafan1/UnitreeG1) - Train policy:
python train_policy.py --exp_name debug --terrain_type flat_terrain - Convert checkpoint:
python brax2torch.py --exp_name <exp_name>(Brax β PyTorch) - Generate videos:
python play_policy.py --exp_name <exp_name> --use_renderer
Python Dependencies
Core stack (see requirements.txt):
- JupyterLab: 4.4.3 (with tornado 6.2 for compatibility)
- JAX: CUDA 12 support via
jax[cuda12] - MuJoCo: 3.3+ with MuJoCo MJX (JAX-based physics)
- Brax: JAX-based RL framework for massively parallel training
- MuJoCo Playground: Collection of robot environments and training utilities
- Supporting libraries: mediapy (video rendering), ipywidgets, nvidia-cusparse-cu12
File Structure
/
βββ Dockerfile # Container with CUDA 12.8 + EGL setup
βββ start_server.sh # Container entrypoint
βββ init_gpu.py # GPU validation script (comprehensive EGL tests)
βββ init_mujoco.py # MuJoCo Playground asset downloader
βββ requirements.txt # Python dependencies
βββ packages.txt # System packages (currently empty)
βββ on_startup.sh # Custom startup commands (placeholder)
βββ login.html # Custom JupyterLab login page
βββ samples/ # Example notebooks (organized by topic)
βββ tutorial/
β βββ tutorial.ipynb # MuJoCo fundamentals (2258 lines)
βββ locomotion/
β βββ locomotion.ipynb # Robot locomotion (1762 lines)
βββ manipulation/
β βββ manipulation.ipynb # Robot manipulation (649 lines)
βββ opentrack/
βββ opentrack.ipynb # Motion tracking (603 lines)
When deployed as a Hugging Face Space with persistent storage:
/data/ # Persistent storage volume (mounted at runtime)
βββ workspaces/ # Sample workspaces (created by start_server.sh)
βββ tutorial/
β βββ tutorial.ipynb # Copied from samples/, preserves user edits
β βββ ... # User-generated data, models, outputs
βββ locomotion/
β βββ locomotion.ipynb
β βββ checkpoints/ # Training checkpoints
β βββ ...
βββ manipulation/
β βββ manipulation.ipynb
β βββ ...
βββ opentrack/
βββ opentrack.ipynb
βββ datasets/ # Downloaded mocap data
βββ models/ # Trained models
βββ videos/ # Generated videos
Performance Notes
- Physics simulation: Can achieve 50,000+ Hz on single GPU with JAX/MJX (much faster than rendering)
- Rendering: Typically 30-60 Hz, much slower than physics
- Training times (on RTX 4090 / L40S):
- Simple manipulation: 3 minutes
- Quadrupedal joystick: 7 minutes
- Bipedal locomotion: 17 minutes
- Dexterous manipulation: 33 minutes
- Brax parallelization: Uses thousands of parallel environments for fast training
- Checkpointing: Critical for curriculum learning and fine-tuning
Common Patterns
Visualization Options
scene_option = mujoco.MjvOption()
scene_option.flags[mujoco.mjtVisFlag.mjVIS_JOINT] = True # Show joints
scene_option.flags[mujoco.mjtVisFlag.mjVIS_CONTACTPOINT] = True # Show contacts
scene_option.flags[mujoco.mjtVisFlag.mjVIS_CONTACTFORCE] = True # Show forces
scene_option.flags[mujoco.mjtVisFlag.mjVIS_TRANSPARENT] = True # Transparency
scene_option.flags[mujoco.mjtVisFlag.mjVIS_PERTFORCE] = True # Show perturbations
Named Access Pattern
# Instead of using indices
model.geom_rgba[geom_id, :]
# Use named access
model.geom('green_sphere').rgba
data.geom('box').xpos
data.joint('swing').qpos
data.sensor('accelerometer').data
Rendering Modes
- RGB rendering:
renderer.render()- returns pixels - Depth rendering:
renderer.enable_depth_rendering()thenrenderer.render() - Segmentation:
renderer.enable_segmentation_rendering()- returns object IDs and types
Important Notes
- This is designed for Hugging Face Spaces with GPU instances (NVIDIA L40S or similar)
- All training uses JAX/Brax for massive parallelization across thousands of environments
- Policies are typically saved using Orbax checkpointing for fine-tuning
- Domain randomization is critical for sim-to-real transfer
- The environment supports multiple RL algorithms (PPO, SAC) through Brax
- Asymmetric actor-critic (different observations for policy and value function) is commonly used