LIBERO-Causal-Wan2.2-5BTI2V

This repository contains a Causal-Forcing / AR diffusion checkpoint for LIBERO robot video generation, adapted from Wan2.2-TI2V-5B.

The checkpoint was trained with the LightEWM LIBERO-Causal example on LIBERO videos with dense prompts.

Checkpoint

  • File: model.pt
  • Training step: 145000
  • Base model family: Wan2.2-TI2V-5B
  • Backend: Causal-Forcing AR diffusion
  • Dataset adapter: lightewm.dataset.causal_forcing.CausalForcingJsonlAdapter
  • Training data metadata: data/libero_i2v_train/metadata_dense_prompt.csv

Training Setup

The training configuration follows examples/LIBERO-Causal/train.yaml in LightEWM:

  • Official backend config: configs/ar_diffusion_tf_framewise_wan22_ti2v_5b_maze.yaml
  • Resolution: 224 x 224
  • Video length: 49 RGB frames
  • FPS: 10
  • Inference latent frames / output frames: 13
  • variable_num_frames_train: false
  • max_training_video_frames: 49
  • model_kwargs.timestep_shift: 5.0
  • model_kwargs.seq_len: 1029
  • Distributed training: 8 processes

The LIBERO preprocessing pipeline resamples LIBERO demonstrations to 10 FPS and uses dense prompts generated for the converted metadata.

Usage

Place model.pt under a local checkpoints directory and point the LightEWM LIBERO-Causal inference config at it:

runner:
  params:
    checkpoint_path: checkpoints/LIBERO-Causal-Wan2.2-5BTI2V/model.pt

Then run:

python run.py --config examples/LIBERO-Causal/infer.yaml

The default LIBERO-Causal inference example uses:

  • 224 x 224
  • 49 RGB frames
  • num_output_frames: 13
  • dense prompt metadata at data/libero_i2v_train/metadata_dense_prompt.csv

Intended Use

This checkpoint is intended for research on robot video prediction/generation and LIBERO-style manipulation trajectories. It is not intended for deployment in safety-critical robotic control systems without additional validation.

Limitations

  • The model is specialized to the LIBERO data distribution and may not generalize to unrelated robot embodiments, scenes, or camera viewpoints.
  • Outputs are generated videos, not verified executable robot policies.
  • Performance depends on using matching preprocessing, prompt format, resolution, and frame count.

Citation

If you use this checkpoint, please cite the upstream Wan2.2, LIBERO, Causal-Forcing, and LightEWM resources as appropriate.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XuWuLingYu/LIBERO-Causal-Wan2.2-5BTI2V

Finetuned
(60)
this model