metadata

license: mit

iVideoGPT (Pre-trained on Open X-Embodiment, 64x64 resolution, action-free)

Pre-trained model introduced in the paper iVideoGPT: Interactive VideoGPTs are Scalable World Models .

See https://github.com/thuml/iVideoGPT for examples for using this model.

Citation

@article{wu2024ivideogpt,
    title={iVideoGPT: Interactive VideoGPTs are Scalable World Models}, 
    author={Jialong Wu and Shaofeng Yin and Ningya Feng and Xu He and Dong Li and Jianye Hao and Mingsheng Long},
    journal={arXiv preprint arXiv:2405.15223},
    year={2024},
}