--- license: gpl --- This is the official pre-trained model of the paper ''VIRT: Vision Instructed Robotic Transformer for Manipulation Learning''. The model is pre-trained using the robotic imagery pre-training technique on the Droid dataset. If you find this model useful, please cite: ```BibTeX @article{li2024virt, title={VIRT: Vision Instructed Robotic Transformer for Manipulation Learning}, author={Zhuoling, Li and Liangliang, Ren and Jinrong, Yang and Yong, Zhao and others}, journal={arXiv preprint arXiv:2410.07169}, year={2024} } ```