--- library_name: transformers tags: - robotics - vlm - image-text-to-text - multimodal - pretraining license: mit language: - en pipeline_tag: image-text-to-text --- # Prism with Qwen 2.5 0.5B backbone (Prismatic-Compatible Version) This model is trained on the Llava-1.5-Instruct dataset. ## Usage Instructions See the [MiniVLA GitHub README](https://github.com/Stanford-ILIAD/openvla-mini/blob/main/README.md) for instructions on how to use this checkpoint for downstream training and finetuning. ## Citation **BibTeX:** ```bibtex @article{belkhale24minivla, title={MiniVLA: A Better VLA with a Smaller Footprint}, author={Suneel Belkhale and Dorsa Sadigh}, url={https://github.com/Stanford-ILIAD/openvla-mini} year={2024} } ```