--- license: apache-2.0 language: - en --- ## News * [2024.05.28] 📃 Our RLAIF-V paper is accesible at [arxiv](https://arxiv.org/abs/2405.17220) now! * [2024.05.20] 🎉 We introduce [RLAIF-V](https://github.com/RLHF-V/RLAIF-V), our new alignment framework that utilize open-source models for feedback generation and reach **super GPT-4V trustworthiness**. You can download the corresponding [dataset](https://huggingface.co/datasets/openbmb/RLAIF-V-Dataset) and models ([7B](https://huggingface.co/openbmb/RLAIF-V-7B), [12B](https://huggingface.co/openbmb/RLAIF-V-12B)) now! * [2024.04.11] 🔥 Our data is used in [MiniCPM-V 2.0](https://huggingface.co/openbmb/MiniCPM-V-2), an **end-side** multimodal large language model that exhibits **comparable trustworthiness with GPT-4V**! ## Citation If you find this dataset helpful, please consider cite our papers 📝: ``` @article{yu2023rlhf, title={Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback}, author={Yu, Tianyu and Yao, Yuan and Zhang, Haoye and He, Taiwen and Han, Yifeng and Cui, Ganqu and Hu, Jinyi and Liu, Zhiyuan and Zheng, Hai-Tao and Sun, Maosong and others}, journal={arXiv preprint arXiv:2312.00849}, year={2023} } @article{yu2024rlaifv, title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness}, author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong}, journal={arXiv preprint arXiv:2405.17220}, year={2024}, } ```