RLHF-V-SFT / README.md
Yirany's picture
Update README.md
df4823e verified
metadata
license: apache-2.0
language:
  - en

News

  • [2024.05.28] πŸ“ƒ Our RLAIF-V paper is accesible at arxiv now!
  • [2024.05.20] πŸŽ‰ We introduce RLAIF-V, our new alignment framework that utilize open-source models for feedback generation and reach super GPT-4V trustworthiness. You can download the corresponding dataset and models (7B, 12B) now!
  • [2024.04.11] πŸ”₯ Our data is used in MiniCPM-V 2.0, an end-side multimodal large language model that exhibits comparable trustworthiness with GPT-4V!

Citation

If you find this dataset helpful, please consider cite our papers πŸ“:

@article{yu2023rlhf,
  title={Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback},
  author={Yu, Tianyu and Yao, Yuan and Zhang, Haoye and He, Taiwen and Han, Yifeng and Cui, Ganqu and Hu, Jinyi and Liu, Zhiyuan and Zheng, Hai-Tao and Sun, Maosong and others},
  journal={arXiv preprint arXiv:2312.00849},
  year={2023}
}

@article{yu2024rlaifv,
  title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness}, 
  author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong},
  journal={arXiv preprint arXiv:2405.17220},
  year={2024},
}