pinned
Running
492
πΌπ¬
None defined yet.
π€ Demo | π€ Paper | π arXiv | GitHub
We are a team from AI2, UCSB, UWaterloo, UPenn, NTU, UWM, and UCSC, and we are working on benchmarking vision language models.
Team Member: Yujie Lu, Dongfu Jiang, Xingyu Fu, Hui Chen, Yingzi Ma, Jing Gu, Michael Saxon
Advisor: Bill Yuchen Lin, Wenhu Chen, Chaowei Xiao, Yejin Choi, Miguel Eckstein, William Yang Wang
Compare VLMs at WildVision-Arena and WildVision-Bench.
More chat and vote data will be updated reguarly. Eval script is released here WildVision-Bench
Contact: Bill Yuchen Lin (yuchenl@allenai.org) and Yujie Lu (yujielu@ucsb.edu)
Citation: If you found this huggingface space useful, please consider cite us:
@article{lu2024wildvision,
title={WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences},
author={Lu, Yujie and Jiang, Dongfu and Chen, Wenhu and Wang, William Yang and Choi, Yejin and Lin, Bill Yuchen},
journal={arXiv preprint arXiv:2406.11069},
year={2024}
}