![vbench_logo](https://raw.githubusercontent.com/Vchitect/VBench/master/asset/vbench_logo_short.jpg)
[![Paper](https://img.shields.io/badge/cs.CV-Paper-b31b1b?logo=arxiv&logoColor=red)](https://arxiv.org/abs/2311.17982)
[![Project Page](https://img.shields.io/badge/VBench-Website-green?logo=googlechrome&logoColor=green)](https://vchitect.github.io/VBench-project/)
[![PyPI](https://img.shields.io/pypi/v/vbench)](https://pypi.org/project/vbench/)
[![HuggingFace](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Leaderboard-blue)](https://huggingface.co/spaces/Vchitect/VBench_Leaderboard)
[![Video](https://img.shields.io/badge/YouTube-Video-c4302b?logo=youtube&logoColor=red)](https://www.youtube.com/watch?v=7IhCC8Qqn8Y)
[![Visitor](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FVchitect%2FVBench&count_bg=%23FFA500&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=visitors&edge_flat=false)](https://hits.seeyoufarm.com)
This repository contains the implementation of the following paper:
> **VBench: Comprehensive Benchmark Suite for Video Generative Models**
> [Ziqi Huang](https://ziqihuangg.github.io/)∗, [Yinan He](https://github.com/yinanhe)∗, [Jiashuo Yu](https://scholar.google.com/citations?user=iH0Aq0YAAAAJ&hl=zh-CN)∗, [Fan Zhang](https://github.com/zhangfan-p)∗, [Chenyang Si](https://chenyangsi.top/), [Yuming Jiang](https://yumingj.github.io/), [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/), [Tianxing Wu](https://tianxingwu.github.io/), [Qingyang Jin](https://github.com/Vchitect/VBench), [Nattapol Chanpaisit](https://nattapolchan.github.io/me), [Yaohui Wang](https://wyhsirius.github.io/), [Xinyuan Chen](https://scholar.google.com/citations?user=3fWSC8YAAAAJ), [Limin Wang](https://wanglimin.github.io), [Dahua Lin](http://dahua.site/)+, [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao/index.html)+, [Ziwei Liu](https://liuziwei7.github.io/)+
> IEEE/CVF Conference on Computer Vision and Pattern Recognition (**CVPR**), 2024
## :fire: Updates
- [03/2024] :fire::fire: **[VBench-Reliability](https://github.com/Vchitect/VBench/tree/master/vbench2_beta_reliability)** :fire::fire: We now support evaluating the **reliability** (*e.g.*, culture, fairness, bias, safety) of video generative models.
- [03/2024] :fire::fire: **[VBench-I2V](https://github.com/Vchitect/VBench/tree/master/vbench2_beta_i2v)** :fire::fire: We now support evaluating **Image-to-Video (I2V)** models. We also provide [Image Suite](https://drive.google.com/drive/folders/1fdOZKQ7HWZtgutCKKA7CMzOhMFUGv4Zx?usp=sharing).
- [03/2024] We support **evaluating customized videos**! See [here](https://github.com/Vchitect/VBench/?tab=readme-ov-file#new-evaluate-your-own-videos) for instructions.
- [01/2024] PyPI pacakge is released! [![PyPI](https://img.shields.io/pypi/v/vbench)](https://pypi.org/project/vbench/). Simply `pip install vbench`.
- [12/2023] :fire::fire: **[VBench](https://github.com/Vchitect/VBench?tab=readme-ov-file#usage)** :fire::fire: Evaluation code released for 16 **Text-to-Video (T2V) evaluation** dimensions.
- `['subject_consistency', 'background_consistency', 'temporal_flickering', 'motion_smoothness', 'dynamic_degree', 'aesthetic_quality', 'imaging_quality', 'object_class', 'multiple_objects', 'human_action', 'color', 'spatial_relationship', 'scene', 'temporal_style', 'appearance_style', 'overall_consistency']`
- [11/2023] Prompt Suites released. (See prompt lists [here](https://github.com/Vchitect/VBench/tree/master/prompts))
## :mega: Overview
![overall_structure](./asset/fig_teaser_new.jpg)
We propose **VBench**, a comprehensive benchmark suite for video generative models. We design a comprehensive and hierarchical Evaluation Dimension Suite to decompose "video generation quality" into multiple well-defined dimensions to facilitate fine-grained and objective evaluation. For each dimension and each content category, we carefully design a Prompt Suite as test cases, and sample Generated Videos from a set of video generation models. For each evaluation dimension, we specifically design an Evaluation Method Suite, which uses carefully crafted method or designated pipeline for automatic objective evaluation. We also conduct Human Preference Annotation for the generated videos for each dimension, and show that VBench evaluation results are well aligned with human perceptions. VBench can provide valuable insights from multiple perspectives.
## :mortar_board: Evaluation Results
We visualize VBench evaluation results of various publicly available video generation models, as well as Gen-2 and Pika, across 16 VBench dimensions. We normalize the results per dimension for clearer comparisons. (See numeric values at our [Leaderboard](https://huggingface.co/spaces/Vchitect/VBench_Leaderboard)) ## :hammer: Installation ### Install with pip ``` pip install vbench ``` To evaluate some video generation ability aspects, you need to install [detectron2](https://github.com/facebookresearch/detectron2) via: ``` pip install detectron2@git+https://github.com/facebookresearch/detectron2.git ``` If there is an error during [detectron2](https://github.com/facebookresearch/detectron2) installation, see [here](https://detectron2.readthedocs.io/en/latest/tutorials/install.html). Download [VBench_full_info.json](https://github.com/Vchitect/VBench/blob/master/vbench/VBench_full_info.json) to your running directory to read the benchmark prompt suites. ### Install with git clone git clone https://github.com/Vchitect/VBench.git pip install -r VBench/requirements.txt pip install VBench If there is an error during [detectron2](https://github.com/facebookresearch/detectron2) installation, see [here](https://detectron2.readthedocs.io/en/latest/tutorials/install.html). ## Usage Use VBench to evaluate videos, and video generative models. - A Side Note: VBench is designed for evaluating different models on a standard benchmark. Therefore, by default, we enforce evaluation on the **standard VBench prompt lists** to ensure **fair comparisons** among different video generation models. That's also why we give warnings when a required video is not found. This is done via defining the set of prompts in [VBench_full_info.json](https://github.com/Vchitect/VBench/blob/master/vbench/VBench_full_info.json). However, we understand that many users would like to use VBench to evaluate their own videos, or videos generated from prompts that does not belong to the VBench Prompt Suite, so we also added the function of **Evaluating Your Own Videos**. Simply turn the `custom_input` flag on, and you can evaluate your own videos. ### **[New]** Evaluate Your Own Videos We support evaluating any video. Simply provide the path to the video file, or the path to the folder that contains your videos. There is no requirement on the videos' names. - Note: We support customized videos / prompts for the following dimensions: `'subject_consistency', 'background_consistency', 'motion_smoothness', 'dynamic_degree', 'aesthetic_quality', 'imaging_quality'` To evaluate videos with customed input prompt, run our script with the `custom_input` flag on: ``` python evaluate.py \ --dimension $DIMENSION \ --videos_path /path/to/folder_or_video/ \ --custom_input ``` alternatively you can use our command: ``` vbench evaluate \ --dimension $DIMENSION \ --videos_path /path/to/folder_or_video/ \ --custom_input ``` ### Evaluation on the Standard Prompt Suite of VBench ##### command line ```bash vbench evaluate --videos_path $VIDEO_PATH --dimension $DIMENSION ``` For example: ```bash vbench evaluate --videos_path "sampled_videos/lavie/human_action" --dimension "human_action" ``` ##### python ```python from vbench import VBench my_VBench = VBench(device,