File size: 7,206 Bytes
2665db2 6bc03db 0c325be 2665db2 f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db f318285 6bc03db |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
---
title: COVER
emoji: π
colorFrom: blue
colorTo: yellow
sdk: gradio
sdk_version: 4.36.1
python_version: 3.9
app_file: app.py
pinned: false
license: mit
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# π [CVPRW 2024] [COVER](https://openaccess.thecvf.com/content/CVPR2024W/AI4Streaming/papers/He_COVER_A_Comprehensive_Video_Quality_Evaluator_CVPRW_2024_paper.pdf): A Comprehensive Video Quality Evaluator.
π π₯ **Winner solution for [Video Quality Assessment Challenge](https://codalab.lisn.upsaclay.fr/competitions/17340) at the 1st [AIS 2024](https://ai4streaming-workshop.github.io/) workshop @ CVPR 2024**
Official Code for [CVPR Workshop 2024] Paper *"COVER: A Comprehensive Video Quality Evaluator"*.
Official Code, Demo, Weights for the [Comprehensive Video Quality Evaluator (COVER)](https://openaccess.thecvf.com/content/CVPR2024W/AI4Streaming/papers/He_COVER_A_Comprehensive_Video_Quality_Evaluator_CVPRW_2024_paper.pdf).
- 29 May, 2024: We create a space for [COVER](https://huggingface.co/spaces/Sorakado/COVER) on Hugging Face.
- 09 May, 2024: We upload Code of [COVER](https://github.com/vztu/COVER).
- 12 Apr, 2024: COVER has been accepted by CVPR Workshop2024.
![visitors](https://visitor-badge.laobi.icu/badge?page_id=vztu/COVER) [![](https://img.shields.io/github/stars/vztu/COVER)](https://github.com/vztu/COVER)
[![State-of-the-Art](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/vztu/COVER)
<a href="https://huggingface.co/spaces/Sorakado/COVER"><img src="./figs/deploy-on-spaces-sm-dark.svg" alt="hugging face log"></a>
## Introduction
- Existing UGC VQA models strive to quantify quality degradation mainly from technical aspect, with a few considering aesthetic or semantic aspects, but no model has addressed all three aspects simultaneously.
- The demand for high-resolution and high-frame-rate videos on social media platforms presents new challenges for VQA tasks, as they must ensure effectiveness while also meeting real-time requirements.
## the proposed COVER
*This inspires us to develop comprehensive and efficient model for UGC VQA task*
![Fig](./figs/approach.jpg)
### COVER
Results comparison:
| Dataset: YT-UGC | SROCC | KROCC | PLCC | RMSE | Run Time |
| ---- | ---- | ---- | ---- | ---- | ---- |
| [**COVER**](https://github.com/vztu/COVER/release/Model/COVER.pth) | 0.9143 | 0.7413 | 0.9122 | 0.2519 | 79.37ms |
| TVQE (Wang *et al*, CVPRWS 2024) | 0.9150 | 0.7410 | 0.9182 | ------- | 705.30ms |
| Q-Align (Zhang *et al, CVPRWS 2024) | 0.9080 | 0.7340 | 0.9120 | ------- | 1707.06ms |
| SimpleVQA+ (Sun *et al, CVPRWS 2024) | 0.9060 | 0.7280 | 0.9110 | ------- | 245.51ms |
The run time is measured on an NVIDIA A100 GPU. A clip
of 30 frames of 4K resolution 3840Γ2160 is used as input.
## Install
The repository can be installed via the following commands:
```shell
git clone https://github.com/vztu/COVER
cd COVER
pip install -e .
mkdir pretrained_weights
cd pretrained_weights
wget https://github.com/vztu/COVER/release/Model/COVER.pth
cd ..
```
## Evaluation: Judge the Quality of Any Video
### Try on Demos
You can run a single command to judge the quality of the demo videos in comparison with videos in VQA datasets.
```shell
python evaluate_one_video.py -v ./demo/video_1.mp4
```
or
```shell
python evaluate_one_video.py -v ./demo/video_2.mp4
```
Or choose any video you like to predict its quality:
```shell
python evaluate_one_video.py -v $YOUR_SPECIFIED_VIDEO_PATH$
```
### Outputs
The script can directly score the video's overall quality (considering all perspectives).
```shell
python evaluate_one_video.py -v $YOUR_SPECIFIED_VIDEO_PATH$
```
The final output score is the sum of all perspectives.
## Evaluate on a Exsiting Video Dataset
```shell
python evaluate_one_dataset.py -in $YOUR_SPECIFIED_DIR$ -out $OUTPUT_CSV_PATH$
```
## Evaluate on a Set of Unlabelled Videos
```shell
python evaluate_a_set_of_videos.py -in $YOUR_SPECIFIED_DIR$ -out $OUTPUT_CSV_PATH$
```
The results are stored as `.csv` files in cover_predictions in your `OUTPUT_CSV_PATH`.
Please feel free to use COVER to pseudo-label your non-quality video datasets.
## Data Preparation
We have already converted the labels for most popular datasets you will need for Blind Video Quality Assessment,
and the download links for the **videos** are as follows:
:book: LSVQ: [Github](https://github.com/baidut/PatchVQ)
:book: KoNViD-1k: [Official Site](http://database.mmsp-kn.de/konvid-1k-database.html)
:book: LIVE-VQC: [Official Site](http://live.ece.utexas.edu/research/LIVEVQC)
:book: YouTube-UGC: [Official Site](https://media.withyoutube.com)
*(Please contact the original authors if the download links were unavailable.)*
After downloading, kindly put them under the `../datasets` or anywhere but remember to change the `data_prefix` respectively in the [config file](cover.yml).
# Training: Adapt COVER to your video quality dataset!
Now you can employ ***head-only/end-to-end transfer*** of COVER to get dataset-specific VQA prediction heads.
```shell
python transfer_learning.py -t $YOUR_SPECIFIED_DATASET_NAME$
```
For existing public datasets, type the following commands for respective ones:
- `python transfer_learning.py -t val-kv1k` for KoNViD-1k.
- `python transfer_learning.py -t val-ytugc` for YouTube-UGC.
- `python transfer_learning.py -t val-cvd2014` for CVD2014.
- `python transfer_learning.py -t val-livevqc` for LIVE-VQC.
As the backbone will not be updated here, the checkpoint saving process will only save the regression heads. To use it, simply replace the head weights with the official weights [COVER.pth](https://github.com/vztu/COVER/release/Model/COVER.pth).
We also support ***end-to-end*** fine-tune right now (by modifying the `num_epochs: 0` to `num_epochs: 15` in `./cover.yml`). It will require more memory cost and more storage cost for the weights (with full parameters) saved, but will result in optimal accuracy.
## Visualization
### WandB Training and Evaluation Curves
You can be monitoring your results on WandB!
## Acknowledgement
Thanks for every participant of the subjective studies!
## Citation
Should you find our work interesting and would like to cite it, please feel free to add these in your references!
```bibtex
%AIS 2024 VQA challenge
@article{conde2024ais,
title={AIS 2024 challenge on video quality assessment of user-generated content: Methods and results},
author={Conde, Marcos V and Zadtootaghaj, Saman and Barman, Nabajeet and Timofte, Radu and He, Chenlong and Zheng, Qi and Zhu, Ruoxi and Tu, Zhengzhong and Wang, Haiqiang and Chen, Xiangguang and others},
journal={arXiv preprint arXiv:2404.16205},
year={2024}
}
%cover
@article{cover2024cpvrws,
title={COVER: A comprehensive video quality evaluator},
author={Chenlong, He and Qi, Zheng and Ruoxi, Zhu and Xiaoyang, Zeng and
Yibo, Fan and Zhengzhong, Tu},
journal={In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year={2024}
}
```
|