VideoScore

Running on Zero

DongfuJiang commited on 20 days ago

Commit

cb5c689

•

1 Parent(s): 03b5d97

update

Files changed (1) hide show

app_regression.py CHANGED Viewed

@@ -37,26 +37,24 @@ for item in hd_examples:
 examples = hd_examples + examples
 VIDEO_EVAL_PROMPT = """
-Suppose you are an expert in judging and evaluating the quality of AI-generated videos,
-please watch the following frames of a given video and see the text prompt for generating the video,
-then give scores from 7 different dimensions:
 (1) visual quality: the quality of the video in terms of clearness, resolution, brightness, and color
-(2) object consistency, the consistency of objects or humans in video
 (3) dynamic degree, the degree of dynamic changes
-(4) motion smoothness, the smoothness of motion or movements
-(5) text-to-video alignment, the alignment between the text prompt and the video content
-(6) factual consistency, the consistency of the video content with the common-sense and factual knowledge
-for each dimension, output a float number from 1.0 to 4.0,
-the higher the number is, the better the video performs in that sub-score,
-the lowest 1.0 means Bad, the highest 4.0 means Perfect/Real (the video is like a real video)
 Here is an output example:
-visual quality: 3.2
-object consistency: 2.7
-dynamic degree: 4.0
-motion smoothness: 1.6
-text-to-video alignment: 2.3
-factual consistency: 1.8
 For this video, the text prompt is "{text_prompt}",
 all the frames of video are as follows:

 examples = hd_examples + examples
 VIDEO_EVAL_PROMPT = """
+Suppose you are an expert in judging and evaluating the quality of AI-generated videos,
+please watch the following frames of a given video and see the text prompt for generating the video,
+then give scores from 5 different dimensions:
 (1) visual quality: the quality of the video in terms of clearness, resolution, brightness, and color
+(2) temporal consistency, the consistency of objects or humans in video
 (3) dynamic degree, the degree of dynamic changes
+(4) text-to-video alignment, the alignment between the text prompt and the video content
+(5) factual consistency, the consistency of the video content with the common-sense and factual knowledge
+For each dimension, output a number from [1,2,3,4],
+in which ‘1’ means ‘Bad’, ‘2’ means ‘Average’, ‘3’ means ‘Good’,
+'4' means ‘Real’ or ‘Perfect’ (the video is like a real video)
 Here is an output example:
+visual quality: 4
+temporal consistency: 4
+dynamic degree: 3
+text-to-video alignment: 1
+factual consistency: 2
 For this video, the text prompt is "{text_prompt}",
 all the frames of video are as follows: