Spaces:

ali-vilab
/

IDEA-Bench-Arena

Running

App Files Files Community

JasiLiang commited on Dec 27, 2024

Commit

fca928b

verified ·

1 Parent(s): a248282

upload dataset

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

dataset/3d_effect_generation_single_reference_0001/auto_eval.jsonl +6 -0
dataset/3d_effect_generation_single_reference_0001/eval.json +34 -0
dataset/3d_effect_generation_single_reference_0001/images.txt +1 -0
dataset/3d_effect_generation_single_reference_0001/instruction.txt +1 -0
dataset/3d_effect_generation_single_reference_0001/meta.json +10 -0
dataset/3d_effect_generation_three-view_reference_0002/auto_eval.jsonl +6 -0
dataset/3d_effect_generation_three-view_reference_0002/eval.json +34 -0
dataset/3d_effect_generation_three-view_reference_0002/images.txt +3 -0
dataset/3d_effect_generation_three-view_reference_0002/instruction.txt +1 -0
dataset/3d_effect_generation_three-view_reference_0002/meta.json +10 -0
dataset/animal_attribute_editing_hair_editing_0002/eval.json +34 -0
dataset/animal_attribute_editing_hair_editing_0002/images.txt +1 -0
dataset/animal_attribute_editing_hair_editing_0002/instruction.txt +1 -0
dataset/animal_attribute_editing_hair_editing_0002/meta.json +10 -0
dataset/animal_attribute_editing_posture_editing_0001/eval.json +34 -0
dataset/animal_attribute_editing_posture_editing_0001/images.txt +1 -0
dataset/animal_attribute_editing_posture_editing_0001/instruction.txt +1 -0
dataset/animal_attribute_editing_posture_editing_0001/meta.json +10 -0
dataset/animal_attribute_editing_species_editing_0002/eval.json +34 -0
dataset/animal_attribute_editing_species_editing_0002/images.txt +1 -0
dataset/animal_attribute_editing_species_editing_0002/instruction.txt +1 -0
dataset/animal_attribute_editing_species_editing_0002/meta.json +10 -0
dataset/animal_growth_process_generation_with_reference_0001/eval.json +34 -0
dataset/animal_growth_process_generation_with_reference_0001/images.txt +1 -0
dataset/animal_growth_process_generation_with_reference_0001/instruction.txt +1 -0
dataset/animal_growth_process_generation_with_reference_0001/meta.json +10 -0
dataset/brand_merchandise_generation_0002/auto_eval.jsonl +6 -0
dataset/brand_merchandise_generation_0002/eval.json +34 -0
dataset/brand_merchandise_generation_0002/images.txt +1 -0
dataset/brand_merchandise_generation_0002/instruction.txt +1 -0
dataset/brand_merchandise_generation_0002/meta.json +10 -0
dataset/brand_merchandise_generation_0003/auto_eval.jsonl +6 -0
dataset/brand_merchandise_generation_0003/eval.json +34 -0
dataset/brand_merchandise_generation_0003/images.txt +1 -0
dataset/brand_merchandise_generation_0003/instruction.txt +1 -0
dataset/brand_merchandise_generation_0003/meta.json +10 -0
dataset/business_card_generation_0002/.DS_Store +0 -0
dataset/business_card_generation_0002/eval.json +34 -0
dataset/business_card_generation_0002/images.txt +0 -0
dataset/business_card_generation_0002/instruction.txt +1 -0
dataset/business_card_generation_0002/meta.json +10 -0
dataset/business_card_generation_0003/eval.json +34 -0
dataset/business_card_generation_0003/images.txt +0 -0
dataset/business_card_generation_0003/instruction.txt +1 -0
dataset/business_card_generation_0003/meta.json +10 -0
dataset/childrens_book_generation_0003/eval.json +34 -0
dataset/childrens_book_generation_0003/images.txt +0 -0
dataset/childrens_book_generation_0003/instruction.txt +1 -0
dataset/childrens_book_generation_0003/meta.json +10 -0
dataset/childrens_book_generation_0004/eval.json +34 -0

dataset/3d_effect_generation_single_reference_0001/auto_eval.jsonl ADDED Viewed

	@@ -0,0 +1,6 @@

+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two images, with the top image as the reference picture for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided design sketch and text requirements.\nThe text requirement is:\n\"Please generate a corresponding 3D rendering based on the provided design sketch. The task objective is to infer a plausible 3D form from the product design sketch, displaying the appearance, structure, and functional details of the product. The model should integrate the different perspectives shown in the sketch to create a complete and consistent 3D model, rendering realistic lighting, shadows, and material effects. Ensure that the generated 3D image adheres to the proportions and design intentions depicted in the sketch, accurately showcasing the product from all angles and details, resulting in a high-quality 3D rendering with depth and realism.\"\nYour review question is:\nDoes the generated 3D rendering accurately retain every details of the shapes and outlines of the line drawing? 0 points: The shapes and outlines in the 3D rendering show noticeable deviations or distortions compared to the line drawing. 1 point: The 3D rendering accurately preserves the shapes and outlines from the line drawing in every detail.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two images, with the top image as the reference picture for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided design sketch and text requirements.\nThe text requirement is:\n\"Please generate a corresponding 3D rendering based on the provided design sketch. The task objective is to infer a plausible 3D form from the product design sketch, displaying the appearance, structure, and functional details of the product. The model should integrate the different perspectives shown in the sketch to create a complete and consistent 3D model, rendering realistic lighting, shadows, and material effects. Ensure that the generated 3D image adheres to the proportions and design intentions depicted in the sketch, accurately showcasing the product from all angles and details, resulting in a high-quality 3D rendering with depth and realism.\"\nYour review question is:\nDoes the generated 3D rendering maintain the overall structure and proportions of the line drawing, ensuring consistency between the line drawing and the generated image? 0 points: The object's structure in the 3D rendering has been noticeably altered, with unbalanced proportions. 1 point: The structure and proportions of the object in the 3D rendering are consistent with the line drawing and are well-balanced.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two images, with the top image as the reference picture for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided design sketch and text requirements.\nThe text requirement is:\n\"Please generate a corresponding 3D rendering based on the provided design sketch. The task objective is to infer a plausible 3D form from the product design sketch, displaying the appearance, structure, and functional details of the product. The model should integrate the different perspectives shown in the sketch to create a complete and consistent 3D model, rendering realistic lighting, shadows, and material effects. Ensure that the generated 3D image adheres to the proportions and design intentions depicted in the sketch, accurately showcasing the product from all angles and details, resulting in a high-quality 3D rendering with depth and realism.\"\nYour review question is:\nDoes the generated image effectively convey a sense of depth and three-dimensionality, presenting as a realistic 3D rendered image? 0 points: The image lacks a convincing 3D appearance, with minimal or no sense of depth. Key elements like lighting, shadows, and material effects are either absent or insufficiently applied, resulting in a flat or two-dimensional look. 1 point: The image successfully conveys a realistic 3D appearance, with well-applied lighting, shadows, and material effects that contribute to a sense of depth and dimensionality. The rendering effectively represents a believable 3D object with spatial volume.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two images, with the top image as the reference picture for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided design sketch and text requirements.\nThe text requirement is:\n\"Please generate a corresponding 3D rendering based on the provided design sketch. The task objective is to infer a plausible 3D form from the product design sketch, displaying the appearance, structure, and functional details of the product. The model should integrate the different perspectives shown in the sketch to create a complete and consistent 3D model, rendering realistic lighting, shadows, and material effects. Ensure that the generated 3D image adheres to the proportions and design intentions depicted in the sketch, accurately showcasing the product from all angles and details, resulting in a high-quality 3D rendering with depth and realism.\"\nYour review question is:\nIf there is multiple angles, does the generated 3D rendering maintain consistency of the object across multiple angles in every detail? 0 points: The object appears inconsistent across different angles, with noticeable discrepancies in shape, proportions, or details between views. 1 point: There is no multiple angles, or the object remains consistent across all angles, with uniform shape, proportions, and details, creating a coherent and accurate representation from every perspective.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two images, with the top image as the reference picture for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided design sketch and text requirements.\nThe text requirement is:\n\"Please generate a corresponding 3D rendering based on the provided design sketch. The task objective is to infer a plausible 3D form from the product design sketch, displaying the appearance, structure, and functional details of the product. The model should integrate the different perspectives shown in the sketch to create a complete and consistent 3D model, rendering realistic lighting, shadows, and material effects. Ensure that the generated 3D image adheres to the proportions and design intentions depicted in the sketch, accurately showcasing the product from all angles and details, resulting in a high-quality 3D rendering with depth and realism.\"\nYour review question is:\nDoes the rendering display realistic and well-defined shadows that enhance the 3D appearance of the object? 0 points: Shadows are poorly rendered, with inconsistent or unrealistic positioning, depth, or softness.  1 point: Shadows are accurately rendered, with appropriate depth, softness, and alignment to the light source.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two images, with the top image as the reference picture for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided design sketch and text requirements.\nThe text requirement is:\n\"Please generate a corresponding 3D rendering based on the provided design sketch. The task objective is to infer a plausible 3D form from the product design sketch, displaying the appearance, structure, and functional details of the product. The model should integrate the different perspectives shown in the sketch to create a complete and consistent 3D model, rendering realistic lighting, shadows, and material effects. Ensure that the generated 3D image adheres to the proportions and design intentions depicted in the sketch, accurately showcasing the product from all angles and details, resulting in a high-quality 3D rendering with depth and realism.\"\nYour review question is:\nDoes the generated 3D rendering have an overall aesthetic appeal, with high-quality visual detailing and a cohesive style that meets professional standards? 0 points: The 3D rendering lacks aesthetic appeal, with poor visual detailing and an incohesive style. 1 point: The 3D rendering demonstrates strong aesthetic appeal, with high-quality visual detailing and a cohesive style.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}

dataset/3d_effect_generation_single_reference_0001/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the generated 3D rendering accurately preserve every detail of the line drawing, including shapes and contours?",
+            "0_point_standard": "There are noticeable deviations or distortions in the shapes and contours in the 3D rendering compared to the line drawing.",
+            "1_point_standard": "The 3D rendering accurately preserves every detail of the shapes and contours in the line drawing."
+        },
+        {
+            "question": "Does the generated 3D rendering maintain the overall structure and proportions of the line drawing, ensuring consistency between the line drawing and the generated image?",
+            "0_point_standard": "There are significant changes in the structure of objects in the 3D rendering, with imbalanced proportions.",
+            "1_point_standard": "The structure and proportions of objects in the 3D rendering are consistent with the line drawing, and the proportions are balanced."
+        },
+        {
+            "question": "Does the generated image effectively convey depth and a sense of three-dimensionality, presenting a realistic 3D rendering effect?",
+            "0_point_standard": "The image lacks a convincing 3D appearance, with almost no sense of depth. Key elements such as lighting and material effects are missing or insufficiently applied, making the image appear flat.",
+            "1_point_standard": "The image successfully conveys a realistic 3D appearance, with well-applied lighting and material effects that enhance depth and three-dimensionality, presenting credible spatial volume."
+        },
+        {
+            "question": "If the generated 3D rendering includes multiple angles, does it maintain consistency of the object across angles, ensuring no deviation in details?",
+            "0_point_standard": "The object appears inconsistent across different angles, with noticeable differences in shape, proportion, or detail between perspectives.",
+            "1_point_standard": "There are no multiple angles, or the object maintains consistency across all angles, with uniform shape, proportion, and details, ensuring accurate consistency in each perspective."
+        },
+        {
+            "question": "Does the rendering display realistic and clear shadow effects, enhancing the 3D appearance of the object?",
+            "0_point_standard": "The shadow effects are poorly rendered, with inaccurate or unrealistic position, depth, or softness.",
+            "1_point_standard": "The shadow effects are accurately rendered, with appropriate depth, softness, and alignment with the light source."
+        },
+        {
+            "question": "Does the generated 3D rendering have an overall aesthetic appeal, with high-quality visual details and a unified style that meets professional standards?",
+            "0_point_standard": "The 3D rendering lacks aesthetic appeal, with poor visual details and an inconsistent style.",
+            "1_point_standard": "The 3D rendering exhibits strong aesthetic appeal, with high-quality visual details and a unified style."
+        }
+    ]
+}

dataset/3d_effect_generation_single_reference_0001/images.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://img.alicdn.com/imgextra/i3/O1CN01sdobeF1zREcpWHb46_!!6000000006710-0-tps-3508-2192.jpg

dataset/3d_effect_generation_single_reference_0001/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ Please generate a corresponding 3D rendering based on the provided design sketch. The task objective is to infer a plausible 3D form from the product design sketch, displaying the appearance, structure, and functional details of the product. The model should integrate the different perspectives shown in the sketch to create a complete and consistent 3D model, rendering realistic lighting, shadows, and material effects. Ensure that the generated 3D image adheres to the proportions and design intentions depicted in the sketch, accurately showcasing the product from all angles and details, resulting in a high-quality 3D rendering with depth and realism.

dataset/3d_effect_generation_single_reference_0001/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "3D effect generation with single reference",
+    "num_of_cases": 2,
+    "image_reference": true,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0072",
+    "output_image_count": 1,
+    "case_id": "0001"
+}

dataset/3d_effect_generation_three-view_reference_0002/auto_eval.jsonl ADDED Viewed

	@@ -0,0 +1,6 @@

+{"input_images": ["0001.jpg", "0002.jpg", "0003.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference three-view pictures for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided three-view pictures and text requirements.\nThe text requirement is:\n\"The picture provided is of a shoe with a cloth upper and a rubber sole. Please generate a 3D rendering of the item based on the three views (front view, side view, top view) and text description of the item provided. The 3D form of the item should be accurately constructed by combining the information from all angles in the three views and refining the details of the material, colour, light and shadow effects of the item. The final 3D rendering should show the complete three-dimensional appearance of the item, conform to the scale and structure provided in the three-view drawings, and be consistent with the features of the item.\"\nYour review question is:\nLogical Consistency with Input Reference: 0 points: The output 3D rendering lacks logical consistency with the input three-view sketches, showing significant discrepancies in shape or structure that contradict the references. 1 point: The output 3D rendering logically aligns with the input three-view sketches, maintaining structural coherence and faithfully representing the object's intended form.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg", "0002.jpg", "0003.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference three-view pictures for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided three-view pictures and text requirements.\nThe text requirement is:\n\"The picture provided is of a shoe with a cloth upper and a rubber sole. Please generate a 3D rendering of the item based on the three views (front view, side view, top view) and text description of the item provided. The 3D form of the item should be accurately constructed by combining the information from all angles in the three views and refining the details of the material, colour, light and shadow effects of the item. The final 3D rendering should show the complete three-dimensional appearance of the item, conform to the scale and structure provided in the three-view drawings, and be consistent with the features of the item.\"\nYour review question is:\nCompleteness of Detail Representation: 0 points: The 3D rendering lacks certain intricate details or specific design elements present in the input sketches, resulting in a simplified or incomplete representation. 1 point: The 3D rendering fully captures all details and intricate features from the input sketches, providing a complete and precise depiction of the object.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg", "0002.jpg", "0003.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference three-view pictures for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided three-view pictures and text requirements.\nThe text requirement is:\n\"The picture provided is of a shoe with a cloth upper and a rubber sole. Please generate a 3D rendering of the item based on the three views (front view, side view, top view) and text description of the item provided. The 3D form of the item should be accurately constructed by combining the information from all angles in the three views and refining the details of the material, colour, light and shadow effects of the item. The final 3D rendering should show the complete three-dimensional appearance of the item, conform to the scale and structure provided in the three-view drawings, and be consistent with the features of the item.\"\nYour review question is:\nConsistency in Material and Texture Representation: 0 points: The 3D rendering fails to accurately represent materials or textures specified or implied in the input sketches, with inconsistencies that detract from realism. 1 point: The 3D rendering maintains consistent and realistic material and texture representations across all views, effectively enhancing the appearance and believability of the object.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg", "0002.jpg", "0003.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference three-view pictures for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided three-view pictures and text requirements.\nThe text requirement is:\n\"The picture provided is of a shoe with a cloth upper and a rubber sole. Please generate a 3D rendering of the item based on the three views (front view, side view, top view) and text description of the item provided. The 3D form of the item should be accurately constructed by combining the information from all angles in the three views and refining the details of the material, colour, light and shadow effects of the item. The final 3D rendering should show the complete three-dimensional appearance of the item, conform to the scale and structure provided in the three-view drawings, and be consistent with the features of the item.\"\nYour review question is:\nAdherence to Text Description Instructions: 0 points: The model-generated 3D rendering does not fulfill the specific requirements outlined in the text description, failing to incorporate directed changes or enhancements. 1 point: The model effectively incorporates and adheres to the specific instructions provided in the text description, accurately reflecting all requested modifications or features.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg", "0002.jpg", "0003.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference three-view pictures for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided three-view pictures and text requirements.\nThe text requirement is:\n\"The picture provided is of a shoe with a cloth upper and a rubber sole. Please generate a 3D rendering of the item based on the three views (front view, side view, top view) and text description of the item provided. The 3D form of the item should be accurately constructed by combining the information from all angles in the three views and refining the details of the material, colour, light and shadow effects of the item. The final 3D rendering should show the complete three-dimensional appearance of the item, conform to the scale and structure provided in the three-view drawings, and be consistent with the features of the item.\"\nYour review question is:\nIntegration of Realistic Lighting and Shadows: 0 points: The 3D rendering shows unrealistic or poorly implemented lighting and shadow effects, with unnatural or inconsistent lighting that detracts from the image’s overall quality. 1 point: The lighting and shadows are realistically integrated into the rendering, creating a natural sense of depth and enhancing the three-dimensional effect in a way that complements the object’s design.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg", "0002.jpg", "0003.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference three-view pictures for the design task and the bottom image as the response provided by a student. The task objective is to generate a realistic 3D rendering based on the provided three-view pictures and text requirements.\nThe text requirement is:\n\"The picture provided is of a shoe with a cloth upper and a rubber sole. Please generate a 3D rendering of the item based on the three views (front view, side view, top view) and text description of the item provided. The 3D form of the item should be accurately constructed by combining the information from all angles in the three views and refining the details of the material, colour, light and shadow effects of the item. The final 3D rendering should show the complete three-dimensional appearance of the item, conform to the scale and structure provided in the three-view drawings, and be consistent with the features of the item.\"\nYour review question is:\nAesthetic Cohesion and Visual Appeal: 0 points: The 3D rendering lacks visual harmony, with elements that appear disjointed or poorly composed. The design may have awkward proportions, distracting artifacts, or an inconsistent style, resulting in a rendering that feels visually unbalanced or unpolished. 1 point: The 3D rendering demonstrates strong aesthetic cohesion, with a harmonious composition and well-balanced proportions. The style is consistent throughout, and the image has a polished, visually appealing look that aligns with professional design standards.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}

dataset/3d_effect_generation_three-view_reference_0002/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Consistency with the input reference image:",
+            "0_point_standard": "The output 3D rendering lacks logical consistency with the input three-view sketch, showing noticeable discrepancies in shape or structure, deviating from the reference image.",
+            "1_point_standard": "The output 3D rendering logically aligns with the input three-view sketch, maintaining structural coherence and faithfully representing the intended form of the object."
+        },
+        {
+            "question": "Completeness of detail representation:",
+            "0_point_standard": "The 3D rendering misses certain fine details or specific design elements from the input sketch, resulting in a simplified or incomplete representation.",
+            "1_point_standard": "The 3D rendering fully captures all details and refined features from the input sketch, providing a complete and accurate representation of the object."
+        },
+        {
+            "question": "Consistency of material and texture representation:",
+            "0_point_standard": "The 3D rendering fails to accurately depict the materials or textures specified or implied by the input sketch, showing inconsistencies that affect realism.",
+            "1_point_standard": "The 3D rendering maintains consistent and realistic material and texture representation across all views, effectively enhancing the object's appearance and credibility."
+        },
+        {
+            "question": "Adherence to text description instructions:",
+            "0_point_standard": "The 3D rendering generated by the model fails to meet specific requirements from the text description, omitting changes or enhancements mentioned in the instructions.",
+            "1_point_standard": "The model effectively adheres to and incorporates specific instructions provided in the text description, accurately reflecting all required modifications or features."
+        },
+        {
+            "question": "Realistic integration of lighting and shadows:",
+            "0_point_standard": "The lighting and shadow effects in the 3D rendering are unrealistic or poorly handled, with unnatural or inconsistent light sources affecting the overall image quality.",
+            "1_point_standard": "The lighting and shadow effects are realistically integrated into the rendering, creating a natural sense of depth and enhancing the three-dimensional effect, complementing the object design."
+        },
+        {
+            "question": "Aesthetic unity and visual appeal:",
+            "0_point_standard": "The 3D rendering lacks visual harmony, with elements appearing disjointed or poorly combined. The design may show disproportionate elements, distracting artifacts, or inconsistent styles, leading to an image that seems unbalanced or unrefined visually.",
+            "1_point_standard": "The 3D rendering exhibits strong aesthetic unity, with harmonious combinations and balanced proportions. The style is consistent, and the image has a refined, appealing appearance, meeting professional design standards."
+        }
+    ]
+}

dataset/3d_effect_generation_three-view_reference_0002/images.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+https://img.alicdn.com/imgextra/i4/O1CN01lTEych1PcO6MkHln5_!!6000000001861-0-tps-909-459.jpg
+https://img.alicdn.com/imgextra/i3/O1CN015eFw5l1pzuSBLBnzQ_!!6000000005432-0-tps-384-479.jpg
+https://img.alicdn.com/imgextra/i2/O1CN01hAdyLY1KIcw1en0V8_!!6000000001141-0-tps-939-387.jpg

dataset/3d_effect_generation_three-view_reference_0002/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ The picture provided is of a shoe with a cloth upper and a rubber sole.Please generate a 3D rendering of the item based on the three views (front view, side view, top view) and text description of the item provided. The 3D form of the item should be accurately constructed by combining the information from all angles in the three views and refining the details of the material, colour, light and shadow effects of the item. The final 3D rendering should show the complete three-dimensional appearance of the item, conform to the scale and structure provided in the three-view drawings, and be consistent with the features of the item.

dataset/3d_effect_generation_three-view_reference_0002/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "3D effect generation with three-view reference",
+    "num_of_cases": 2,
+    "image_reference": true,
+    "multi_image_reference": true,
+    "multi_image_output": false,
+    "uid": "0073",
+    "output_image_count": 1,
+    "case_id": "0002"
+}

dataset/animal_attribute_editing_hair_editing_0002/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the edited image accurately retain the unchanged portions of the original animal image, ensuring no unintended alterations were made?",
+            "0_point_standard": "The edited image shows noticeable changes in areas that should not have been altered.",
+            "1_point_standard": "The edited image accurately retains the unchanged portions, with no unintended modifications outside the specified areas."
+        },
+        {
+            "question": "Does the edited image maintain the overall content and identity of the original animal, ensuring consistency between the input and output images?",
+            "0_point_standard": "The edited image significantly changes the animal's identity or main features, making it look different from the original image.",
+            "1_point_standard": "The edited image retains the animal's identity and main features, ensuring consistency with the original image."
+        },
+        {
+            "question": "Does the hair editing meet the specific requirements provided in the text description, such as style, color, or length modifications?",
+            "0_point_standard": "The hair editing does not meet the specific requirements described in the text, showing noticeable differences in style, color, or length.",
+            "1_point_standard": "The hair editing accurately meets the specified requirements in the text description."
+        },
+        {
+            "question": "Does the edited image effectively integrate the hair modifications, providing a seamless and natural appearance consistent with the animal's existing features?",
+            "0_point_standard": "The hair modifications appear unnatural or poorly integrated, disrupting the overall appearance of the animal.",
+            "1_point_standard": "The hair modifications are seamlessly integrated, maintaining a natural cohesive appearance with the animal's existing features."
+        },
+        {
+            "question": "Are the texture and details of the edited hair realistic, enhancing the visual quality of the animal image?",
+            "0_point_standard": "The texture and details of the edited hair look unrealistic or lack depth, reducing the visual quality of the image.",
+            "1_point_standard": "The texture and details of the edited hair are realistic, enhancing the overall visual quality of the image."
+        },
+        {
+            "question": "Does the edited image have a strong aesthetic appeal, providing a visually attractive and professionally executed final result?",
+            "0_point_standard": "The edited image lacks aesthetic appeal, appearing unprofessional or visually unpleasing.",
+            "1_point_standard": "The edited image exhibits strong aesthetic appeal, with high visual attractiveness and professional execution."
+        }
+    ]
+}

dataset/animal_attribute_editing_hair_editing_0002/images.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://img.alicdn.com/imgextra/i2/O1CN01Ge9no81W06IqBaGUI_!!6000000002725-0-tps-2480-1550.jpg

dataset/animal_attribute_editing_hair_editing_0002/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ Please modify the hair color of the white rabbit in the input image from white to gray. The goal is to keep the rabbit's posture and background unchanged but adjust the hair color to gray, ensuring the texture and lighting effects of the fur remain natural and realistic. The generated image should maintain the overall style of the original, but the fur color should clearly change to gray.

dataset/animal_attribute_editing_hair_editing_0002/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "animal hair editing",
+    "num_of_cases": 2,
+    "image_reference": true,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0087",
+    "output_image_count": 1,
+    "case_id": "0002"
+}

dataset/animal_attribute_editing_posture_editing_0001/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the edited image accurately preserve the original state of unmodified parts of the animal, ensuring consistency with the original image without specified changes?",
+            "0_point_standard": "There are noticeable changes or inconsistencies in the unmodified parts of the animal compared to the original image.",
+            "1_point_standard": "The unmodified parts of the animal remain consistent and unchanged, accurately reflecting the original image."
+        },
+        {
+            "question": "Does the edited image maintain the overall style and characteristics of the animal, ensuring that the edited posture does not alter its recognizable features or species-specific traits?",
+            "0_point_standard": "The edited posture has altered the animal's recognizable features or species-specific traits, making it difficult to identify.",
+            "1_point_standard": "The edited posture retains the animal's style and characteristics, maintaining its recognizable features and species-specific traits intact."
+        },
+        {
+            "question": "Does the edited animal posture accurately reflect the requirements in the text description, effectively conveying the intended posture adjustments?",
+            "0_point_standard": "The edited posture does not reflect the changes specified in the text description or misrepresents the intended posture.",
+            "1_point_standard": "The edited posture accurately reflects the changes specified in the text description, conveying the intended posture adjustments."
+        },
+        {
+            "question": "Does the edited image include the modifications specified in the text without introducing inconsistencies or unrealistic elements in the animal's anatomy and position?",
+            "0_point_standard": "The modifications have introduced inconsistencies or unrealistic elements in the animal's anatomy and position.",
+            "1_point_standard": "The modifications are consistent with realistic anatomy and position, without introducing unrealistic elements."
+        },
+        {
+            "question": "Does the edited image exhibit high-quality texture details, maintaining the natural appearance and texture of the animal's fur, skin, or scales?",
+            "0_point_standard": "The texture details are poor, leading to an unnatural appearance of the animal's fur, skin, or scales.",
+            "1_point_standard": "The texture details are of high quality, maintaining the natural appearance of the animal's fur, skin, or scales."
+        },
+        {
+            "question": "Does the edited image have an overall aesthetic appeal, with visually pleasing composition that meets professional standards and user expectations?",
+            "0_point_standard": "The edited image lacks aesthetic appeal and has poor visual composition.",
+            "1_point_standard": "The edited image displays strong aesthetic appeal with high-quality visual composition that meets professional standards."
+        }
+    ]
+}

dataset/animal_attribute_editing_posture_editing_0001/images.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://img.alicdn.com/imgextra/i3/O1CN0116QVew1CSEJUFZfHs_!!6000000000079-0-tps-1279-1706.jpg

dataset/animal_attribute_editing_posture_editing_0001/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ Please modify the posture of the Corgi in the input image from a sitting position to a standing position. The goal is to keep the dog's facial expression and background unchanged, but adjust its posture to a natural standing position, ensuring that the body proportions and limb positioning are consistent with the physical characteristics of a Corgi. The generated image should look natural and realistic, with a smooth and accurate posture transition.

dataset/animal_attribute_editing_posture_editing_0001/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "animal posture editing",
+    "num_of_cases": 2,
+    "image_reference": true,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0088",
+    "output_image_count": 1,
+    "case_id": "0001"
+}

dataset/animal_attribute_editing_species_editing_0002/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the edited image accurately retain the basic shape and outline of the original animal, except for the specified species modifications?",
+            "0_point_standard": "There are noticeable deviations or distortions in the animal's shape and outline, apart from the specified species modifications.",
+            "1_point_standard": "The edited image accurately retains the original animal's shape and outline, except for the specified species modifications."
+        },
+        {
+            "question": "Aside from the specified modifications, does the edited image maintain the overall structure and proportion of the animal, ensuring consistency between the original and edited images?",
+            "0_point_standard": "The structure of the animal in the edited image has undergone significant changes in unintended areas, with disproportionate balance.",
+            "1_point_standard": "The structure and proportion of the animal in the edited image are consistent with the original image (except for the intended modifications), and the proportions are balanced."
+        },
+        {
+            "question": "Does the edited image accurately reflect the specific species changes described in the text input?",
+            "0_point_standard": "The species changes are not accurately represented or are inconsistent with the description provided in the text input.",
+            "1_point_standard": "The species changes are accurately represented and consistent with the description provided in the text input."
+        },
+        {
+            "question": "Does the edited image correctly implement any additional features specified in the text description, such as colors, patterns, or unique characteristics?",
+            "0_point_standard": "The edited image fails to include the specified additional features or includes them inaccurately.",
+            "1_point_standard": "The edited image correctly includes all specified additional features, such as colors, patterns, or unique characteristics."
+        },
+        {
+            "question": "Does the edited image seamlessly integrate the modified species features with the rest of the image, ensuring a natural appearance?",
+            "0_point_standard": "The integration of the modified species features is poor, resulting in a disjointed or unnatural appearance.",
+            "1_point_standard": "The modified species features seamlessly integrate with the rest of the image, presenting a natural and harmonious appearance."
+        },
+        {
+            "question": "Does the edited image possess overall aesthetic appeal, being visually attractive and meeting professional expectations for digital image editing?",
+            "0_point_standard": "The edited image lacks aesthetic appeal and has poor visual quality.",
+            "1_point_standard": "The edited image exhibits strong aesthetic appeal, with high visual attractiveness, meeting professional standards."
+        }
+    ]
+}

dataset/animal_attribute_editing_species_editing_0002/images.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://img.alicdn.com/imgextra/i4/O1CN01dKgsjj1ta2XrUM1gu_!!6000000005917-0-tps-1952-1464.jpg

dataset/animal_attribute_editing_species_editing_0002/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ Please transform the input image of a bear into a lion. The goal is to keep the bear's posture and background unchanged but edit its species characteristics to resemble a lion. Ensure the edited image features the lion's facial traits, mane, body proportions, and fur characteristics while maintaining the same pose and background as the original image. The resulting image should look natural and realistic, accurately reflecting the lion's species traits.

dataset/animal_attribute_editing_species_editing_0002/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "animal species editing",
+    "num_of_cases": 2,
+    "image_reference": true,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0089",
+    "output_image_count": 1,
+    "case_id": "0002"
+}

dataset/animal_growth_process_generation_with_reference_0001/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the output image originate from the input image and maintain a clear association in terms of content and style?",
+            "0_point_standard": "The output image does not have a clear association with the input image, and its elements do not match the content or style of the original animal.",
+            "1_point_standard": "The output image clearly originates from the input image, maintains consistent content and style, and retains recognizable features of the original animal."
+        },
+        {
+            "question": "Does the generated growth process image follow the specified timeline and show reasonable progression?",
+            "0_point_standard": "The images do not follow a reasonable timeline, with growth stages appearing inconsistent or in a jumbled order.",
+            "1_point_standard": "The images clearly demonstrate reasonable progression through the animal's growth stages and follow the specified timeline."
+        },
+        {
+            "question": "Has the model accurately implemented any specific modifications mentioned in the text description, such as changes in size or color?",
+            "0_point_standard": "The specified modifications have not been accurately implemented, with changes not matching the text description.",
+            "1_point_standard": "The model has accurately implemented the specified modifications in the text description, such as changes in size or color."
+        },
+        {
+            "question": "Do the unspecified parts of the image remain unchanged and maintain integrity with the input image?",
+            "0_point_standard": "Unnecessary alterations or distortions have occurred in parts of the image that should not have changed, affecting overall content.",
+            "1_point_standard": "The unspecified parts of the image remain unchanged, retaining the integrity and structure of the input image."
+        },
+        {
+            "question": "Is the image style consistent throughout the whole sequence of growth process images?",
+            "0_point_standard": "The image style is inconsistent, with variations disrupting the visual coherence of the growth stages.",
+            "1_point_standard": "The images maintain a consistent style throughout the entire sequence, ensuring visual coherence and continuity."
+        },
+        {
+            "question": "Does each image in the growth process sequence retain the original animal's key details and recognizable features, ensuring ID consistency?",
+            "0_point_standard": "The growth process images lack the original animal's key details and recognizable features, making it difficult to identify them as the same animal.",
+            "1_point_standard": "Each image retains the original animal's key details and recognizable features, ensuring identity consistency throughout the growth process."
+        }
+    ]
+}

dataset/animal_growth_process_generation_with_reference_0001/images.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://img.alicdn.com/imgextra/i2/O1CN010Vlkzq1VzduIHyO3P_!!6000000002724-0-tps-1920-1080.jpg

dataset/animal_growth_process_generation_with_reference_0001/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ Please generate 3 images showing the intermediate growth stages of this adult male lion based on the provided picture. The first image should depict the lion in its cub stage, with shorter fur, a smaller body, and undeveloped facial features. The second image should show the lion in its juvenile stage, where the body is growing larger, the fur is starting to thicken, but the mane is not yet fully developed. The third image should depict the lion in its sub-adult stage, with a body close to adult size, though the mane is still not fully grown. Ensure that all the generated images clearly show the same lion, with visible continuity in the appearance as the lion progresses through different growth stages, making it feel like the same animal at different points in time.

dataset/animal_growth_process_generation_with_reference_0001/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "animal growth process generation with reference",
+    "num_of_cases": 2,
+    "image_reference": true,
+    "multi_image_reference": false,
+    "multi_image_output": true,
+    "uid": "0048",
+    "output_image_count": 3,
+    "case_id": "0001"
+}

dataset/brand_merchandise_generation_0002/auto_eval.jsonl ADDED Viewed

	@@ -0,0 +1,6 @@

+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Create an image of a white baseball cap featuring a playful, abstract character embroidered on the front. The character should appear as a lively, hand-drawn figure composed of swirling purple lines forming a loose, scribble-like shape. It has simple black line-drawn arms and legs, a wide, open mouth, and small dots for eyes, giving it a fun and expressive look. The character’s face has a joyful, animated expression, with a touch of whimsical hair drawn as short, spiky black lines on top. The cap is displayed on a wooden stand in a well-lit, minimalist studio setup with a soft gray background, lending a professional product photo appearance. The design should have a unique, artistic style that makes the cap feel like part of a distinct and recognizable brand collection.\"\nYour review question is:\nBrand Visual Pattern Consistency: 0 points: The brand’s unique visual elements are missing or altered, making it hard to recognize the brand in the merchandise. 1 point: The merchandise clearly reflects every detail of the visual elements in the referenced pattern image, accurately transferring the brand’s pattern and style as specified. The visual pattern in the two images is exactly the same. \nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Create an image of a white baseball cap featuring a playful, abstract character embroidered on the front. The character should appear as a lively, hand-drawn figure composed of swirling purple lines forming a loose, scribble-like shape. It has simple black line-drawn arms and legs, a wide, open mouth, and small dots for eyes, giving it a fun and expressive look. The character’s face has a joyful, animated expression, with a touch of whimsical hair drawn as short, spiky black lines on top. The cap is displayed on a wooden stand in a well-lit, minimalist studio setup with a soft gray background, lending a professional product photo appearance. The design should have a unique, artistic style that makes the cap feel like part of a distinct and recognizable brand collection.\"\nYour review question is:\nProduct Type Accuracy: 0 points: The generated product does not match the specified type (e.g., a mug instead of a tote bag), or the structure is inconsistent with the description. 1 point: The merchandise accurately aligns with the specified product type, showing the correct form and structure as outlined.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Create an image of a white baseball cap featuring a playful, abstract character embroidered on the front. The character should appear as a lively, hand-drawn figure composed of swirling purple lines forming a loose, scribble-like shape. It has simple black line-drawn arms and legs, a wide, open mouth, and small dots for eyes, giving it a fun and expressive look. The character’s face has a joyful, animated expression, with a touch of whimsical hair drawn as short, spiky black lines on top. The cap is displayed on a wooden stand in a well-lit, minimalist studio setup with a soft gray background, lending a professional product photo appearance. The design should have a unique, artistic style that makes the cap feel like part of a distinct and recognizable brand collection.\"\nYour review question is:\nFidelity to Visual Details: 0 points: Key visual details requested, such as specific color adjustments or logo placements, are missing or incorrectly applied. 1 point: All specified visual details, including color adjustments and logo placement, are accurately and thoughtfully applied as per the description.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Create an image of a white baseball cap featuring a playful, abstract character embroidered on the front. The character should appear as a lively, hand-drawn figure composed of swirling purple lines forming a loose, scribble-like shape. It has simple black line-drawn arms and legs, a wide, open mouth, and small dots for eyes, giving it a fun and expressive look. The character’s face has a joyful, animated expression, with a touch of whimsical hair drawn as short, spiky black lines on top. The cap is displayed on a wooden stand in a well-lit, minimalist studio setup with a soft gray background, lending a professional product photo appearance. The design should have a unique, artistic style that makes the cap feel like part of a distinct and recognizable brand collection.\"\nYour review question is:\nPositioning and Proportion of the Visual Pattern: 0 points: The brand’s visual pattern is positioned awkwardly or disproportionately, failing to integrate naturally with the product’s shape and surface. 1 point: The brand’s visual pattern is applied with correct positioning and proportion, fitting naturally onto the product’s surface and enhancing its visual appeal.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Create an image of a white baseball cap featuring a playful, abstract character embroidered on the front. The character should appear as a lively, hand-drawn figure composed of swirling purple lines forming a loose, scribble-like shape. It has simple black line-drawn arms and legs, a wide, open mouth, and small dots for eyes, giving it a fun and expressive look. The character’s face has a joyful, animated expression, with a touch of whimsical hair drawn as short, spiky black lines on top. The cap is displayed on a wooden stand in a well-lit, minimalist studio setup with a soft gray background, lending a professional product photo appearance. The design should have a unique, artistic style that makes the cap feel like part of a distinct and recognizable brand collection.\"\nYour review question is:\nClarity and Quality of Text or Graphics: 0 points: The text or graphics appear blurry, pixelated, or washed out, reducing the professional quality and clarity of the brand. 1 point: The text and graphics are rendered sharply and vividly, enhancing both readability and the brand’s visual appeal.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Create an image of a white baseball cap featuring a playful, abstract character embroidered on the front. The character should appear as a lively, hand-drawn figure composed of swirling purple lines forming a loose, scribble-like shape. It has simple black line-drawn arms and legs, a wide, open mouth, and small dots for eyes, giving it a fun and expressive look. The character’s face has a joyful, animated expression, with a touch of whimsical hair drawn as short, spiky black lines on top. The cap is displayed on a wooden stand in a well-lit, minimalist studio setup with a soft gray background, lending a professional product photo appearance. The design should have a unique, artistic style that makes the cap feel like part of a distinct and recognizable brand collection.\"\nYour review question is:\nOverall Aesthetic and Professional Quality: 0 points: The merchandise lacks aesthetic cohesion or professionalism, with issues like poor composition, lighting, or unrealistic presentation, making it unfit for brand representation. 1 point: The merchandise displays high aesthetic appeal and professional quality, with balanced composition, effective lighting, and a realistic appearance suitable for showcasing as branded merchandise.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}

dataset/brand_merchandise_generation_0002/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Brand Visual Pattern Consistency:",
+            "0_point_standard": "The unique visual elements of the brand are missing or altered, making it difficult to recognize the brand in the product.",
+            "1_point_standard": "The product clearly reflects every visual element detail from the reference pattern image, accurately conveying the brand's patterns and style. The visual patterns in both images are identical."
+        },
+        {
+            "question": "Product Type Accuracy:",
+            "0_point_standard": "The generated product does not match the specified type (e.g., a cup is generated instead of a handbag), or its structure is inconsistent with the description.",
+            "1_point_standard": "The product accurately matches the specified product type, displaying the correct form and structure described."
+        },
+        {
+            "question": "Fidelity of Visual Details:",
+            "0_point_standard": "Missing key visual details required, such as specific color adjustments or logo placement, or applied incorrectly.",
+            "1_point_standard": "All specified visual details, including color adjustments and logo placement, are applied accurately and thoughtfully according to the description."
+        },
+        {
+            "question": "Positioning and Proportion of Visual Patterns:",
+            "0_point_standard": "The brand's visual pattern is improperly positioned or out of proportion, failing to naturally integrate with the product's shape and surface.",
+            "1_point_standard": "The brand's visual pattern is applied with correct positioning and proportion, naturally adapting to the product's surface and enhancing visual appeal."
+        },
+        {
+            "question": "Clarity and Quality of Text or Graphics:",
+            "0_point_standard": "Text or graphics appear blurry, pixelated, or faded, reducing the professional quality and clarity of the brand.",
+            "1_point_standard": "Text and graphics are presented clearly and vividly, enhancing readability and the brand's visual appeal."
+        },
+        {
+            "question": "Overall Aesthetic and Professional Quality:",
+            "0_point_standard": "The product lacks aesthetic unity or professionalism, with issues such as poor composition, insufficient lighting, or unrealistic representation, making it unsuitable for brand presentation.",
+            "1_point_standard": "The product displays high aesthetic and professional quality, with balanced composition, good lighting effects, and a realistic appearance suitable for brand product presentation."
+        }
+    ]
+}

dataset/brand_merchandise_generation_0002/images.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://img.alicdn.com/imgextra/i1/O1CN01O6hbDm27Scki1rVvi_!!6000000007796-0-tps-564-705.jpg

dataset/brand_merchandise_generation_0002/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ Create an image of a white baseball cap with the purple scribble character embroidered on the front, keeping the same fun and playful expression from the original design. The character may be slightly simplified to fit embroidery details but should retain its unique style. The cap is displayed on a wooden stand in a well-lit studio setting with a soft gray background, creating a professional product photo feel. The design should clearly reflect the original character, making it easily identifiable as part of the same brand.

dataset/brand_merchandise_generation_0002/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "brand merchandise generation",
+    "num_of_cases": 3,
+    "image_reference": true,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0063",
+    "output_image_count": 1,
+    "case_id": "0002"
+}

dataset/brand_merchandise_generation_0003/auto_eval.jsonl ADDED Viewed

	@@ -0,0 +1,6 @@

+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Generate an image of a large, eco-friendly water bottle with the colorful noodle graphic wrapping around the bottom third of the bottle. The design should remain vibrant and closely match the original image’s colors and playful style. The bottle is placed on a marble countertop with a few fresh vegetables like tomatoes and basil beside it, suggesting a healthy, eco-conscious theme. The background is a bright kitchen scene. The design should make it clear that this water bottle is part of the same brand as the original noodle graphic, ensuring brand recognition.\"\nYour review question is:\nBrand Visual Pattern Consistency: 0 points: The brand’s unique visual elements are missing or altered, making it hard to recognize the brand in the merchandise. 1 point: The merchandise clearly reflects every detail of the visual elements in the referenced pattern image, accurately transferring the brand’s pattern and style as specified. The visual pattern in the two images is exactly the same. \nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Generate an image of a large, eco-friendly water bottle with the colorful noodle graphic wrapping around the bottom third of the bottle. The design should remain vibrant and closely match the original image’s colors and playful style. The bottle is placed on a marble countertop with a few fresh vegetables like tomatoes and basil beside it, suggesting a healthy, eco-conscious theme. The background is a bright kitchen scene. The design should make it clear that this water bottle is part of the same brand as the original noodle graphic, ensuring brand recognition.\"\nYour review question is:\nProduct Type Accuracy: 0 points: The generated product does not match the specified type (e.g., a mug instead of a tote bag), or the structure is inconsistent with the description. 1 point: The merchandise accurately aligns with the specified product type, showing the correct form and structure as outlined.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Generate an image of a large, eco-friendly water bottle with the colorful noodle graphic wrapping around the bottom third of the bottle. The design should remain vibrant and closely match the original image’s colors and playful style. The bottle is placed on a marble countertop with a few fresh vegetables like tomatoes and basil beside it, suggesting a healthy, eco-conscious theme. The background is a bright kitchen scene. The design should make it clear that this water bottle is part of the same brand as the original noodle graphic, ensuring brand recognition.\"\nYour review question is:\nFidelity to Visual Details: 0 points: Key visual details requested, such as specific color adjustments or logo placements, are missing or incorrectly applied. 1 point: All specified visual details, including color adjustments and logo placement, are accurately and thoughtfully applied as per the description.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Generate an image of a large, eco-friendly water bottle with the colorful noodle graphic wrapping around the bottom third of the bottle. The design should remain vibrant and closely match the original image’s colors and playful style. The bottle is placed on a marble countertop with a few fresh vegetables like tomatoes and basil beside it, suggesting a healthy, eco-conscious theme. The background is a bright kitchen scene. The design should make it clear that this water bottle is part of the same brand as the original noodle graphic, ensuring brand recognition.\"\nYour review question is:\nPositioning and Proportion of the Visual Pattern: 0 points: The brand’s visual pattern is positioned awkwardly or disproportionately, failing to integrate naturally with the product’s shape and surface. 1 point: The brand’s visual pattern is applied with correct positioning and proportion, fitting naturally onto the product’s surface and enhancing its visual appeal.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Generate an image of a large, eco-friendly water bottle with the colorful noodle graphic wrapping around the bottom third of the bottle. The design should remain vibrant and closely match the original image’s colors and playful style. The bottle is placed on a marble countertop with a few fresh vegetables like tomatoes and basil beside it, suggesting a healthy, eco-conscious theme. The background is a bright kitchen scene. The design should make it clear that this water bottle is part of the same brand as the original noodle graphic, ensuring brand recognition.\"\nYour review question is:\nClarity and Quality of Text or Graphics: 0 points: The text or graphics appear blurry, pixelated, or washed out, reducing the professional quality and clarity of the brand. 1 point: The text and graphics are rendered sharply and vividly, enhancing both readability and the brand’s visual appeal.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}
+{"input_images": ["0001.jpg"], "output_images": ["0001.jpg"], "question": "You are a professional image designer, and you are now required to conduct a strict evaluation of the following design work. The work consists of two rows of images, with the top row row as the reference brand visual pattern for the design task and the bottom image as the response provided by a student. The task objective is to generate brand peripheral products containing the given brand visual pattern based on the text requirements.\nThe text requirement is:\n\"Generate an image of a large, eco-friendly water bottle with the colorful noodle graphic wrapping around the bottom third of the bottle. The design should remain vibrant and closely match the original image’s colors and playful style. The bottle is placed on a marble countertop with a few fresh vegetables like tomatoes and basil beside it, suggesting a healthy, eco-conscious theme. The background is a bright kitchen scene. The design should make it clear that this water bottle is part of the same brand as the original noodle graphic, ensuring brand recognition.\"\nYour review question is:\nOverall Aesthetic and Professional Quality: 0 points: The merchandise lacks aesthetic cohesion or professionalism, with issues like poor composition, lighting, or unrealistic presentation, making it unfit for brand representation. 1 point: The merchandise displays high aesthetic appeal and professional quality, with balanced composition, effective lighting, and a realistic appearance suitable for showcasing as branded merchandise.\nUse this JSON schema:\nEvaluation = {'score': int, 'reason': str}\nReturn: Evaluation"}

dataset/brand_merchandise_generation_0003/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Brand Visual Pattern Consistency:",
+            "0_point_standard": "The unique visual elements of the brand are missing or altered, making it difficult to recognize the brand in the product.",
+            "1_point_standard": "The product clearly reflects every detail of the visual elements in the reference pattern image, accurately conveying the brand's pattern and style. The visual patterns in both images are identical."
+        },
+        {
+            "question": "Product Type Accuracy:",
+            "0_point_standard": "The generated product does not match the specified type (e.g., a cup is generated instead of a handbag), or its structure is inconsistent with the description.",
+            "1_point_standard": "The product accurately matches the specified product type, displaying the correct form and structure as described."
+        },
+        {
+            "question": "Fidelity of Visual Details:",
+            "0_point_standard": "Key visual details required, such as specific color adjustments or logo placement, are missing or incorrectly applied.",
+            "1_point_standard": "All specified visual details, including color adjustments and logo placement, are accurately and thoughtfully applied as described."
+        },
+        {
+            "question": "Positioning and Proportion of Visual Pattern:",
+            "0_point_standard": "The brand's visual pattern is improperly positioned or disproportionate, failing to naturally integrate with the product's shape and surface.",
+            "1_point_standard": "The brand's visual pattern is applied with correct positioning and proportion, naturally adapting to the product's surface, enhancing visual appeal."
+        },
+        {
+            "question": "Clarity and Quality of Text or Graphics:",
+            "0_point_standard": "Text or graphics appear blurry, pixelated, or faded, reducing the professional quality and clarity of the brand.",
+            "1_point_standard": "Text and graphics are presented clearly and vividly, enhancing readability and the brand's visual appeal."
+        },
+        {
+            "question": "Overall Aesthetic and Professional Quality:",
+            "0_point_standard": "The product lacks aesthetic unity or professionalism, with poor composition, inadequate lighting, or unrealistic representation, unsuitable for brand display.",
+            "1_point_standard": "The product exhibits high aesthetic and professional quality, with balanced composition, good lighting effects, and a realistic appearance suitable for brand product display."
+        }
+    ]
+}

dataset/brand_merchandise_generation_0003/images.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://img.alicdn.com/imgextra/i3/O1CN01Cks11j1v9auHqr20g_!!6000000006130-0-tps-564-805.jpg

dataset/brand_merchandise_generation_0003/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ Generate an image of a large, eco-friendly water bottle with the colorful noodle graphic wrapping around the bottom third of the bottle. The design should remain vibrant and closely match the original image’s colors and playful style. The bottle is placed on a marble countertop with a few fresh vegetables like tomatoes and basil beside it, suggesting a healthy, eco-conscious theme. The background is a bright kitchen scene. The design should make it clear that this water bottle is part of the same brand as the original noodle graphic, ensuring brand recognition.

dataset/brand_merchandise_generation_0003/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "brand merchandise generation",
+    "num_of_cases": 3,
+    "image_reference": true,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0063",
+    "output_image_count": 1,
+    "case_id": "0003"
+}

dataset/business_card_generation_0002/.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

dataset/business_card_generation_0002/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the business card design match the textual description and include all key information (e.g., name, position, contact details)?",
+            "0_point_standard": "The business card design does not match the description, with missing or incorrect key information.",
+            "1_point_standard": "The business card design matches the description, accurately displaying all key information."
+        },
+        {
+            "question": "Is the text on the business card clear and easy to read, and do the font style and layout meet design requirements?",
+            "0_point_standard": "The text is unclear, and the font style or layout does not meet requirements, affecting overall readability.",
+            "1_point_standard": "The text is clear and easy to read, and the font style and layout meet design requirements."
+        },
+        {
+            "question": "Does the overall color scheme and visual style of the business card align with the style requirements described in the text (e.g., simple, modern)?",
+            "0_point_standard": "The color scheme and visual style do not match the textual description and fail to convey the intended style.",
+            "1_point_standard": "The color scheme and visual style align with the textual description, conveying the intended design style."
+        },
+        {
+            "question": "Has the model accurately implemented the special design requirements mentioned in the text (e.g., logo, icon, or background pattern)?",
+            "0_point_standard": "The special design requirements mentioned in the text have not been accurately implemented, or lack detail.",
+            "1_point_standard": "The special design requirements mentioned in the text are accurately implemented, with precise details."
+        },
+        {
+            "question": "Is the layout of the business card clear and logical, with reasonably organized information that is easy to understand?",
+            "0_point_standard": "The layout is chaotic, with poorly organized information and a cluttered visual effect.",
+            "1_point_standard": "The layout is clear and logical, with reasonably organized information that is easy to understand and read."
+        },
+        {
+            "question": "Does the overall aesthetic and design quality of the business card meet professional standards, and does it have strong visual appeal?",
+            "0_point_standard": "The overall aesthetic of the business card is lacking, with weak design sense and insufficient visual appeal.",
+            "1_point_standard": "The business card has high aesthetic quality, a strong design sense, and good visual appeal."
+        }
+    ]
+}

dataset/business_card_generation_0002/images.txt ADDED Viewed

File without changes

dataset/business_card_generation_0002/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ This business card design has a nautical, seafood-themed aesthetic with a vintage-style illustration of octopus tentacles in dark blue on a cream background. The front side of the card features intricate tentacle sketches that curl and fill the space in an organic pattern. In the center, there is a red-bordered rectangle containing the word “RESTAURANT” in bold, uppercase red letters. Beneath this, the text “Sea Food” appears in smaller, black, uppercase letters, indicating the restaurant’s specialty. The tentacle illustrations wrap around the text box, creating a dynamic frame. On the back side, the tentacle pattern continues around the edges, forming a decorative border. The central area is left blank for contact information, which is displayed in a clean, minimalist layout. The top left features a phone icon followed by “+0123 456 7890,” an email icon followed by “Seafood@Restaurant.com,” and a location icon followed by “12 Harbour Lane, Restaurant, AB1 2CD.” A red line rectangle frames the contact information section, mirroring the front side’s design. The overall style is elegant, combining a marine theme with classic design elements, making it ideal for a high-end seafood restaurant. At the bottom, small text reads “designed by freepik,” indicating the design source. The only generated image contains both sides of the business card.

dataset/business_card_generation_0002/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "business card generation",
+    "num_of_cases": 3,
+    "image_reference": false,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0032",
+    "output_image_count": 1,
+    "case_id": "0002"
+}

dataset/business_card_generation_0003/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the business card design match the text description and include all key information (e.g., name, position, contact information)?",
+            "0_point_standard": "The business card design does not match the description, key information is missing or displayed incorrectly.",
+            "1_point_standard": "The business card design matches the description and accurately displays all key information."
+        },
+        {
+            "question": "Is the text on the business card clear and easy to read, and do the font style and layout meet the design requirements?",
+            "0_point_standard": "The text is unclear, the font style or layout does not meet the requirements, affecting overall readability.",
+            "1_point_standard": "The text is clear and easy to read, and the font style and layout meet the design requirements."
+        },
+        {
+            "question": "Do the overall color scheme and visual style of the business card align with the style requirements described in the text (e.g., minimalist, modern)?",
+            "0_point_standard": "The color scheme and visual style do not match the text description and fail to convey the intended style.",
+            "1_point_standard": "The color scheme and visual style match the text description and convey the intended design style."
+        },
+        {
+            "question": "Has the model accurately implemented the special design requirements mentioned in the text (e.g., logos, icons, or background patterns)?",
+            "0_point_standard": "The special design requirements mentioned in the text have not been accurately implemented, or details are insufficient.",
+            "1_point_standard": "The special design requirements mentioned in the text are accurately implemented with precise details."
+        },
+        {
+            "question": "Is the layout of the business card clear and logical, and is the information organized reasonably and easy to understand?",
+            "0_point_standard": "The layout is chaotic, information is poorly organized, and the visual effect is cluttered.",
+            "1_point_standard": "The layout is clear and logical, information is organized reasonably, and it is easy to understand and read."
+        },
+        {
+            "question": "Does the overall aesthetics and design quality of the business card meet professional standards, and does it have strong visual appeal?",
+            "0_point_standard": "The overall aesthetics of the business card are insufficient, the design is weak, and it lacks visual appeal.",
+            "1_point_standard": "The business card has high aesthetics, strong design sense, and good visual appeal."
+        }
+    ]
+}

dataset/business_card_generation_0003/images.txt ADDED Viewed

File without changes

dataset/business_card_generation_0003/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ This business card design has a delicate, botanical theme in grayscale. The front side has a white background with detailed, hand-drawn floral illustrations in soft gray that cover the entire surface, giving it a vintage, elegant feel. Centered on this side is a rectangular frame with thin borders and decorative corners, within which the name “Emma Williams” is written in a refined serif font. This frame adds a touch of sophistication, highlighting the name as the main focal point. On the back side, the design is split, with floral illustrations continuing on the right side, while the left side is a blank white section where the contact information is displayed. The name “Emma Williams” appears at the top in the same serif font, followed by the title “(Graphic Designer)” in smaller, italicized text beneath it. Below, three contact details are listed with bullet points: “instagram,” “telegram,” and a website URL “www.emmawilliams.com,” each in lowercase letters and simple fonts. The overall design is clean, stylish, and professional, suitable for a graphic designer who values minimalism and classic aesthetics. The only generated image contains both sides of the business card.

dataset/business_card_generation_0003/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "business card generation",
+    "num_of_cases": 3,
+    "image_reference": false,
+    "multi_image_reference": false,
+    "multi_image_output": false,
+    "uid": "0032",
+    "output_image_count": 1,
+    "case_id": "0003"
+}

dataset/childrens_book_generation_0003/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the sequence of images present a coherent narrative in a logical order?",
+            "0_point_standard": "The sequence of images is not arranged in chronological order or lacks logical flow, disrupting the narrative.",
+            "1_point_standard": "The sequence of images clearly presents a coherent narrative in a logical chronological order."
+        },
+        {
+            "question": "Does the content of the images match the text descriptions provided in the children's book?",
+            "0_point_standard": "The images do not accurately reflect the text descriptions and deviate significantly from the story.",
+            "1_point_standard": "The images perfectly match the text descriptions, accurately depicting the story elements."
+        },
+        {
+            "question": "Is the illustration style consistent throughout the entire book?",
+            "0_point_standard": "The illustration style is inconsistent, leading to a disjointed visual experience.",
+            "1_point_standard": "All illustrations maintain a consistent style, creating a harmonious visual effect throughout the book."
+        },
+        {
+            "question": "Is the depiction of main characters or objects consistent across all illustrations?",
+            "0_point_standard": "The depiction of main characters or objects is inconsistent across different images, making them difficult to recognize as the same characters or objects.",
+            "1_point_standard": "The depiction of main characters or objects is consistent, clearly recognizable as the same characters or objects across all illustrations."
+        },
+        {
+            "question": "Is the portrayal of narrative and characters logically accurate and suitable for the children's age group?",
+            "0_point_standard": "The portrayal is illogical, inaccurate, or unsuitable for the target age group, with noticeable errors.",
+            "1_point_standard": "The portrayal is logically accurate, suitable for the target age group, and reflects the expected narrative standards."
+        },
+        {
+            "question": "Do the illustrations exhibit a professional level of detail and aesthetic appeal, enhancing the book's visual attraction?",
+            "0_point_standard": "The illustrations lack detail and aesthetic appeal, falling short of the visual standards for children's books.",
+            "1_point_standard": "The illustrations are rich in detail and have excellent aesthetic appeal, meeting professional standards and significantly enhancing visual attraction."
+        }
+    ]
+}

dataset/childrens_book_generation_0003/images.txt ADDED Viewed

File without changes

dataset/childrens_book_generation_0003/instruction.txt ADDED Viewed

	@@ -0,0 +1 @@

+ This is a children's picture book illustration generation task consisting of 8 pages, titled “The Talking Cloud.” Scene and character IDs need to remain consistent throughout the book to ensure stylistic uniformity and character continuity. The main characters include a little girl named Amy (ID: Amy) and a talking cloud (ID: Cloudy). Each page will depict the daily life of children in different countries. Page 1: The story begins on a clear afternoon with Amy playing alone in the park. She looks up at the sky and suddenly notices a cloud drifting down. To her surprise, the cloud starts talking and greets her warmly, inviting her to explore the world together. The scene is set in an open park, with children playing in the distance and a few fluffy clouds in the sky. Amy stands on the grass, looking up at the cloud, which is smiling as it approaches her from the sky. Character IDs: Amy, Cloudy. Page 2: The cloud takes Amy up into the sky, and they arrive at an African savanna, where they see a girl named Kara (ID: Kara) helping her family water the cattle. Amy is amazed to see how Kara and her family rely on water to care for their animals. The scene is set in a vast savanna, with a few large trees in the distance. Kara is by a river, drawing water for the cattle, and in the background, other animals like giraffes and zebras can be seen. Character IDs: Amy, Cloudy, Kara. Page 3: Next, Amy and the cloud fly to Japan, where they see a boy named Kenta (ID: Kenta) preparing to participate in his school's sports day with his friends. The cloud explains to Amy that sports day is an important event in Japanese schools, where students show their teamwork through competitions. The scene is set on a Japanese school playground, with flags flying overhead. Kenta and his friends are dressed in sports uniforms, ready for a race, and the background features the school buildings and cherry blossom trees around the field. Character IDs: Amy, Cloudy, Kenta. Page 4: Then, the cloud takes Amy to Paris, France, where they see a girl named Sophie (ID: Sophie) enjoying lunch with her parents. The cloud tells Amy that French people love to enjoy meals outdoors, especially on warm days. The scene is set on a Parisian street café, with the Eiffel Tower faintly visible in the background. Sophie and her family are seated at a table, enjoying bread and cheese, with a warm and relaxed atmosphere. Character IDs: Amy, Cloudy, Sophie. Page 5: Amy and the cloud continue flying and arrive in a snowy landscape, where they meet a boy named Ivan (ID: Ivan), who is building a large snowman with his family. The cloud tells Amy that Ivan lives in Russia, where they enjoy playing in the snow during the winter. The scene is set in a snowy Russian village, with Ivan and his family building a snowman. In the background, wooden cottages and pine trees are covered in snow, and snowflakes are gently falling. Character IDs: Amy, Cloudy, Ivan. Page 6: Afterward, Amy and the cloud fly to India, where they see a girl named Anya (ID: Anya) creating colorful patterns outside her house. The cloud tells Amy that this is called “Rangoli,” a traditional Indian decoration that symbolizes good luck and happiness. The scene is set in the courtyard of an Indian home, with Anya drawing beautiful patterns on the ground with colored powders. In the background, the house is decorated with lanterns and flower garlands. Character IDs: Amy, Cloudy, Anya. Page 7: Finally, Amy and the cloud return to her hometown, where the sky is painted with the colors of the setting sun. Amy thanks the cloud for taking her on a journey around the world and learning about the lives of children from different cultures. The cloud smiles and slowly disappears into the sky, leaving Amy sitting on the grass, reminiscing about her wonderful adventure. The scene is set in the park at sunset, with the golden glow of the sun lighting up the sky. Trees sway gently in the breeze, and Amy smiles as she watches the cloud fade away. Character IDs: Amy, Cloudy. Page 8: In the final page, Amy is back at home, and she takes out her drawing tools to illustrate her adventure with the cloud. Her walls are covered with pictures of the friends she met around the world, and she decides to keep this journey in her heart forever. The scene is set in Amy's bedroom, with pictures of the children she met on her travels decorating the walls. On her desk, she is working on a colorful drawing, and the room is filled with a warm, cozy atmosphere. Character IDs: Amy.

dataset/childrens_book_generation_0003/meta.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "task_name": "childrens book generation without reference",
+    "num_of_cases": 4,
+    "image_reference": false,
+    "multi_image_reference": false,
+    "multi_image_output": true,
+    "uid": "0019",
+    "output_image_count": 8,
+    "case_id": "0003"
+}

dataset/childrens_book_generation_0004/eval.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+    "questions": [
+        {
+            "question": "Does the sequence of images present a coherent narrative in logical order?",
+            "0_point_standard": "The sequence of images is not arranged in chronological order or lacks logical flow, disrupting the narrative.",
+            "1_point_standard": "The sequence of images clearly presents a coherent narrative in logical chronological order."
+        },
+        {
+            "question": "Do the image contents match the textual descriptions provided in the children's book?",
+            "0_point_standard": "The images do not accurately reflect the textual descriptions, with noticeable deviations from the story.",
+            "1_point_standard": "The images completely match the textual descriptions, accurately depicting the story elements."
+        },
+        {
+            "question": "Is the illustration style consistent throughout the book?",
+            "0_point_standard": "The illustration style is inconsistent, leading to a disjointed visual effect.",
+            "1_point_standard": "All illustrations maintain a consistent style, creating a harmonious visual effect throughout the book."
+        },
+        {
+            "question": "Are the images of main characters or objects consistent across all illustrations?",
+            "0_point_standard": "The images of main characters or objects are inconsistent across different images, making it difficult to recognize them as the same character or object.",
+            "1_point_standard": "The images of main characters or objects are consistent, clearly recognizable as the same character or object in all illustrations."
+        },
+        {
+            "question": "Is the depiction of narrative and characters logically accurate and suitable for the children's age group?",
+            "0_point_standard": "The depiction is illogical, inaccurate, or unsuitable for the target age group, with evident errors.",
+            "1_point_standard": "The depiction is logically accurate, suitable for the target age group, and reflects the expected narrative standards."
+        },
+        {
+            "question": "Do the illustrations exhibit professional-level detail and aesthetic, enhancing the book's visual appeal?",
+            "0_point_standard": "The illustrations lack detail, have poor aesthetic quality, and do not meet the visual standards of children's books.",
+            "1_point_standard": "The illustrations are rich in detail and have excellent aesthetic quality, meeting professional standards with significant visual appeal."
+        }
+    ]
+}