playgroundai
/

playground-v2-1024px-aesthetic

Text-to-Image

Diffusers

Safetensors

StableDiffusionXLPipeline

playground

Inference Endpoints

Model card Files Files and versions Community

Daiqing commited on Dec 5, 2023

Commit

f81d842

1 Parent(s): 76c0dd5

Update README.md

Browse files

Files changed (1) hide show

README.md +12 -1

README.md CHANGED Viewed

@@ -67,6 +67,11 @@ During the user study, we give users instructions to evaluate image pairs based
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63855d851769b7c4b10e1f76/o3Bt62qFsTO9DkeX2yLua.png)
 We introduce a new benchmark, [MJHQ-30K](https://huggingface.co/datasets/playgroundai/MJHQ30K), for automatic evaluation of a model’s aesthetic quality. The benchmark computes FID on a high-quality dataset to gauge aesthetic quality.
 We curate the high-quality dataset from Midjourney with 10 common categories, each category with 3K samples. Following common practice, we use aesthetic score and CLIP score to ensure high image quality and high image-text alignment. Furthermore, we take extra care to make the data diverse within each category.
@@ -77,6 +82,12 @@ We release this benchmark to the public and encourage the community to adopt it
 ### Base Models for all resolution
-< INSERT TABLE HERE >
 Apart from playground-v2-1024px-aesthetic, we release all intermediate checkpoints at different training stages to the community in order to foster foundation model research in pixels. Here, we report the FID score and CLIP score on the MSCOCO14 evaluation set for the reference purposes. (Note that our reported numbers may differ from the numbers reported in SDXL’s published results, as our prompt list may be different.)

 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63855d851769b7c4b10e1f76/o3Bt62qFsTO9DkeX2yLua.png)
+| Model                                 | Overall FID   |
+| ------------------------------------- | ----- |
+| SDXL-1-0-refiner                      | 9.55  |
+| playground-v2-1024px-aesthetic        | **7.07**  |
 We introduce a new benchmark, [MJHQ-30K](https://huggingface.co/datasets/playgroundai/MJHQ30K), for automatic evaluation of a model’s aesthetic quality. The benchmark computes FID on a high-quality dataset to gauge aesthetic quality.
 We curate the high-quality dataset from Midjourney with 10 common categories, each category with 3K samples. Following common practice, we use aesthetic score and CLIP score to ensure high image quality and high image-text alignment. Furthermore, we take extra care to make the data diverse within each category.
 ### Base Models for all resolution
+| Model                        | FID    | Clip Score |
+| ---------------------------- | ------ | ---------- |
+| SDXL-1-0-refiner             | 13.04  | 32.62      |
+| [playground-v2-256px-base](https://huggingface.co/playgroundai/playground-v2-256px-base)     | 9.83   | 31.90      |
+| [playground-v2-512px-base](https://huggingface.co/playgroundai/playground-v2-512px-base)     | 9.55   | 32.08      |
+| [playground-v2-1024px-base](https://huggingface.co/playgroundai/playground-v2-1024px-base)    | 9.97   | 31.90      |
 Apart from playground-v2-1024px-aesthetic, we release all intermediate checkpoints at different training stages to the community in order to foster foundation model research in pixels. Here, we report the FID score and CLIP score on the MSCOCO14 evaluation set for the reference purposes. (Note that our reported numbers may differ from the numbers reported in SDXL’s published results, as our prompt list may be different.)