Linaqruf
/

animagine-xl-2.0

@@ -33,7 +33,7 @@ widget:
   }
   .title {
-    font-size: 2vw;
     text-align: center;
     color: #333;
     font-family: 'Helvetica Neue', sans-serif;
@@ -127,8 +127,18 @@ widget:
     /* Fallback for browsers that do not support this effect */
     text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.7);
     /* Enhanced text shadow for better legibility */
   }
 </style>
 <h1 class="title">
@@ -282,6 +292,7 @@ image = pipe(
 ## Usage Guidelines
 ### Prompt Guidelines
 Animagine XL 2.0 responds effectively to natural language descriptions for image generation. For example:
 ```
 A girl with mesmerizing blue eyes looks at the viewer. Her long, white hair is adorned with blue butterfly hair ornaments.
@@ -327,6 +338,18 @@ For higher quality outcomes, prepend prompts with:
 masterpiece, best quality
 ```
 ### Multi Aspect Resolution
 This model supports generating images at the following dimensions:
@@ -342,6 +365,8 @@ This model supports generating images at the following dimensions:
 | 1536 x 640      | 12:5 Horizontal |
 | 640 x 1536      | 5:12 Vertical   |
 ## Training and Hyperparameters
@@ -360,6 +385,29 @@ This model supports generating images at the following dimensions:
 *Note: The model's training configuration is subject to future enhancements.*
 ## Direct Use
 The Animagine XL 2.0 model, with its advanced text-to-image diffusion capabilities, is highly versatile and can be applied in various fields:
@@ -390,4 +438,8 @@ We extend our gratitude to:
 - **Camenduru Server Community:** For invaluable insights and support.
 - **NovelAI:** For inspiring the Quality Tags feature.
 - **Waifu DIffusion Team:** for inspiring the optimal training pipeline with bigger datasets.
-- **Shadow Lilac:** For the image classification model ([Hugging Face - shadowlilac/aesthetic-shadow](https://huggingface.co/shadowlilac/aesthetic-shadow)) crucial in our quality assessment process.

   }
   .title {
+    font-size: 2.5em;
     text-align: center;
     color: #333;
     font-family: 'Helvetica Neue', sans-serif;
     /* Fallback for browsers that do not support this effect */
     text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.7);
     /* Enhanced text shadow for better legibility */
+  .overlay-subtext {
+    font-size: 0.75em;
+    margin-top: 0.5em;
+    font-style: italic;
   }
+  .overlay,
+  .overlay-subtext {
+    text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5);
+  }
 </style>
 <h1 class="title">
 ## Usage Guidelines
 ### Prompt Guidelines
 Animagine XL 2.0 responds effectively to natural language descriptions for image generation. For example:
 ```
 A girl with mesmerizing blue eyes looks at the viewer. Her long, white hair is adorned with blue butterfly hair ornaments.
 masterpiece, best quality
 ```
+<table class="custom-table">
+  <tr>
+    <td>
+      <div class="custom-image-container">
+        <img class="custom-image" src="https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/m6BGzrJgYTb9QrZprVAqZ.png" alt="sample1">
+        <div class="overlay" style="font-size: 3vw;"> Twilight Contemplation <div class="overlay-subtext" style="font-size: 0.75em; font-style: italic;">"Stelle, Amidst Shooting Stars and Mountain Silhouettes"</div>
+        </div>
+      </div>
+    </td>
+  </tr>
+</table>
 ### Multi Aspect Resolution
 This model supports generating images at the following dimensions:
 | 1536 x 640      | 12:5 Horizontal |
 | 640 x 1536      | 5:12 Vertical   |
+## Examples
 ## Training and Hyperparameters
 *Note: The model's training configuration is subject to future enhancements.*
+## Model Comparison (Animagine XL 1.0 vs Animagine XL 2.0)
+### Image Comparison
+In the second iteration (Animagine XL 2.0), we have addressed the 'broken neck' issue prevalent in poses like "looking back" and "from behind". Now, characters are consistently "looking at viewer" by default, enhancing the naturalism and accuracy of the generated images.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/oSssetgmuLEV6RlaSC5Tr.png)
+### Training Config
+| Configuration Item    | Animagine XL 1.0   | Animagine XL 2.0        |
+|-----------------------|--------------------|--------------------------|
+| **GPU**               | A100 40G           | A100 80G                 |
+| **Dataset**           | 8000 images        | 170k + 83k images        |
+| **Global Epochs**     | Not Applicable     | 20                       |
+| **Learning Rate**     | 4e-7               | 1e-6                     |
+| **Batch Size**        | 16                 | 32                       |
+| **Train Text Encoder**| False              | True                     |
+| **Train Special Tags**| False              | True                     |
+| **Image Resolution**  | 1024               | 1024                     |
+| **Bucket Resolution** | 1024 x 256         | 2048 x 512               |
+| **Caption Dropout**   | 0.5                | 0                        |
 ## Direct Use
 The Animagine XL 2.0 model, with its advanced text-to-image diffusion capabilities, is highly versatile and can be applied in various fields:
 - **Camenduru Server Community:** For invaluable insights and support.
 - **NovelAI:** For inspiring the Quality Tags feature.
 - **Waifu DIffusion Team:** for inspiring the optimal training pipeline with bigger datasets.
+- **Shadow Lilac:** For the image classification model ([shadowlilac/aesthetic-shadow](https://huggingface.co/shadowlilac/aesthetic-shadow)) crucial in our quality assessment process.
+<h1 class="title">
+  <span>Anything you can Imagine!</span>
+</h1>