Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ license: mit
|
|
41 |
<img src="./card_images/11.png" class="wide" alt="Sample Image 11">
|
42 |
</div>
|
43 |
|
44 |
-
**Momo XL** is an anime-style model based on SDXL, fine-tuned to produce high-quality anime-style images with detailed and vibrant aesthetics.
|
45 |
|
46 |
## Key Features:
|
47 |
|
@@ -66,3 +66,35 @@ This model may produce unexpected or unintended results. **Use with caution and
|
|
66 |
- **Data Sources**: The model was trained on publicly available datasets. While efforts have been made to filter and curate the training data, some undesirable content may remain.
|
67 |
|
68 |
Thank you! 😊
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
<img src="./card_images/11.png" class="wide" alt="Sample Image 11">
|
42 |
</div>
|
43 |
|
44 |
+
**Momo XL** is an anime-style model based on SDXL, fine-tuned to produce high-quality anime-style images with detailed and vibrant aesthetics. (Oct 6, 2024)
|
45 |
|
46 |
## Key Features:
|
47 |
|
|
|
66 |
- **Data Sources**: The model was trained on publicly available datasets. While efforts have been made to filter and curate the training data, some undesirable content may remain.
|
67 |
|
68 |
Thank you! 😊
|
69 |
+
|
70 |
+
|
71 |
+
------------------------------------------------------
|
72 |
+
## Momo XL - Training Details (Oct 15, 2024)
|
73 |
+
|
74 |
+
### Dataset
|
75 |
+
Momo XL was trained using a dataset of over **400,000+ images** sourced from Danbooru.
|
76 |
+
|
77 |
+
### Base Model
|
78 |
+
Momo XL was built on top of SDXL, incorporating knowledge from two finetuned models:
|
79 |
+
- Formula:
|
80 |
+
`SDXL_base + (Animagine 3.0 base - SDXL_base) * 1.0 + (Pony V6 - SDXL_base) * 0.5`
|
81 |
+
|
82 |
+
For more details:
|
83 |
+
- [Animagine 3.0 base](https://huggingface.co/Linaqruf/animagine-xl-3.0)
|
84 |
+
- [Pony V6](https://huggingface.co/LyliaEngine/Pony_Diffusion_V6_XL)
|
85 |
+
|
86 |
+
### Training Process
|
87 |
+
Training was conducted on **A100 80GB GPUs**, totaling over **2000+ GPU hours**. The training was divided into three stages:
|
88 |
+
- **Finetuning - First Stage**: Trained on the entire dataset with a defined set of training configurations.
|
89 |
+
- **Finetuning - Second Stage**: Also trained on the entire dataset with some variations in settings.
|
90 |
+
- **Adjustment Stage**: Focused on aesthetic adjustments to improve the overall visual quality.
|
91 |
+
|
92 |
+
The final model, **Momo XL**, was released by merging the Text Encoder from the Finetuning Second Stage with the UNet from the Adjustment Stage.
|
93 |
+
|
94 |
+
### Hyperparameters
|
95 |
+
|
96 |
+
| Stage | Epochs | UNet lr | Text Encoder lr | Batch Size | Resolution | Noise Offset | Optimizer | LR Scheduler |
|
97 |
+
|--------------------------|--------|---------|-----------------|------------|------------|--------------|------------|--------------|
|
98 |
+
| **Finetuning 1st Stage** | 10 | 2e-5 | 1e-5 | 256 | 1024² | N/A | AdamW8bit | Constant |
|
99 |
+
| **Finetuning 2nd Stage** | 10 | 2e-5 | 1e-5 | 256 | Max. 1280² | N/A | AdamW | Constant |
|
100 |
+
| **Adjustment Stage** | 0.25 | 8e-5 | 4e-5 | 1024 | Max. 1280² | 0.05 | AdamW | Constant |
|