Sygil
/

Sygil-Diffusion

@@ -89,7 +89,7 @@ image.save("fantasy_forest_illustration.png")
     - [Sygil Diffusion v0.2](https://huggingface.co/Sygil/Sygil-Diffusion/blob/main/sygil-diffusion-v0.2.ckpt): Resumed from Sygil Diffusion v0.1 and trained for a total of 1.77 million steps.
     - [Sygil Diffusion v0.3](https://huggingface.co/Sygil/Sygil-Diffusion/blob/main/sygil-diffusion-v0.3.ckpt): Resumed from Sygil Diffusion v0.2 and trained for a total of 2.01 million steps so far.
   - #### Beta:
-    - [sygil-diffusion-v0.4_2216300_lora.ckpt](https://huggingface.co/Sygil/Sygil-Diffusion/blob/main/sygil-diffusion-v0.4_2216300_lora.ckpt): Resumed from Sygil Diffusion v0.3 and trained for a total of 2.21 million steps so far.
   Note: Checkpoints under the Beta section are updated daily or at least 3-4 times a week. This is usually the equivalent of 1-2 training session,
   this is done until they are stable enough to be moved into a proper release, usually every 1 or 2 weeks.
@@ -105,14 +105,14 @@ The model was trained on the following dataset:
 **Hardware and others**
 - **Hardware:** 1 x Nvidia RTX 3050 8GB GPU
-- **Hours Trained:** 804 hours approximately.
 - **Optimizer:** AdamW
 - **Adam Beta 1**: 0.9
 - **Adam Beta 2**: 0.999
 - **Adam Weight Decay**: 0.01
 - **Adam Epsilon**: 1e-8
 - **Gradient Checkpointing**: True
-- **Gradient Accumulations**: 4
 - **Batch:** 1
 - **Learning Rate:** 1e-7
 - **Learning Rate Scheduler:** cosine_with_restarts
@@ -120,7 +120,15 @@ The model was trained on the following dataset:
 - **Lora unet Learning Rate**: 1e-7
 - **Lora Text Encoder Learning Rate**: 1e-7
 - **Resolution**: 512 pixels
-- **Total Training Steps:** 2,216,300
 Developed by: [ZeroCool94](https://github.com/ZeroCool940711) at [Sygil-Dev](https://github.com/Sygil-Dev/)

     - [Sygil Diffusion v0.2](https://huggingface.co/Sygil/Sygil-Diffusion/blob/main/sygil-diffusion-v0.2.ckpt): Resumed from Sygil Diffusion v0.1 and trained for a total of 1.77 million steps.
     - [Sygil Diffusion v0.3](https://huggingface.co/Sygil/Sygil-Diffusion/blob/main/sygil-diffusion-v0.3.ckpt): Resumed from Sygil Diffusion v0.2 and trained for a total of 2.01 million steps so far.
   - #### Beta:
+    - [sygil-diffusion-v0.4_2318263_lora.ckptt](https://huggingface.co/Sygil/Sygil-Diffusion/blob/main/sygil-diffusion-v0.4_2318263_lora.ckpt): Resumed from Sygil Diffusion v0.3 and trained for a total of 2.31 million steps so far.
   Note: Checkpoints under the Beta section are updated daily or at least 3-4 times a week. This is usually the equivalent of 1-2 training session,
   this is done until they are stable enough to be moved into a proper release, usually every 1 or 2 weeks.
 **Hardware and others**
 - **Hardware:** 1 x Nvidia RTX 3050 8GB GPU
+- **Hours Trained:** 840 hours approximately.
 - **Optimizer:** AdamW
 - **Adam Beta 1**: 0.9
 - **Adam Beta 2**: 0.999
 - **Adam Weight Decay**: 0.01
 - **Adam Epsilon**: 1e-8
 - **Gradient Checkpointing**: True
+- **Gradient Accumulations**: 400
 - **Batch:** 1
 - **Learning Rate:** 1e-7
 - **Learning Rate Scheduler:** cosine_with_restarts
 - **Lora unet Learning Rate**: 1e-7
 - **Lora Text Encoder Learning Rate**: 1e-7
 - **Resolution**: 512 pixels
+- **Total Training Steps:** 2,318,263
+  Note: For the learning rate I'm testing something new, after changing from using the `constant` scheduler to `cosine_with_restarts` after v0.3 was released, I noticed
+   it practically uses the optimal learning rate while trying to minimize the loss value, so, when every training session finishes I use for the next session the latest
+   learning rate value shown for the last few steps from the last session, this makes it so it will overtime decrease at a constant rate. When I add a lot of data to the training dataset
+   at once, I move the learning rate back to 1e-7 which then the scheduler will move down again as it learns more from the new data, this makes it so the training
+   doesn't overfit or uses a learning rate too low that makes the model not learn anything new for a while.
 Developed by: [ZeroCool94](https://github.com/ZeroCool940711) at [Sygil-Dev](https://github.com/Sygil-Dev/)