dataautogpt3 commited on
Commit
a197189
1 Parent(s): 9409833

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -31,10 +31,12 @@ widget:
31
  url: ComfyUI_00614_.png
32
  ---
33
  <Gallery />
34
- This model is a fine-tuned version of Stable Diffusion 1.5, specifically enhanced for generating high-quality images of people, hands, and text. It has been trained on 131,000 high-quality, captioned image pairs generated using DALL-E 3. The training was conducted on four NVIDIA 3090 GPUs with NVLink over 16 hours, spanning 8 epochs.
35
 
36
- The model demonstrates notable proficiency in rendering human figures and intricate details like hand gestures and written text, although it shows less effectiveness with animal imagery. This specialization makes it well-suited for applications requiring precise human and text representations.
37
 
38
- The fine-tuning process involved 13,100 unique examples, contributing to a total dataset size of 131,000 images. Each training epoch processed 31,000 examples, with a total train batch size of 40. The model underwent a total of 26,200 optimization steps, maintaining a gradient accumulation of 1 throughout the training period.
39
 
40
- The enhancements in this version aim to minimize common image generation flaws such as blurriness, disproportion, noise, and low resolution, ensuring clear and anatomically accurate outputs.
 
 
 
31
  url: ComfyUI_00614_.png
32
  ---
33
  <Gallery />
34
+ ComfyUI_00641_.png
35
 
36
+ iteration of Stable Diffusion 1.5, modestly adapted for more refined generation of human figures, hands, and text. The training, while not groundbreaking, was conducted on a reasonable setup of four NVIDIA 3090 GPUs and spanned a modest 16 hours for 8 epochs.
37
 
38
+ Its capabilities are somewhat specialized, being more adept at creating images of people and textual elements, and less so with animals. This selective improvement makes it a suitable, though not exceptional, tool for tasks requiring detailed human figures or textual accuracy.
39
 
40
+ The training process incorporated a set of 13,100 unique examples, leading to a dataset of 131,000 images. Each epoch dealt with 31,000 examples, and the model was trained with a batch size of 40. The optimization steps totaled 26,200, with a consistent gradient accumulation, emphasizing gradual and steady learning.
41
+
42
+ The improvements, while not radical, aim to address common issues in image generation such as blurriness and disproportion. The goal was to achieve clearer, more anatomically coherent results, although the advancements are more evolutionary than revolutionary.