ProteusSigma / README.md
dataautogpt3's picture
Update README.md
1a60e3d verified
|
raw
history blame
8.98 kB
---
license: apache-2.0
language:
- en
base_model:
- stabilityai/stable-diffusion-xl-base-1.0
pipeline_tag: text-to-image
tags:
- art
---
# SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements
- [x] 10k dataset proof of concept (completed)[link](https://huggingface.co/dataautogpt3/ProteusSigma)
- [ ] 200k+ dataset finetune (in testing/training)
- [ ] 12M million dataset finetune (planned)
<style>
.logo {
width: 600px;
margin: 20px auto;
display: block;
background: linear-gradient(180deg, rgba(0,0,0,0) 0%, rgba(137,27,171,0.2) 100%);
padding: 20px;
}
.logo-text-main {
font-family: 'Arial Black', sans-serif;
fill: none;
stroke-width: 2;
stroke-linejoin: round;
animation: glow 2s ease-in-out infinite alternate;
}
.logo-text-outline {
stroke: #ff00ff;
stroke-width: 8;
stroke-linejoin: round;
fill: none;
}
.logo-text-fill {
fill: url(#retroGradient);
stroke: none;
}
.logo-text-shadow {
fill: none;
stroke: #00ffff;
stroke-width: 2;
filter: blur(3px);
}
.subtitle {
font-family: 'Arial', sans-serif;
fill: #00ffff;
font-size: 20px;
filter: drop-shadow(0 0 2px #00ffff);
}
@keyframes glow {
from {
filter: drop-shadow(0 0 2px #ff00ff)
drop-shadow(0 0 4px #ff00ff)
drop-shadow(0 0 6px #00ffff);
}
to {
filter: drop-shadow(0 0 4px #ff00ff)
drop-shadow(0 0 8px #ff00ff)
drop-shadow(0 0 12px #00ffff);
}
}
</style>
<svg class="logo" viewBox="0 0 800 200" xmlns="http://www.w3.org/2000/svg">
<defs>
<linearGradient id="retroGradient" x1="0%" y1="0%" x2="0%" y2="100%">
<stop offset="0%" style="stop-color:#ff00ff;stop-opacity:1" />
<stop offset="50%" style="stop-color:#ff71ce;stop-opacity:1" />
<stop offset="100%" style="stop-color:#b967ff;stop-opacity:1" />
</linearGradient>
<filter id="chrome">
<feGaussianBlur in="SourceAlpha" stdDeviation="2" result="blur" />
<feOffset in="blur" dx="2" dy="2" result="offsetBlur" />
<feMerge>
<feMergeNode in="offsetBlur" />
<feMergeNode in="SourceGraphic" />
</feMerge>
</filter>
</defs>
<!-- Main text with effects -->
<g transform="translate(400,100)" text-anchor="middle">
<!-- Shadow layer -->
<text class="logo-text-main logo-text-shadow"
x="-100" y="0" font-size="80px">Proteus</text>
<!-- Outline layer -->
<text class="logo-text-main logo-text-outline"
x="-100" y="0" font-size="80px">Proteus</text>
<!-- Gradient fill layer -->
<text class="logo-text-main logo-text-fill"
x="-100" y="0" font-size="80px">Proteus</text>
<!-- Sigma symbol -->
<text x="120" y="0"
font-size="80px"
fill="#00ffff"
filter="url(#chrome)">Σ</text>
<!-- Subtitle -->
<text class="subtitle" y="40">STABLE DIFFUSION XL</text>
</g>
<!-- Grid effect -->
<path d="M0 180 L800 180" stroke="#ff00ff" stroke-width="1" opacity="0.5"/>
<path d="M0 185 L800 185" stroke="#00ffff" stroke-width="1" opacity="0.3"/>
<path d="M0 190 L800 190" stroke="#ff00ff" stroke-width="1" opacity="0.2"/>
</svg>
## Example Outputs
<style>
.gallery {
display: flex;
flex-direction: row;
flex-wrap: wrap;
gap: 10px;
justify-content: center;
align-items: center;
width: 100%;
padding: 10px;
}
.gallery-item {
flex: 0 0 300px;
margin: 0;
position: relative;
}
.gallery-item.large { /* New class for larger item */
flex: 0 0 340px;
}
.gallery img {
width: 300px;
cursor: pointer;
transition: transform 0.2s;
border-radius: 8px;
}
.gallery-item.large img { /* Larger size for last image */
width: 512px;
}
.gallery img:hover {
transform: scale(1.05);
}
.caption {
position: absolute;
bottom: 0;
left: 0;
right: 0;
background: rgba(0, 0, 0, 0.4);
color: white;
padding: 8px;
font-size: 11px;
border-bottom-left-radius: 8px;
border-bottom-right-radius: 8px;
opacity: 0.7;
transition: opacity 0.3s ease;
}
.gallery-item:hover .caption {
opacity: 0.2;
}
.modal {
display: none;
position: fixed;
z-index: 1000;
top: 0;
left: 0;
width: 100%;
height: 100%;
background-color: rgba(0,0,0,0.9);
padding: 20px;
box-sizing: border-box;
}
.modal img {
max-width: 90%;
max-height: 90vh;
margin: auto;
display: block;
position: relative;
top: 50%;
transform: translateY(-50%);
}
.modal.active {
display: block;
}
</style>
<div class="gallery">
<div class="gallery-item">
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example.png" alt="Example Output 1" onclick="showImage(this.src)"/>
<div class="caption">A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.</div>
</div>
<div class="gallery-item">
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example2.png" alt="Example Output 2" onclick="showImage(this.src)"/>
<div class="caption">A Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room, real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1.2 ISO100 35MM</div>
</div>
<div class="gallery-item">
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example3.png" alt="Example Output 3" onclick="showImage(this.src)"/>
</div>
<div class="gallery-item">
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example4.png" alt="Example Output 4" onclick="showImage(this.src)"/>
</div>
<div class="gallery-item large"> <!-- Added 'large' class -->
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example5.png" alt="Example Output 5" onclick="showImage(this.src)"/>
</div>
</div>
<div class="modal" onclick="this.classList.remove('active')">
<img id="modal-img" src="" alt="Full size image"/>
</div>
<script>
function showImage(src) {
document.getElementById('modal-img').src = src;
document.querySelector('.modal').classList.add('active');
}
</script>
# Combined Proteus and Mobius datasets.
# Recommended Inference Parameters
[ComfyUI workflow](https://huggingface.co/dataautogpt3/sdxl-ztsnr-sigma-10k/blob/main/ComfyUI-test10k.json)
"sampler": "euler_ancestral", # Best results with Euler Ancestral
"scheduler": "normal", # Normal noise schedule
"steps": 28, # Optimal step count
"cfg": 7.5 # Classifier-free guidance scale
## Model Details
- **Model Type:** SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements
- **Base Model:** stabilityai/stable-diffusion-xl-base-1.0
- **Training Dataset:** 10,000 high-quality images
- **License:** Apache 2.0
## Key Features
- Zero Terminal SNR (ZTSNR) implementation
- Increased σ_max ≈ 20000.0 (NovelAI research)
- High-resolution coherence enhancements
- Tag-based CLIP weighting
- VAE improvements
### Technical Specifications
- **Noise Schedule**: σ_max ≈ 20000.0 to σ_min ≈ 0.0292
- **Progressive Steps**: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292]
- **Resolution Scaling**: √(H×W)/1024
## Training Details
### Training Configuration
- **Learning Rate:** 1e-6
- **Batch Size:** 1
- **Gradient Accumulation Steps:** 1
- **Optimizer:** AdamW
- **Precision:** bfloat16
- **VAE Finetuning:** Enabled
- **VAE Learning Rate:** 1e-6
### CLIP Weight Configuration
- **Character Weight:** 1.5
- **Style Weight:** 1.2
- **Quality Weight:** 0.8
- **Setting Weight:** 1.0
- **Action Weight:** 1.1
- **Object Weight:** 0.9
## Performance Improvements
- 47% fewer artifacts at σ < 5.0
- Stable composition at σ > 12.4
- 31% better detail consistency
- Improved color accuracy
- Enhanced dark tone reproduction
## Repository and Resources
- **GitHub Repository:** [SDXL-Training-Improvements](https://github.com/DataCTE/SDXL-Training-Improvements)
- **Training Code:** Available in the repository
- **Documentation:** [Implementation Details](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/README.md)
- **Issues and Support:** [GitHub Issues](https://github.com/DataCTE/SDXL-Training-Improvements/issues)
## Citation
```bibtex
@article{ossa2024improvements,
title={Improvements to SDXL in NovelAI Diffusion V3},
author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.},
journal={arXiv preprint arXiv:2409.15997v2},
year={2024}
}
```