ProteusSigma / README.md

Update README.md

1a60e3d verified 3 months ago

8.98 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- stabilityai/stable-diffusion-xl-base-1.0
	pipeline_tag: text-to-image
	tags:
	- art
	---
	# SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements

	- [x] 10k dataset proof of concept (completed)[link](https://huggingface.co/dataautogpt3/ProteusSigma)

	- [ ] 200k+ dataset finetune (in testing/training)

	- [ ] 12M million dataset finetune (planned)

	<style>
	.logo {
	width: 600px;
	margin: 20px auto;
	display: block;
	background: linear-gradient(180deg, rgba(0,0,0,0) 0%, rgba(137,27,171,0.2) 100%);
	padding: 20px;
	}

	.logo-text-main {
	font-family: 'Arial Black', sans-serif;
	fill: none;
	stroke-width: 2;
	stroke-linejoin: round;
	animation: glow 2s ease-in-out infinite alternate;
	}

	.logo-text-outline {
	stroke: #ff00ff;
	stroke-width: 8;
	stroke-linejoin: round;
	fill: none;
	}

	.logo-text-fill {
	fill: url(#retroGradient);
	stroke: none;
	}

	.logo-text-shadow {
	fill: none;
	stroke: #00ffff;
	stroke-width: 2;
	filter: blur(3px);
	}

	.subtitle {
	font-family: 'Arial', sans-serif;
	fill: #00ffff;
	font-size: 20px;
	filter: drop-shadow(0 0 2px #00ffff);
	}

	@keyframes glow {
	from {
	filter: drop-shadow(0 0 2px #ff00ff)
	drop-shadow(0 0 4px #ff00ff)
	drop-shadow(0 0 6px #00ffff);
	}
	to {
	filter: drop-shadow(0 0 4px #ff00ff)
	drop-shadow(0 0 8px #ff00ff)
	drop-shadow(0 0 12px #00ffff);
	}
	}
	</style>

	<svg class="logo" viewBox="0 0 800 200" xmlns="http://www.w3.org/2000/svg">
	<defs>
	<linearGradient id="retroGradient" x1="0%" y1="0%" x2="0%" y2="100%">
	<stop offset="0%" style="stop-color:#ff00ff;stop-opacity:1" />
	<stop offset="50%" style="stop-color:#ff71ce;stop-opacity:1" />
	<stop offset="100%" style="stop-color:#b967ff;stop-opacity:1" />
	</linearGradient>
	<filter id="chrome">
	<feGaussianBlur in="SourceAlpha" stdDeviation="2" result="blur" />
	<feOffset in="blur" dx="2" dy="2" result="offsetBlur" />
	<feMerge>
	<feMergeNode in="offsetBlur" />
	<feMergeNode in="SourceGraphic" />
	</feMerge>
	</filter>
	</defs>

	<!-- Main text with effects -->
	<g transform="translate(400,100)" text-anchor="middle">
	<!-- Shadow layer -->
	<text class="logo-text-main logo-text-shadow"
	x="-100" y="0" font-size="80px">Proteus</text>

	<!-- Outline layer -->
	<text class="logo-text-main logo-text-outline"
	x="-100" y="0" font-size="80px">Proteus</text>

	<!-- Gradient fill layer -->
	<text class="logo-text-main logo-text-fill"
	x="-100" y="0" font-size="80px">Proteus</text>

	<!-- Sigma symbol -->
	<text x="120" y="0"
	font-size="80px"
	fill="#00ffff"
	filter="url(#chrome)">Σ</text>

	<!-- Subtitle -->
	<text class="subtitle" y="40">STABLE DIFFUSION XL</text>
	</g>

	<!-- Grid effect -->
	<path d="M0 180 L800 180" stroke="#ff00ff" stroke-width="1" opacity="0.5"/>
	<path d="M0 185 L800 185" stroke="#00ffff" stroke-width="1" opacity="0.3"/>
	<path d="M0 190 L800 190" stroke="#ff00ff" stroke-width="1" opacity="0.2"/>
	</svg>

	## Example Outputs

	<style>
	.gallery {
	display: flex;
	flex-direction: row;
	flex-wrap: wrap;
	gap: 10px;
	justify-content: center;
	align-items: center;
	width: 100%;
	padding: 10px;
	}

	.gallery-item {
	flex: 0 0 300px;
	margin: 0;
	position: relative;
	}

	.gallery-item.large { /* New class for larger item */
	flex: 0 0 340px;
	}

	.gallery img {
	width: 300px;
	cursor: pointer;
	transition: transform 0.2s;
	border-radius: 8px;
	}

	.gallery-item.large img { /* Larger size for last image */
	width: 512px;
	}

	.gallery img:hover {
	transform: scale(1.05);
	}

	.caption {
	position: absolute;
	bottom: 0;
	left: 0;
	right: 0;
	background: rgba(0, 0, 0, 0.4);
	color: white;
	padding: 8px;
	font-size: 11px;
	border-bottom-left-radius: 8px;
	border-bottom-right-radius: 8px;
	opacity: 0.7;
	transition: opacity 0.3s ease;
	}

	.gallery-item:hover .caption {
	opacity: 0.2;
	}

	.modal {
	display: none;
	position: fixed;
	z-index: 1000;
	top: 0;
	left: 0;
	width: 100%;
	height: 100%;
	background-color: rgba(0,0,0,0.9);
	padding: 20px;
	box-sizing: border-box;
	}

	.modal img {
	max-width: 90%;
	max-height: 90vh;
	margin: auto;
	display: block;
	position: relative;
	top: 50%;
	transform: translateY(-50%);
	}

	.modal.active {
	display: block;
	}
	</style>

	<div class="gallery">
	<div class="gallery-item">
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example.png" alt="Example Output 1" onclick="showImage(this.src)"/>
	<div class="caption">A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.</div>
	</div>
	<div class="gallery-item">
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example2.png" alt="Example Output 2" onclick="showImage(this.src)"/>
	<div class="caption">A Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room, real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1.2 ISO100 35MM</div>
	</div>
	<div class="gallery-item">
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example3.png" alt="Example Output 3" onclick="showImage(this.src)"/>
	</div>
	<div class="gallery-item">
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example4.png" alt="Example Output 4" onclick="showImage(this.src)"/>
	</div>
	<div class="gallery-item large"> <!-- Added 'large' class -->
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example5.png" alt="Example Output 5" onclick="showImage(this.src)"/>
	</div>
	</div>

	<div class="modal" onclick="this.classList.remove('active')">
	<img id="modal-img" src="" alt="Full size image"/>
	</div>

	<script>
	function showImage(src) {
	document.getElementById('modal-img').src = src;
	document.querySelector('.modal').classList.add('active');
	}
	</script>


	# Combined Proteus and Mobius datasets.

	# Recommended Inference Parameters


	[ComfyUI workflow](https://huggingface.co/dataautogpt3/sdxl-ztsnr-sigma-10k/blob/main/ComfyUI-test10k.json)

	"sampler": "euler_ancestral", # Best results with Euler Ancestral

	"scheduler": "normal", # Normal noise schedule

	"steps": 28, # Optimal step count

	"cfg": 7.5 # Classifier-free guidance scale

	## Model Details

	- Model Type: SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements
	- Base Model: stabilityai/stable-diffusion-xl-base-1.0
	- Training Dataset: 10,000 high-quality images
	- License: Apache 2.0

	## Key Features

	- Zero Terminal SNR (ZTSNR) implementation
	- Increased σ_max ≈ 20000.0 (NovelAI research)
	- High-resolution coherence enhancements
	- Tag-based CLIP weighting
	- VAE improvements

	### Technical Specifications

	- Noise Schedule: σ_max ≈ 20000.0 to σ_min ≈ 0.0292
	- Progressive Steps: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292]
	- Resolution Scaling: √(H×W)/1024

	## Training Details

	### Training Configuration
	- Learning Rate: 1e-6
	- Batch Size: 1
	- Gradient Accumulation Steps: 1
	- Optimizer: AdamW
	- Precision: bfloat16
	- VAE Finetuning: Enabled
	- VAE Learning Rate: 1e-6

	### CLIP Weight Configuration
	- Character Weight: 1.5
	- Style Weight: 1.2
	- Quality Weight: 0.8
	- Setting Weight: 1.0
	- Action Weight: 1.1
	- Object Weight: 0.9


	## Performance Improvements

	- 47% fewer artifacts at σ < 5.0
	- Stable composition at σ > 12.4
	- 31% better detail consistency
	- Improved color accuracy
	- Enhanced dark tone reproduction

	## Repository and Resources

	- GitHub Repository: [SDXL-Training-Improvements](https://github.com/DataCTE/SDXL-Training-Improvements)
	- Training Code: Available in the repository
	- Documentation: [Implementation Details](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/README.md)
	- Issues and Support: [GitHub Issues](https://github.com/DataCTE/SDXL-Training-Improvements/issues)

	## Citation

	```bibtex
	@article{ossa2024improvements,
	title={Improvements to SDXL in NovelAI Diffusion V3},
	author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.},
	journal={arXiv preprint arXiv:2409.15997v2},
	year={2024}
	}
	```