ProteusSigma / README.md
dataautogpt3's picture
Update README.md
1a60e3d verified
|
raw
history blame
8.98 kB
metadata
license: apache-2.0
language:
  - en
base_model:
  - stabilityai/stable-diffusion-xl-base-1.0
pipeline_tag: text-to-image
tags:
  - art

SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements

  • 10k dataset proof of concept (completed)link

  • 200k+ dataset finetune (in testing/training)

  • 12M million dataset finetune (planned)

<!-- Main text with effects -->
<g transform="translate(400,100)" text-anchor="middle">
    <!-- Shadow layer -->
    <text class="logo-text-main logo-text-shadow" 
          x="-100" y="0" font-size="80px">Proteus</text>
    
    <!-- Outline layer -->
    <text class="logo-text-main logo-text-outline" 
          x="-100" y="0" font-size="80px">Proteus</text>
    
    <!-- Gradient fill layer -->
    <text class="logo-text-main logo-text-fill" 
          x="-100" y="0" font-size="80px">Proteus</text>
          
    <!-- Sigma symbol -->
    <text x="120" y="0" 
          font-size="80px" 
          fill="#00ffff" 
          filter="url(#chrome)">Σ</text>
    
    <!-- Subtitle -->
    <text class="subtitle" y="40">STABLE DIFFUSION XL</text>
</g>

<!-- Grid effect -->
<path d="M0 180 L800 180" stroke="#ff00ff" stroke-width="1" opacity="0.5"/>
<path d="M0 185 L800 185" stroke="#00ffff" stroke-width="1" opacity="0.3"/>
<path d="M0 190 L800 190" stroke="#ff00ff" stroke-width="1" opacity="0.2"/>

Example Outputs

Combined Proteus and Mobius datasets.

Recommended Inference Parameters

ComfyUI workflow

"sampler": "euler_ancestral", # Best results with Euler Ancestral

"scheduler": "normal", # Normal noise schedule

"steps": 28, # Optimal step count

"cfg": 7.5 # Classifier-free guidance scale

Model Details

  • Model Type: SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements
  • Base Model: stabilityai/stable-diffusion-xl-base-1.0
  • Training Dataset: 10,000 high-quality images
  • License: Apache 2.0

Key Features

  • Zero Terminal SNR (ZTSNR) implementation
  • Increased σ_max ≈ 20000.0 (NovelAI research)
  • High-resolution coherence enhancements
  • Tag-based CLIP weighting
  • VAE improvements

Technical Specifications

  • Noise Schedule: σ_max ≈ 20000.0 to σ_min ≈ 0.0292
  • Progressive Steps: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292]
  • Resolution Scaling: √(H×W)/1024

Training Details

Training Configuration

  • Learning Rate: 1e-6
  • Batch Size: 1
  • Gradient Accumulation Steps: 1
  • Optimizer: AdamW
  • Precision: bfloat16
  • VAE Finetuning: Enabled
  • VAE Learning Rate: 1e-6

CLIP Weight Configuration

  • Character Weight: 1.5
  • Style Weight: 1.2
  • Quality Weight: 0.8
  • Setting Weight: 1.0
  • Action Weight: 1.1
  • Object Weight: 0.9

Performance Improvements

  • 47% fewer artifacts at σ < 5.0
  • Stable composition at σ > 12.4
  • 31% better detail consistency
  • Improved color accuracy
  • Enhanced dark tone reproduction

Repository and Resources

Citation

@article{ossa2024improvements,
  title={Improvements to SDXL in NovelAI Diffusion V3},
  author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.},
  journal={arXiv preprint arXiv:2409.15997v2},
  year={2024}
}