Add files using upload-large-folder tool

Browse files

Files changed (3) hide show

README.md +227 -46
checkpoints/flux/flux1-dev-fp16.safetensors +3 -0
diffusion_models/flux/flux1-dev-fp16.safetensors +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,5 @@
 ---
 license: apache-2.0
 library_name: diffusers
@@ -7,98 +9,277 @@ tags:
   - flux
   - flux.1-dev
   - image-generation
-  - stable-diffusion
   - fp16
-  - full-precision
 base_model: black-forest-labs/FLUX.1-dev
 ---
-# FLUX.1-dev FP16 Model Collection
-This repository contains the FP16 (full precision) variant of the FLUX.1-dev text-to-image generation model. This streamlined collection includes only FP16 precision models optimized for quality.
 ## Model Description
-FLUX.1-dev is a state-of-the-art text-to-image generation model that produces high-quality images from text prompts. This FP16 collection provides the best quality output with full precision weights.
 ## Repository Contents
-**Total Size**: ~72GB
-### Diffusion Models
-- `diffusion_models/flux1-dev-fp16.safetensors` (23GB) - Main diffusion model
-- `checkpoints/flux1-dev-fp16.safetensors` (23GB) - Checkpoint format
-### Text Encoders
-- `text_encoders/clip_g.safetensors` (1.3GB) - CLIP-G text encoder
-- `text_encoders/clip_l.safetensors` (235MB) - CLIP-L text encoder
-- `text_encoders/clip-vit-large.safetensors` (1.6GB) - CLIP ViT-Large encoder
-- `text_encoders/t5xxl_fp16.safetensors` (9.2GB) - T5-XXL FP16 text encoder
-- `clip/t5xxl_fp16.safetensors` (9.2GB) - T5-XXL alternative path
-### Vision Models
-- `clip_vision/clip_vision_h.safetensors` (1.2GB) - CLIP Vision H model
 ## Hardware Requirements
-- **VRAM**: 16GB+ recommended for optimal performance
-- **Disk Space**: 72GB
-- **Precision**: FP16 (full precision)
-- **Memory**: 32GB+ system RAM recommended
-## Usage
 ```python
-from diffusers import FluxPipeline
 import torch
-# Load the FP16 model
-pipe = FluxPipeline.from_pretrained(
-    "path/to/flux-dev-fp16",
     torch_dtype=torch.float16
 )
 pipe.to("cuda")
 # Generate an image
 image = pipe(
-    prompt="a beautiful mountain landscape at sunset",
     num_inference_steps=50,
-    guidance_scale=7.5
 ).images[0]
 image.save("output.png")
 ```
-## Model Precision Trade-offs
-**FP16 (This Collection)**:
-- Best quality output
-- Full precision weights
-- Requires more VRAM (16GB+)
-- Slower inference compared to FP8
-- Recommended for: Quality-focused applications, professional use
-**Alternatives**:
-- FP8: ~50% smaller, faster inference, minimal quality loss
-- GGUF: Quantized variants for memory-constrained scenarios
 ## License
-This model is released under the Apache 2.0 license.
 ## Citation
 ```bibtex
-@software{flux1-dev,
   author = {Black Forest Labs},
-  title = {FLUX.1-dev},
   year = {2024},
-  publisher = {Hugging Face},
-  url = {https://huggingface.co/black-forest-labs/FLUX.1-dev}
 }
 ```
-## Model Card Contact
-For questions or issues with this model collection, please refer to the original FLUX.1-dev model card and repository.

+<!-- README Version: v1.0 -->
 ---
 license: apache-2.0
 library_name: diffusers
   - flux
   - flux.1-dev
   - image-generation
   - fp16
+  - diffusion
+  - stable-diffusion
 base_model: black-forest-labs/FLUX.1-dev
 ---
+# FLUX.1-dev FP16 Model Repository
+High-quality text-to-image generation model from Black Forest Labs in FP16 precision format. FLUX.1-dev delivers state-of-the-art image synthesis with exceptional prompt adherence, visual quality, and detail preservation.
 ## Model Description
+FLUX.1-dev is a 12 billion parameter rectified flow transformer capable of generating high-resolution images from text descriptions. This FP16 precision version maintains maximum quality with no quantization loss, ideal for professional workflows requiring the highest fidelity output.
+**Key Capabilities**:
+- Advanced text-to-image generation with complex prompt understanding
+- High-resolution output (up to 2048x2048 and beyond)
+- Excellent composition, lighting, and detail rendering
+- Strong prompt adherence and instruction following
+- Superior handling of text rendering within images
+- Support for various artistic styles and photorealistic generation
 ## Repository Contents
+This repository contains the complete FLUX.1-dev FP16 model organized by component type:
+```
+flux-dev-fp16/
+├── checkpoints/flux/
+│   └── flux1-dev-fp16.safetensors          (23 GB)  # Complete model checkpoint
+├── diffusion_models/flux/
+│   └── flux1-dev-fp16.safetensors          (23 GB)  # Diffusion model weights
+├── text_encoders/
+│   ├── clip_l.safetensors                   (235 MB) # CLIP-L text encoder
+│   ├── clip_g.safetensors                   (1.3 GB) # CLIP-G text encoder
+│   ├── clip-vit-large.safetensors          (1.6 GB) # CLIP ViT-Large encoder
+│   └── t5xxl_fp16.safetensors              (9.2 GB) # T5-XXL text encoder
+├── clip/
+│   └── t5xxl_fp16.safetensors              (9.2 GB) # T5-XXL encoder (alternate location)
+└── clip_vision/
+    └── clip_vision_h.safetensors           (1.2 GB) # CLIP vision encoder
+Total Repository Size: 72 GB
+```
+**Model Components**:
+- **Main Model**: `flux1-dev-fp16.safetensors` (23 GB) - Core diffusion transformer
+- **Text Encoders**: CLIP-L, CLIP-G, T5-XXL for advanced text understanding
+- **Vision Encoder**: CLIP vision model for image understanding capabilities
 ## Hardware Requirements
+**Minimum Requirements** (for basic inference):
+- **GPU**: NVIDIA RTX 4090 (24 GB VRAM) or equivalent
+- **RAM**: 32 GB system memory
+- **Storage**: 80 GB free disk space
+- **OS**: Windows 10/11, Linux (Ubuntu 20.04+)
+**Recommended Requirements** (for optimal performance):
+- **GPU**: NVIDIA A100 (40/80 GB VRAM) or RTX 6000 Ada
+- **RAM**: 64 GB system memory
+- **Storage**: NVMe SSD with 100+ GB free space
+- **OS**: Linux with CUDA 12.1+
+**Performance Notes**:
+- FP16 precision requires substantial VRAM (20+ GB for standard workflows)
+- Batch generation and high resolutions require additional memory
+- Consider FP8 or quantized versions for lower VRAM requirements
+- Generation time: ~10-30 seconds per image depending on hardware and resolution
+## Usage Examples
+### Basic Text-to-Image Generation (Diffusers)
 ```python
 import torch
+from diffusers import FluxPipeline
+# Load the FLUX.1-dev model
+pipe = FluxPipeline.from_single_file(
+    "E:/huggingface/flux-dev-fp16/checkpoints/flux/flux1-dev-fp16.safetensors",
     torch_dtype=torch.float16
 )
 pipe.to("cuda")
 # Generate an image
+prompt = "A serene mountain landscape at sunset, with dramatic clouds and golden light"
 image = pipe(
+    prompt=prompt,
     num_inference_steps=50,
+    guidance_scale=7.5,
+    height=1024,
+    width=1024
 ).images[0]
 image.save("output.png")
 ```
+### Advanced Generation with Text Encoders
+```python
+import torch
+from diffusers import FluxPipeline
+from transformers import CLIPTextModel, T5EncoderModel
+# Load text encoders separately for fine control
+text_encoder = CLIPTextModel.from_pretrained(
+    "E:/huggingface/flux-dev-fp16/text_encoders",
+    torch_dtype=torch.float16
+)
+text_encoder_2 = T5EncoderModel.from_pretrained(
+    "E:/huggingface/flux-dev-fp16/text_encoders",
+    subfolder="t5xxl_fp16",
+    torch_dtype=torch.float16
+)
+# Load FLUX pipeline with custom encoders
+pipe = FluxPipeline.from_single_file(
+    "E:/huggingface/flux-dev-fp16/checkpoints/flux/flux1-dev-fp16.safetensors",
+    text_encoder=text_encoder,
+    text_encoder_2=text_encoder_2,
+    torch_dtype=torch.float16
+)
+pipe.to("cuda")
+# Generate with advanced parameters
+image = pipe(
+    prompt="A highly detailed cyberpunk street scene with neon signs and rain",
+    negative_prompt="blurry, low quality, distorted",
+    num_inference_steps=75,
+    guidance_scale=8.0,
+    height=1536,
+    width=1024
+).images[0]
+image.save("cyberpunk_output.png")
+```
+### Memory-Efficient Generation
+```python
+import torch
+from diffusers import FluxPipeline
+# Enable memory optimizations
+pipe = FluxPipeline.from_single_file(
+    "E:/huggingface/flux-dev-fp16/checkpoints/flux/flux1-dev-fp16.safetensors",
+    torch_dtype=torch.float16
+)
+# Enable CPU offloading for lower VRAM usage
+pipe.enable_model_cpu_offload()
+# Enable attention slicing
+pipe.enable_attention_slicing(1)
+# Enable VAE slicing for high-resolution outputs
+pipe.enable_vae_slicing()
+# Generate image with optimizations
+image = pipe(
+    prompt="An artistic portrait with intricate details",
+    num_inference_steps=50,
+    height=1024,
+    width=1024
+).images[0]
+image.save("optimized_output.png")
+```
+## Model Specifications
+| Specification | Details |
+|--------------|---------|
+| **Architecture** | Rectified Flow Transformer |
+| **Parameters** | 12 billion |
+| **Precision** | FP16 (16-bit floating point) |
+| **Format** | SafeTensors |
+| **Base Resolution** | 1024x1024 (supports flexible resolutions) |
+| **Max Resolution** | 2048x2048+ (hardware dependent) |
+| **Text Encoders** | CLIP-L, CLIP-G, T5-XXL |
+| **Inference Steps** | 20-100 (50 recommended) |
+| **Guidance Scale** | 7.0-9.0 (7.5 recommended) |
+**Supported Features**:
+- Text-to-image generation
+- Complex prompt understanding
+- Multi-aspect ratio generation
+- Img2img workflows
+- Inpainting and outpainting
+- ControlNet compatibility
+- LoRA fine-tuning support
+## Performance Tips & Optimization
+**Speed Optimization**:
+- Use 20-30 inference steps for faster generation (slight quality trade-off)
+- Enable `xformers` or `torch.compile()` for attention optimization
+- Reduce guidance scale to 6.0-7.0 for faster convergence
+- Use lower resolutions (512x512, 768x768) for draft iterations
+**Memory Optimization**:
+- Enable CPU offloading: `pipe.enable_model_cpu_offload()`
+- Enable attention slicing: `pipe.enable_attention_slicing()`
+- Enable VAE slicing: `pipe.enable_vae_slicing()`
+- Use sequential CPU offload for extreme memory constraints
+- Consider switching to FP8 version for 50% memory reduction
+**Quality Optimization**:
+- Use 50-75 inference steps for maximum quality
+- Guidance scale 7.5-8.5 for strong prompt adherence
+- Add negative prompts to avoid common artifacts
+- Use higher resolutions (1536x1024, 2048x2048) for detail
+- Experiment with different samplers (DPM++, Euler a)
+**Workflow Optimization**:
+- Pre-load models at startup to avoid repeated loading
+- Batch generate similar prompts for efficiency
+- Cache text encoder outputs for prompt variations
+- Use FP16 mixed precision training for fine-tuning
 ## License
+FLUX.1-dev is licensed under the **Apache License 2.0**.
+**Usage Terms**:
+- Free for personal, research, and commercial use
+- Attribution to Black Forest Labs appreciated
+- No warranty provided, use at your own risk
+- See official license documentation for full terms
+**Ethical Use Guidelines**:
+- Do not generate harmful, illegal, or unethical content
+- Respect copyright and intellectual property
+- Follow platform-specific content policies
+- Consider social impact of generated media
 ## Citation
+If you use FLUX.1-dev in your research or projects, please cite:
 ```bibtex
+@software{flux1_dev_2024,
+  title = {FLUX.1-dev: High-Quality Text-to-Image Generation},
   author = {Black Forest Labs},
   year = {2024},
+  url = {https://huggingface.co/black-forest-labs/FLUX.1-dev},
+  note = {FP16 precision version}
 }
 ```
+## Links & Resources
+**Official Resources**:
+- Original Model: [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)
+- Black Forest Labs: [https://blackforestlabs.ai](https://blackforestlabs.ai)
+- Documentation: [FLUX.1 Technical Documentation](https://blackforestlabs.ai/docs)
+**Community & Support**:
+- Hugging Face Diffusers: [https://github.com/huggingface/diffusers](https://github.com/huggingface/diffusers)
+- Community Forum: [Hugging Face Forums](https://discuss.huggingface.co/)
+- ComfyUI Integration: [ComfyUI FLUX Nodes](https://github.com/comfyanonymous/ComfyUI)
+**Related Models**:
+- FLUX.1-schnell (Fast version): [black-forest-labs/FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell)
+- FLUX.1-dev FP8 (Memory efficient): Available in sibling repository
+---
+**Model Version**: FLUX.1-dev
+**Precision**: FP16
+**Repository Version**: v1.0
+**Last Updated**: 2025-10-13

checkpoints/flux/flux1-dev-fp16.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4610115bb0c89560703c892c59ac2742fa821e60ef5871b33493ba544683abd7
+size 23802932552

diffusion_models/flux/flux1-dev-fp16.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4610115bb0c89560703c892c59ac2742fa821e60ef5871b33493ba544683abd7
+size 23802932552