yujiepan
/

FLUX.1-dev-tiny-random

Text-to-Image

Diffusers

Safetensors

FluxPipeline

Inference Endpoints

Model card Files Files and versions Community

yujiepan commited on Dec 1, 2024

Commit

98efed6

verified ·

1 Parent(s): 29d660f

Update README.md

Browse files

Files changed (1) hide show

README.md +156 -194

README.md CHANGED Viewed

@@ -2,197 +2,159 @@
 library_name: diffusers
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🧨 diffusers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 library_name: diffusers
 ---
+# yujiepan/stable-diffusion-3-tiny-random
+This pipeline is intended from debugging. It is adapted from [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) with smaller size and randomly initialized parameters.
+## Usage
+```python
+import torch
+from diffusers import FluxPipeline
+pipe = FluxPipeline.from_pretrained("yujiepan/FLUX.1-dev-tiny-random", torch_dtype=torch.bfloat16)
+pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
+prompt = "A cat holding a sign that says hello world"
+image = pipe(
+    prompt,
+    height=1024,
+    width=1024,
+    guidance_scale=3.5,
+    num_inference_steps=50,
+    max_sequence_length=512,
+    generator=torch.Generator("cpu").manual_seed(0)
+).images[0]
+# image.save("flux-dev.png")
+```
+## Codes
+```python
+import importlib
+import torch
+import transformers
+import diffusers
+import rich
+def get_original_model_configs(
+    pipeline_cls: type[diffusers.FluxPipeline],
+    pipeline_id: str
+):
+    pipeline_config: dict[str, list[str]] = \
+        pipeline_cls.load_config(pipeline_id)
+    model_configs = {}
+    for subfolder, import_strings in pipeline_config.items():
+        if subfolder.startswith("_"):
+            continue
+        module = importlib.import_module(".".join(import_strings[:-1]))
+        cls = getattr(module, import_strings[-1])
+        if issubclass(cls, transformers.PreTrainedModel):
+            config_class: transformers.PretrainedConfig = cls.config_class
+            config = config_class.from_pretrained(
+                pipeline_id, subfolder=subfolder)
+            model_configs[subfolder] = config
+        elif issubclass(cls, diffusers.ModelMixin) and issubclass(cls, diffusers.ConfigMixin):
+            config = cls.load_config(pipeline_id, subfolder=subfolder)
+            model_configs[subfolder] = config
+        elif subfolder in ['scheduler', 'tokenizer', 'tokenizer_2', 'tokenizer_3']:
+            pass
+        else:
+            raise NotImplementedError(f"unknown {subfolder}: {import_strings}")
+    return model_configs
+def load_pipeline(pipeline_cls: type[diffusers.DiffusionPipeline], pipeline_id: str, model_configs: dict[str, dict]):
+    pipeline_config: dict[str, list[str]
+                          ] = pipeline_cls.load_config(pipeline_id)
+    components = {}
+    for subfolder, import_strings in pipeline_config.items():
+        if subfolder.startswith("_"):
+            continue
+        module = importlib.import_module(".".join(import_strings[:-1]))
+        cls = getattr(module, import_strings[-1])
+        print(f"Loading:", ".".join(import_strings))
+        if issubclass(cls, transformers.PreTrainedModel):
+            config = model_configs[subfolder]
+            component = cls(config)
+        elif issubclass(cls, transformers.PreTrainedTokenizerBase):
+            component = cls.from_pretrained(pipeline_id, subfolder=subfolder)
+        elif issubclass(cls, diffusers.ModelMixin) and issubclass(cls, diffusers.ConfigMixin):
+            config = model_configs[subfolder]
+            component = cls.from_config(config)
+        elif issubclass(cls, diffusers.SchedulerMixin) and issubclass(cls, diffusers.ConfigMixin):
+            component = cls.from_pretrained(pipeline_id, subfolder=subfolder)
+        else:
+            raise (f"unknown {subfolder}: {import_strings}")
+        components[subfolder] = component
+        if 'transformer' in component.__class__.__name__.lower():
+            print(component)
+    pipeline = pipeline_cls(**components)
+    return pipeline
+def get_pipeline():
+    torch.manual_seed(42)
+    pipeline_id = "black-forest-labs/FLUX.1-dev"
+    pipeline_cls = diffusers.FluxPipeline
+    model_configs = get_original_model_configs(pipeline_cls, pipeline_id)
+    HIDDEN_SIZE = 8
+    model_configs["text_encoder"].hidden_size = HIDDEN_SIZE
+    model_configs["text_encoder"].intermediate_size = HIDDEN_SIZE * 2
+    model_configs["text_encoder"].num_attention_heads = 2
+    model_configs["text_encoder"].num_hidden_layers = 2
+    model_configs["text_encoder"].projection_dim = HIDDEN_SIZE
+    model_configs["text_encoder_2"].d_model = HIDDEN_SIZE
+    model_configs["text_encoder_2"].d_ff = HIDDEN_SIZE * 2
+    model_configs["text_encoder_2"].d_kv = HIDDEN_SIZE // 2
+    model_configs["text_encoder_2"].num_heads = 2
+    model_configs["text_encoder_2"].num_layers = 2
+    model_configs["transformer"]["num_layers"] = 2
+    model_configs["transformer"]["num_single_layers"] = 4
+    model_configs["transformer"]["num_attention_heads"] = 2
+    model_configs["transformer"]["attention_head_dim"] = HIDDEN_SIZE
+    model_configs["transformer"]["pooled_projection_dim"] = HIDDEN_SIZE
+    model_configs["transformer"]["joint_attention_dim"] = HIDDEN_SIZE
+    model_configs["transformer"]["axes_dims_rope"] = (4, 2, 2)
+    # model_configs["transformer"]["caption_projection_dim"] = HIDDEN_SIZE
+    model_configs["vae"]["layers_per_block"] = 1
+    model_configs["vae"]["block_out_channels"] = [HIDDEN_SIZE] * 4
+    model_configs["vae"]["norm_num_groups"] = 2
+    model_configs["vae"]["latent_channels"] = 16
+    pipeline = load_pipeline(pipeline_cls, pipeline_id, model_configs)
+    return pipeline
+pipe = get_pipeline()
+pipe = pipe.to(torch.bfloat16)
+from pathlib import Path
+save_folder = '/tmp/yujiepan/FLUX.1-dev-tiny-random'
+Path(save_folder).mkdir(parents=True, exist_ok=True)
+pipe.save_pretrained(save_folder)
+pipe = diffusers.FluxPipeline.from_pretrained(save_folder, torch_dtype=torch.bfloat16)
+pipe.enable_model_cpu_offload()
+prompt = "A cat holding a sign that says hello world"
+image = pipe(
+    prompt,
+    height=1024,
+    width=1024,
+    guidance_scale=3.5,
+    num_inference_steps=50,
+    max_sequence_length=512,
+    generator=torch.Generator("cpu").manual_seed(0)
+).images[0]
+configs = get_original_model_configs(diffusers.FluxPipeline, save_folder)
+rich.print(configs)
+pipe.push_to_hub(save_folder.removeprefix('/tmp/'))
+```