Add pipeline tag, library name, project page and sample usage
Browse filesThis PR enhances the model card by:
- Adding `pipeline_tag: unconditional-image-generation` for better discoverability of the model's functionality.
- Specifying `library_name: transformers` to enable the "how to use" widget, as demonstrated by the provided inference code in the official GitHub repository.
- Including a link to the project website: `https://imagination-research.github.io/distilled-decoding`.
- Adding a "Sample Usage" section directly from the official GitHub repository for clear, actionable instructions on how to get started with the model.
README.md
CHANGED
|
@@ -1,6 +1,9 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
| 4 |
# Model Card for Distilled Decoding
|
| 5 |
|
| 6 |
## Model Details
|
|
@@ -34,6 +37,7 @@ We may release the text-to-image distilled decoding models in the future.
|
|
| 34 |
### Model Sources
|
| 35 |
* Repository: https://huggingface.co/microsoft/distilled_decoding
|
| 36 |
* Paper: https://arxiv.org/abs/2412.17153
|
|
|
|
| 37 |
|
| 38 |
### Red Teaming
|
| 39 |
Our models generate images based on predefined categories from ImageNet. Some of the ImageNet categories contain sensitive names such as "assault rifle". This test is designed to assess if the model could produce sensitive images from such categories.
|
|
@@ -76,6 +80,30 @@ These models are trained to mimic the generation quality of pretrained VAR and L
|
|
| 76 |
### Recommendations
|
| 77 |
While these models are designed to generate images in one-step, they also support multi-step sampling to enhance image quality. When the one-step sampling quality is not satisfactory, users are recommended to use enable multi-step sampling.
|
| 78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
## How to Get Started with the Model
|
| 80 |
|
| 81 |
Please see the GitHub repo for instructions: https://github.com/microsoft/distilled_decoding
|
|
@@ -121,4 +149,4 @@ Overall, the results demonstrate that our Distilled Decoding models are able to
|
|
| 121 |
## Model Card Contact
|
| 122 |
We welcome feedback and collaboration from our audience. If you have suggestions, questions, or observe unexpected/offensive behavior in our technology, please contact us at Zinan Lin, zinanlin@microsoft.com.
|
| 123 |
|
| 124 |
-
If the team receives reports of undesired behavior or identifies issues independently,
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
pipeline_tag: unconditional-image-generation
|
| 4 |
+
library_name: transformers
|
| 5 |
---
|
| 6 |
+
|
| 7 |
# Model Card for Distilled Decoding
|
| 8 |
|
| 9 |
## Model Details
|
|
|
|
| 37 |
### Model Sources
|
| 38 |
* Repository: https://huggingface.co/microsoft/distilled_decoding
|
| 39 |
* Paper: https://arxiv.org/abs/2412.17153
|
| 40 |
+
* Project Page: https://imagination-research.github.io/distilled-decoding
|
| 41 |
|
| 42 |
### Red Teaming
|
| 43 |
Our models generate images based on predefined categories from ImageNet. Some of the ImageNet categories contain sensitive names such as "assault rifle". This test is designed to assess if the model could produce sensitive images from such categories.
|
|
|
|
| 80 |
### Recommendations
|
| 81 |
While these models are designed to generate images in one-step, they also support multi-step sampling to enhance image quality. When the one-step sampling quality is not satisfactory, users are recommended to use enable multi-step sampling.
|
| 82 |
|
| 83 |
+
## Sample Usage
|
| 84 |
+
|
| 85 |
+
You can use the `transformers` library to load and generate images with the model:
|
| 86 |
+
|
| 87 |
+
```python
|
| 88 |
+
import torch
|
| 89 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 90 |
+
from distilled_decoding.models.modeling_var_dd import VAR_DD
|
| 91 |
+
|
| 92 |
+
# Load DD model
|
| 93 |
+
model_name = "microsoft/distilled_decoding"
|
| 94 |
+
model = VAR_DD.from_pretrained(
|
| 95 |
+
"microsoft/distilled_decoding",
|
| 96 |
+
subfolder="VAR-DD-d16",
|
| 97 |
+
torch_dtype=torch.float16
|
| 98 |
+
).cuda()
|
| 99 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, subfolder="tokenizer")
|
| 100 |
+
|
| 101 |
+
# Generate ImageNet image (class-conditional)
|
| 102 |
+
labels = torch.tensor([483]).cuda() # Golden retriever label
|
| 103 |
+
generated_img = model.generate(labels=labels, num_inference_steps=1)
|
| 104 |
+
generated_img.save("golden_retriever.png")
|
| 105 |
+
```
|
| 106 |
+
|
| 107 |
## How to Get Started with the Model
|
| 108 |
|
| 109 |
Please see the GitHub repo for instructions: https://github.com/microsoft/distilled_decoding
|
|
|
|
| 149 |
## Model Card Contact
|
| 150 |
We welcome feedback and collaboration from our audience. If you have suggestions, questions, or observe unexpected/offensive behavior in our technology, please contact us at Zinan Lin, zinanlin@microsoft.com.
|
| 151 |
|
| 152 |
+
If the team receives reports of undesired behavior or identifies issues independently, we will update this repository with appropriate mitigations.
|