boris meg HF staff commited on
Commit
5570ace
1 Parent(s): d37887a

Creating initial model card (#24)

Browse files

- Creating initial model card (40990509670fb82e67beffb03f11a26191ea4436)


Co-authored-by: Margaret Mitchell <meg@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +42 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+
3
+ # Model Card: DALL·E Mini
4
+
5
+ This model is a reproduction of OpenAI’s DALL·E. Please see [this link](https://wandb.ai/dalle-mini/dalle-mini/reports/DALL-E-Mini-Explained-with-Demo--Vmlldzo4NjIxODA) for project-specific details. Below, we include the original DALL·E model card available on [the OpenAI github](https://github.com/openai/DALL-E/edit/master/model_card.md).
6
+
7
+ ## Model Details
8
+
9
+ The dVAE was developed by researchers at OpenAI to reduce the memory footprint of the transformer trained on the
10
+ text-to-image generation task. The details involved in training the dVAE are described in [the paper][dalle_paper]. This
11
+ model card describes the first version of the model, released in February 2021. The model consists of a convolutional
12
+ encoder and decoder whose architectures are described [here](dall_e/encoder.py) and [here](dall_e/decoder.py), respectively.
13
+ For questions or comments about the models or the code release, please file a Github issue.
14
+
15
+ ## Model Use
16
+
17
+ ### Intended Use
18
+
19
+ The model is intended for others to use for training their own generative models.
20
+
21
+ ### Out-of-Scope Use Cases
22
+
23
+ This model is inappropriate for high-fidelity image processing applications. We also do not recommend its use as a
24
+ general-purpose image compressor.
25
+
26
+ ## Training Data
27
+
28
+ The model was trained on publicly available text-image pairs collected from the internet. This data consists partly of
29
+ [Conceptual Captions][cc] and a filtered subset of [YFCC100M][yfcc100m]. We used a subset of the filters described in
30
+ [Sharma et al.][cc_paper] to construct this dataset; further details are described in [our paper][dalle_paper]. We will
31
+ not be releasing the dataset.
32
+
33
+ ## Performance and Limitations
34
+
35
+ The heavy compression from the encoding process results in a noticeable loss of detail in the reconstructed images. This
36
+ renders it inappropriate for applications that require fine-grained details of the image to be preserved.
37
+
38
+ [dalle_paper]: https://arxiv.org/abs/2102.12092
39
+ [cc]: https://ai.google.com/research/ConceptualCaptions
40
+ [cc_paper]: https://www.aclweb.org/anthology/P18-1238/
41
+ [yfcc100m]: http://projects.dfki.uni-kl.de/yfcc100m/
42
+