dalle-mini / README.md
boris's picture
doc: explain the logo
d8a2192
metadata
title: DALL·E mini
emoji: 🥑
colorFrom: red
colorTo: purple
sdk: streamlit
app_file: app/app.py
pinned: false

DALL·E Mini

Generate images from a text prompt

Our logo was generated with DALL·E mini using the prompt "logo of an armchair in the shape of an avocado".

You can create your own pictures with the demo (temporarily in beta on Huging Face Spaces but soon to be open to all).

How does it work?

Refer to our report.

Where does the logo come from?

The "armchair in the shape of an avocado" was used by OpenAI when releasing DALL·E to illustrate the model's capabilities. Having successful predictions on this prompt represents a big milestone to us.

Development

This section is for the adventurous people wanting to look into the code.

Dependencies Installation

The root folder and associated requirements.txt is only for the app.

You will find necessary requirements in each sub-section.

You should create a new python virtual environment and install the project dependencies inside the virtual env. You need to use the -f (--find-links) option for pip to be able to find the appropriate libtpu required for the TPU hardware.

Adapt the installation to your own hardware and follow library installation instructions.

$ pip install -r requirements.txt -f https://storage.googleapis.com/jax-releases/libtpu_releases.html

If you use conda, you can create the virtual env and install everything using: conda env update -f environments.yaml

Training of VQGAN

The VQGAN was trained using taming-transformers.

We recommend using the latest version available.

Conversion of VQGAN to JAX

Use patil-suraj/vqgan-jax.

Training of Seq2Seq

Refer to dev/seq2seq folder.

You can also adjust the sweep configuration file if you need to perform a hyperparameter search.

Inference

Refer to dev/notebooks/demo.

Authors

Acknowledgements