schrum2's picture
Update README.md
ed60c13 verified
metadata
license: mit

Details on the code used to produce and use this model are available at:

https://github.com/schrum2/MarioDiffusion

That repo has instructions to check out this model and apply it to the generation of Super Mario Bros. level scenes. There is also an interactive GUI for constructing complete levels out of model-generated scenes.

This model makes use of https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1 as a text embedding model for use with diffusion to generate Mario levels. Mario scene captions contain multiple period-separated phrases, and this model embeds each phrase with its own sentence embedding vector for the diffusion model to use as text conditioning.

This model makes use of absence style captions, which can be difficult to use correctly. To see a model using sentence-transformers/multi-qa-MiniLM-L6-cos-v1 that uses regular captions with multiple sentence embeddings, see https://huggingface.co/schrum2/MarioDiffusion-MiniLM-multiple-regular0. For a model that embeds the entire caption as a single vector, see https://huggingface.co/schrum2/MarioDiffusion-MiniLM-single-regular0. To see a model that uses a simple token-based transformer model for text embedding, see https://huggingface.co/schrum2/MarioDiffusion-MLM-regular0.