--- license: openrail metrics: - accuracy - bertscore - bleu - bleurt - brier_score - cer - character - charcut_mt - chrf - code_eval tags: - text-to-image - sygil-devs - Muse - Sygil-Muse --- # Model Card for Model ID This model is based in [Muse](https://muse-model.github.io/) and trained using [`lucidrains/muse-maskgit-pytorch`](https://github.com/lucidrains/muse-maskgit-pytorch). # Model Details This model is a new model trained from scratch based on [Muse](https://muse-model.github.io/), trained on the [Imaginary Network Expanded Dataset](https://github.com/Sygil-Dev/INE-dataset), with the big advantage of allowing the use of multiple namespaces (labeled tags) to control various parts of the final generation. The use of namespaces (eg. “species:seal” or “studio:dc”) stops the model from misinterpreting a seal as the singer Seal, or DC Comics as Washington DC. Note: As of right now, only the first VAE has been trained, we still need to train the Base and Super Resolution VAE for the model to be usable. If you find my work useful, please consider supporting me on [GitHub Sponsors](https://github.com/sponsors/ZeroCool940711)! This model is still in its infancy and it's meant to be constantly updated and trained with more and more data as time goes by, so feel free to give us feedback on our [Discord Server](https://discord.gg/UjXFsf6mTu) or on the discussions section on huggingface. We plan to improve it with more, better tags in the future, so any help is always welcome. [![Join the Discord Server](https://badgen.net/discord/members/fTtcufxyHQ?icon=discord)](https://discord.gg/UjXFsf6mTu) ## Available Checkpoints: - #### Stable: - No stable version available right now. - #### Beta: - [vae.2410000.pt](https://huggingface.co/Sygil/Sygil-Muse/blob/main/vae.2410000.pt): Trained from scratch for 2.41M steps - [maskgit.1206000.p](https://huggingface.co/Sygil/Sygil-Muse/blob/main/maskgit.1206000.pt): Maskgit trained from the VAE for 1.20M steps Note: Checkpoints under the Beta section are updated daily or at least 3-4 times a week. This is usually the equivalent of 1-2 training session, this is done until they are stable enough to be moved into a proper release, usually every 1 or 2 weeks. While the beta checkpoints can be used as they are only the latest version is kept on the repo and the older checkpoints are removed when a new one is uploaded to keep the repo clean. ## Training **Training Data**: The model was trained on the following dataset: - [Imaginary Network Expanded Dataset](https://github.com/Sygil-Dev/INE-dataset) dataset. **Hardware and others** - **Hardware:** 1 x Nvidia RTX 3090 GPU - **Hours Trained:** NaN. - **Gradient Accumulations**: 1 - **Batch:** 1 - **Learning Rate:** 7e-08 - **Warmup Steps:** 10,000 - **Resolution/Image Size**: First trained at a resolution of 64x64 and then increased to 512x512. Check the notes down below for more details on this. - **Dimension:** 256 - **vq_codebook_size:** 256 - **Total Training Steps:** 2,410,000 Note: On Muse we can change the image_size or resolution at any time without having to train the model from scratch again, this allows us to first train the model at low resolution using the same `dim` and `vq_codebook_size` to train faster and then we can increase the `image_size` and use a higher resolution once the model has trained enough. Developed by: [ZeroCool](https://github.com/ZeroCool940711) at [Sygil-Dev](https://github.com/Sygil-Dev/) ## Community Contributions: - [Benjamin Trom (limiteinductive)](https://huggingface.co/limiteinductive): Thanks for providing us with extra compute to speed up the training. - [Chad Kensington (isamu isozaki)](https://github.com/isamu-isozaki/muse-maskgit-pytorch): Thanks for helping with the training scripts and improving the code for Muse. # License This model is open access and available to all, with a CreativeML Open RAIL++-M License further specifying rights and usage.