--- language: - en license: apache-2.0 library_name: transformers tags: - text-to-image - image-to-image - anime datasets: - Xiao215/pixiv-image-with-caption metrics: - Custom (Text-to-Image Quality) model-index: - name: LoRAniDiff results: - task: name: Text-to-Image Generation type: text-to-image-generation dataset: name: "Pixiv Image with Caption" type: Xiao215/pixiv-image-with-caption metrics: - name: Custom (Text-to-Image Quality) type: custom value: "To be evaluated" --- # LoRAniDiff Model Card ## Model Details - **Model Name**: LoRAniDiff - **Model Type**: This model is based on the stable diffusion architecture, fine-tuned with LoRA (Low-Rank Adaptation) for targeted improvements. - **Training Data**: LoRAniDiff was fine-tuned on the "Pixiv Image with Caption" dataset available at [pixiv-image-with-caption](https://huggingface.co/datasets/Xiao215/pixiv-image-with-caption). ## Intended Use LoRAniDiff is crafted for the generation of anime-style artwork through text-to-image and image-to-image transformations. It's designed to serve enthusiasts and creators in the anime community, facilitating the exploration of creative ideas and artistic expressions. ### Use Restrictions This model is intended for **non-commercial use only** and should be utilized as a tool for fun, personal projects, and artistic exploration within the anime domain. ## Primary Applications - **Text-to-Image Generation**: Transform descriptive text into detailed anime-style artwork. - **Image-to-Image Translation**: Adapt existing images to new contexts or concepts described by text, staying within the anime art style. ## Model Architecture LoRAniDiff utilizes the stable diffusion architecture, renowned for its capacity to generate detailed images from textual descriptions. The application of LoRA in fine-tuning enables the model to specialize in producing anime-style imagery, distinguishing it in the field of creative AI. ## Training Procedure The fine-tuning process was conducted on the "Pixiv Image with Caption" dataset, employing LoRA to selectively adjust the model's parameters. This approach allows LoRAniDiff to inherit the base model's generative capabilities while honing its focus on anime-style content creation. ## Limitations and Biases Users should note the model's output may inherently reflect the biases and artistic styles present in the training dataset. While LoRAniDiff excels in anime-style image synthesis, its performance and style adherence might vary significantly with prompts outside this domain. ## Ethical Considerations LoRAniDiff should be utilized with respect for artistic integrity and copyright norms. Creators are urged to consider the implications of AI-generated art and to avoid producing content that could be harmful or offensive. ## Licensing and Citation LoRAniDiff is made available for non-commercial use to foster creativity and innovation. For academic or project use, please cite the model appropriately: ``` @misc{rombach2021highresolution, title={High-Resolution Image Synthesis with Latent Diffusion Models}, author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer}, year={2021}, eprint={2112.10752}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```