|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
# Ssebowa-Imagen |
|
Ssebowa-Imagen is an open-source image synthesis model that utilizes a combination of diffusion modeling and generative adversarial networks (GANs) to generate high-quality images from text descriptions. It leverages a 100 billion dataset of images and text descriptions, enabling it to accurately capture the nuances of real-world imagery and effectively translate text descriptions into compelling visual representations. |
|
|
|
## Features |
|
|
|
Ssebowa-Imagen boasts several compelling features, including: |
|
- Diffusion Modeling: |
|
Ssebowa-Imagen utilizes diffusion modeling to progressively refine noisy images into high-quality, photorealistic outputs. This approach allows for a more controlled and deterministic image generation process. |
|
|
|
- Generative Adversarial Networks (GANs): |
|
Ssebowa-Imagen employs GANs to enhance the realism and diversity of generated images. GANs pit two neural networks against each other, forcing the generator to produce images that are both realistic and indistinguishable from real-world images. |
|
- Large Dataset Training: |
|
Ssebowa-Imagen is trained on a massive dataset of over 100 billion images and text descriptions. This extensive dataset enables the model to learn intricate patterns and relationships between images and their textual descriptions, leading to more accurate and creative image generation. |
|
- Multimodal Capabilities: |
|
Ssebowa-Imagen can handle a wide range of input modalities, including text descriptions, sketches, and existing images, providing flexibility in image generation. |
|
- Creative Control: |
|
Ssebowa-Imagen offers fine-tuning options, allowing users to control various aspects of the generated images, such as style, composition, and lighting, enabling personalized artistic expression. |
|
|
|
## Benefits |
|
Ssebowa-Imagen offers several advantages, including: |
|
- Enhanced Image Quality: |
|
Ssebowa-Imagen produces high-resolution images with intricate details and realistic textures, surpassing the image quality achievable by either diffusion modeling or GANs alone. |
|
- Varied Artistic Styles: |
|
Ssebowa-Imagen can generate images in various artistic styles, from realistic portraits to abstract art, catering to diverse creative needs. |
|
- Personalized Image Generation: |
|
Ssebowa-Imagen's fine-tuning capabilities allow users to exert precise control over the generated images, enabling them to create personalized and unique visual representations. |
|
- Creative Exploration: |
|
Ssebowa-Imagen empowers users to explore their creativity and experiment with various image styles, concepts, and compositions. |
|
|
|
## Usage |
|
To use Ssebowa-imagen you have to first install the required libraries, you can do so by following this command, |
|
```bash |
|
git clone https://github.com/huggingface/diffusers |
|
cd diffusers |
|
pip install . |
|
``` |
|
## Installing Ssebowa with pip |
|
To install Ssebowa-Imagen, you will first install ssebowa using pip command below: |
|
|
|
```bash |
|
pip install ssebowa |
|
``` |
|
|
|
Once Ssebowa is installed, you can import it into your Python code and start generating images. |
|
|
|
## Using Ssebowa-Imagen |
|
To generate an image from a text description, you can use the following code: |
|
```bash |
|
from ssebowa import Ssebowa_imagen |
|
model = Ssebowa-imagen() |
|
``` |
|
# Generate an image with the text description |
|
Let us generate an image from this "A cat sitting on a bookshelf" |
|
```bash |
|
image = model.generate_image("A cat sitting on a bookshelf") |
|
``` |
|
# Save the image to a file |
|
```bash |
|
image.save("cat_on_bookshelf.jpg") |
|
``` |
|
![finetune](https://ssebowa.s3.amazonaws.com/sdimage/image_generation_1.jpg) |
|
![finetune](https://ssebowa.s3.amazonaws.com/sdimage/image_generation_2.jpg) |
|
|
|
## Finetuning on your own data |
|
- Prepare about 10-20 high-quality photos (jpg or png) and put them in a specific directory. |
|
- Please run on a machine with a GPU of 16GB or more. (If you're fine-tuning SDXL, you'll need 24GB of VRAM.) |
|
|
|
|
|
```bash |
|
from ssebowa.dataset import LocalDataset |
|
from ssebowa.model import SdSsebowaModel |
|
from ssebowa.trainer import LocalTrainer |
|
from ssebowa.utils.image_helpers import display_images |
|
from ssebowa.utils.prompt_helpers import make_prompt |
|
|
|
DATA_DIR = "data" # The directory where you put your prepared photos |
|
OUTPUT_DIR = "models" |
|
|
|
dataset = LocalDataset(DATA_DIR) |
|
dataset = dataset.preprocess_images(detect_face=True) |
|
|
|
SUBJECT_NAME = "<YOUR-NAME>" |
|
CLASS_NAME = "person" |
|
|
|
model = SdSsebowaModel(subject_name=SUBJECT_NAME, class_name=CLASS_NAME) |
|
trainer = LocalTrainer(output_dir=OUTPUT_DIR) |
|
predictor = trainer.fit(model, dataset) |
|
# Use the prompt helper to create an awesome AI avatar! |
|
prompt = next(make_prompt(SUBJECT_NAME, CLASS_NAME)) |
|
images = predictor.predict( |
|
prompt, height=768, width=512, num_images_per_prompt=2, |
|
) |
|
|
|
display_images(images, fig_size=10) |
|
``` |
|
![finetune](https://ssebowa.s3.amazonaws.com/sdimage/Finetuning+on+your+own+data_image.jpg) |
|
|