boredape_diffusion / README.md
ckandemir's picture
Update README.md
e6a8832 verified
---
license: creativeml-openrail-m
base_model: runwayml/stable-diffusion-v1-5
instance_prompt: photo of a bayc nft
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- dreambooth
inference: true
pipeline_tag: text-to-image
---
# DreamBooth - Bored Ape Yacht Club
## Model Description
This DreamBooth model is an exquisite derivative of [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5), fine-tuned with an engaging emphasis on the Bored Ape Yacht Club (BAYC) NFT collection. The model's weights were meticulously honed using photos from BAYC NFTs, leveraging the innovative [DreamBooth](https://dreambooth.github.io/) to curate a unique, text-to-image synthesis experience.
### Training
Images instrumental in the model's training were generously sourced from the Covalent API, specifically via this [endpoint](https://www.covalenthq.com/docs/api/nft/get-nft-token-ids-for-contract-with-metadata/).
### Inference
Inference has been meticulously optimized, allowing for the generation of captivating, original, and unique images that resonate with the Bored Ape Yacht Club collection. This facilitates a vivid exploration of creativity, enabling the synthesis of images that seamlessly align with the distinctive aesthetics of Bored Ape NFTs.
![img_0](./image_0.png)
![img_1](./image_1.png)
![img_2](./image_2.png)
## Usage
Here’s a basic example of how you can wield this model for generating images:
```python
import torch
from diffusers import StableDiffusionPipeline, DDIMScheduler
from transformers import CLIPTextModel
import numpy as np
model_id = "runwayml/stable-diffusion-v1-5"
unet = UNet2DConditionModel.from_pretrained("ckandemir/boredape_diffusion", subfolder="unet")
text_encoder = CLIPTextModel.from_pretrained("ckandemir/boredape_diffusion",subfolder="text_encoder")
pipeline = StableDiffusionPipeline.from_pretrained(
model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16, use_safetensors=True
).to('cuda')
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
prompt = ["a spiderman bayc nft"]
neg_prompt = ["realistic,disfigured face,disfigured eyes, deformed,bad anatomy"] * len(prompt)
num_samples = 3
guidance_scale = 9
num_inference_steps = 50
height = 512
width = 512
seed = np.random.randint(0, 2**20 - 1)
print("Seed: {}".format(str(seed)))
generator = torch.Generator(device='cuda').manual_seed(seed)
with autocast("cuda"), torch.inference_mode():
imgs = pipeline(
prompt,
negative_prompt=neg_prompt,
height=height, width=width,
num_images_per_prompt=num_samples,
num_inference_steps=num_inference_steps,
guidance_scale=guidance_scale,
generator=generator
).images
for img in imgs:
display(img)
```
## Further Optimization
Results can be further enhanced and refined through meticulous fine-tuning and adept modification of training parameters, unlocking an even broader spectrum of creativity and artistic expression.