File size: 3,049 Bytes

aa634ba
 
 
 
 
 
 
 
 
 
 
079aefe
aa634ba
 
5fffd79
aa634ba
5fffd79
aa634ba
60f80f5
5fffd79
 
 
 
 
 
 
 
 
aa634ba
1efa2d2
 
 
 
aa634ba
5fffd79
 
 
 
 
 
 
 
 
 
 
 
e6a8832
 
5fffd79
 
 
 
 
 
 
a323cfe
3c804dd
 
 
 
 
5fffd79
3c804dd
 
 
5fffd79
 
 
3c804dd
 
 
 
 
 
 
5fffd79
 
 
 
 
 
a02c993
079aefe

---
license: creativeml-openrail-m
base_model: runwayml/stable-diffusion-v1-5
instance_prompt: photo of a bayc nft
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- dreambooth
inference: true
pipeline_tag: text-to-image
---
    
# DreamBooth - Bored Ape Yacht Club

## Model Description

This DreamBooth model is an exquisite derivative of [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5), fine-tuned with an engaging emphasis on the Bored Ape Yacht Club (BAYC) NFT collection. The model's weights were meticulously honed using photos from BAYC NFTs, leveraging the innovative [DreamBooth](https://dreambooth.github.io/)  to curate a unique, text-to-image synthesis experience. 


### Training

Images instrumental in the model's training were generously sourced from the Covalent API, specifically via this [endpoint](https://www.covalenthq.com/docs/api/nft/get-nft-token-ids-for-contract-with-metadata/).

### Inference

Inference has been meticulously optimized, allowing for the generation of captivating, original, and unique images that resonate with the Bored Ape Yacht Club collection. This facilitates a vivid exploration of creativity, enabling the synthesis of images that seamlessly align with the distinctive aesthetics of Bored Ape NFTs.

![img_0](./image_0.png)
![img_1](./image_1.png)
![img_2](./image_2.png)


## Usage

Here’s a basic example of how you can wield this model for generating images:

```python
import torch
from diffusers import StableDiffusionPipeline, DDIMScheduler
from transformers import CLIPTextModel
import numpy as np

model_id = "runwayml/stable-diffusion-v1-5"

unet = UNet2DConditionModel.from_pretrained("ckandemir/boredape_diffusion", subfolder="unet")
text_encoder = CLIPTextModel.from_pretrained("ckandemir/boredape_diffusion",subfolder="text_encoder")

pipeline = StableDiffusionPipeline.from_pretrained(
    model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16, use_safetensors=True
).to('cuda')
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

prompt = ["a spiderman bayc nft"]
neg_prompt = ["realistic,disfigured face,disfigured eyes, deformed,bad anatomy"] * len(prompt)
num_samples = 3
guidance_scale = 9
num_inference_steps = 50
height = 512
width = 512

seed = np.random.randint(0, 2**20 - 1)
print("Seed: {}".format(str(seed)))
generator = torch.Generator(device='cuda').manual_seed(seed)

with autocast("cuda"), torch.inference_mode():
    imgs = pipeline(
        prompt,
        negative_prompt=neg_prompt,
        height=height, width=width,
        num_images_per_prompt=num_samples,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale,
        generator=generator
    ).images

for img in imgs:
    display(img)
```

## Further Optimization
Results can be further enhanced and refined through meticulous fine-tuning and adept modification of training parameters, unlocking an even broader spectrum of creativity and artistic expression.