File size: 3,041 Bytes
aa634ba 079aefe aa634ba 5fffd79 aa634ba 5fffd79 aa634ba 60f80f5 5fffd79 aa634ba 1efa2d2 aa634ba 5fffd79 a323cfe 3c804dd 5fffd79 3c804dd 5fffd79 3c804dd 5fffd79 a02c993 079aefe |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
license: creativeml-openrail-m
base_model: runwayml/stable-diffusion-v1-5
instance_prompt: photo of a bayc nft
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- dreambooth
inference: true
pipeline_tag: text-to-image
---
# DreamBooth - Bored Ape Yacht Club
## Model Description
This DreamBooth model is an exquisite derivative of [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5), fine-tuned with an engaging emphasis on the Bored Ape Yacht Club (BAYC) NFT collection. The model's weights were meticulously honed using photos from BAYC NFTs, leveraging the innovative [DreamBooth](https://dreambooth.github.io/) to curate a unique, text-to-image synthesis experience.
### Training
Images instrumental in the model's training were generously sourced from the Covalent API, specifically via this [endpoint](https://www.covalenthq.com/docs/api/nft/get-nft-token-ids-for-contract-with-metadata/).
### Inference
Inference has been meticulously optimized, allowing for the generation of captivating, original, and unique images that resonate with the Bored Ape Yacht Club collection. This facilitates a vivid exploration of creativity, enabling the synthesis of images that seamlessly align with the distinctive aesthetics of Bored Ape NFTs.
![img_0](./image_0.png)
![img_1](./image_1.png)
![img_2](./image_2.png)
## Usage
Here’s a basic example of how you can wield this model for generating images:
```python
import torch
from diffusers import StableDiffusionPipeline, DDIMScheduler
from transformers import CLIPTextModel
import numpy as np
model_id = "runwayml/stable-diffusion-v1-5"
unet = UNet2DConditionModel.from_pretrained("ckandemir/bayc-diffusion", subfolder="unet")
text_encoder = CLIPTextModel.from_pretrained("ckandemir/bayc-diffusion",subfolder="text_encoder")
pipeline = StableDiffusionPipeline.from_pretrained(
model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16, use_safetensors=True
).to('cuda')
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
prompt = ["a spiderman bayc nft"]
neg_prompt = ["realistic,disfigured face,disfigured eyes, deformed,bad anatomy"] * len(prompt)
num_samples = 3
guidance_scale = 9
num_inference_steps = 50
height = 512
width = 512
seed = np.random.randint(0, 2**20 - 1)
print("Seed: {}".format(str(seed)))
generator = torch.Generator(device='cuda').manual_seed(seed)
with autocast("cuda"), torch.inference_mode():
imgs = pipeline(
prompt,
negative_prompt=neg_prompt,
height=height, width=width,
num_images_per_prompt=num_samples,
num_inference_steps=num_inference_steps,
guidance_scale=guidance_scale,
generator=generator
).images
for img in imgs:
display(img)
```
## Further Optimization
Results can be further enhanced and refined through meticulous fine-tuning and adept modification of training parameters, unlocking an even broader spectrum of creativity and artistic expression. |