Create a Diffusers-compatible Dataset for Stable Diffusion Fine-tuning

Community Article Published July 19, 2024

To appeal to the youth, I will be speaking in you kid's lingo. 😎

Yo, what’s up squad! πŸŽ‰ Today, we’re diving into how to make a diffusers dataset (yeah, I know, it’s kinda meh, but hang tight!). 🀘

Step 1. Load your groovy Img2Img pipeline and send it to GPU (whatever that means) 😎😎😎

from transformers import pipeline

pipe = pipeline("image-to-text", model="Salesforce/blip-image-captioning-large", device="cuda") # I'm using blip, as that's what most SD models expect

Step 2: Time for a Break (You #earnedIt! πŸ–οΈπŸ˜Ž)

Chill for a sec and let’s regroup. Grab a snack or something. πŸ•πŸŒŸ

Step 3. Epic to have you back, now let's set up your Super Cool Image Folder πŸ“βœ¨

import os

folder = "your_image_folder" # file structure is your_image_folder/train

image_folder = os.path.join(folder, "train")

Step 4. That was totally exhausting, all those characters I had to copy and paste without giving credit, let's take another break πŸ₯΅πŸ₯΅πŸ₯΅πŸ₯΅πŸ”₯

Rest those fingers, β€˜cause we’re about to get back to it. 🀯πŸ’₯

Step- I forgot.. 4 I think? Load That Folder #LikeABoss πŸš€πŸ•ΆοΈ

from datasets import load_dataset

dataset = load_dataset('imagefolder', data_dir=image_folder, split='train')

Step 4 (yeah that sounds right). Craft Your Formatting Function with Style πŸ§™β€β™‚οΈπŸŽ¨

def generate_captions(examples):
    captions = []
    for image in examples['image']:
        result = pipe(image)
        caption = result[0]['generated_text']
        captions.append(caption)
    return {"text": captions}

Step 4. Map the dataset (data again? thats so lame)

dataset = dataset.map(generate_captions, batched=True)

Step 4. Bring back the train split 😎😎πŸ”ͺ

from datasets import DatasetDict

dataset_dict = DatasetDict({
    'train': dataset
})

dataset_dict # test to see if the new columns are there in train, if you were paying attention they should be

Step 4. Push It to the Hub (Is it a cool club now? πŸ€·β€β™‚οΈ) πŸ’₯🌈

dataset_dict.push_to_hub("your_output_repo")

Step 4. πŸŽ‰ You Did It! πŸŽ‰ (finally)

Now go watch Vine or whatever. Wait- that's not right.. you kids are using that new short-form-video platform.. Tik Tak?