vwu142's picture
Update README.md
9c3f5d8 verified
|
raw
history blame
2.57 kB
metadata
library_name: diffusers
license: creativeml-openrail-m
datasets:
  - vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000
language:
  - en

Fine-Tuned Pokemon Generator Model Card

This model was fined-tuned with a Pokemon and Pokemon Card Image dataset with Stable Diffusion v2-1 as the Base Model

Most of the documentation would still be the same as the Base Model's repo, but with some of the fine-tuning done

Base Model Repo: https://huggingface.co/stabilityai/stable-diffusion-2-1

Dataset: https://huggingface.co/datasets/vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000

Stable Diffusion v2-1 text2image fine-tuning - vwu142/fine-tuned-pokemon-and-pokemon-card-generator-13000

The model was fine-tuned on the vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000 dataset. You can find some example images in the following.

img_0 img_1 img_2

How to Get Started with the Model

# Building the pipeline with the Fined-tuned model from Hugging Face
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained("vwu142/fine-tuned-pokemon-and-pokemon-card-generator-13000")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
pipeline = pipeline.to("cuda")

# Image generation
prompt = "A Pokemon Card of the format tag team,with pokemon of type dragon and ghost with the title Gratina in the Tag Team form from Sun & Moon with an Electric type Pikachu as the buddy of the Tag Team"
images = pipeline(prompt).images
images

Training Details

Training Procedure

The weights were trained on the Free GPU provided in Google Collab.

The data it was trained on comes from this dataset: https://huggingface.co/datasets/vwu142/Pokemon-Card-Plus-Pokemon-Actual-Image-And-Captions-13000

It has images of pokemon cards and pokemon with various descriptions of the image.

Training Hyperparameters

!accelerate launch diffusers/examples/text_to_image/train_text_to_image.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --dataset_name=$dataset_name --caption_column="caption"\
  --use_ema \
  --use_8bit_adam \
  --resolution=512 --center_crop --random_flip \
  --train_batch_size=1 \
  --gradient_accumulation_steps=8 \
  --gradient_checkpointing \
  --mixed_precision="fp16" \
  --max_train_steps=$max_training_epochs \
  --learning_rate=1e-05 \
  --max_grad_norm=1 \
  --lr_scheduler="constant" --lr_warmup_steps=0 \
  --output_dir="pokemon-card-model"