README.md · kakaobrain/karlo-v1-alpha at refs/pr/3

Karlo v1 alpha

Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details only in the small number of denoising steps.

Karlo is available in diffusers!

from diffusers import UnCLIPPipeline
import torch

pipe = UnCLIPPipeline.from_pretrained("fusing/karlo_unclip", torch_dtype=torch.float16)
pipe = pipe.to('cuda')

prompt = "a high-resolution photograph of a big red frog on a green leaf."

image = pipe([prompt]).images[0]

image.save("./frog.png")

Original codebase

This alpha version of Karlo is trained on 115M image-text pairs, including COYO-100M high-quality subset, CC3M, and CC12M. For those who are interested in a better version of Karlo trained on more large-scale high-quality datasets, please visit the landing page of our application B^DISCOVER.