could you give an exact example about informative initialization for 1 step generation with lora?

by greasebig - opened Apr 7, 2024

Discussion

greasebig

Apr 7, 2024

like how to generate a latent about informative initialization

Luo-Yihong

Owner Apr 8, 2024

•

edited Apr 8, 2024

The readme already had a demo for IPI.

greasebig

Apr 8, 2024

The readme already had a demo for IPI.

i think it is like sdxl mechenism: get the base model latent and give the latent to refiner model. But i finally got a WORSE result.
here is how i got my latent:

 image = pipeline_yoso_lora(
        prompt=prompt,
        num_inference_steps=steps, 
        num_images_per_prompt = 1,
        generator = torch.Generator(device="cuda").manual_seed(seed),
        guidance_scale=cfg,
        negetive_prompt = negetive_prompt,
        output_type="latent",
        ).images
bs = 1
latents = image # maybe some latent codes of real images or SD generation
latent_mean = latents.mean(dim=0)
noise = torch.randn([1,bs,64,64])
noise = noise.to('cuda')
timesteps = torch.randint(0, pipeline_yoso_lora.scheduler.config.num_train_timesteps, (bs,), device=latents.device)
timesteps = timesteps.long()
input_latent = pipeline_yoso_lora.scheduler.add_noise(latent_mean.repeat(bs,1,1,1), noise, timesteps)
input_latent = input_latent.to(torch.float16)

Luo-Yihong

Owner Apr 8, 2024

The readme already had a demo for IPI.

i think it is like sdxl mechenism: get the base model latent and give the latent to refiner model. But i finally got a WORSE result.
here is how i got my latent:

 image = pipeline_yoso_lora(
        prompt=prompt,
        num_inference_steps=steps, 
        num_images_per_prompt = 1,
        generator = torch.Generator(device="cuda").manual_seed(seed),
        guidance_scale=cfg,
        negetive_prompt = negetive_prompt,
        output_type="latent",
        ).images
bs = 1
latents = image # maybe some latent codes of real images or SD generation
latent_mean = latents.mean(dim=0)
noise = torch.randn([1,bs,64,64])
noise = noise.to('cuda')
timesteps = torch.randint(0, pipeline_yoso_lora.scheduler.config.num_train_timesteps, (bs,), device=latents.device)
timesteps = timesteps.long()
input_latent = pipeline_yoso_lora.scheduler.add_noise(latent_mean.repeat(bs,1,1,1), noise, timesteps)
input_latent = input_latent.to(torch.float16)

This code was built based on the old version of the readme. I found a bug in the old version of the readme and updated the readme today. Please try according to the new version of the readme.

And there exists a serious bug in your code, you sample timesteps in [0,1000), which definitely produces bad results. Please check your own code before raising questions.

Your code can be updated as:

 image = pipeline_yoso_lora(
        prompt=prompt,
        num_inference_steps=steps, 
        num_images_per_prompt = 1,
        generator = torch.Generator(device="cuda").manual_seed(seed),
        guidance_scale=cfg,
        negetive_prompt = negetive_prompt,
        output_type="latent",
        ).images
bs = 1
latents = image # maybe some latent codes of real images or SD generation
latent_mean = latents.mean(dim=0)
noise = torch.randn([bs,1,64,64])
noise = noise.to('cuda')
timesteps = torch.ones(bs).to(latents.device) * 999
timesteps = timesteps.long()
init_latent = latent_mean.repeat(bs,1,1,1) + latents.std() *  torch.randn_like(noise)
input_latent = pipeline_yoso_lora.scheduler.add_noise(init_latent , noise, timesteps)
input_latent = input_latent.to(torch.float16)

greasebig

Apr 10, 2024

•

edited Apr 10, 2024

however, i follow your fixed code and got WORSE results, here is an example:
prompt : a motorcycle

i am curious about how to correctly use SD generated latent to do IPI in one-step generation

Luo-Yihong

Owner Apr 10, 2024

however, i follow your fixed code and got WORSE results, here is an example:
prompt : a motorcycle

i am curious about how to correctly use SD generated latent to do IPI in one-step generation

There is still a bug in the code. You should fix this. Moreover, we use the training set to compute mean and variance, it is suboptimal to use generated samples for estimating this.

Considering the potential complexity for the community to use the IPI, we may consider releasing a version that distills from IPI soon.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment