RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list.

#38
by gebaltso - opened

I get the aforementioned error. I have changed in config.json the c_in with in_channels.

I also have the same problem, how did you solve it?@gebaltso

I broadcast the 0 channel of kv and it is feasible.
Or you can directly set the output image number = 1
@gebaltso

thanks @nuaazs directly setting the output image number = 1 seems to solve the problem!

deleted

change :num_images_per_prompt = 1

I've solved the issue by providing the prompt as a list like below:

import torch
from diffusers import StableCascadeDecoderPipeline, StableCascadePriorPipeline

device = "cuda"
num_images_per_prompt = 2

prior = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", torch_dtype=torch.bfloat16).to(device)
decoder = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade",  torch_dtype=torch.float16).to(device)

prompt = "Anthropomorphic cat dressed as a pilot"
negative_prompt = ""

prior_output = prior(
    prompt=prompt,
    height=1024,
    width=1024,
    negative_prompt=negative_prompt,
    guidance_scale=4.0,
    num_images_per_prompt=num_images_per_prompt,
    num_inference_steps=20
)
decoder_output = decoder(
    image_embeddings=prior_output.image_embeddings.half(),
    prompt=[prompt] * len(prior_output.image_embeddings), # <-- CHANGE HERE
    negative_prompt=negative_prompt,
    guidance_scale=0.0,
    output_type="pil",
    num_inference_steps=10
).images

if I set num_images_per_prompt=2, then len(prior_output) would be 3, so I use num_images_per_prompt instead of len(prior_output)
@kaan-aytekin

@arcral You are right, I meant to write len(prior_output.image_embeddings) instead of len(prior_output).
Fixing the original comment, thank you

Sign up or log in to comment