CompVis/stable-diffusion-v1-4 · How to fix "RuntimeError: expected scalar type Half but found Float" when using fp16

Aug 23, 2022

•

edited Aug 23, 2022

Replace line 272-273 in <pythondistr>\Lib\site-packages\torch\nn\modules\normalization.py

        return F.group_norm(
            input, self.num_groups, self.weight, self.bias, self.eps)

with

        return F.group_norm(
            input, self.num_groups, self.weight.type(input.dtype), self.bias.type(input.dtype), self.eps)

In <pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py after this section (around line 102-107)

        latents = torch.randn(
            (batch_size, self.unet.in_channels, height // 8, width // 8),
            generator=generator,
            device=self.device,
        )

add

        latents = latents.half()

Finally, In <pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py replace this (on line 160-161)

       safety_cheker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
       image, has_nsfw_concept = self.safety_checker(images=image, clip_input=safety_cheker_input.pixel_values)

with

        safety_cheker_input = self.feature_extractor(self.numpy_to_pil(image), return_tensors="pt").to(self.device)
        safety_cheker_input.pixel_values = safety_cheker_input.pixel_values.half()
        image, has_nsfw_concept = self.safety_checker(images=image, clip_input=safety_cheker_input.pixel_values)

twobob

Aug 23, 2022

Hug

patrickvonplaten

Aug 23, 2022

Hey @TessaCoil ,

Thanks for the fix here! Does it happen when loading weights in torch.float16?

patrickvonplaten

Aug 23, 2022

Could you maybe post a code snippet that currently leads to an error/bug? :-)

twobob

Aug 23, 2022

It happens when you try to switch to cpu I think in one instance - likely the self hosted - that I have seen bemoaned "in the wild". @patrickvonplaten rather than included GPU driven default selection. As I understood it.

setting it to CPU then complains about no support for halfs or vice versa. this looks like a fix for that. First glance.

njmaeff

Sep 1, 2022

Here is a code snippet that causes the error.

import torch
from diffusers import StableDiffusionPipeline

TOKEN = 'hugging_face_token'


# get your token at https://huggingface.co/settings/tokens

def run():
    pipe = StableDiffusionPipeline.from_pretrained(
        "CompVis/stable-diffusion-v1-4",
        revision="fp16",
        torch_dtype=torch.float16,
        use_auth_token=TOKEN,
    ).to("cuda")

    prompt = "a photo of an astronaut riding a horse on mars"
    image = pipe(prompt)["sample"][0]

    image.save("astronaut_rides_horse.png")


# Press the green button in the gutter to run the script.
if __name__ == '__main__':
    run()

erenyeagar

Sep 5, 2022

•

edited Sep 5, 2022

@TessaCoil - I get the same error around line 82 (<pythondistr>\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py):

text_embeddings = self.text_encoder(text_input.input_ids.to(self.device))[0]

so none of the upcoming modifications are reached. What do you reckon should I change?

Edit: I forgot to wrap pipe(prompt)["sample"][0] around autocast("cuda").

nightfury

Oct 12, 2022

hey i'm facing the similar issue for 'cpu' device... in https://huggingface.co/spaces/nightfury/SD-InPainting/blob/main/app.py
as no gpu - 'Cuda' available.

if i set torch_dtype=torch.float16,
thn it throws
RuntimeError: expected scalar type Float but found BFloat16

if i set torch_dtype=torch.bfloat16,
thn it throws
RuntimeError: expected scalar type BFloat16 but found Float,

if i set torch_dtype=torch.half,
thn it throws
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

if i set torch_dtype=torch.double,
thn it throws
RuntimeError: expected scalar type BFloat16 but found Double

if i set torch_dtype=torch.long,
thn it throws
raise TypeError('nn.Module.to only accepts floating point or complex '
TypeError: nn.Module.to only accepts floating point or complex dtypes, but got desired dtype=torch.int64

so i am really confused on what torch_dtype to use for successful run.

Shakibyzn

Mar 14, 2023

I came across the same error. I am also using diffusers 1.4. I added the with torch.autocast("cuda"): line above the pipe(prompt, latents=latents) and problem solved.

chalecao

Apr 6, 2023

thanks, i also add with torch.autocast("cuda"): , works for me

MohammadMi

May 3, 2023

Hi! I Have a problem : Input type (float) and bias type (struct c10::Half) should be the same
the error is in \Lib\site-packages\torch\nn\modules\conv.py
File "C:\Users\user\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias) File "C:\Users\user\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same
END OF TRACEBACK

do u know what I can do here?

pcuenq

CompVis org May 3, 2023

Hi @MohammadMi ! As others mentioned, this usually happens when attempting to run the model in half precision on CPU, because CPU does not support half floats. Do you have a GPU in your computer, and are you trying to use it? Do you have a code snippet that demonstrates the problem?

MohammadMi

May 3, 2023

Yes I have a GPU - GTX 1060Ti
I didn’t change any settings..
Do u need to see my webui-user ? I set —no-half —lowvram —opt-slipt-attention..
Which code snippet do u mean?

pcuenq

CompVis org May 3, 2023

I think that card doesn't properly support half float, unfortunately. See here for details about a similar card and some tricks to make it work using the diffusers library.

doubleZ0108

Aug 26, 2024

I met the problem when using concurrent.futures for multi-thread inferencing, I cannot solve the bug yet.
But when setting num_workers = 1, everything goes fine.

Massyzs

Oct 9, 2024

Hi, thanks for your wonderful model and weight. I came across similar problem:
It always told me mat1 and mat2 not match and half with float stuff. I checked and unet related params are all float16, inputs are float16. When changed to float32 it works but loss the speed, could you tell me the possible problem and solution?

    controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-depth", torch_dtype=weight_dtype)
    pipeline = StableDiffusionControlNetPipeline.from_pretrained(
        self.config.diffusion_ckpt,
        vae=vae,
        text_encoder=text_encoder,
        tokenizer=tokenizer,
        unet=unet,
        controlnet=controlnet,
        safety_checker=None,
        # revision=args.revision,
        # variant=args.variant,
        torch_dtype=weight_dtype,
    )
    # pipeline.scheduler = UniPCMultistepScheduler.from_config(pipeline.scheduler.config)
    pipeline = pipeline.to(accelerator.device)


    # disp = disp.to(torch.float32)
    # self.tmp_pipe.to(torch.float32)
    latent=latent.to(torch.float32)
    disp=disp.to(torch.float32)
    pipeline.to(torch.float32)
    result = pipeline(
               prompt=[self.positive_prompt] ,
               negative_prompt=[self.negative_prompts],
               latents=latent,
               image=disp,
               num_inference_steps=self.num_inference_steps,
               guidance_scale=self.guidance_scale,
               controlnet_conditioning_scale=self.controlnet_conditioning_scale,
               eta=self.eta,
               output_type='pt',
               
           ).images[0]

thesorcerer1900

Nov 18, 2024

Found the fix here
https://github.com/balazik/ComfyUI-PuLID-Flux/commit/5867d664e55a9a2736f10f451ee6b4a64c679ee3