About VAE in sdxl, when I encode the image with VAE, the result was nan
#39
by
zideliu
- opened
>>> from diffusers.models import AutoencoderKL
>>> import torch
>>> vae16 = AutoencoderKL.from_pretrained("sdxl-vae",torch_dtype=torch.float16).to('cuda')
>>> vae = AutoencoderKL.from_pretrained("sdxl-vae").to('cuda')
>>> import numpy as np
>>> from PIL import Image
>>> x = np.array(Image.open('1.png').convert('RGB'))
>>> x = x/127.5 -1.0
>>> x = torch.from_numpy(x)
>>> x = x.permute(2,0,1)
>>> x.shape
torch.Size([3, 1024, 1024])
>>> im = x.to('cuda',dtype=torch.float32)
>>> im16 = x.to('cuda',dtype=torch.float16)
>>> t16 = vae16.encode(im16).latent_dist.sample()
>>> t16
tensor([[[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]]], device='cuda:0',
dtype=torch.float16, grad_fn=<AddBackward0>)
>>> t = vae.encode(im).latent_dist.sample()
>>> t
tensor([[[[-10.6650, -10.7863, -8.9922, ..., -4.9603, -7.7862, -5.2480],
[ -7.2087, -11.2612, -7.2466, ..., -5.0280, -8.4497, -5.2412],
[ -9.8503, -7.0086, -8.6645, ..., -6.8731, -5.7120, -5.2228],
...,
[ -0.8512, -2.7118, -3.6125, ..., -8.5498, -6.7364, -4.5577],
[ -4.7474, -4.0266, -2.2864, ..., -5.6377, -5.6006, -7.2353],
[ -3.9695, -5.3364, -1.7182, ..., -4.1163, -3.1237, -8.5114]],
[[ 0.5291, 3.1062, 3.9364, ..., 6.1754, 6.9985, 10.2787],
[ 4.3542, 2.2514, 4.5842, ..., 9.9785, 7.0942, 9.3612],
[ 2.3624, 7.6000, 4.0870, ..., 8.2662, 7.3177, 7.4536],
...,
[ 2.4479, 5.3318, 7.7907, ..., 2.2813, 3.3110, 7.9058],
[ 3.1145, 7.0900, 9.5676, ..., 4.0560, 7.3074, 5.4294],
[ 4.1593, 2.2279, 6.3064, ..., 5.3051, 2.4846, -0.6007]],
[[ -5.9309, 2.5895, -7.6263, ..., 7.6300, 9.7816, -3.2628],
[ 0.0919, -4.1963, -0.4588, ..., 8.5846, 7.2509, 1.5191],
[ -7.2874, 0.2553, -0.2289, ..., 7.3174, 12.3518, 0.0686],
...,
[ 7.4820, 4.8404, 2.6403, ..., -3.4686, -7.8857, -10.7730],
[ -3.9887, 5.6537, -5.4063, ..., -4.4793, 0.0799, -6.0221],
[ 5.6877, -0.1834, 1.8418, ..., 0.7933, 5.1394, -2.5688]],
[[ 5.9649, 7.5648, 6.7202, ..., 7.6067, 2.7659, 2.5190],
[ 11.1852, 4.1965, 11.4174, ..., 6.5834, 0.0270, -0.6252],
[ 5.5535, 8.5816, 10.5749, ..., -0.3417, 3.5020, 3.8151],
...,
[ 7.0386, 7.9634, 8.0208, ..., 2.0303, 6.1680, 3.0449],
[ 4.4080, 5.6513, 4.0449, ..., 3.7297, 9.5295, 0.9136],
[ 4.0043, 1.5003, 8.5650, ..., 10.1849, 6.6777, 1.8639]]]],
device='cuda:0', grad_fn=<AddBackward0>)
>>>
When I use fp16 the result is nan, if I use fp32 the result is normal
I get it. It's my problem
zideliu
changed discussion status to
closed