4 GPUs with 12.2GiB each. Not totally clear from readme files all the steps to be done.

#21
by BigDeeper - opened

I have downloaded the lite versions of the models and want to use 'float32' type, as I am not certain my GPUs cn handle 'bfloat16'

I have also modified the stage_* files in configs to use 'float32' and lite versions.

I get the following:


ValueError Traceback (most recent call last)
Cell In[3], line 3
1 # SETUP MODELS & DATA
2 extras = core.setup_extras_pre()
----> 3 models = core.setup_models(extras)
4 models.generator.eval().requires_grad_(False)
5 print("STAGE C READY")

File ~/PROJECTS/StableCascade/train/train_c.py:164, in WurstCore.setup_models(self, extras)
162 else:
163 for param_name, param in load_or_fail(self.config.generator_checkpoint_path).items():
--> 164 set_module_tensor_to_device(generator, param_name, "cpu", value=param)
165 generator = generator.to(dtype).to(self.device)
166 generator = self.load_model(generator, 'generator')

File ~/mambaforge/envs/StableCascade/lib/python3.10/site-packages/accelerate/utils/modeling.py:345, in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics, tied_params_map)
343 if value is not None:
344 if old_value.shape != value.shape:
--> 345 raise ValueError(
346 f'Trying to set a tensor of shape {value.shape} in "{tensor_name}" (which has shape {old_value.shape}), this look incorrect.'
347 )
349 if dtype is None:
350 # For compatibility with PyTorch load_state_dict which converts state dict dtype to existing dtype in model
351 value = value.to(old_value.dtype)

ValueError: Trying to set a tensor of shape torch.Size([16, 1536, 1, 1]) in "weight" (which has shape torch.Size([16, 2048, 1, 1])), this look incorrect.

I suspect it has to do with bfloat16 vs. float32, but not certain what else needs to be modified.

Does the accelerate module work with these models to distribute layers to different GPUs?

Sign up or log in to comment