Error when finetuning this - fp16 vs int8?

by hartleyterw - opened Oct 26, 2023

Oct 26, 2023

/usr/local/lib/python3.10/dist-packages/peft/tuners/lora.py in forward(self, x)
815 result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
816 elif self.r[self.active_adapter] > 0 and not self.merged:
--> 817 result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
818
819 x = x.to(self.lora_A[self.active_adapter].weight.dtype)

RuntimeError: self and mat2 must have the same dtype, but got Half and Int

Using qlora and peft 0.4
Can't finetune the original model because colab OOMs.
Is there any way to fix this?

dahara1

webbigdata org Oct 26, 2023

•

edited Oct 26, 2023

Hello.

It should work if I use the sample below, but for some reason it doesn't work with this model.
I don't know the cause yet.
https://huggingface.co/dahara1/weblab-10b-instruction-sft-GPTQ/tree/main/finetune_sample

ALMA-7B-Ja-V2 is scheduled to be released soon, so I would like to reconsider the quantization method at that time.

Until then try the gguf version.
https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_gguf_Free_Colab_sample.ipynb

hartleyterw

Oct 26, 2023

Training on cpu... I'd rather wait for the new release... :'D Thank you for your work.

dahara1

webbigdata org Nov 5, 2023

Hello.
It's taken longer than I expected because I've remade it several times, but I think I'll be able to release V2 tomorrow.

As far as I can tell, it is still possible to create LoRA for GPTQ quantized models using axolotl.

However, I have encountered a phenomenon where when I merge the created LoRA with the original model, the file size becomes the same as the model before GPTQ quantization.
https://github.com/OpenAccess-AI-Collective/axolotl/issues/829

Using huggingface/peft directly increased the file size as well, so this may be a current limitation on the peft side.

dahara1

webbigdata org Dec 11, 2023

Hi.

According this thread.
https://github.com/OpenAccess-AI-Collective/axolotl/issues/829

Lora fine-tuning -> merge into one file -> GPTQ -> OK
GPTQ -> fine-tuning -> I can, but merging into one file is not supported.

And we're training a new version now.

hartleyterw

Dec 12, 2023

Thanks for your reply.
I'm trying to use axolotl with the same config as you except I disabled bf32 and different dataset (not tokenized). But it's giving me the error ValueError: model_config.quantization_config is not set or quant_method is not set to gptq. Please make sure to point to a GPTQ model.

!accelerate launch -m axolotl.cli.train /content/gptq.yml

I tried with the V2 version as well, same error

dahara1

webbigdata org Dec 13, 2023

Hmmm, I don't see any errors in my environment.
Well, the quantize_config.json is included in the repository, so it could be that the model is failing to download for some reason.

Check to see if you're getting any out of memory or other errors prior to that error.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment