TheBloke/llama-2-7B-Guanaco-QLoRA-GPTQ · Is it possible to pre-train this model using AutoGPTQ on your dataset?

I tried training with a script taken from the AutoGPTQ repository (examples/peft/peft_adaption_prompt_clm_instruction_tuning.py) and got the adapter, but no matter how I tried I couldn't get it to merge into one model or at least get it to work together. I would really appreciate any help, I'm only posting here because I'm getting quite desperate. I tried following the guide from your answer https://huggingface.co/TheBloke/guanaco-65B-GPTQ/discussions/2 but it didn't work, I also tried running the base model through AutoGPTQForCausalLM.from_quantized but then it didn't work to combine it with the adapter, in general I would really appreciate your help, I've been struggling with this for 2-3 days now.

Firstly it's not "my" dataset. The Guanaco dataset is produced by Tim Dettmers. And I didn't fine tune this model, that was Mikael - his repo is linked in the description. I produced the GPTQ quantisations of it.

I have not yet tried AutoGPTQ PEFT training. But oobabooga's text-generation-webui provides an AutoGPTQ LoRA option. So you could either just use that, or else you could look at how he's implemented it.

No, you can't combine a GPTQ model with another model. Your options are:

Download Meta llama 2 7B HF, run a standard PEFT LoRA training on it, merge the adapter with the base model to produce a new fp16 model, then either run that fp16 model in 4bit with load_in_4bit, or GPTQ it
Download Meta llama 2 7B HF, run a 4bit QLoRA training on it, then merge + quantise as described in 1.
Download my llama 2 7B GPTQ, run an AutoGPTQ PEFT training on it, either from Python code, or using text-generation-webui.