EleutherAI/gpt-j-6b · RuntimeError: expected scalar type Half but found Float

Hi, I'm trying to load GPT-J in 8-bit mode for fine-tuning using LoRA with the new PEFT library.
This is the part of the code I use for loading the model in 8-bit and the tokenizer:

def tokenize(element):
inputs = tokenizer(
element['text'],
truncation=True,
padding=True,
max_length=MAX_LEN,
)
return {'input_ids': inputs.input_ids,
'attention_mask':inputs.attention_mask,
'labels':inputs.input_ids}

model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B",
device_map=device_map,
load_in_8bit=True,
)
model = prepare_model_for_int8_training(model)

config = LoraConfig(
r=LORA_R,
lora_alpha=LORA_ALPHA,
target_modules=TARGET_MODULES,
lora_dropout=LORA_DROPOUT,
bias="none",
task_type=TaskType.CAUSAL_LM,
)
model = get_peft_model(model, config)

When I try to run the training using the Trainer API I get the following error:
File "/home/azureuser/.local/lib/python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 456, in backward
grad_A = torch.matmul(grad_output, CB).view(ctx.grad_shape).to(ctx.dtype_A)
RuntimeError: expected scalar type Half but found Float

It should be noted that I ran a very similar script with a different model (CodeT5-Large) and it worked just fine so there isn't suppose to be something missing in the rest of the code. (I obviously changed parameters and the data collator for a CasualLM one instead of Seq2Seq aswell with the data itself having the prompt and completion appended into one text column to be ran CLM training on).