Loss without grad_fn when using transformers Trainer suite

#10

by syboomsysy - opened Nov 22, 2023

Nov 22, 2023

I tried to finetune the model using alpaca dataset. However, when I launched the training procedure, I found that the model yielded loss directly (does the model have loss function embedded inside?), and the loss had no grad_func attached on it so
the error occurred during backward phase. Could anyone tell me the reason of the problem?

Here is my package setting:
torch 2.0.0+cu117
transformers 4.35.2
peft 0.6.2

and here is the snapshot for my code:

and finally, the error log details:

syboomsysy

Nov 22, 2023

Oh, it seems that the gradient checkpointing should be the crux, the procedure runs well if I set it False.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment