Manual training

#4
by edmond - opened

Hello, for some reason running :
vlm = PaliGemmaForConditionalGeneration.from_pretrained(**llm_args)
pred = vlm(pixel_values=tensor, input_ids=input_ids[:, :-1],
attention_mask=torch.ones_like(input_ids[:, :-1])).logits
pred = pred[:, -nb_tokens_answer:]

loss = F.cross_entropy(pred.permute((0, 2, 1)), input_ids[:, -nb_tokens_answer:],
reduction='mean')

Gives me a very small loss. I have the feeling that input and target tokens were mixed.
Why is that ?

This is driving me crazy. This bugfix was supposed to solve my problem https://github.com/huggingface/transformers/pull/30967 ... (im checking on more data)

https://github.com/huggingface/transformers/issues/30993 ok I got help, apparently this models neededs also labels tokens and tokens type ids in input unlike imp or moondream...

edmond changed discussion status to closed
Google org

@edmond sorry for late response. it's best if you pass suffix to processor and actually pass processor outputs.

Sign up or log in to comment