Fine Tuning Transfomers
https://huggingface.co/blog/paligemma
Hey everyone, maybe a silly question, but shouldn't the tokens for the answer be part of the input_ids? I'm trying to understand why the answer tokens are not included in the input_ids, can someone explain this to me?
image_token = processor.tokenizer.convert_tokens_to_ids("")
def collate_fn(examples):
texts = ["answer " + example["question"] + "\n" + example['multiple_choice_answer'] for example in examples]
images = [example["image"].convert("RGB") for example in examples]
tokens = processor(text=texts, images=images,
return_tensors="pt", padding="longest",
tokenize_newline_separately=False)
labels = tokens["input_ids"].clone()
labels[labels == processor.tokenizer.pad_token_id] = -100
labels[labels == image_token] = -100
tokens["labels"] = labels
tokens = tokens.to(torch.bfloat16).to(device)
return tokens
Hi,
PaliGemma requires the labels to be passed using the "suffix" keyword argument.
See also the demo notebooks here: https://huggingface.co/docs/transformers/main/en/model_doc/paligemma#resources