google/paligemma-3b-pt-224 · Fine Tuning Transfomers

May 16

•

https://huggingface.co/blog/paligemma
Hey everyone, maybe a silly question, but shouldn't the tokens for the answer be part of the input_ids? I'm trying to understand why the answer tokens are not included in the input_ids, can someone explain this to me?

image_token = processor.tokenizer.convert_tokens_to_ids("")
def collate_fn(examples):
texts = ["answer " + example["question"] + "\n" + example['multiple_choice_answer'] for example in examples]
images = [example["image"].convert("RGB") for example in examples]
tokens = processor(text=texts, images=images,
return_tensors="pt", padding="longest",
tokenize_newline_separately=False)
labels = tokens["input_ids"].clone()
labels[labels == processor.tokenizer.pad_token_id] = -100
labels[labels == image_token] = -100
tokens["labels"] = labels
tokens = tokens.to(torch.bfloat16).to(device)
return tokens

RicoRausch changed discussion status to closed May 16

RicoRausch changed discussion status to open May 16

nielsr

May 28

Hi,

PaliGemma requires the labels to be passed using the "suffix" keyword argument.

See also the demo notebooks here: https://huggingface.co/docs/transformers/main/en/model_doc/paligemma#resources

RicoRausch changed discussion status to closed May 31