Warning output
It seems harmless but can see this
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_mask
to obtain reliable results.
Settingpad_token_id
toeos_token_id
:71013 for open-end generation.
Yes, will be adressed in the follow-up PR on transformers:main :)
To eliminate attention_mask warning
inputs["attention_mask"] = torch.ones(inputs["input_ids"].shape, device="cuda:0")
generation_output = model.generate(**inputs, max_new_tokens=50, pad_token_id=model.config.eos_token_id)
@Colderthanice
@dashesy
In the recent update with the new transformers
release, the attention_mask
warning goes away because it is now indeed returned by the Processor :)
Now, batch generation is also supported, so I don't recommend setting attention_mask
to torch.ones(inputs["input_ids"].shape)
in the current version, that is using left-padding. Let us know if you encounter issues!