olm/olm-gpt2-latest · Generation config - "bos_token_id", "eos_token_id", "pad_token

Thanks for making these models available. They're really useful.

Would it be possible to look at correcting the generation config? The model config.json has "bos_token_id": 50256 and "eos_token_id": 50256, which I think is correct for the original gpt2, but not for the tokenizer used for this model. It think it might also be possible to set "pad_token_id".

I noticed this when attempting to generate from an empty string:

>>> from transformers import pipeline
>>> generator = pipeline(
...     "text-generation",
...     model="olm/olm-gpt2-latest",
...     bad_words_ids=[[0,2]],
... )
>>> generator("")
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
[{'generated_text': ' Gasp <more text>'}]

The first token we get is 50256:

>>> generator.tokenizer.convert_ids_to_tokens(50256)
'ĠGasp'

You can fix this by modifying some parameters in the pipeline, but then you get a warning about the generation config:

>>> generator = pipeline(
...     "text-generation",
...     model="olm/olm-gpt2-latest",
...     bad_words_ids=[[0,2]],
...     bos_token_id=0,
...     pad_token_id=1,
...     eos_token_id=2,
... )
>>> generator("")
<path>/site-packages/transformers/generation/utils.py:1186: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
  warnings.warn(
<path>/site-packages/transformers/generation/utils.py:1273: UserWarning: Neither `max_length` nor `max_new_tokens` has been set, `max_length` will default to 50 (`generation_config.max_length`). Controlling `max_length` via the config is deprecated and `max_length` will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
[{'generated_text': '<more text>'}]

olm
/

olm-gpt2-latest

Generation config - "bos_token_id", "eos_token_id", "pad_token_id"