Question about Setting max_length and max_new_tokens in Generation Configuration for CNN/DM Dataset

#50
by cooper521 - opened

In the generation_config.json file, I am quite puzzled about "max_length": 142, because the official documentation mentions:

  • max_length (int, optional, defaults to 20) β€” The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. Its effect is overridden by max_new_tokens, if also set.
  • max_new_tokens (int, optional) β€” The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.

If configured this way, it means the total length of the article + summary would only be 142 tokens. This is obviously inappropriate for cnn/dm as the articles usually have hundreds of tokens. So what would be a good setting for this parameter?

Sign up or log in to comment