led-large-16384 / README.md
johngiorgi's picture
Update README.md
ce6acac
|
raw
history blame
729 Bytes
metadata
language: en
license: apache-2.0
pipeline_tag: summarization

Model Card

This model is identical to allenai/led-large-16384, except the generation_config.json has been updated from:

{
  "_from_model_config": true,
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "eos_token_id": 2,
  "pad_token_id": 1
}

to

{
  "bos_token_id": 0,
  "decoder_start_token_id": 2,
  "eos_token_id": 2,
  "pad_token_id": 1,
  "early_stopping": true,
  "length_penalty": 2.0,
  "max_length": 512,
  "min_length": 100,
  "no_repeat_ngram_size": 3,
  "num_beams": 4
}

which we found to be much more stable when fine-tuning the model for summarization tasks.