--- tags: - summarization - summary - booksum - long-document - long-form license: apache-2.0 datasets: - kmfoda/booksum metrics: - rouge inference: false model-index: - name: pszemraj/long-t5-tglobal-large-pubmed-3k-booksum-16384-WIP results: - task: type: summarization name: Summarization dataset: name: kmfoda/booksum type: kmfoda/booksum config: kmfoda--booksum split: test metrics: - name: ROUGE-1 type: rouge value: 35.9969 verified: true - name: ROUGE-2 type: rouge value: 5.9272 verified: true - name: ROUGE-L type: rouge value: 16.0136 verified: true - name: ROUGE-LSUM type: rouge value: 32.941 verified: true - name: loss type: loss value: 2.9339466094970703 verified: true - name: gen_len type: gen_len value: 283.7198 verified: true - task: type: summarization name: Summarization dataset: name: samsum type: samsum config: samsum split: test metrics: - name: ROUGE-1 type: rouge value: 26.2412 verified: true - name: ROUGE-2 type: rouge value: 5.9791 verified: true - name: ROUGE-L type: rouge value: 18.7467 verified: true - name: ROUGE-LSUM type: rouge value: 22.5566 verified: true - name: loss type: loss value: 2.877626895904541 verified: true - name: gen_len type: gen_len value: 47.6532 verified: true - task: type: summarization name: Summarization dataset: name: xsum type: xsum config: default split: test metrics: - name: ROUGE-1 type: rouge value: 19.3209 verified: true - name: ROUGE-2 type: rouge value: 2.7978 verified: true - name: ROUGE-L type: rouge value: 12.5816 verified: true - name: ROUGE-LSUM type: rouge value: 15.0239 verified: true - name: loss type: loss value: 4.483709335327148 verified: true - name: gen_len type: gen_len value: 82.729 verified: true --- # long-t5-tglobal-large-pubmed-3k-booksum-16384-WIP > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :) ## Updates _As I update this WIP checkpoint, I will post a note here._ - July 26, 2022: add two more epochs of training, metrics starting to be _almost_ as good as the more-tuned `base` variant - July 8, 2022: add checkpoint with ~4 epochs of training on A100, equating to approx 350 steps of functional batch size 128 - July 4, 2022: add checkpoint with six additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving the prior issue of short summaries. ## About - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 26 epochs - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final 10+ epochs** ## Comparisons - compare to [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). - **inference API has been disabled because it's too compute-intensive :/**