book-summary dataset & training

by ccdv - opened Jun 28, 2022

ccdv

Jun 28, 2022

I want to train my own long summarization model on the same book-summary dataset.
Where does this dataset come from (any paper? ) ?
How long did the training last? Did you use gradient-checkpointing? 32Gb gpus ?

Thank you

pszemraj

Owner Jun 29, 2022

HI! thanks for reaching out. SalesForce Research initially released the dataset:

paper https://arxiv.org/abs/2105.08209
repo https://github.com/salesforce/booksum

Training for the tglobal attention variant to create this checkpoint took about seven days with gradient checkpointing on a V100 GPU with 52 GB CPU + deepspeed. The local attention variant seems to train faster, not sure why precisely except for the layman's idea of "worse sparse attention mechanism so has to pay attention to less stuff" when training.

pszemraj

Owner Jun 29, 2022

btw I don't plan to keep training the local variant much because I want to focus on getting this tglobal variant optimized (I am also training the large one too), but if you'd instead start from a mid-training checkpoint of longT5-local-base on booksum I am happy to post that publically

ccdv

Jun 29, 2022

Thank you for your answer.
I plan to train my own long summarization model (from my repo) to compare performance to the LongT5. I will use a smaller model, its too expensive to run.
Did you compute any rouge metric yet?
Does n_positions=4096 refers to the max input length in the config?

pszemraj

Owner Jul 22, 2022

hey sorry I didn't see your question somehow! checking on this

lauraweiss2

Sep 30, 2022

•

edited Oct 3, 2022

As for me, I've always found psychology to be fascinating and have aspired to work in the field. This is why I picked this specialty. But it wasn't as easy as it first appeared to be. The problem is that I want the freedom to enjoy my time as a student. This is why I require help with my study essays. When I came across https://writix.com/essay-examples/psychology I was overjoyed. Additionally, I choose it since they provide affordable essay writing services. Their best quality is that they do extensive study before writing, which is essential to getting excellent results.

pszemraj

Owner Sep 30, 2022

@ccdv can do more detailed checking, but I believe that might refer to an attention block; as far as I can tell, the max input length is indeed 16384 and not reduced (you can also validate by trying to summarize long text and seeing if crucial information at the end of your text appears)

pszemraj changed discussion status to closed Sep 30, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment