New Summarization Model?

#1
by pszemraj - opened

Hey! I've played with your space before and see that you are working on it - great work, btw.

I have been working on a long-text summarization model that generalizes decently well; I was wondering if you have given it a shot for your use case/if you have any thoughts on it vs. the standard ones?

Hi, thanks for the suggestions, happy to test it and incorporate those models as options, very kind of you! Will revert with any feedback if any. Ye will probably trial the smaller version so it doesn’t take too long to load.

Awesome! feel free to ping me with questions

I tried the led-base on text that was 5000 words long, first chunked the text into batches of 1000 then summarized the list of batches, took quite a while on CPU, any idea how I can speed it up by varying some of the args? thanks

Hey! I will answer you on both threads - sorry for the delay. In general, things that can affect/improve/make runtime more consistent:

  • try chunking your text in X tokens as opposed to words. Sometimes numbers and other digits can screw up the counts, and the reality is that the tokens are what matters 9_i.e. if you have a lot of words that map to 4+ tokens or so, that batch might take forever). example code here
  • decrease num_beams to 1 for greedy search decoding
  • then, you can remove the penalties: set length_penalty=1 and/or repetition_penalty=1. While you can get rid of them, I think some form of preventing repetition is likely needed, so I would try keeping no_repeat_ngram_size=3 etc.

Try those, but I think the long-token models are just compute-intensive. I think spaces used to have more resources (just a feeling I get with compute times now), but if it can't run on CPU on spaces, it's probably not viable without a GPU. You could also try the longt5-base model on my profile and see if that is more efficient

feel free to "close" this item as needed btw :)

Sign up or log in to comment