Text summarization with Bloom

#122
by mishavee - opened

Does Bloom summarize text. If not can I fine tune it to summarize and what are prospects for a good result? Will it be as good as gpt3? Is Abstractive summarization possible? Is bloom as good as gpt3 in general?

BigScience Workshop org

Does Bloom summarize text

It was not pretrained for, but it's capable of text generation. One setup that people have had a bit of success in is try to run in one shot setting.

If not can I fine tune it to summarize and what are prospects for a good result?

There's a good chance it will be quite good at it.

Will it be as good as gpt3?

Probably not. BLOOM is multilingual and thus sacrifices performances in english to enable performance in other languages.

Is Abstractive summarization possible?

Yes, see first answer.

Is bloom as good as gpt3 in general?

It depends on the metric you're trying to optimize

ty

As far as one shot summarization do I give it one example and then put what I need summarized and see the output?

How many examples do I need to fine tune it for summarization? How many epochs?

Can I train Bloom for gap sentences?

what level of hardware do I need to fine tune Bloom reasonably? Where is best to go to aws, lambda?

BigScience Workshop org

As far as one shot summarization do I give it one example and then put what I need summarized and see the output?

One shot is going to be something like:

Article: {ARTICLE_1}
Summary: {SUMMARY_1}

Activate: {ARTICLE_2}
Summary:

How many examples do I need to fine tune it for summarization? How many epochs?

Depends on the size of your fine tuning dataset. We haven't played too much with finetuning yet. Feel free to test it out yourself and share your result here.

Can I train Bloom for gap sentences?

No sure what you mean here. If you mean filling the gap in the middle of the sentence, then I'd guess yes, you just need to format your problem as a language modeling one.

what level of hardware do I need to fine tune Bloom reasonably? Where is best to go to aws, lambda?

Hum you need something like 1-2T of GPU memory, depending on if you finetune all the layers and a subset of the layers.

ty

how long does it take approximately to do one epoch for all layers with 2T memory?

Gap sentences is when they remove a sentence in between sentences and the model has to guess all the words of the missing sentence. Google Pegasus was trained like this. Can you train Bloom like this?

if I were to train Bloom for summarization would I feed many of these to it?
Article: {ARTICLE_1}
Summary: {SUMMARY_1}

How long would it take for a dataset of 311k like daily mail/cnn for one epoch with 2T memory of gpus? And how many epochs do you recommend for a good result?

BigScience Workshop org

You'll probably get much better summarization with BLOOMZ by prompting it e.g. with

A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes.[1] Thus, a neural network is either a biological neural network, made up of biological neurons, or an artificial neural network, used for solving artificial intelligence (AI) problems. The connections of the biological neuron are modeled in artificial neural networks as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the amplitude of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1. Summarize the previous text in one sentence:

Thanks, is it possible to do this in a foreign language as well?

BigScience Workshop org

Yes of course! You can try out the inference widget for BLOOMZ & it should understand all BLOOM languages and many more.

I have tried it on English with many different paragraphs. Sometimes it does a good job but not always. Is there an ideal length of text to summarize?

when summarizing other languages do I still write summarize previous text in one sentence in English or the language of the text?

BigScience Workshop org

I have tried it on English with many different paragraphs. Sometimes it does a good job but not always. Is there an ideal length of text to summarize?

Any text length less than 2048 tokens should be fine. Note that there is a limit on the tokens you can generate in the widget, so you may have to press "Compute" multiple times.

when summarizing other languages do I still write summarize previous text in one sentence in English or the language of the text?

It will probably be better when asking it in English, but it should also work in other languages. Specifying the summarization language should also work. E.g. "Summarize the previous text in French.".

This is the code I used and for long texts, the summary is completely useless.

Chat-GPT gave me VERY good answers;

summary_instruction = f"\n\nSummarize the previous text in three sentences:\n\n"

total_prompt = prompt + summary_instruction

input_ids = tokenizer(prompt + f"\n\n" + summary_instruction, return_tensors="pt").to(0)
sample = model.generate(**input_ids, max_length=5000,  top_k=1, temperature=0.9, repetition_penalty = 2.0)

result_string = tokenizer.decode(sample[0], truncate_before_pattern=[r"\n\n^#", "^'", "\n\n\n"])

print(result_string[len(total_prompt)::])

are you saying bloom doesn't summarize well?

I'm not getting good results with this setup, no. Neither with bloom or bloomz.

However, I'd appreciate if anyone can tell me what I could improve.

they have to train bloom for text summarization. Otherwise it won't work. I would like them to do it.

Sign up or log in to comment