Summarization with paraphrasing or word limit doesn't work

by keithhon - opened Feb 15, 2023

Feb 15, 2023

•

edited Feb 16, 2023

Text:
The U.K. inflation rate fell for the third month in a row in January to hit 10.1%, below economists’ expectations, but high food and energy prices continued to put the pressure on British households.

Economists polled by Reuters had forecast inflation would drop to 10.3% after the rate fell to 10.5% for December. Inflation has fallen consistently since hitting a 41-year-high of 11.1% in October.

Core CPI, which doesn’t include food, energy, alcohol or tobacco, was 5.3% compared to 5.8% in December, according to the ONS.

I have tried to give the following prompts with the above text.

Summarize this text with paraphrasing:
Summarize this text with 20 words

Both gives me
The U.K. inflation rate fell for the third month in a row in January to hit 10.1%, below economists’ expectations, but high food and energy prices continued to put the pressure on British households.

@philschmid
Do you know why?

philschmid

Owner Feb 15, 2023

What generation arguments did you use?

keithhon

Feb 16, 2023

text = """
The U.K. inflation rate fell for the third month in a row in January to hit 10.1%, below economists’ expectations, but high food and energy prices continued to put the pressure on British households.

Economists polled by Reuters had forecast inflation would drop to 10.3% after the rate fell to 10.5% for December. Inflation has fallen consistently since hitting a 41-year-high of 11.1% in October.

Core CPI, which doesn’t include food, energy, alcohol or tobacco, was 5.3% compared to 5.8% in December, according to the ONS.
"""

batch = tokenizer("Summarize this text with paraphrasing: " + text, return_tensors='pt')

output_tokens = model.generate(**batch, max_new_tokens=250)

print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))

philschmid

Owner Feb 16, 2023

Got the same results as you. It feels like the model might not know what summarizing in 20 words means, and you would need to fine-tune it for that.

keithhon

Feb 16, 2023

I see. How about paraphrasing?

keithhon

Feb 16, 2023

Got the same results as you. It feels like the model might not know what summarizing in 20 words means, and you would need to fine-tune it for that.

I thought it's related to the format of the input. But it still behaves the same even if I remove all the newlines from it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment