Generate shorter sentences
I need to generate shorter sentences.
I'm using mpt-instruct for creating titles.
I need to restrict the length of the sentences.
what options are available?
- I tried
length_penalty = -0.9 (but didn't work). Any suggested value for generating shorter sentences - Should i control the length using max_tokens or max_length.
I need to generate shorter sentences which will be used for creating titles/ headings
max_new_tokens
and max_length
should not be used to treat this problem because they will simply cut the sentence in between.
For generating shorter sequences you have a couple of options, well this is trial and error but it will work:
Options:
Few shots: Since this is the instruct model it will understand what you want to achieve by giving 3-6 examples in the prompt of "what short title" means. I assume you want to give the title for the given paragraph, so give the paragraph followed by the title. Repeat this for the other 3-4 examples and the model has context on what it needs to do.
length_penalty
: Instead of giving the penalty in negative as you did-0.9
, provide in positive. And it should be more than 1. It is trial and error mate!penalty_alpha
: This is something to take care of, it is advisable to use it between 0 and 1 range. This will too shorten the sequence but from my experiments, when given the long prompt, using this parameter causes OOM errors. So, when you have a long prompt, don't use this...Fine-tune: Gather examples in pairs of paragraphs and titles. Fine-tune this model on that and this will work.
Prompt engineer: Try instructing the model to generate the short sentence within 10 words.
Let me know if any of these helps.
Have a look, without using any other options I described above, I just given a prompt that works for the title generation:
Ptompt:
The following is the paragraph on which you need to generate a short title:
The first Italy international football player to score a hat-trick was Pietro Lana, who scored three times in a 6–2 victory against France on 15 May 1910. The highest individual score in a single match is four goals, which has been achieved by six players: Carlo Biagi, Francesco Pernigo, Omar Sívori, Alberto Orlando, Gigi Riva, and Roberto Bettega. Five players have scored a hat-trick more than once: Giuseppe Meazza, Angelo Schiavio, Silvio Piola, Riva and Paolo Rossi (pictured). The highest number of Italy hat-tricks in a single match is three, which occurred during the third-place match at the 1928 Summer Olympics, an 11–3 victory over Egypt in which Angelo Schiavio, Elvio Banchero and Mario Magnozzi each scored three goals. In the 1982 FIFA World Cup second group-stage match, Italy won 3–2 against Brazil; Rossi scored a hat-trick in the match that allowed Italy to progress to the semi-finals.
Generate a single and short title for the paragraph above within 5 words.
Title:
Generation:
Here are some possible titles generated from this text:
- “Italy’s greatest footballers”
- “Five Italian legends with multiple ‘hat tricks’”
- “Three Italians scored hat trick thrice - here they are!"
See! I could even use something else that would generate a single title instead of multiple. You just need to understand how the model works. Then you are good to go.
thanks @AayushShah
Thanks for the great explanation @AayushShah !