pszemraj/flan-t5-large-grammar-synthesis · determining and formatting according to specific tone during generation of text.

Jan 27

how can we tune the model to get sentences in a specific tone?

Owner Feb 21

Hi! Sorry for the late response here. it depends on what you mean, i.e. do you want the model to always output in a specific tone or to be able to specify the tone/some other attribute?

if you want it to always output in X tone, then you simply need a data set with one column containing the incorrect grammatical text and the other containing the corrected text, always in the tone you want.
if you want to be able to control the output tone at inference, then you need to use 'prefix tuning' (similar to the original t5), where you add 'tone tokens' and prefix the input with them in the 'input' column, and your 'output' column is then the corrected version of the text corresponding to that tone.

Example for 2:

you can make the 'delimiter' token(s) to be whatever you want, but some things may be more effective than others. Because t5 already has a bunch of special tokens that it never uses during finetune already added, let's use one of those. <extra_id_0> (there are 100 of them, just be consistent)

input text:

formal<extra_id_0>i luvv meme w/ all heart & soul

output text

I harbor an intense affection for memes, extending deeply into the recesses of my heart and soul.

Then you just need a lot of data like that, ideally balanced across the different tones you want to use. Also a tip: one of the reasons this model is good/robust is the inclusion of 'negatives' in the training data which are rows where the input text is already what you want the output to be so the model learns "if it aint broke, dont fix it".

hope that helps!

pszemraj changed discussion status to closed Nov 5