scores for model.generate()
Hi,
I intend to fine-tune the t5 model with a custom loss function (REINFORCE) since I do not have access to the immediate decoder outputs. Instead, I'd be using the decoder output for another downstream task which generates the pipeline non-differentiable. As such, I'm trying to find out how efficient it is to train using REINFORCE by using the probability of output generation: P(decoder_output). Is there a way to obtain the output generation logits for greedy decoding, something like:
model.generate(input_ids).logits
I tried to set output_scores = True, but that didn't work. On a different note, I could generate the output and use it alongside the input to get the logits using
output_ids = model.generate(input_ids)
model(input_ids, output_ids).logits
However, I think this might output incorrect probabilities owing to the teacher forcing. Is that correct?
Thanks !