How to determine whether the final model output is good or bad?

by Jeremy1110 - opened Feb 3

Feb 3

Hello, author. I would like to ask about the final output logits of your model. What range of values should I expect for it to be considered reasonable?

I tested it with a random tgt and also with a translation model's output. The first case produced a value of 0.64, while the second case resulted in 0.7.

Additionally, I am using a translation model, but it sometimes generates hallucinated. Would it be possible to filter such outputs using Quality Estimation?

ymoslem

Owner 23 days ago

•

edited 23 days ago

Hello! This model is for sentence-level QE. Hence, it is better to use COMET.
As for which score is better, there is no specific score. Usually, we compare two systems, e.g. a baseline vs. a fine-tuned model.

ymoslem changed discussion status to closed 23 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment