Strange behaviour

#1
by andreP - opened

Hi,

thx for this great model, I had very good results for question answering.
I have just one observation:
Following pair is a 100% match, which is not unexpected: "Wer ist antragsberechtigt?" and "Wer ist antragsberechtigt?"
But this similarity drops sub-70% by just adding 1 word: "Wer ist antragsberechtigt?" and "Wer ist antragsberechtigt? Test"
It drops even further with more appended 'Test' words. Is this expected behaviour?
Is this because of the asymmetric nature of the training Question -> Answer?
On the other hand, other embedding models or even cloud services like ada/openai or luminous/aleph don't react quite as harsh, even though they are also trained for Q->A.

Thx

Hey,

as stated in the modelcard this model is based on paraphrase-multilingual-mpnet-base-v2, which is trained on a wide range of embedding tasks, but mainly 1-to-1 (same size) embeddings. This could maybe be the reason for this behaviour as the model still thinks, that the length of the input is important to the similarity. But i don't know if this is only the case for the multilingual version or if this also affects the english only version.

Maybe try the base model and check if it matches this behaviour.

The I tried the base model and it doesn't react that harsh to adding single words, but it's much worse for german question answering. Seems to come from fine tuning. You cannot have it all :)

@andreP Yeah, if you only care about the german qa performance, you could take the base model and finetune it on the germanquad dataset. This should result in a model that is better on german QA and more robust against differences in sequenze lengths.

Sign up or log in to comment