对前后语序十分敏感

#15
by Tonylin52 - opened

你好,

我在实验bge-reranker-large模型的时候,发现该模型对语序十分敏感,例如:
(1)今天天气怎么样?今天天气晴。我们都是写代码的
(2)今天天气怎么样?我们都是写代码的。今天天气晴。

这两个分数差别很大。

正常应用reranker模型的时候,关键回答在chunk位置本身就是十分随机的,那对语序十分敏感,不就意味着reranker模型是完全失效的吗?

Beijing Academy of Artificial Intelligence org
edited May 11

Hi, @Tonylin52 , we don't find the scores of the provided two examples very different. In contrast, the scores of these two are very close:

from FlagEmbedding import FlagReranker
reranker = FlagReranker('BAAI/bge-reranker-large', use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation

score = reranker.compute_score([['今天天气怎么样?', '今天天气晴。我们都是写代码的'], ['今天天气怎么样?', '我们都是写代码的。今天天气晴']])
print(score)

Here is the output: [1.6416015625, 1.5048828125]

Sign up or log in to comment