512的long-range？

by JaheimLee - opened Oct 12, 2023

Discussion

JaheimLee

Oct 12, 2023

既然论文中明确提到了long-range能力，为什么模型架构还是bert啊？就算你们不想用deberta，roformer这种相对位置编码模型，也可以像ernie一样扩大一下位置范围啊，512现在也能说是长文本吗？

Shitao

Beijing Academy of Artificial Intelligence org Oct 12, 2023

•

edited Oct 12, 2023

您好，这里的long-range指的是通过检索使得大模型能够接触到更长的上下文，并不是指向量模型能够编码更长的文本。llm-embedder是使用大模型llama作为打分器，在多个任务上训练的向量模型，适应大模型的检索需求，但并没有拓展向量模型的上下文。
对于长文本的向量模型，我们正在训练中，后期会放出具有长文本编码能力的BGE模型。

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment