关于多阶段训练

#21
by xxxcliu - opened

看paper有点困惑,是先进行了RetroMAE的预训练,然后用无监督数据进行了dense retrieval的训练,然后又用self蒸馏进行了三种方式的训练吗?

Beijing Academy of Artificial Intelligence org

Yes. RetroMAE->dense retrieval->unified fine-tuning

Yes. RetroMAE->dense retrieval->unified fine-tuning

Thanks for your reply! Do both bge and bge-m3 adopt a bi-encoder architecture? If so, is there actually a single roberta encoder or two seperate models?

Beijing Academy of Artificial Intelligence org

bge and bge-m3 both are bi-encoder model, and query and passage share the same encoder.

Sign up or log in to comment