Further Finetune possible ?
Hello @int-0 ,
Your work is amazing, I would like to further finetune the model on my data (created a 10k dataset with query, passage and negative passages). But from your paper "Text Embeddings by Weakly-Supervised Contrastive Pre-training" I understant in section "4.2 Fine-tuning with Labeled Data" that you added "knowledge distillation from a cross-encoder (CE) teacher model". I can't fine the mention of this teacher model and it's necessary to further finetune as the finetune loss is a linear interpolation with the InfoNCE loss and KL div obtained with teacher model.
- where can I find this cross-encoder teacher model ?
- would it be possible to not use the teacher model and just apply the InfoNCE loss straight away ?
- MultipleNegativeRanking loss could be used to finetune multilingual e5 large instruct, I thought is used InfoNCE during finetuning ?
Have a great day !
Hi, the cross-encoder teacher model is not publicly available. However, you can fine-tune very effectively without it using only the InfoNCE loss. The performance difference is minimal. InfoNCE and MultipleNegativesRanking are the same loss, just different names.