Further Finetune possible ?

#55
by al-h - opened

Hello @int-0 ,

Your work is amazing, I would like to further finetune the model on my data (created a 10k dataset with query, passage and negative passages). But from your paper "Text Embeddings by Weakly-Supervised Contrastive Pre-training" I understant in section "4.2 Fine-tuning with Labeled Data" that you added "knowledge distillation from a cross-encoder (CE) teacher model". I can't fine the mention of this teacher model and it's necessary to further finetune as the finetune loss is a linear interpolation with the InfoNCE loss and KL div obtained with teacher model.

  • where can I find this cross-encoder teacher model ?
  • would it be possible to not use the teacher model and just apply the InfoNCE loss straight away ?
  • MultipleNegativeRanking loss could be used to finetune multilingual e5 large instruct, I thought is used InfoNCE during finetuning ?

Have a great day !

Hi, the cross-encoder teacher model is not publicly available. However, you can fine-tune very effectively without it using only the InfoNCE loss. The performance difference is minimal. InfoNCE and MultipleNegativesRanking are the same loss, just different names.

intfloat changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment