Fine tuning on text domain

by wilfoderek - opened

Hi guys!
Please, share the receip to fine tuning in my own corpus this amazin model.
Thank you in advance!

You can use any dense retrieval training framework, just replace the model initialization with this one.

Personally, I would recommend Tevatron and SimLM codebase.

What codebase can I refer to if I want to implement the pre-training stage in your e5 paper?

@Chuzhan The implementation for pre-training part is even simpler, you can use the same codebase by removing the hard negatives.

Is there a tutorial on how to fine tune this model on a Greek sentence similarity dataset?

I am not aware of any, but you can adapt existing codebase for your specific needs.

Sign up or log in to comment