Fine tuning

by gromag - opened Mar 22, 2023

Mar 22, 2023

•

edited Mar 23, 2023

Thank you for sharing this model and paper.
I'm investigating what would take to further fine tune Instructor-XL to a legal domain for retrival tasks.
I'm trying to assess what could be a good starting training set size, loss temperature and what could be a good k of negative pairs per positive pairs.
I welcome any other heads-ups.

PS. with hindsight I feel a little daft asking about finetuning when the model card explicitly say "embeddings tailored to any task and domains [...] by simply providing the task instruction, without any finetuning. " , please let me know if it is a stupid idea.

multi-train

NLP Group of The University of Hong Kong org Mar 24, 2023

Thank you very much for your interest in INSTRUCTOR!

The instruction serves as an efficient option for adapting embeddings to specific domains, but you can also further enhance the model ability through finetuning. At the start, you may use all the available training data (probably training for a maximum of 40K steps). For other hyper-parameters, you may adopt our default setting (e.g., loss_temperature=0.01, k=4, etc.)

Hope this helps! Feel free to add any further questions or comments!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment