How can I finetune this for domain especific applications?

by KaiKapioka - opened Mar 19

Mar 19

Let's say I want to finetune it so is better on medical lexic/language or something like that. How would I do it? Also, How should the data be formatted? I'm a little bit ignorant so please be kingd with me. Regards.

prudant

Owner Mar 19

•

edited Mar 19

I havent fine tuned this model, but afaik that you have to fine tune the original model (the link is in this model card) and if you want the long context you have to apply the lsg after the fine tune, with Transformer library fine tune is a few line of code, the format of the files is well documented on the hugging face transformer library, also in github there is a lot of juputer notebooks of the process... this is a bert transformer model, you have to search on how to fine tune a bert embeddings model

prudant

Owner Mar 19

also there is some packs for finetunning embeddings, llama indes support fine tune for this type of transformers with a few lines of code, you will need a high grade gpu if you dont want to wait 2 years for the finetune

prudant

Owner Mar 19

here: https://docs.llamaindex.ai/en/stable/examples/finetuning/embeddings/finetune_embedding.html

KaiKapioka

Mar 19

•

edited Mar 19

I was trying to run code to finetune it just for research and run into both issues. Lack of GPU memory and that the model ignores the long context and behaves like normal bert. How do you add the lsg to the model? I mean, the first issue is just better hardware or software optimization, but for the architecture I'm really lost here.

prudant

Owner Mar 19

in this model card is the link to the repo for that is pretty simple and straightforward

prudant

Owner Mar 19

•

edited Mar 20

I think you have to train first the original model (with 512 token max len) and then apply the lsg with the repo in my model card

KaiKapioka

Mar 20

Thx, that's what I was suspecting. I'll try it. Thx!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment