Fine Tuning

#16
by drachs - opened

Do I have to do anything special if I want to try to fine tune this as compared to a regular mistral fine tune? I have a task that requires very long attention, 60-100k. I have plenty of data to work with so I thought I'd try a LORA based fine tune and see what happens.

Amazon Web Services org

@drachs

I think it is better to set to sliding_windowto 100k in the model config for your fine tuning. Thank you! If possible, please share with us how it goes.

Sign up or log in to comment