lora How to create for mistral?

#1
by LeroyDyer - opened

could you provide a notebook on the git hub or here to show how to create these lora for mistral model ?
is this model a mixture of loras ?

as i have loras saved from my fine tuning of mistral models ; many are the same setup, ie same target gates and rank etc :
i was thinking to clip them to a model then save as pretrained but the mixtrue of loras has a gating system ?? so maybe it should be merged with the main model also ?

I have had luck in merging architectures also... ie thes confguration for the lora would need to go inside the actual misrtral modelling file... and be placed in the transformer network generation path: so when executing the layer stack this would be a layeer in the stack : also the config for these layer extras would need to be in the configuration.py also enabling for config to be read and models to be instanciated fro the config file as wel as set parameter ... enabling for correct traiing of the transformer network as a whole !
converting a model to this model would be such that all you would need to do is to load your misstral model using these files from pretrained and puting the model in train and training for a single cycle before saving to pretrained ... leaving you with the new model and the previous llm weights and the new section awaiting fine tuning... ie retraining of existing datasets to allow for the previous data to flow through the new tensors !!!
when creating a new peft layer it will automatically target the new gates as you will need to specifie thier names !!!

hence a fully rtrained model with aditional eperts inside withouut adjusting the B (perhaps only a few 100k parameters for the new additional gates!) if this was a drop out layer as well it would allow the model to auto target these layers in training backprop!

Sign up or log in to comment