Further fine-tuning?

#4
by jspr - opened

Hi @grimulkan ! I've been loving this model but I've found myself wanting the option to give it a bit more domain adaptation to particular fandoms, subgenres, etc. Would you be open to providing a bit of guidance on how to further fine-tune Aurelian, if it'd be possible for others to replicate your training environment? Not sure if/how much the LongLoRA base is prohibitive for this sort of thing.

You sure can finetune it, though v0.5 has it's pitfalls and I've learnt more about how not to do those things since. You're free to finetune further as you wish.

This is an old thread that went over some of the methods for training long-context models (outdated in some ways): https://www.reddit.com/r/LocalLLaMA/comments/16euhw5/training_long_context_32k_70b_llama/
I haven't published my particular training script yet, which uses a few methods not commonly used out there, but I do intend do (and that will probably be a separate Part 2 Reddit post).

However you can train the model with any other training repo that supports Llama, such as Axolotl. You don't have to train it at the full 32K context length, just train it like any other model & make sure you use a rope scaling of 8. Chances are it will generalize to 32K anyway.

Edit: You don't have to train it as LongLORA either. I think it will retain that capability even if you train a regular LORA/QLORA on top of it.

Edit2: See the main page for how to convert stories into multi-round Q&A with my other model if that's what you meant. There's an update coming to that model soon as well, but the existing version is decent.

You rock, thanks @grimulkan . I'll probably go with Axolotl since I know it already.

Sign up or log in to comment