Training

by freegheist - opened May 8

Discussion

freegheist

May 8

Any chance for hyperparameters or training config? :)

migtissera

Owner May 9

Here you go my man: https://gist.github.com/mtisz/5cd0e72844e552fd06e77535c81bbfae

This was for a 4xA100 machine. Play around with:

learning_rate (the learning rate)
lora_r (dimension of the LoRA adapters)
gradient_accumulation_steps and
micro_batch_size

Make sure to comment out fsdp and fsdp_config sections when you're ready to merge the QLoRA adapter, as there's a bug in Axolotl that makes the model merging hang.

migtissera changed discussion status to closed May 9

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment