Text Generation
Transformers
Safetensors
jamba
conversational
custom_code
Inference Endpoints

Thank you! Got more details on the fine tuning?

#1
by KnutJaegersberg - opened

Thank you for making this great fine tune, can you help me understand the parameter settings you have chosen to fine tune your model?
I still struggle to find good ones. I see you have chosen different target modules, but what about the other hyperparameters?
Could you list them?

Here's the W&B run:
https://wandb.ai/jondurbin/bagel-jamba-v0.5/runs/h730jkg1/overview?nw=nwuserjondurbin

TL;DR:

  • rank 16
  • alpha 32
  • learning rate 0.0001
  • per device batch size 4
  • gradient accumulation steps 4

Thanks so much!

KnutJaegersberg changed discussion status to closed

Sign up or log in to comment