Training details?

#6
by melmass - opened

Hi

I'd like to make a derive of VLM but wanted to roughly estimate the processing/time required.
Is this data available and I just did not find it?
If not could you share some details?

Thanks

We fine tuned a 7B VLM on WebSight for one epoch with a global batch size of 384 without packing (one batch example contains only one example of WebSight) for a few days.
Depending on your batch size, num of steps, size of VLM you use, it will vary a lot

HugoLaurencon changed discussion status to closed

Sign up or log in to comment