How long did it take you to pre-train and how much compute did it cost ?

#1
by damerajee - opened

Great Model I've been learning about VLM and multi-model
your code and github repo really help me but before i start pre-training my self I wanted to know how much time did it take and what was the compute cost ?

Using the LLaVA-Pretrain-JA data, the pre-training was completed in about 1 day for the RTX4090 x 1 unit.
LLM used the GPT2-based 1.3b model.

Oh wow in just 1 day crazy thanks a lot

damerajee changed discussion status to closed

Sign up or log in to comment