Text Generation
Transformers
Safetensors
mistral
Inference Endpoints
text-generation-inference

How many hours of training?

#1
by damerajee - opened

Hey there i noticed you train the entire model on free kaggle GPU if you don't mind ,I wanted to know which you did you use and how many hours did you train it for ?,

Owner

Hi damerajee, I'm trained on the last 3 layers (29, 30, 31), not the entire model. It took about 20 hours with 2 GPUs. The goals of these projects are twofold. First, to be able to train 7b models for free (Kaggle) and to test the backward training notion, which is used in RNNs but not in LLMs.

pretty cool thanks for this info

Sign up or log in to comment