
by ArthurFischel - opened

What kind of resources did it take to train this model?

I trained it on nvidia A4000 gpu for a few hours.

Thanks for taking the time to respond. How much of the pile did you train on? If you had to give a rough token estimate?

I did not measure that for this project; but you might be interested to look at this:

Sign up or log in to comment