Continue pretraining of some larger models?
A twitter contact suggested to me it could make sense to continue pretraining of one of the larger models, i.e. mpt-30b or falcon-40b on some German data.
What do you think about this?
Do you have ideas how to realize that? Perhaps continuing pretraining for some 50b tokens would cost 100k euros or so.
I am already preparing the 70b model training on large scale german data :-)
Stay tuned
If you want to speed up the process we could speak about some ways to help financing that, my Employer is the defacto main sponsor of training and inference hardware in their data centers.
So one way would be telling them my name when buying new hardware or directly contact them to find a way of financial support.
Pinging
@jphme
who was also interested in german LLM research group
Maybe we could open an slack or something
https://join.slack.com/t/slack-dtc7771/shared_invite/zt-219keplqu-hLwjm0xcFAOX7enERfBz0Q
Just created an slack for german llm's, would be happy plan more training runs there