Text Generation
Transformers
Safetensors
7 languages
stablelm
causal-lm
Inference Endpoints
12 papers

Training config link is broken

#3
by davidgortega - opened

Hi, in the section Training procedure the training config link is missing.

I would like to continue pre-training. Is there any additional advice?

@jon-tow any thought here? I would like to continue pre-training

same reason here, the config seams to be missing in the repo :-)
any news?

anyone here?

The file can be found in the training brach here in huggingface. :facepalm:

I leave it open to fix the Model card

Stability AI org

Hi @davidgortega ! Sorry, I was swamped and missed this. Let me know if you have any questions about the config; I can try to answer them ASAP.

Thanks!

Im planing to continue pre-training with a dataset of 100M tokens for two epochs. Do you think it would be enough to learn it?

hello new here kind of wanted to learn more and figure out some things some of this is way over my head I feel like I need a book for dummy's, but in your bust words how do i get the lib for this and use it I get lost alot 🤔

Stability AI org

@davidgortega re:

Im planing to continue pre-training with a dataset of 100M tokens for two epochs. Do you think it would be enough to learn it?

If the domain of your data is relatively close to the pre-training dataset (see datasets metadata), it should be enough. Otherwise, it is hard to tell 😅 I'd also suggest fine-tuning the released checkpoint as opposed to continued pre-training from the pre-cooldown version since it's only for 200M tokens.

@jon-tow thanks for the reply.

Its wiki data in a specific domain (like fandom). I hope it works.
The problem of fine-tuning after the cool down is that training raw data with an empty prompt alone does not work as I expect. I have to combine empty prompts with synthetic instruct data generated from my corpus to learn a little and the outcome hallucinates a bit too much. Maybe you have a recipe?

Sign up or log in to comment