ru-3b-openllama-transformers / README.md

Defetya

Update README.md

73a5ef8 verified 7 months ago

preview code

raw

history blame

311 Bytes

metadata

license: apache-2.0

openllama_v2 3B second stage pre-trained on OSCAR with 4k sequence length. Model has seen about 5B tokens for now, weights will be updated as the training goes on. Achieves 3.8 perplexity on the evaluation dataset. Will we further pre-trained on wiki dataset with 16K context length.