Defetya's picture
Update README.md
73a5ef8 verified
|
raw
history blame
311 Bytes
metadata
license: apache-2.0

openllama_v2 3B second stage pre-trained on OSCAR with 4k sequence length. Model has seen about 5B tokens for now, weights will be updated as the training goes on. Achieves 3.8 perplexity on the evaluation dataset. Will we further pre-trained on wiki dataset with 16K context length.