--- datasets: - bigcode/the-stack-smol - EleutherAI/the_pile --- # Cerebras GPT 111M pretraining continuation on source code 15_000 step checkpoint model Source: https://github.com/claysauruswrecks/pretrain-cerebras-gpt-111m ```txt Epoch 0.25/2 Step Training Loss ===================== 500 1.644200 1000 1.552200 1500 1.546600 2000 1.497400 2500 1.523500 3000 1.506100 3500 1.476600 4000 1.427400 4500 1.466000 5000 1.461100 5500 1.436800 6000 1.447200 6500 1.433600 7000 1.416400 7500 1.428600 8000 1.401900 8500 1.373500 9000 1.391300 9500 1.415700 10000 1.393300 10500 1.411500 11000 1.401900 11500 1.378400 12000 1.381700 12500 1.347900 13000 1.357900 13500 1.328000 14000 1.337400 14500 1.346600 15000 1.336100 ```