claysauruswrecks
/

cerebras-gpt-111m-pretrain-stack-smol-0-15k-chkp

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cerebras-gpt-111m-pretrain-stack-smol-0-15k-chkp / README.md

claysauruswrecks

initial commit, 15k step checkpoint

ec1ccf6 over 1 year ago

|

history blame contribute delete

809 Bytes

	---
	datasets:
	- bigcode/the-stack-smol
	- EleutherAI/the_pile
	---

	# Cerebras GPT 111M pretraining continuation on source code

	15_000 step checkpoint model

	Source: https://github.com/claysauruswrecks/pretrain-cerebras-gpt-111m

	```txt
	Epoch 0.25/2
	Step Training Loss
	=====================
	500 1.644200
	1000 1.552200
	1500 1.546600
	2000 1.497400
	2500 1.523500
	3000 1.506100
	3500 1.476600
	4000 1.427400
	4500 1.466000
	5000 1.461100
	5500 1.436800
	6000 1.447200
	6500 1.433600
	7000 1.416400
	7500 1.428600
	8000 1.401900
	8500 1.373500
	9000 1.391300
	9500 1.415700
	10000 1.393300
	10500 1.411500
	11000 1.401900
	11500 1.378400
	12000 1.381700
	12500 1.347900
	13000 1.357900
	13500 1.328000
	14000 1.337400
	14500 1.346600
	15000 1.336100
	```