Pythia-160m-PG19

This is an experiment to see if Pythia pretrained from scratch on pg19 could work. It was trained from scratch and uses the same settings as the regular Pythia-160M. It was trained for 150,000,000 tokens of the pg19 dataset. It is possible that better results can be achieved if trained for longer, but that is not the goal of this project.

Currently I am working on creating a larger model trained on more CC0 datasets.