thing3 / README.md
oof-baroomf's picture
Create README.md
cfb07f2 verified

This is the third in a series of GPT-2 (124M) models I pretrained on different orderings, of data, proving that curriculum learning (https://arxiv.org/html/2405.07490v1) is not a viable method for improving LLM performance, and in fact reduces the performance.

I trained the models on data ordered randomly, reading level ascending, and reading level descending.