|
The model's name describes how many layers and which dataset it was trained on. There is also other metadata in the checkpoint like lowest validation loss, number of iterations, etc. |
|
|
|
Training configs can be viewed here: https://api.wandb.ai/links/adam-karvonen/u783xspb |
|
|
|
|
|
|
|
Dataset descriptions: |
|
- lichess: 7GB of 16 million games from lichess's database. No elo filtering performed. |
|
- Lichess_gt_18k: ~4GB of games from lichess. Per OpenAI's weak to strong generalization paper, filtered to only include games where white is > 1800 ELO. |
|
- Stockfish: 4.5GB of games generated by White playing as Stockfish ELO 3200 against a range of Stockfish ELO 1300-3200 as black. |
|
- Lichess-stockfish mix: a 50 / 50 mix of > 1800 ELO lichess games and stockfish generated games |
|
- Lichess results: lichess, but we include the result before every game. Hopefully, we can then prompt the model with ";1-0#1.", indicating to the model that it's supposed to win this game. |
|
|
|
All models are trained with their inputs beginning with ";", which is also the delimiter token between games. Performance will go down if this is not used. |
|
Models with optimizers use more storage, but you can easily resume training with them. Models without optimizers use less storage and are fine for training linear probes or inference. |
|
At some point, I started including dataset as metadata in the checkpoint. Some models may not include it. |
|
|
|
I also have 31 checkpoints from a training run if you are interested in investigating how skills emerge during a training run. They are located here: https://huggingface.co/adamkarvonen/chess_llm_30_checkpoints |