File size: 1,624 Bytes
b2fa501 830dfe3 a89aaa0 830dfe3 a89aaa0 1b9ac19 b2fa501 bf40a91 b2fa501 bf40a91 b2fa501 87eb525 bf40a91 87eb525 b2fa501 830dfe3 5eb16a6 830dfe3 e82b102 5eb16a6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: mit
datasets:
- RefinedWeb
- EleutherAI/OpenWebText2
library_name: open_lm
tokenizer: GPT-NeoX-20B
---
# Resolving Discrepancies in Compute-Optimal Scaling of Language Models: Checkpoints
This repository contains the model checkpoints in the paper ["Resolving Discrepancies in Compute-Optimal Scaling of Language Models"](https://arxiv.org/abs/2406.19146), by Tomer Porian, Mithcell Wortsman, Jenia Jitsev, Ludwig Schmidt, and Yair Carmon.
## Folder structure
Each checkpoint directory is in the path
`dataset={dataset}/hparams={hparams}_warmup={warmup}_decay={decay}/params={int(params / 1e6)}M_maxstep={maxstep}`
where `dataset, hparams, warmup, decay, params, maxstep` are as defined in the [github repository](https://github.com/formll/resolving-scaling-law-discrepancies), which contains the code and data for reproducing the figures in the paper.
## Evaluation and text generation
The script `evaluating_checkpoint.py` allows you to evaluate checkpoints on validation shards and generate text.
Move it to your `open_lm` local copy and run the following commands:
```
python evaluating_checkpoint.py --checkpoint "path/to/checkpoint" --input-text "The quick brown fox jumps over the lazy dog."
```
or
```
python evaluating_checkpoint.py --checkpoint "path/to/checkpoint" --val-data "path/to/validation/shards"
```
## Citation
```
@article{porian2024resolving,
title={Resolving Discrepancies in Compute-Optimal Scaling of Language Models},
author={Porian, Tomer and Wortsman, Mitchell and Jitsev, Jenia and Schmidt, Ludwig and Carmon, Yair},
journal={arXiv:2406.19146},
year={2024}
}
``` |