Edit model card

StoriesLM: A Family of Language Models With Sequentially-Expanding Pretraining Windows

Model Family

StoriesLM is a family of language models with sequentially-expanding pretraining windows. The pretraining data for the model family comes from the American Stories dataset—a collection of language from historical American news articles. The first language model in the StoriesLM family is pretrained on language data from 1900. Each subsequent language model further trains the previous year’s model checkpoint using data from the following year, up until 1963.

Dataset

The StoriesLM family is pretrained on the American Stories dataset. If you use a model from this family, please also cite the original dataset's authors:

@article{dell2024american,
  title={American stories: A large-scale structured text dataset of historical us newspapers},
  author={Dell, Melissa and Carlson, Jacob and Bryan, Tom and Silcock, Emily and Arora, Abhishek and Shen, Zejiang and D'Amico-Wong, Luca and Le, Quan and Querubin, Pablo and Heldring, Leander},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train StoriesLM/StoriesLM-v1-1905