Edit model card

Standard roberta-large model fine-tuned for one pass over the entire Pile dataset.

See Test-time training on nearest neighbors for large language models for details.

Downloads last month
1,073

Dataset used to train socialfoundations/roberta-large-pile-lr2e-5-bs16-8gpu-1700000