jrt
Collection
13 items
•
Updated
This model is an Attention (Llama architecture) model pretrained on 30Bn tokens of the Pile corpus.
The model implementation and training code that produced the model are provided here: https://github.com/HazyResearch/based