attn-360M-30B / README.md
simarora's picture
Create README.md
91cc02d verified
|
raw
history blame
311 Bytes
---
datasets:
- EleutherAI/pile
language:
- en
---
# Model Card
This model is an Attention (Llama architecture) model pretrained on 30Bn tokens of the Pile corpus.
### Model Sources
The model implementation and training code that produced the model are provided here: https://github.com/HazyResearch/based