Update README.md
Browse files
README.md
CHANGED
@@ -8,8 +8,9 @@ language:
|
|
8 |
|
9 |
This model is pretrained Based model.
|
10 |
|
11 |
-
As a quality reference, we include a pretrained Mamba model provided here: https://huggingface.co/hazyresearch/mamba-1b
|
12 |
-
|
|
|
13 |
|
14 |
|
15 |
### Model Sources
|
|
|
8 |
|
9 |
This model is pretrained Based model.
|
10 |
|
11 |
+
As a quality reference, we include a pretrained Mamba model provided here: https://huggingface.co/hazyresearch/mamba-1b, and a pretrained Attention (Llama architecture) model provided here: https://huggingface.co/hazyresearch/attn-1b
|
12 |
+
|
13 |
+
All three checkpoints are pretrained on 10Bn tokens of the Pile in the exact same data order using next token prediction.
|
14 |
|
15 |
|
16 |
### Model Sources
|