bailin28 commited on
Commit
85e27ce
1 Parent(s): af49087

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -7,4 +7,6 @@ language:
7
  ---
8
 
9
 
10
- This checkpoint of the 1.3B GLA model used in the paper [Gated Linear Attention](https://arxiv.org/abs/2312.06635). See the model and loading script in this [repo](https://github.com/berlino/gated_linear_attention).
 
 
 
7
  ---
8
 
9
 
10
+ This checkpoint of the 1.3B GLA model used in the paper [Gated Linear Attention](https://arxiv.org/abs/2312.06635). The model is trained with 100B tokens from the SlimPajama dataset tokenized with Llama2 tokenizer.
11
+
12
+ See the model and loading script in this [repo](https://github.com/berlino/gated_linear_attention).