bailin28
/

gla-1B-100B

Text Generation

Model card Files Files and versions Community

bailin28 commited on Feb 12, 2024

Commit

af49087

·

verified ·

1 Parent(s): 4b9828d

Create README.md

Files changed (1) hide show

README.md +10 -0

README.md ADDED Viewed

	@@ -0,0 +1,10 @@

+---
+license: mit
+datasets:
+- cerebras/SlimPajama-627B
+language:
+- en
+---
+This checkpoint of the 1.3B GLA model used in the paper [Gated Linear Attention](https://arxiv.org/abs/2312.06635).  See the model and loading script in this [repo](https://github.com/berlino/gated_linear_attention).