Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
fla-hub
/
gsa-1.3B-100B
like
0
Text Generation
Safetensors
cerebras/SlimPajama-627B
English
fla
gsa
arxiv:
2409.07146
License:
mit
Model card
Files
Files and versions
Community
1
Edit model card
Model of the paper
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
.
Downloads last month
40
Safetensors
Model size
1.38B params
Tensor type
BF16
·
Inference Examples
Text Generation
Inference API (serverless) does not yet support fla models for this pipeline type.
Dataset used to train
fla-hub/gsa-1.3B-100B
cerebras/SlimPajama-627B
Preview
•
Updated
Jul 7, 2023
•
2.38k
•
417
Collection including
fla-hub/gsa-1.3B-100B
GSA
Collection
3 items
•
Updated
about 1 month ago
•
2