Zymrael commited on
Commit
1e81fed
2 Parent(s): 2f9e2c6 70481ea

chore: sync readme

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -1,3 +1,18 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
  ---
6
+
7
+ ## StripedHyena-Hessian-7B (SH-7B)
8
+
9
+
10
+ ### Model Architecture
11
+
12
+ The architecture of StripedHyena-Hessian-7B is different from traditional decoder-only Transformers.
13
+
14
+ StripedHyena is a hybrid architecture composed of multi-head, grouped-query attention and gated convolutions arranged in [Hyena](https://arxiv.org/abs/2302.10866) blocks.
15
+ - Costant memory decoding by representation of convolutions as state-space models (modal or canonical form), or as truncated filters.
16
+ - Lower latency to preprocess long prompts.
17
+ - Improvements to training and inference compute-optimal scaling laws, compared to Transformers.
18
+ >>>>>>> 70481ea0fbb23e43c66663f8fb40d94661f235f0