Update README.md
Browse files
README.md
CHANGED
@@ -71,15 +71,14 @@ Zamba2-2.7B utilizes and extends our original Zamba hybrid SSM-attention archite
|
|
71 |
|
72 |
Zamba2-2.7B achieves leading and state-of-the-art performance among models of <3B parameters and is competitive with some models of significantly greater size. Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
|
73 |
|
|
|
|
|
74 |
Zamba2-2.7B's high performance and small inference compute and memory footprint renders it an ideal generalist model for on-device applications.
|
75 |
|
76 |
<center>
|
77 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/U7VD9PYLj3XcEjgV08sP5.png" width="700" alt="Zamba performance">
|
78 |
</center>
|
79 |
|
80 |
-
<center>
|
81 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/3u8k7tcRi-oC_ltGhdHAk.png" width="800" alt="Zamba performance">
|
82 |
-
</center>
|
83 |
|
84 |
Time to First Token (TTFT) | Output Generation
|
85 |
:-------------------------:|:-------------------------:
|
|
|
71 |
|
72 |
Zamba2-2.7B achieves leading and state-of-the-art performance among models of <3B parameters and is competitive with some models of significantly greater size. Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
|
73 |
|
74 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/64e40335c0edca443ef8af3e/wXFMLXZA2-xz2PDyUMwTI.png" width="600"/>
|
75 |
+
|
76 |
Zamba2-2.7B's high performance and small inference compute and memory footprint renders it an ideal generalist model for on-device applications.
|
77 |
|
78 |
<center>
|
79 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/U7VD9PYLj3XcEjgV08sP5.png" width="700" alt="Zamba performance">
|
80 |
</center>
|
81 |
|
|
|
|
|
|
|
82 |
|
83 |
Time to First Token (TTFT) | Output Generation
|
84 |
:-------------------------:|:-------------------------:
|