qanthony-z commited on
Commit
40c8655
·
verified ·
1 Parent(s): b74422a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -61,7 +61,11 @@ Zamba2-2.7B-Instruct punches dramatically above its weight, achieving extremely
61
  | StableLM-Zephyr-3B | 3B | 66.43 | 38.27 |
62
 
63
 
64
- Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
 
 
 
 
65
 
66
  Time to First Token (TTFT) | Output Generation
67
  :-------------------------:|:-------------------------:
 
61
  | StableLM-Zephyr-3B | 3B | 66.43 | 38.27 |
62
 
63
 
64
+ Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer-based models.
65
+
66
+ <center>
67
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/U7VD9PYLj3XcEjgV08sP5.png" width="700" alt="Zamba performance">
68
+ </center>
69
 
70
  Time to First Token (TTFT) | Output Generation
71
  :-------------------------:|:-------------------------: