pglo commited on
Commit
5813888
1 Parent(s): ded606b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -71,15 +71,14 @@ Zamba2-2.7B utilizes and extends our original Zamba hybrid SSM-attention archite
71
 
72
  Zamba2-2.7B achieves leading and state-of-the-art performance among models of <3B parameters and is competitive with some models of significantly greater size. Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
73
 
 
 
74
  Zamba2-2.7B's high performance and small inference compute and memory footprint renders it an ideal generalist model for on-device applications.
75
 
76
  <center>
77
  <img src="https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/U7VD9PYLj3XcEjgV08sP5.png" width="700" alt="Zamba performance">
78
  </center>
79
 
80
- <center>
81
- <img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/3u8k7tcRi-oC_ltGhdHAk.png" width="800" alt="Zamba performance">
82
- </center>
83
 
84
  Time to First Token (TTFT) | Output Generation
85
  :-------------------------:|:-------------------------:
 
71
 
72
  Zamba2-2.7B achieves leading and state-of-the-art performance among models of <3B parameters and is competitive with some models of significantly greater size. Moreover, due to its unique hybrid SSM architecture, Zamba2-2.7B achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer based models.
73
 
74
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/64e40335c0edca443ef8af3e/wXFMLXZA2-xz2PDyUMwTI.png" width="600"/>
75
+
76
  Zamba2-2.7B's high performance and small inference compute and memory footprint renders it an ideal generalist model for on-device applications.
77
 
78
  <center>
79
  <img src="https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/U7VD9PYLj3XcEjgV08sP5.png" width="700" alt="Zamba performance">
80
  </center>
81
 
 
 
 
82
 
83
  Time to First Token (TTFT) | Output Generation
84
  :-------------------------:|:-------------------------: