mobicham commited on
Commit
247e309
·
verified ·
1 Parent(s): 80ef501

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -19,10 +19,11 @@ This is an <a href="https://github.com/mobiusml/hqq/">HQQ</a> all 4-bit (group-s
19
  ## Model Decoding Speed
20
  | Models | fp16| HQQ 4-bit/gs-64|
21
  |:-------------------:|:--------:|:----------------:|
22
- | Decoding* - short seq (tokens/sec)| OOM | 10.7 (tokens/sec) |
23
- | Decoding* - long seq (tokens/sec)| OOM | 9.7 (tokens/sec)|
24
 
25
- *: 1xA100 80GB
 
26
 
27
  ## Performance
28
 
 
19
  ## Model Decoding Speed
20
  | Models | fp16| HQQ 4-bit/gs-64|
21
  |:-------------------:|:--------:|:----------------:|
22
+ | Decoding - short seq (tokens/sec)| 10.5 (tokens/sec)** | 10.7 (tokens/sec)* |
23
+ | Decoding - long seq (tokens/sec)| 9.5 (tokens/sec)** | 9.7 (tokens/sec)*|
24
 
25
+ **: 2xA100 80GB<br>
26
+ *: 1xA100 80GB
27
 
28
  ## Performance
29