jartine commited on
Commit
d04aff4
1 Parent(s): 3b8b17a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -54
README.md CHANGED
@@ -75,60 +75,60 @@ pass the `-c 0` flag. The default temperature for these llamafiles is
75
 
76
  | hardware | model\_filename | size | test | t/s |
77
  | :----------------------------------------- | :--------------------------------------- | ---------: | ------------: | --------------: |
78
- | NVIDIA GeForce RTX 4090 (cuBLAS) | F16 | 13.50 GiB | pp512 | 7264.74 |
79
- | NVIDIA GeForce RTX 4090 (cuBLAS) | F16 | 13.50 GiB | tg16 | 58.27 |
80
- | NVIDIA GeForce RTX 4090 (cuBLAS) | Q6\_K | 5.54 GiB | pp512 | 4236.95 |
81
- | NVIDIA GeForce RTX 4090 (cuBLAS) | Q6\_K | 5.54 GiB | tg16 | 114.65 |
82
- | NVIDIA GeForce RTX 4090 (tinyBLAS) | Q6\_K | 5.54 GiB | pp512 | 3457.31 |
83
- | NVIDIA GeForce RTX 4090 (tinyBLAS) | Q6\_K | 5.54 GiB | tg16 | 85.20 |
84
- | NVIDIA GeForce RTX 4090 (tinyBLAS) | F16 | 13.50 GiB | pp512 | 1284.87 |
85
- | NVIDIA GeForce RTX 4090 (tinyBLAS) | F16 | 13.50 GiB | tg16 | 49.76 |
86
- | AMD Radeon RX 7900 XTX (hipBLAS) | F16 | 13.50 GiB | pp512 | 3239.27 |
87
- | AMD Radeon RX 7900 XTX (hipBLAS) | F16 | 13.50 GiB | tg16 | 37.41 |
88
- | AMD Radeon RX 7900 XTX (hipBLAS) | Q6\_K | 5.54 GiB | pp512 | 2647.72 |
89
- | AMD Radeon RX 7900 XTX (hipBLAS) | Q6\_K | 5.54 GiB | tg16 | 85.42 |
90
- | AMD Radeon RX 7900 XTX (tinyBLAS) | Q6\_K | 5.54 GiB | pp512 | 1226.20 |
91
- | AMD Radeon RX 7900 XTX (tinyBLAS) | Q6\_K | 5.54 GiB | tg16 | 76.29 |
92
- | AMD Radeon RX 7900 XTX (tinyBLAS) | F16 | 13.50 GiB | pp512 | 1033.91 |
93
- | AMD Radeon RX 7900 XTX (tinyBLAS) | F16 | 13.50 GiB | tg16 | 35.41 |
94
- | Apple M2 Ultra (60-core Metal GPU) | Q6\_K | 5.54 GiB | pp512 | 761.88 |
95
- | Apple M2 Ultra (60-core Metal GPU) | Q6\_K | 5.54 GiB | tg16 | 64.15 |
96
- | Apple M2 Ultra (ARMv8+fp16+dotprod) | F16 | 13.50 GiB | pp512 | 109.18 |
97
- | Apple M2 Ultra (ARMv8+fp16+dotprod) | F16 | 13.50 GiB | tg16 | 15.17 |
98
- | Intel Core i9-14900K (alderlake) | Q6\_K | 5.54 GiB | pp512 | 95.87 |
99
- | Intel Core i9-14900K (alderlake) | Q6\_K | 5.54 GiB | tg16 | 12.66 |
100
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | BF16 | 13.50 GiB | pp512 | 759.25 |
101
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | BF16 | 13.50 GiB | tg16 | 19.29 |
102
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | F16 | 13.50 GiB | pp512 | 559.94 |
103
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | F16 | 13.50 GiB | tg16 | 19.26 |
104
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q8\_0 | 7.17 GiB | pp512 | 518.76 |
105
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q8\_0 | 7.17 GiB | tg16 | 26.31 |
106
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q6\_K | 5.54 GiB | pp512 | 726.13 |
107
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q6\_K | 5.54 GiB | tg16 | 38.65 |
108
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_1 | 5.07 GiB | pp512 | 534.04 |
109
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_1 | 5.07 GiB | tg16 | 38.68 |
110
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_K\_M | 4.78 GiB | pp512 | 723.25 |
111
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_K\_M | 4.78 GiB | tg16 | 41.13 |
112
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_0 | 4.65 GiB | pp512 | 536.67 |
113
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_0 | 4.65 GiB | tg16 | 42.46 |
114
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_K\_S | 4.65 GiB | pp512 | 651.05 |
115
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q5\_K\_S | 4.65 GiB | tg16 | 42.14 |
116
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_1 | 4.24 GiB | pp512 | 572.67 |
117
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_1 | 4.24 GiB | tg16 | 43.19 |
118
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_K\_M | 4.07 GiB | pp512 | 728.48 |
119
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_K\_M | 4.07 GiB | tg16 | 44.29 |
120
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_K\_S | 3.86 GiB | pp512 | 666.82 |
121
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_K\_S | 3.86 GiB | tg16 | 45.18 |
122
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_0 | 3.83 GiB | pp512 | 562.96 |
123
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q4\_0 | 3.83 GiB | tg16 | 48.02 |
124
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q3\_K\_L | 3.56 GiB | pp512 | 706.64 |
125
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q3\_K\_L | 3.56 GiB | tg16 | 46.82 |
126
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q3\_K\_M | 3.28 GiB | pp512 | 715.62 |
127
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q3\_K\_M | 3.28 GiB | tg16 | 48.29 |
128
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q3\_K\_S | 2.95 GiB | pp512 | 722.11 |
129
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q3\_K\_S | 2.95 GiB | tg16 | 49.76 |
130
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q2\_K | 2.53 GiB | pp512 | 739.28 |
131
- | AMD Ryzen Threadripper PRO 7995WX (znver4) | Q2\_K | 2.53 GiB | tg16 | 53.01 |
132
 
133
  ## About llamafile
134
 
 
75
 
76
  | hardware | model\_filename | size | test | t/s |
77
  | :----------------------------------------- | :--------------------------------------- | ---------: | ------------: | --------------: |
78
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 7264.74 |
79
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 58.27 |
80
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 4236.95 |
81
+ | NVIDIA GeForce RTX 4090 (cuBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 114.65 |
82
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 3457.31 |
83
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 85.20 |
84
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 1284.87 |
85
+ | NVIDIA GeForce RTX 4090 (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 49.76 |
86
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 3239.27 |
87
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 37.41 |
88
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 2647.72 |
89
+ | AMD Radeon RX 7900 XTX (hipBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 85.42 |
90
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 1226.20 |
91
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 76.29 |
92
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 1033.91 |
93
+ | AMD Radeon RX 7900 XTX (tinyBLAS) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 35.41 |
94
+ | Apple M2 Ultra (60-core Metal GPU) | mistral-7b-instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 761.88 |
95
+ | Apple M2 Ultra (60-core Metal GPU) | mistral-7b-instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 64.15 |
96
+ | Apple M2 Ultra (ARMv8+fp16+dotprod) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | pp512 | 109.18 |
97
+ | Apple M2 Ultra (ARMv8+fp16+dotprod) | Mistral-7B-Instruct-v0.3.F16 | 13.50 GiB | tg16 | 15.17 |
98
+ | Intel Core i9-14900K (alderlake) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 95.87 |
99
+ | Intel Core i9-14900K (alderlake) | Mistral-7B-Instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 12.66 |
100
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.BF16 | 13.50 GiB | pp512 | 759.25 |
101
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.BF16 | 13.50 GiB | tg16 | 19.29 |
102
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.F16 | 13.50 GiB | pp512 | 559.94 |
103
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.F16 | 13.50 GiB | tg16 | 19.26 |
104
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q8\_0 | 7.17 GiB | pp512 | 518.76 |
105
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q8\_0 | 7.17 GiB | tg16 | 26.31 |
106
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q6\_K | 5.54 GiB | pp512 | 726.13 |
107
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q6\_K | 5.54 GiB | tg16 | 38.65 |
108
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_1 | 5.07 GiB | pp512 | 534.04 |
109
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_1 | 5.07 GiB | tg16 | 38.68 |
110
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_M | 4.78 GiB | pp512 | 723.25 |
111
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_M | 4.78 GiB | tg16 | 41.13 |
112
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_0 | 4.65 GiB | pp512 | 536.67 |
113
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_0 | 4.65 GiB | tg16 | 42.46 |
114
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_S | 4.65 GiB | pp512 | 651.05 |
115
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q5\_K\_S | 4.65 GiB | tg16 | 42.14 |
116
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_1 | 4.24 GiB | pp512 | 572.67 |
117
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_1 | 4.24 GiB | tg16 | 43.19 |
118
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_M | 4.07 GiB | pp512 | 728.48 |
119
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_M | 4.07 GiB | tg16 | 44.29 |
120
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_S | 3.86 GiB | pp512 | 666.82 |
121
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_K\_S | 3.86 GiB | tg16 | 45.18 |
122
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_0 | 3.83 GiB | pp512 | 562.96 |
123
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q4\_0 | 3.83 GiB | tg16 | 48.02 |
124
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_L | 3.56 GiB | pp512 | 706.64 |
125
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_L | 3.56 GiB | tg16 | 46.82 |
126
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_M | 3.28 GiB | pp512 | 715.62 |
127
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_M | 3.28 GiB | tg16 | 48.29 |
128
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_S | 2.95 GiB | pp512 | 722.11 |
129
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q3\_K\_S | 2.95 GiB | tg16 | 49.76 |
130
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q2\_K | 2.53 GiB | pp512 | 739.28 |
131
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | mistral-7b-instruct-v0.3.Q2\_K | 2.53 GiB | tg16 | 53.01 |
132
 
133
  ## About llamafile
134