Kabumbus commited on
Commit
56d7c99
1 Parent(s): 59929e0

GGML models that can run f16 41.68 ms per token and q8 23.76 ms per token giving good results

Browse files
Files changed (2) hide show
  1. ggml-model-f16.bin +3 -0
  2. ggml-model-q8_0.bin +3 -0
ggml-model-f16.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b12534933201810c5cc5b6eb033f07ff6232e03f1b3cc4820b0e18566d113f3e
3
+ size 2623816724
ggml-model-q8_0.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4af6467cf42a8e5341c471fe0b370e5d038e521877b4612284c3ce8abbd26f4a
3
+ size 1394525204