Kabumbus's picture
GGML models that can run f16 41.68 ms per token and q8 23.76 ms per token giving good results
56d7c99