xformAI/opt-125m-gqa-ub-6-best-for-KV-cache
Text Generation
• Updated • 220
xformAI/facebook-opt-125m-qcqa-ub-6-best-for-KV-cache
Text Generation
• Updated • 223
Composio/mixtral_tensorrt_a100_tp2_w2_paged_kvcache
xformAI/opt-6.7b-ub-16-qcqa-best-for-KV-cache
Updated
saarvajanik/facebook-opt-6.7b-qcqa-ub-16-best-for-KV-cache
Text Generation
• Updated • 217
saarvajanik/facebook-opt-6.7b-gqa-ub-16-best-for-KV-cache
Text Generation
• Updated • 219
riczhou/Llama-3-70B-Instruct-awq-int8-kv-cache-trt-llm
riczhou/Llama-3-70B-Instruct-awq-int8-kv-cache-trt-llm-compiled
nm-testing/TinyLlama-1.1B-compressed-tensors-kv-cache-scheme
Text Generation
• 1B • Updated • 304
anthonymikinka/gpt2_coreml_kv_cache_try1
Text Generation
• Updated • 2
horheynm/Phi-3-mini-4k-instruct-kv_cache
4B • Updated • 4
nintwentydo/pixtral-12b-FP8-dynamic-FP8-KV-cache
Image-Text-to-Text
• 13B • Updated • 1
• 1
KVCache-ai/DeepSeek-V3-GGML-FP8-Hybrid
KVCache-ai/DeepSeek-R1-GGML-FP8-Hybrid
Yi30/inc-tp16-ep16-smoke-kvcache
Updated
Yi30/inc-tp16-ep16-full-kvcache
Updated
Yi30/inc-tp8-ep8-full-kvcache-from-tp16-ep16
Updated
Yi30/inc-tp16-ep16-full-bf16-kvcache
Updated
Yi30/inc-tp16-ep16-full-bf16-kvcache-smoke
Updated
Yi30/inc-unified-tp16-ep16-full-bf16-kvcache-smoke
Updated
Yi30/inc-2nodes-fp8-kvcache
Updated
MintLemon/inc-2nodes-tc-fp8-kvcache
Updated
nm-testing/Meta-Llama-3-8B-Instruct-FP8-channel-output-activation-kv_cache-qkv_proj
8B • Updated • 5
KVCache-ai/Qwen3-30BA3B-GGUF
31B • Updated • 9
• 1
aileedakey/HyperCLOVAX-SEED-Instruct-0.5B-CoreML-KVcache-FP32
manueldeprada/sampling_with_kvcache
Text Generation
• 0.1B • Updated • 9
Salesforce/moirai-1.5-llama-kvcache
Text Generation
• 12.2M • Updated manueldeprada/sampling_with_kvcache_hf_helpers
Text Generation
• 0.1B • Updated • 11
KVCache-ai/Kimi-K2-Instruct-GGUF
1T • Updated • 57
• 19
lawrencefeng17/sampling_with_kvcache
1.0B • Updated • 2