This is the KVLink5 model of the paper "KVLink: Accelerating LLMs via Efficient KV Cache Reuse."

Safetensors

Model size

1.5B params

Tensor type

BF16

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Shiyu-Lab/Llama1B-KVLink5

Quantizations