Add model card for LServe
#1
by
nielsr
HF staff
- opened
README.md
ADDED
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
pipeline_tag: text-generation
|
3 |
+
library_name: transformers
|
4 |
+
license: apache-2.0
|
5 |
+
---
|
6 |
+
|
7 |
+
# LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
|
8 |
+
|
9 |
+
This repository contains the LServe model, as presented in the paper [LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention](https://hf.co/papers/2502.14866). LServe is an efficient system that accelerates long-sequence LLM serving via hybrid sparse attention.
|
10 |
+
|
11 |
+
**Paper:** [LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention](https://hf.co/papers/2502.14866)
|
12 |
+
|
13 |
+
**Code:** [https://github.com/mit-han-lab/omniserve](https://github.com/mit-han-lab/omniserve)
|
14 |
+
|
15 |
+
**Project Page:** [https://hanlab.mit.edu/projects/lserve](https://hanlab.mit.edu/projects/lserve)
|