mit-han-lab
/

Yi-34B-QServe-g128

Text Generation

text-generation-inference

Model card Files Files and versions Community

Add model card for LServe

#1

by nielsr HF staff - opened Feb 22

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +15 -0

README.md ADDED Viewed

	@@ -0,0 +1,15 @@

+---
+pipeline_tag: text-generation
+library_name: transformers
+license: apache-2.0
+---
+# LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
+This repository contains the LServe model, as presented in the paper [LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention](https://hf.co/papers/2502.14866). LServe is an efficient system that accelerates long-sequence LLM serving via hybrid sparse attention.
+**Paper:** [LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention](https://hf.co/papers/2502.14866)
+**Code:** [https://github.com/mit-han-lab/omniserve](https://github.com/mit-han-lab/omniserve)
+**Project Page:** [https://hanlab.mit.edu/projects/lserve](https://hanlab.mit.edu/projects/lserve)