Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26 • 31
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper • 2407.20183 • Published Jul 29 • 40
view article Article MInference 1.0: 10x Faster Million Context Inference with a Single GPU By liyucheng • Jul 11 • 12
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper • 2407.02490 • Published Jul 2 • 23
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression Paper • 2310.06839 • Published Oct 10, 2023 • 3
LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models Paper • 2310.05736 • Published Oct 9, 2023 • 4
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Paper • 2403.12968 • Published Mar 19 • 24