llmlingua-2 / README.md
qianhuiwu's picture
Remove env setup. Update Readme.
b8f3522
|
raw
history blame
2.97 kB
metadata
title: Llmlingua 2
emoji: 💻
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 4.21.0
app_file: app.py
pinned: false
license: cc-by-nc-sa-4.0

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

LLMLingua-2 is one of the branch from LLMLingua Series. Please check the links below for more information.

LLMLingua

LLMLingua Series | Effectively Deliver Information to LLMs via Prompt Compression

| Project Page | LLMLingua | LongLLMLingua | LLMLingua-2 | LLMLingua Demo | LLMLingua-2 Demo |

Brief Introduction

LLMLingua utilizes a compact, well-trained language model (e.g., GPT2-small, LLaMA-7B) to identify and remove non-essential tokens in prompts. This approach enables efficient inference with large language models (LLMs), achieving up to 20x compression with minimal performance loss.

LongLLMLingua mitigates the 'lost in the middle' issue in LLMs, enhancing long-context information processing. It reduces costs and boosts efficiency with prompt compression, improving RAG performance by up to 21.4% using only 1/4 of the tokens.

LLMLingua-2, a small-size yet powerful prompt compression method trained via data distillation from GPT-4 for token classification with a BERT-level encoder, excels in task-agnostic compression. It surpasses LLMLingua in handling out-of-domain data, offering 3x-6x faster performance.