--- title: LLMLingua emoji: 📝 colorFrom: red colorTo: yellow sdk: gradio sdk_version: 3.47.1 app_file: app.py pinned: false license: mit --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
| LLMLingua Paper | LongLLMLingua Paper | HF Space Demo |
## Tl;DR LLMLingua, that uses a well-trained small language model after alignment, such as GPT2-small or LLaMA-7B, to detect the unimportant tokens in the prompt and enable inference with the compressed prompt in black-box LLMs, achieving up to 20x compression with minimal performance loss. [LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models](https://arxiv.org/abs/2310.05736) (EMNLP 2023).