Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment Paper • 2401.12474 • Published Jan 23 • 33
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Paper • 2403.12968 • Published Mar 19 • 23
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models Paper • 2310.00746 • Published Oct 1, 2023 • 1
LESS: Selecting Influential Data for Targeted Instruction Tuning Paper • 2402.04333 • Published Feb 6 • 3
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Paper • 2306.01116 • Published Jun 1, 2023 • 29
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published 8 days ago • 71
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published 5 days ago • 69