CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues Paper • 2404.03820 • Published Apr 4 • 21 • 4
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 102 • 5