arxiv:2406.10996

THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation

Published on Jun 16

· Submitted by

ktio on Jun 18

Upvote

Authors:

Seo Hyun Kim ,

Kai Tzu-iunn Ong ,

Taeyoon Kwon ,

Keummin Ka ,

Seung-won Hwang ,

Jinyoung Yeo

Abstract

Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argue that such memories can provide contextual cues that help dialogue systems understand the development of past events and, therefore, benefit response generation. We present Theanine, a framework that augments LLMs' response generation with memory timelines -- series of memories that demonstrate the development and causality of relevant past events. Along with Theanine, we introduce TeaFarm, a counterfactual-driven question-answering pipeline addressing the limitation of G-Eval in long-term conversations. Supplementary videos of our methods and the TeaBag dataset for TeaFarm evaluation are in https://theanine-693b0.web.app/.

View arXiv page View PDF Add to collection

Community

ktio

Paper author Paper submitter 19 days ago

•

edited 19 days ago

In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argue that such memories can provide contextual cues that help dialogue systems understand the development of past events and, therefore, benefit response generation. We present Theanine, a framework that augments LLMs' response generation with memory timelines -- series of memories that demonstrate the development and causality of relevant past events. Along with Theanine, we introduce TeaFarm, a counterfactual-driven question-answering pipeline addressing the limitation of G-Eval in long-term conversations.