arxiv:2312.05491

Using Captum to Explain Generative Language Models

Published on Dec 9, 2023

· Submitted by

akhaliq on Dec 12, 2023

Upvote

Authors:

Vivek Miglani ,

Aobo Yang ,

Aram H. Markosyan ,

Diego Garcia-Olano ,

Abstract

Captum is a comprehensive library for model explainability in PyTorch, offering a range of methods from the interpretability literature to enhance users' understanding of PyTorch models. In this paper, we introduce new features in Captum that are specifically designed to analyze the behavior of generative language models. We provide an overview of the available functionalities and example applications of their potential for understanding learned associations within generative language models.

View arXiv page View PDF Add to collection

Community

chromeNLP

Dec 12, 2023

Great work! I previously used Captum to compute Shapley Values (SV) to explain natural language models in classification tasks and it is great to see new features for explaining generation. In our work (forgive me for shameless self-plug: https://arxiv.org/pdf/2305.19998.pdf), we find that 1) random seed choices can influence the explanation results a bit, and 2) computing SV for a large language model is costly if you want a stable explanation with large sample size. Casting generation as a consecutive classification task, I think that is still the case. We developed an amortized model to achieve a better stability-efficiency trade-off even allowing online computation of SV for LLMs. Would love to chat about incorporating my work into Captum or extending my work to generation setting :-)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2312.05491 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2312.05491 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2312.05491 in a Space README.md to link it from this page.