Papers
arxiv:2211.12821

Explainable AI for Pre-Trained Code Models: What Do They Learn? When They Do Not Work?

Published on Nov 23, 2022
Authors:
,

Abstract

In recent years, there has been a wide interest in designing deep neural network-based models that automate downstream software engineering tasks on source code, such as code document generation, code search, and program repair. Although the main objective of these studies is to improve the effectiveness of the downstream task, many studies only attempt to employ the next best neural network model, without a proper in-depth analysis of why a particular solution works or does not, on particular tasks or scenarios. In this paper, using an example eXplainable AI (XAI) method (attention mechanism), we study two recent large language models (LLMs) for code (CodeBERT and GraphCodeBERT) on a set of software engineering downstream tasks: code document generation (CDG), code refinement (CR), and code translation (CT). Through quantitative and qualitative studies, we identify what CodeBERT and GraphCodeBERT learn (put the highest attention on, in terms of source code token types), on these tasks. We also show some of the common patterns when the model does not work as expected (performs poorly even on easy problems) and suggest recommendations that may alleviate the observed challenges.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2211.12821 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2211.12821 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2211.12821 in a Space README.md to link it from this page.

Collections including this paper 2