Post
๐ Today's pick in Interpretability & Analysis of LMs: AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers by
@RedOneAI
et al.
This work proposes extending the LRP feature attribution framework to handling Transformers-specific layers. In particular: authors:
1. Propose a generalized approach to softmax linearization by designing a distribution rule that incorporates bias terms, absorbing of a portion of the relevance.
2. Propose decomposing the element-wise matrix multiplication in the attention operation as a sequential of epsilon and uniform distribution rules to ensure conservation (=sum of relevance stays constant across layers)
3. Propose handling normalisation layers with an identity distribution rule.
By means of extensive experiments, authors show that AttnLRP:
1. Is significantly more faithful than other popular gradient- and attention-based attribution approaches on CV and NLP tasks using large transformer models.
2. Runs in O(1) time, requiring O(sqrt(num_layers)) memory, as opposed to perturbation-based approaches requiring O(seq_len) time.
3. can be used alongside activation maximisation to explain the contribution of granular model components in driving modelsโ predictions.
๐ Paper: AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers (2402.05602)
๐ All daily picks in LM interpretability: gsarti/daily-picks-in-interpretability-and-analysis-of-lms-65ae3339949c5675d25de2f9
This work proposes extending the LRP feature attribution framework to handling Transformers-specific layers. In particular: authors:
1. Propose a generalized approach to softmax linearization by designing a distribution rule that incorporates bias terms, absorbing of a portion of the relevance.
2. Propose decomposing the element-wise matrix multiplication in the attention operation as a sequential of epsilon and uniform distribution rules to ensure conservation (=sum of relevance stays constant across layers)
3. Propose handling normalisation layers with an identity distribution rule.
By means of extensive experiments, authors show that AttnLRP:
1. Is significantly more faithful than other popular gradient- and attention-based attribution approaches on CV and NLP tasks using large transformer models.
2. Runs in O(1) time, requiring O(sqrt(num_layers)) memory, as opposed to perturbation-based approaches requiring O(seq_len) time.
3. can be used alongside activation maximisation to explain the contribution of granular model components in driving modelsโ predictions.
๐ Paper: AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers (2402.05602)
๐ All daily picks in LM interpretability: gsarti/daily-picks-in-interpretability-and-analysis-of-lms-65ae3339949c5675d25de2f9