File size: 637 Bytes
9d1fa85
6c01ee5
3b59ebc
 
 
 
cd3f110
3b59ebc
 
 
1
2
3
4
5
6
7
8
9
10
11
# Attention Rollout -- RoBERTa

In this demo, we use the RoBERTa language model (optimized for masked language modelling and finetuned
for sentiment analysis). The model predicts for a given sentences whether it expresses a positive,
negative or neutral sentiment. But how does it arrive at its classification?  This is, surprisingly
perhaps, very difficult to determine.

Abnar & Zuidema (2020) proposed a method for Transformers called **Attention Rollout**, which was further
refined by Chefer et al. (2021) into **Gradient-weighted Attention Rollout**. Here we compare them to
another popular method called **Integrated Gradients**.