wzuidema commited on
Commit
4b018a1
1 Parent(s): b0bf43a

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +4 -3
app.py CHANGED
@@ -267,7 +267,7 @@ def sentiment_explanation_hila(input_text, layer):
267
 
268
  return show_explanation(model, input_ids, attention_mask, start_layer=int(layer))
269
 
270
- layer_slider = gradio.Slider(minimum=0, maximum=12, value=8, step=1, label="Select rollout layer")
271
  hila = gradio.Interface(
272
  fn=sentiment_explanation_hila,
273
  inputs=["text", layer_slider],
@@ -281,7 +281,7 @@ lig = gradio.Interface(
281
  )
282
 
283
  iface = gradio.Parallel(hila, lig,
284
- title="RoBERTa Explainability",
285
  description="""
286
  In this demo, we use the RoBERTa language model (optimized for masked language modelling and finetuned for sentiment analysis).
287
  The model predicts for a given sentences whether it expresses a positive, negative or neutral sentiment.
@@ -289,7 +289,8 @@ But how does it arrive at its classification? This is, surprisingly perhaps, ve
289
  A range of so-called "attribution methods" have been developed that attempt to determine the importance of the words in the input for the final prediction;
290
  they provide a very limited form of "explanation" -- and often disagree -- but sometimes provide good initial hypotheses nevertheless that can be further explored with other methods.
291
 
292
- Two key attribution methods for Transformers are "Attention Rollout" (Abnar & Zuidema, 2020) and (layer) Integrated Gradient. Here we show:
 
293
 
294
  * Gradient-weighted attention rollout, as defined by [Hila Chefer](https://github.com/hila-chefer)
295
  [(Transformer-MM_explainability)](https://github.com/hila-chefer/Transformer-MM-Explainability/), with rollout recursion upto selected layer
 
267
 
268
  return show_explanation(model, input_ids, attention_mask, start_layer=int(layer))
269
 
270
+ layer_slider = gradio.Slider(minimum=0, maximum=12, value=8, step=1, label="Select layer")
271
  hila = gradio.Interface(
272
  fn=sentiment_explanation_hila,
273
  inputs=["text", layer_slider],
 
281
  )
282
 
283
  iface = gradio.Parallel(hila, lig,
284
+ title="Attention Rollout -- RoBERTa",
285
  description="""
286
  In this demo, we use the RoBERTa language model (optimized for masked language modelling and finetuned for sentiment analysis).
287
  The model predicts for a given sentences whether it expresses a positive, negative or neutral sentiment.
 
289
  A range of so-called "attribution methods" have been developed that attempt to determine the importance of the words in the input for the final prediction;
290
  they provide a very limited form of "explanation" -- and often disagree -- but sometimes provide good initial hypotheses nevertheless that can be further explored with other methods.
291
 
292
+ Abnar & Zuidema (2020) proposed a method for Transformers called "Attention Rollout", which was further refined by Chefer et al. (2021) into Gradient-weighted Rollout.
293
+ Here we compare it to another popular method called Integrated Gradient.
294
 
295
  * Gradient-weighted attention rollout, as defined by [Hila Chefer](https://github.com/hila-chefer)
296
  [(Transformer-MM_explainability)](https://github.com/hila-chefer/Transformer-MM-Explainability/), with rollout recursion upto selected layer