Spaces:

amsterdamNLP
/

attention-rollout

Running

App Files Files Community

Martijn van Beers commited on Sep 9, 2022

Commit

cd3f110

•

1 Parent(s): 4b018a1

Move description to external file

Browse files

Files changed (2) hide show

app.py +5 -16
description.md +12 -0

app.py CHANGED Viewed

@@ -280,22 +280,10 @@ lig = gradio.Interface(
     outputs="html",
 )
-iface = gradio.Parallel(hila, lig,
-                           title="Attention Rollout -- RoBERTa",
-                        description="""
-In this demo, we use the RoBERTa language model (optimized for masked language modelling and finetuned for sentiment analysis).
-The model predicts for a given sentences whether it expresses a positive, negative or neutral sentiment.
-But how does it arrive at its classification?  This is, surprisingly perhaps, very difficult to determine.
-A range of so-called "attribution methods" have been developed that attempt to determine the importance of the words in the input for the final prediction;
-they provide a very limited form of "explanation" -- and often disagree -- but sometimes provide good initial hypotheses nevertheless that can be further explored with other methods.
-Abnar & Zuidema (2020) proposed a method for Transformers called "Attention Rollout", which was further refined by Chefer et al. (2021) into Gradient-weighted Rollout.
-Here we compare it to another popular method called Integrated Gradient.
-* Gradient-weighted attention rollout, as defined by [Hila Chefer](https://github.com/hila-chefer)
-  [(Transformer-MM_explainability)](https://github.com/hila-chefer/Transformer-MM-Explainability/), with rollout recursion upto selected layer
-* Layer IG, as implemented in [Captum](https://captum.ai/)(LayerIntegratedGradients), based on gradient w.r.t. selected layer.
-""",
     examples=[
         [
             "This movie was the best movie I have ever seen! some scenes were ridiculous, but acting was great",
@@ -323,4 +311,5 @@ Here we compare it to another popular method called Integrated Gradient.
         ]
     ],
 )
 iface.launch()

     outputs="html",
 )
+with open("description.md", "r") as fh:
+    description = fh.read()
+iface = gradio.Parallel(hila, lig, title="RoBERTa Explainability", description=description,
     examples=[
         [
             "This movie was the best movie I have ever seen! some scenes were ridiculous, but acting was great",
         ]
     ],
 )
 iface.launch()

description.md ADDED Viewed

	@@ -0,0 +1,12 @@

+In this demo, we use the RoBERTa language model (optimized for masked language modelling and finetuned for sentiment analysis).
+The model predicts for a given sentences whether it expresses a positive, negative or neutral sentiment.
+But how does it arrive at its classification?  This is, surprisingly perhaps, very difficult to determine.
+A range of so-called "attribution methods" have been developed that attempt to determine the importance of the words in the input for the final prediction;
+they provide a very limited form of "explanation" -- and often disagree -- but sometimes provide good initial hypotheses nevertheless that can be further explored with other methods.
+Abnar & Zuidema (2020) proposed a method for Transformers called "Attention Rollout", which was further refined by Chefer et al. (2021) into Gradient-weighted Rollout.
+Here we compare it to another popular method called Integrated Gradient.
+* Gradient-weighted attention rollout, as defined by [Hila Chefer](https://github.com/hila-chefer)
+  [(Transformer-MM_explainability)](https://github.com/hila-chefer/Transformer-MM-Explainability/), with rollout recursion upto selected layer
+* Layer IG, as implemented in [Captum](https://captum.ai/)(LayerIntegratedGradients), based on gradient w.r.t. selected layer.