Spaces:

dar-tau
/

selfie

Running on Zero

App Files Files Community

dar-tau commited on Apr 9, 2024

Commit

9f98ca2

•

1 Parent(s): ee8bd6d

Update app.py

Browse files

Files changed (1) hide show

app.py +19 -18

app.py CHANGED Viewed

@@ -182,23 +182,24 @@ with gr.Blocks(theme=gr.themes.Default(), css=css) as demo:
     with gr.Row():
         with gr.Column(scale=5):
             gr.Markdown('# 😎 Self-Interpreting Models')
-            with gr.Accordion(
-                label='👾 This space is a simple introduction to the emerging trend of models interpreting their OWN hidden states in free form natural language!!👾',
-                elem_classes=['explanation_accordion']
-            ):
-                gr.Markdown(
-                '''This idea was investigated in the paper **Patchscopes** ([Ghandeharioun et al., 2024](https://arxiv.org/abs/2401.06102)) and was further explored in **SelfIE** ([Chen et al., 2024](https://arxiv.org/abs/2403.10949)).
-                    An honorary mention of **Speaking Probes** ([Dar, 2023](https://towardsdatascience.com/speaking-probes-self-interpreting-models-7a3dc6cb33d6) - my own work 🥳) which was less mature but had the same idea in mind.
-                    We will follow the SelfIE implementation in this space for concreteness. Patchscopes are so general that they encompass many other interpretation techniques too!!!
-                ''', line_breaks=True)
-            with gr.Accordion(label='👾 The idea is really simple: models are able to understand their own hidden states by nature! 👾',
-                              elem_classes=['explanation_accordion']):
-                gr.Markdown(
-                '''According to the residual stream view ([nostalgebraist, 2020](https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens)), internal representations from different layers are transferable between layers.
-                So we can inject an representation from (roughly) any layer to any layer! If I give a model a prompt of the form ``User: [X] Assistant: Sure'll I'll repeat your message`` and replace the internal representation of ``[X]`` *during computation* with the hidden state we want to understand,
-                we expect to get back a summary of the information that exists inside the hidden state. Since the model uses a roughly common latent space, it can understand representations from different layers and different runs!! How cool is that! 😯😯😯
-                ''', line_breaks=True)
         with gr.Column(scale=1):
             gr.Markdown('<span style="font-size:180px;">🤔</span>')
@@ -229,7 +230,7 @@ with gr.Blocks(theme=gr.themes.Default(), css=css) as demo:
         for i in range(MAX_PROMPT_TOKENS):
             btn = gr.Button('', visible=False, elem_classes=['token_btn'])
             tokens_container.append(btn)
-    use_gpu = gr.Checkbox(value=False, label='Use GPU')
     progress_dummy = gr.Markdown('', elem_id='progress_dummy')
     interpretation_bubbles = [gr.Textbox('', container=False, visible=False, elem_classes=['bubble',

     with gr.Row():
         with gr.Column(scale=5):
             gr.Markdown('# 😎 Self-Interpreting Models')
+            gr.Markdown(
+                '**👾 This space is a simple introduction to the emerging trend of models interpreting their OWN hidden states in free form natural language!!👾**',
+                # elem_classes=['explanation_accordion']
+            )
+            gr.Markdown(
+            '''This idea was investigated in the paper **Patchscopes** ([Ghandeharioun et al., 2024](https://arxiv.org/abs/2401.06102)) and was further explored in **SelfIE** ([Chen et al., 2024](https://arxiv.org/abs/2403.10949)).
+                An honorary mention of **Speaking Probes** ([Dar, 2023](https://towardsdatascience.com/speaking-probes-self-interpreting-models-7a3dc6cb33d6) - my own work 🥳) which was less mature but had the same idea in mind.
+                We will follow the SelfIE implementation in this space for concreteness. Patchscopes are so general that they encompass many other interpretation techniques too!!!
+            ''', line_breaks=True)
+            gr.Markdown('**👾 The idea is really simple: models are able to understand their own hidden states by nature! 👾**',
+                          # elem_classes=['explanation_accordion']
+                        )
+            gr.Markdown(
+            '''According to the residual stream view ([nostalgebraist, 2020](https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens)), internal representations from different layers are transferable between layers.
+            So we can inject an representation from (roughly) any layer to any layer! If I give a model a prompt of the form ``User: [X] Assistant: Sure'll I'll repeat your message`` and replace the internal representation of ``[X]`` *during computation* with the hidden state we want to understand,
+            we expect to get back a summary of the information that exists inside the hidden state. Since the model uses a roughly common latent space, it can understand representations from different layers and different runs!! How cool is that! 😯😯😯
+            ''', line_breaks=True)
         with gr.Column(scale=1):
             gr.Markdown('<span style="font-size:180px;">🤔</span>')
         for i in range(MAX_PROMPT_TOKENS):
             btn = gr.Button('', visible=False, elem_classes=['token_btn'])
             tokens_container.append(btn)
+    use_gpu = False # gr.Checkbox(value=False, label='Use GPU')
     progress_dummy = gr.Markdown('', elem_id='progress_dummy')
     interpretation_bubbles = [gr.Textbox('', container=False, visible=False, elem_classes=['bubble',