Spaces:

SecureLLMSys
/

AttnTrace

Running on Zero

App Files Files Community

SecureLLMSys commited on 13 days ago

Commit

41db03f

1 Parent(s): dd6e1b6

update

Browse files

Files changed (1) hide show

app.py +2 -2

app.py CHANGED Viewed

@@ -19,7 +19,7 @@ from examples import run_example_1, run_example_2, run_example_3, run_example_4,
 from functools import partial
 # Load original app constants
-APP_TITLE = '<div class="app-title"><span class="brand">AttnTrace</span><span class="subtitle">Attention-based Context Traceback for Long-Context LLMs</span></div>'
 APP_DESCRIPTION = """AttnTrace traces a model's generated statements back to specific parts of the context using attention-based traceback. Try it out with Meta-Llama-3.1-8B-Instruct here! See the [[paper](https://arxiv.org/abs/2506.04202)] and [[code](https://github.com/Wang-Yanting/TracLLM-Kit)] for more!
 Maintained by the AttnTrace team."""
 # NEW_TEXT = """Long-context large language models (LLMs), such as Gemini-2.5-Pro and Claude-Sonnet-4, are increasingly used to empower advanced AI systems, including retrieval-augmented generation (RAG) pipelines and autonomous agents. In these systems, an LLM receives an instruction along with a context—often consisting of texts retrieved from a knowledge database or memory—and generates a response that is contextually grounded by following the instruction. Recent studies have designed solutions to trace back to a subset of texts in the context that contributes most to the response generated by the LLM. These solutions have numerous real-world applications, including performing post-attack forensic analysis and improving the interpretability and trustworthiness of LLM outputs. While significant efforts have been made, state-of-the-art solutions such as TracLLM often lead to a high computation cost, e.g., it takes TracLLM hundreds of seconds to perform traceback for a single response-context pair. In this work, we propose {\name}, a new context traceback method based on the attention weights produced by an LLM for a prompt. To effectively utilize attention weights, we introduce two techniques designed to enhance the effectiveness of {\name}, and we provide theoretical insights for our design choice. %Moreover, we perform both theoretical analysis and empirical evaluation to demonstrate their effectiveness.
@@ -850,7 +850,7 @@ theme = gr.themes.Citrus(
     text_size="lg",
     spacing_size="md",
 )
-with gr.Blocks(theme=theme) as demo:
     gr.Markdown(f"# {APP_TITLE}")
     gr.Markdown(APP_DESCRIPTION, elem_classes="app-description")
     # gr.Markdown(NEW_TEXT, elem_classes="app-description-2")

 from functools import partial
 # Load original app constants
+APP_TITLE = '<div class="app-title"><span class="brand">AttnTrace: </span><span class="subtitle">Attention-based Context Traceback for Long-Context LLMs</span></div>'
 APP_DESCRIPTION = """AttnTrace traces a model's generated statements back to specific parts of the context using attention-based traceback. Try it out with Meta-Llama-3.1-8B-Instruct here! See the [[paper](https://arxiv.org/abs/2506.04202)] and [[code](https://github.com/Wang-Yanting/TracLLM-Kit)] for more!
 Maintained by the AttnTrace team."""
 # NEW_TEXT = """Long-context large language models (LLMs), such as Gemini-2.5-Pro and Claude-Sonnet-4, are increasingly used to empower advanced AI systems, including retrieval-augmented generation (RAG) pipelines and autonomous agents. In these systems, an LLM receives an instruction along with a context—often consisting of texts retrieved from a knowledge database or memory—and generates a response that is contextually grounded by following the instruction. Recent studies have designed solutions to trace back to a subset of texts in the context that contributes most to the response generated by the LLM. These solutions have numerous real-world applications, including performing post-attack forensic analysis and improving the interpretability and trustworthiness of LLM outputs. While significant efforts have been made, state-of-the-art solutions such as TracLLM often lead to a high computation cost, e.g., it takes TracLLM hundreds of seconds to perform traceback for a single response-context pair. In this work, we propose {\name}, a new context traceback method based on the attention weights produced by an LLM for a prompt. To effectively utilize attention weights, we introduce two techniques designed to enhance the effectiveness of {\name}, and we provide theoretical insights for our design choice. %Moreover, we perform both theoretical analysis and empirical evaluation to demonstrate their effectiveness.
     text_size="lg",
     spacing_size="md",
 )
+with gr.Blocks(theme=theme, css=custom_css) as demo:
     gr.Markdown(f"# {APP_TITLE}")
     gr.Markdown(APP_DESCRIPTION, elem_classes="app-description")
     # gr.Markdown(NEW_TEXT, elem_classes="app-description-2")