Spaces:

antenmanuuel
/

bert-attention-visualizer

Sleeping

App Files Files Community

antenmanuuel commited on Apr 3

Commit

efa23c6

verified ·

1 Parent(s): e83a9ae

Upload folder using huggingface_hub

Browse files

Files changed (2) hide show

.gitignore +2 -1
README.md +27 -10

.gitignore CHANGED Viewed

	@@ -1 +1,2 @@
1	- venv


1	+ venv/
2	+ __pycache__/

README.md CHANGED Viewed

@@ -43,10 +43,10 @@ uvicorn app:app --reload
 This API powers the BERT Attention Visualizer, which helps researchers and practitioners understand how transformer models like BERT attend to different tokens and how attention patterns change with different inputs.
 # todo:
- 1. ~~word replacement UI & it's back end~~ ✓ (Implemented via `/attention_comparison` endpoint)
- 2. extened the attetion page when it have more word.
- 3. backend maybe not correct for attention heatmap and flow
 # PyTorch Backend
@@ -67,12 +67,14 @@ This backend provides a FastAPI service for tokenization, attention visualizatio
 ## Installation
 1. Create a virtual environment (recommended):
 ```bash
 python -m venv venv
 source venv/bin/activate  # On Windows: venv\Scripts\activate
 ```
 2. Install the required packages:
 ```bash
 pip install -r requirements.txt
 ```
@@ -80,8 +82,9 @@ pip install -r requirements.txt
 ## Running the Server
 Start the server with:
 ```bash
-python app.py
 ```
 This will launch the server at `http://localhost:8000`.
@@ -89,12 +92,15 @@ This will launch the server at `http://localhost:8000`.
 ## API Endpoints
 ### GET /models
 Returns a list of available models.
 ### POST /tokenize
 Tokenizes input text using the specified model.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
@@ -103,21 +109,24 @@ Request body:
 ```
 Response:
 ```json
 {
   "tokens": [
-    {"text": "[CLS]", "index": 0},
-    {"text": "the", "index": 1},
-    {"text": "cat", "index": 2},
     // ...other tokens
   ]
 }
 ```
 ### POST /predict_masked
 Predicts masked tokens using the specified model.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
@@ -128,20 +137,23 @@ Request body:
 ```
 Response:
 ```json
 {
   "predictions": [
-    {"word": "the", "score": 0.9},
-    {"word": "a", "score": 0.05},
     // ...other predictions
   ]
 }
 ```
 ### POST /attention
 Retrieves attention matrices for visualizing attention patterns between tokens.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
@@ -150,6 +162,7 @@ Request body:
 ```
 Response:
 ```json
 {
   "attention_data": {
@@ -180,9 +193,11 @@ Response:
 ```
 ### POST /attention_comparison
 Compares attention patterns before and after replacing a word in the input text. This is useful for analyzing how word replacements affect the model's attention distribution.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
@@ -193,6 +208,7 @@ Request body:
 ```
 Response:
 ```json
 {
   "before_attention": {
@@ -218,6 +234,7 @@ RoBERTa tokens are automatically cleaned to remove the leading 'Ġ' character (w
 The backend communicates with the frontend through these API endpoints. The `/attention` endpoint is particularly important for the attention visualization features, including the matrix view, parallel view, and attention distribution bar charts.
 The `/attention_comparison` endpoint enables a comparative analysis feature in the frontend, allowing users to see how attention patterns change when a word is replaced. This can be used to:
 - Analyze semantic shifts in the model's understanding
 - Compare attention flows before and after word replacements
 - Visualize how different word choices affect contextual relationships
@@ -230,4 +247,4 @@ For debugging purposes, the backend includes extensive logging for token process
 - Models are loaded dynamically upon first request and cached for subsequent requests
 - The server supports both CPU and CUDA (GPU) execution if available
-- For large texts, attention matrices can become quite large, so consider limiting input length for better performance

 This API powers the BERT Attention Visualizer, which helps researchers and practitioners understand how transformer models like BERT attend to different tokens and how attention patterns change with different inputs.
 # todo:
+1.  ~~word replacement UI & it's back end~~ ✓ (Implemented via `/attention_comparison` endpoint)
+2.  extened the attetion page when it have more word.
+3.  backend maybe not correct for attention heatmap and flow
 # PyTorch Backend
 ## Installation
 1. Create a virtual environment (recommended):
 ```bash
 python -m venv venv
 source venv/bin/activate  # On Windows: venv\Scripts\activate
 ```
 2. Install the required packages:
 ```bash
 pip install -r requirements.txt
 ```
 ## Running the Server
 Start the server with:
 ```bash
+python main.py
 ```
 This will launch the server at `http://localhost:8000`.
 ## API Endpoints
 ### GET /models
 Returns a list of available models.
 ### POST /tokenize
 Tokenizes input text using the specified model.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
 ```
 Response:
 ```json
 {
   "tokens": [
+    { "text": "[CLS]", "index": 0 },
+    { "text": "the", "index": 1 },
+    { "text": "cat", "index": 2 }
     // ...other tokens
   ]
 }
 ```
 ### POST /predict_masked
 Predicts masked tokens using the specified model.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
 ```
 Response:
 ```json
 {
   "predictions": [
+    { "word": "the", "score": 0.9 },
+    { "word": "a", "score": 0.05 }
     // ...other predictions
   ]
 }
 ```
 ### POST /attention
 Retrieves attention matrices for visualizing attention patterns between tokens.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
 ```
 Response:
 ```json
 {
   "attention_data": {
 ```
 ### POST /attention_comparison
 Compares attention patterns before and after replacing a word in the input text. This is useful for analyzing how word replacements affect the model's attention distribution.
 Request body:
 ```json
 {
   "text": "The cat sat on the mat",
 ```
 Response:
 ```json
 {
   "before_attention": {
 The backend communicates with the frontend through these API endpoints. The `/attention` endpoint is particularly important for the attention visualization features, including the matrix view, parallel view, and attention distribution bar charts.
 The `/attention_comparison` endpoint enables a comparative analysis feature in the frontend, allowing users to see how attention patterns change when a word is replaced. This can be used to:
 - Analyze semantic shifts in the model's understanding
 - Compare attention flows before and after word replacements
 - Visualize how different word choices affect contextual relationships
 - Models are loaded dynamically upon first request and cached for subsequent requests
 - The server supports both CPU and CUDA (GPU) execution if available
+- For large texts, attention matrices can become quite large, so consider limiting input length for better performance