antenmanuuel commited on
Commit
efa23c6
·
verified ·
1 Parent(s): e83a9ae

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. .gitignore +2 -1
  2. README.md +27 -10
.gitignore CHANGED
@@ -1 +1,2 @@
1
- venv
 
 
1
+ venv/
2
+ __pycache__/
README.md CHANGED
@@ -43,10 +43,10 @@ uvicorn app:app --reload
43
  This API powers the BERT Attention Visualizer, which helps researchers and practitioners understand how transformer models like BERT attend to different tokens and how attention patterns change with different inputs.
44
 
45
  # todo:
46
- 1. ~~word replacement UI & it's back end~~ ✓ (Implemented via `/attention_comparison` endpoint)
47
- 2. extened the attetion page when it have more word.
48
- 3. backend maybe not correct for attention heatmap and flow
49
 
 
 
 
50
 
51
  # PyTorch Backend
52
 
@@ -67,12 +67,14 @@ This backend provides a FastAPI service for tokenization, attention visualizatio
67
  ## Installation
68
 
69
  1. Create a virtual environment (recommended):
 
70
  ```bash
71
  python -m venv venv
72
  source venv/bin/activate # On Windows: venv\Scripts\activate
73
  ```
74
 
75
  2. Install the required packages:
 
76
  ```bash
77
  pip install -r requirements.txt
78
  ```
@@ -80,8 +82,9 @@ pip install -r requirements.txt
80
  ## Running the Server
81
 
82
  Start the server with:
 
83
  ```bash
84
- python app.py
85
  ```
86
 
87
  This will launch the server at `http://localhost:8000`.
@@ -89,12 +92,15 @@ This will launch the server at `http://localhost:8000`.
89
  ## API Endpoints
90
 
91
  ### GET /models
 
92
  Returns a list of available models.
93
 
94
  ### POST /tokenize
 
95
  Tokenizes input text using the specified model.
96
 
97
  Request body:
 
98
  ```json
99
  {
100
  "text": "The cat sat on the mat",
@@ -103,21 +109,24 @@ Request body:
103
  ```
104
 
105
  Response:
 
106
  ```json
107
  {
108
  "tokens": [
109
- {"text": "[CLS]", "index": 0},
110
- {"text": "the", "index": 1},
111
- {"text": "cat", "index": 2},
112
  // ...other tokens
113
  ]
114
  }
115
  ```
116
 
117
  ### POST /predict_masked
 
118
  Predicts masked tokens using the specified model.
119
 
120
  Request body:
 
121
  ```json
122
  {
123
  "text": "The cat sat on the mat",
@@ -128,20 +137,23 @@ Request body:
128
  ```
129
 
130
  Response:
 
131
  ```json
132
  {
133
  "predictions": [
134
- {"word": "the", "score": 0.9},
135
- {"word": "a", "score": 0.05},
136
  // ...other predictions
137
  ]
138
  }
139
  ```
140
 
141
  ### POST /attention
 
142
  Retrieves attention matrices for visualizing attention patterns between tokens.
143
 
144
  Request body:
 
145
  ```json
146
  {
147
  "text": "The cat sat on the mat",
@@ -150,6 +162,7 @@ Request body:
150
  ```
151
 
152
  Response:
 
153
  ```json
154
  {
155
  "attention_data": {
@@ -180,9 +193,11 @@ Response:
180
  ```
181
 
182
  ### POST /attention_comparison
 
183
  Compares attention patterns before and after replacing a word in the input text. This is useful for analyzing how word replacements affect the model's attention distribution.
184
 
185
  Request body:
 
186
  ```json
187
  {
188
  "text": "The cat sat on the mat",
@@ -193,6 +208,7 @@ Request body:
193
  ```
194
 
195
  Response:
 
196
  ```json
197
  {
198
  "before_attention": {
@@ -218,6 +234,7 @@ RoBERTa tokens are automatically cleaned to remove the leading 'Ġ' character (w
218
  The backend communicates with the frontend through these API endpoints. The `/attention` endpoint is particularly important for the attention visualization features, including the matrix view, parallel view, and attention distribution bar charts.
219
 
220
  The `/attention_comparison` endpoint enables a comparative analysis feature in the frontend, allowing users to see how attention patterns change when a word is replaced. This can be used to:
 
221
  - Analyze semantic shifts in the model's understanding
222
  - Compare attention flows before and after word replacements
223
  - Visualize how different word choices affect contextual relationships
@@ -230,4 +247,4 @@ For debugging purposes, the backend includes extensive logging for token process
230
 
231
  - Models are loaded dynamically upon first request and cached for subsequent requests
232
  - The server supports both CPU and CUDA (GPU) execution if available
233
- - For large texts, attention matrices can become quite large, so consider limiting input length for better performance
 
43
  This API powers the BERT Attention Visualizer, which helps researchers and practitioners understand how transformer models like BERT attend to different tokens and how attention patterns change with different inputs.
44
 
45
  # todo:
 
 
 
46
 
47
+ 1. ~~word replacement UI & it's back end~~ ✓ (Implemented via `/attention_comparison` endpoint)
48
+ 2. extened the attetion page when it have more word.
49
+ 3. backend maybe not correct for attention heatmap and flow
50
 
51
  # PyTorch Backend
52
 
 
67
  ## Installation
68
 
69
  1. Create a virtual environment (recommended):
70
+
71
  ```bash
72
  python -m venv venv
73
  source venv/bin/activate # On Windows: venv\Scripts\activate
74
  ```
75
 
76
  2. Install the required packages:
77
+
78
  ```bash
79
  pip install -r requirements.txt
80
  ```
 
82
  ## Running the Server
83
 
84
  Start the server with:
85
+
86
  ```bash
87
+ python main.py
88
  ```
89
 
90
  This will launch the server at `http://localhost:8000`.
 
92
  ## API Endpoints
93
 
94
  ### GET /models
95
+
96
  Returns a list of available models.
97
 
98
  ### POST /tokenize
99
+
100
  Tokenizes input text using the specified model.
101
 
102
  Request body:
103
+
104
  ```json
105
  {
106
  "text": "The cat sat on the mat",
 
109
  ```
110
 
111
  Response:
112
+
113
  ```json
114
  {
115
  "tokens": [
116
+ { "text": "[CLS]", "index": 0 },
117
+ { "text": "the", "index": 1 },
118
+ { "text": "cat", "index": 2 }
119
  // ...other tokens
120
  ]
121
  }
122
  ```
123
 
124
  ### POST /predict_masked
125
+
126
  Predicts masked tokens using the specified model.
127
 
128
  Request body:
129
+
130
  ```json
131
  {
132
  "text": "The cat sat on the mat",
 
137
  ```
138
 
139
  Response:
140
+
141
  ```json
142
  {
143
  "predictions": [
144
+ { "word": "the", "score": 0.9 },
145
+ { "word": "a", "score": 0.05 }
146
  // ...other predictions
147
  ]
148
  }
149
  ```
150
 
151
  ### POST /attention
152
+
153
  Retrieves attention matrices for visualizing attention patterns between tokens.
154
 
155
  Request body:
156
+
157
  ```json
158
  {
159
  "text": "The cat sat on the mat",
 
162
  ```
163
 
164
  Response:
165
+
166
  ```json
167
  {
168
  "attention_data": {
 
193
  ```
194
 
195
  ### POST /attention_comparison
196
+
197
  Compares attention patterns before and after replacing a word in the input text. This is useful for analyzing how word replacements affect the model's attention distribution.
198
 
199
  Request body:
200
+
201
  ```json
202
  {
203
  "text": "The cat sat on the mat",
 
208
  ```
209
 
210
  Response:
211
+
212
  ```json
213
  {
214
  "before_attention": {
 
234
  The backend communicates with the frontend through these API endpoints. The `/attention` endpoint is particularly important for the attention visualization features, including the matrix view, parallel view, and attention distribution bar charts.
235
 
236
  The `/attention_comparison` endpoint enables a comparative analysis feature in the frontend, allowing users to see how attention patterns change when a word is replaced. This can be used to:
237
+
238
  - Analyze semantic shifts in the model's understanding
239
  - Compare attention flows before and after word replacements
240
  - Visualize how different word choices affect contextual relationships
 
247
 
248
  - Models are loaded dynamically upon first request and cached for subsequent requests
249
  - The server supports both CPU and CUDA (GPU) execution if available
250
+ - For large texts, attention matrices can become quite large, so consider limiting input length for better performance