Upload folder using huggingface_hub
Browse files- .gitignore +2 -1
- README.md +27 -10
.gitignore
CHANGED
|
@@ -1 +1,2 @@
|
|
| 1 |
-
venv
|
|
|
|
|
|
| 1 |
+
venv/
|
| 2 |
+
__pycache__/
|
README.md
CHANGED
|
@@ -43,10 +43,10 @@ uvicorn app:app --reload
|
|
| 43 |
This API powers the BERT Attention Visualizer, which helps researchers and practitioners understand how transformer models like BERT attend to different tokens and how attention patterns change with different inputs.
|
| 44 |
|
| 45 |
# todo:
|
| 46 |
-
1. ~~word replacement UI & it's back end~~ ✓ (Implemented via `/attention_comparison` endpoint)
|
| 47 |
-
2. extened the attetion page when it have more word.
|
| 48 |
-
3. backend maybe not correct for attention heatmap and flow
|
| 49 |
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
# PyTorch Backend
|
| 52 |
|
|
@@ -67,12 +67,14 @@ This backend provides a FastAPI service for tokenization, attention visualizatio
|
|
| 67 |
## Installation
|
| 68 |
|
| 69 |
1. Create a virtual environment (recommended):
|
|
|
|
| 70 |
```bash
|
| 71 |
python -m venv venv
|
| 72 |
source venv/bin/activate # On Windows: venv\Scripts\activate
|
| 73 |
```
|
| 74 |
|
| 75 |
2. Install the required packages:
|
|
|
|
| 76 |
```bash
|
| 77 |
pip install -r requirements.txt
|
| 78 |
```
|
|
@@ -80,8 +82,9 @@ pip install -r requirements.txt
|
|
| 80 |
## Running the Server
|
| 81 |
|
| 82 |
Start the server with:
|
|
|
|
| 83 |
```bash
|
| 84 |
-
python
|
| 85 |
```
|
| 86 |
|
| 87 |
This will launch the server at `http://localhost:8000`.
|
|
@@ -89,12 +92,15 @@ This will launch the server at `http://localhost:8000`.
|
|
| 89 |
## API Endpoints
|
| 90 |
|
| 91 |
### GET /models
|
|
|
|
| 92 |
Returns a list of available models.
|
| 93 |
|
| 94 |
### POST /tokenize
|
|
|
|
| 95 |
Tokenizes input text using the specified model.
|
| 96 |
|
| 97 |
Request body:
|
|
|
|
| 98 |
```json
|
| 99 |
{
|
| 100 |
"text": "The cat sat on the mat",
|
|
@@ -103,21 +109,24 @@ Request body:
|
|
| 103 |
```
|
| 104 |
|
| 105 |
Response:
|
|
|
|
| 106 |
```json
|
| 107 |
{
|
| 108 |
"tokens": [
|
| 109 |
-
{"text": "[CLS]", "index": 0},
|
| 110 |
-
{"text": "the", "index": 1},
|
| 111 |
-
{"text": "cat", "index": 2}
|
| 112 |
// ...other tokens
|
| 113 |
]
|
| 114 |
}
|
| 115 |
```
|
| 116 |
|
| 117 |
### POST /predict_masked
|
|
|
|
| 118 |
Predicts masked tokens using the specified model.
|
| 119 |
|
| 120 |
Request body:
|
|
|
|
| 121 |
```json
|
| 122 |
{
|
| 123 |
"text": "The cat sat on the mat",
|
|
@@ -128,20 +137,23 @@ Request body:
|
|
| 128 |
```
|
| 129 |
|
| 130 |
Response:
|
|
|
|
| 131 |
```json
|
| 132 |
{
|
| 133 |
"predictions": [
|
| 134 |
-
{"word": "the", "score": 0.9},
|
| 135 |
-
{"word": "a", "score": 0.05}
|
| 136 |
// ...other predictions
|
| 137 |
]
|
| 138 |
}
|
| 139 |
```
|
| 140 |
|
| 141 |
### POST /attention
|
|
|
|
| 142 |
Retrieves attention matrices for visualizing attention patterns between tokens.
|
| 143 |
|
| 144 |
Request body:
|
|
|
|
| 145 |
```json
|
| 146 |
{
|
| 147 |
"text": "The cat sat on the mat",
|
|
@@ -150,6 +162,7 @@ Request body:
|
|
| 150 |
```
|
| 151 |
|
| 152 |
Response:
|
|
|
|
| 153 |
```json
|
| 154 |
{
|
| 155 |
"attention_data": {
|
|
@@ -180,9 +193,11 @@ Response:
|
|
| 180 |
```
|
| 181 |
|
| 182 |
### POST /attention_comparison
|
|
|
|
| 183 |
Compares attention patterns before and after replacing a word in the input text. This is useful for analyzing how word replacements affect the model's attention distribution.
|
| 184 |
|
| 185 |
Request body:
|
|
|
|
| 186 |
```json
|
| 187 |
{
|
| 188 |
"text": "The cat sat on the mat",
|
|
@@ -193,6 +208,7 @@ Request body:
|
|
| 193 |
```
|
| 194 |
|
| 195 |
Response:
|
|
|
|
| 196 |
```json
|
| 197 |
{
|
| 198 |
"before_attention": {
|
|
@@ -218,6 +234,7 @@ RoBERTa tokens are automatically cleaned to remove the leading 'Ġ' character (w
|
|
| 218 |
The backend communicates with the frontend through these API endpoints. The `/attention` endpoint is particularly important for the attention visualization features, including the matrix view, parallel view, and attention distribution bar charts.
|
| 219 |
|
| 220 |
The `/attention_comparison` endpoint enables a comparative analysis feature in the frontend, allowing users to see how attention patterns change when a word is replaced. This can be used to:
|
|
|
|
| 221 |
- Analyze semantic shifts in the model's understanding
|
| 222 |
- Compare attention flows before and after word replacements
|
| 223 |
- Visualize how different word choices affect contextual relationships
|
|
@@ -230,4 +247,4 @@ For debugging purposes, the backend includes extensive logging for token process
|
|
| 230 |
|
| 231 |
- Models are loaded dynamically upon first request and cached for subsequent requests
|
| 232 |
- The server supports both CPU and CUDA (GPU) execution if available
|
| 233 |
-
- For large texts, attention matrices can become quite large, so consider limiting input length for better performance
|
|
|
|
| 43 |
This API powers the BERT Attention Visualizer, which helps researchers and practitioners understand how transformer models like BERT attend to different tokens and how attention patterns change with different inputs.
|
| 44 |
|
| 45 |
# todo:
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
+
1. ~~word replacement UI & it's back end~~ ✓ (Implemented via `/attention_comparison` endpoint)
|
| 48 |
+
2. extened the attetion page when it have more word.
|
| 49 |
+
3. backend maybe not correct for attention heatmap and flow
|
| 50 |
|
| 51 |
# PyTorch Backend
|
| 52 |
|
|
|
|
| 67 |
## Installation
|
| 68 |
|
| 69 |
1. Create a virtual environment (recommended):
|
| 70 |
+
|
| 71 |
```bash
|
| 72 |
python -m venv venv
|
| 73 |
source venv/bin/activate # On Windows: venv\Scripts\activate
|
| 74 |
```
|
| 75 |
|
| 76 |
2. Install the required packages:
|
| 77 |
+
|
| 78 |
```bash
|
| 79 |
pip install -r requirements.txt
|
| 80 |
```
|
|
|
|
| 82 |
## Running the Server
|
| 83 |
|
| 84 |
Start the server with:
|
| 85 |
+
|
| 86 |
```bash
|
| 87 |
+
python main.py
|
| 88 |
```
|
| 89 |
|
| 90 |
This will launch the server at `http://localhost:8000`.
|
|
|
|
| 92 |
## API Endpoints
|
| 93 |
|
| 94 |
### GET /models
|
| 95 |
+
|
| 96 |
Returns a list of available models.
|
| 97 |
|
| 98 |
### POST /tokenize
|
| 99 |
+
|
| 100 |
Tokenizes input text using the specified model.
|
| 101 |
|
| 102 |
Request body:
|
| 103 |
+
|
| 104 |
```json
|
| 105 |
{
|
| 106 |
"text": "The cat sat on the mat",
|
|
|
|
| 109 |
```
|
| 110 |
|
| 111 |
Response:
|
| 112 |
+
|
| 113 |
```json
|
| 114 |
{
|
| 115 |
"tokens": [
|
| 116 |
+
{ "text": "[CLS]", "index": 0 },
|
| 117 |
+
{ "text": "the", "index": 1 },
|
| 118 |
+
{ "text": "cat", "index": 2 }
|
| 119 |
// ...other tokens
|
| 120 |
]
|
| 121 |
}
|
| 122 |
```
|
| 123 |
|
| 124 |
### POST /predict_masked
|
| 125 |
+
|
| 126 |
Predicts masked tokens using the specified model.
|
| 127 |
|
| 128 |
Request body:
|
| 129 |
+
|
| 130 |
```json
|
| 131 |
{
|
| 132 |
"text": "The cat sat on the mat",
|
|
|
|
| 137 |
```
|
| 138 |
|
| 139 |
Response:
|
| 140 |
+
|
| 141 |
```json
|
| 142 |
{
|
| 143 |
"predictions": [
|
| 144 |
+
{ "word": "the", "score": 0.9 },
|
| 145 |
+
{ "word": "a", "score": 0.05 }
|
| 146 |
// ...other predictions
|
| 147 |
]
|
| 148 |
}
|
| 149 |
```
|
| 150 |
|
| 151 |
### POST /attention
|
| 152 |
+
|
| 153 |
Retrieves attention matrices for visualizing attention patterns between tokens.
|
| 154 |
|
| 155 |
Request body:
|
| 156 |
+
|
| 157 |
```json
|
| 158 |
{
|
| 159 |
"text": "The cat sat on the mat",
|
|
|
|
| 162 |
```
|
| 163 |
|
| 164 |
Response:
|
| 165 |
+
|
| 166 |
```json
|
| 167 |
{
|
| 168 |
"attention_data": {
|
|
|
|
| 193 |
```
|
| 194 |
|
| 195 |
### POST /attention_comparison
|
| 196 |
+
|
| 197 |
Compares attention patterns before and after replacing a word in the input text. This is useful for analyzing how word replacements affect the model's attention distribution.
|
| 198 |
|
| 199 |
Request body:
|
| 200 |
+
|
| 201 |
```json
|
| 202 |
{
|
| 203 |
"text": "The cat sat on the mat",
|
|
|
|
| 208 |
```
|
| 209 |
|
| 210 |
Response:
|
| 211 |
+
|
| 212 |
```json
|
| 213 |
{
|
| 214 |
"before_attention": {
|
|
|
|
| 234 |
The backend communicates with the frontend through these API endpoints. The `/attention` endpoint is particularly important for the attention visualization features, including the matrix view, parallel view, and attention distribution bar charts.
|
| 235 |
|
| 236 |
The `/attention_comparison` endpoint enables a comparative analysis feature in the frontend, allowing users to see how attention patterns change when a word is replaced. This can be used to:
|
| 237 |
+
|
| 238 |
- Analyze semantic shifts in the model's understanding
|
| 239 |
- Compare attention flows before and after word replacements
|
| 240 |
- Visualize how different word choices affect contextual relationships
|
|
|
|
| 247 |
|
| 248 |
- Models are loaded dynamically upon first request and cached for subsequent requests
|
| 249 |
- The server supports both CPU and CUDA (GPU) execution if available
|
| 250 |
+
- For large texts, attention matrices can become quite large, so consider limiting input length for better performance
|