Dev Seth commited on
Commit
50aa037
·
1 Parent(s): c764f92

init space

Browse files
.DS_Store ADDED
Binary file (6.15 kB). View file
 
.gitattributes CHANGED
@@ -32,3 +32,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
32
  *.zip filter=lfs diff=lfs merge=lfs -text
33
  *.zst filter=lfs diff=lfs merge=lfs -text
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
32
  *.zip filter=lfs diff=lfs merge=lfs -text
33
  *.zst filter=lfs diff=lfs merge=lfs -text
34
  *tfevents* filter=lfs diff=lfs merge=lfs -text
35
+ *.hdf filter=lfs diff=lfs merge=lfs -text
36
+ *.ttf filter=lfs diff=lfs merge=lfs -text
README.ipynb ADDED
@@ -0,0 +1,265 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "## Welcome!\n",
8
+ "to the repo for\n",
9
+ "\n",
10
+ "*Learning the Legibility of Visual Text Perturbations* (EACL 2023)\n",
11
+ "\n",
12
+ "by Dev Seth, Rickard Stureborg, Danish Pruthi and Bhuwan Dhingra"
13
+ ]
14
+ },
15
+ {
16
+ "cell_type": "markdown",
17
+ "metadata": {},
18
+ "source": [
19
+ "### A `LEGIT` Introduction\n",
20
+ "This notebook provides a helpful starting point to interact with the datasets and models presented in the Learning Legibility paper.\n",
21
+ "\n",
22
+ "All assets are hosted on the HuggingFace Hub and can be used with the `transformers` and `datasets` libraries: \n",
23
+ " - TrOCR-MT Model: https://huggingface.co/dvsth/LEGIT-TrOCR-MT \n",
24
+ " - LEGIT Dataset: https://huggingface.co/datasets/dvsth/LEGIT\n",
25
+ " - Perturbed Jigsaw Dataset: https://huggingface.co/datasets/dvsth/LEGIT-VIPER-Jigsaw-Toxic-Comment-Perturbed"
26
+ ]
27
+ },
28
+ {
29
+ "attachments": {},
30
+ "cell_type": "markdown",
31
+ "metadata": {},
32
+ "source": [
33
+ "**For an interactive preview of the perturbation process and legibility assessment model, run `demo.py` using the command `python demo.py` (will open a browser-based interface). The demo allows you to perturb a word with your chosen attack parameters, then see the model's legibility estimate for the generated perturbations.**"
34
+ ]
35
+ },
36
+ {
37
+ "cell_type": "markdown",
38
+ "metadata": {},
39
+ "source": [
40
+ "##### Setup"
41
+ ]
42
+ },
43
+ {
44
+ "cell_type": "code",
45
+ "execution_count": 1,
46
+ "metadata": {},
47
+ "outputs": [],
48
+ "source": [
49
+ "# external imports -- use pip or conda to install these packages\n",
50
+ "import torch\n",
51
+ "from transformers import TrOCRProcessor, AutoModel, TrainingArguments\n",
52
+ "from datasets import load_dataset\n",
53
+ "\n",
54
+ "# local imports\n",
55
+ "from classes.LegibilityModel import LegibilityModel\n",
56
+ "from classes.Trainer import MultiTaskTrainer\n",
57
+ "from classes.Metrics import binary_classification_metric, ranking_metric"
58
+ ]
59
+ },
60
+ {
61
+ "cell_type": "markdown",
62
+ "metadata": {},
63
+ "source": [
64
+ "#### Loading the Model and Dataset"
65
+ ]
66
+ },
67
+ {
68
+ "cell_type": "code",
69
+ "execution_count": 2,
70
+ "metadata": {},
71
+ "outputs": [],
72
+ "source": [
73
+ "# load the model schema and pretrained weights\n",
74
+ "# (this may take some time to download)\n",
75
+ "model = AutoModel.from_pretrained(\"dvsth/LEGIT-TrOCR-MT\", revision='main', trust_remote_code=True)"
76
+ ]
77
+ },
78
+ {
79
+ "cell_type": "markdown",
80
+ "metadata": {},
81
+ "source": [
82
+ "Interactive dataset preview available [here](https://huggingface.co/datasets/dvsth/LEGIT/viewer/dvsth--LEGIT/test)."
83
+ ]
84
+ },
85
+ {
86
+ "cell_type": "code",
87
+ "execution_count": 3,
88
+ "metadata": {},
89
+ "outputs": [
90
+ {
91
+ "name": "stderr",
92
+ "output_type": "stream",
93
+ "text": [
94
+ "Using custom data configuration dvsth--LEGIT-d84a4d72774d3652\n",
95
+ "Found cached dataset parquet (/Users/dvsth/.cache/huggingface/datasets/dvsth___parquet/dvsth--LEGIT-d84a4d72774d3652/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)\n"
96
+ ]
97
+ },
98
+ {
99
+ "data": {
100
+ "application/vnd.jupyter.widget-view+json": {
101
+ "model_id": "22f8a468229a4760bd2829ef894a5472",
102
+ "version_major": 2,
103
+ "version_minor": 0
104
+ },
105
+ "text/plain": [
106
+ " 0%| | 0/3 [00:00<?, ?it/s]"
107
+ ]
108
+ },
109
+ "metadata": {},
110
+ "output_type": "display_data"
111
+ }
112
+ ],
113
+ "source": [
114
+ "dataset = load_dataset('dvsth/LEGIT').with_format('torch')"
115
+ ]
116
+ },
117
+ {
118
+ "cell_type": "markdown",
119
+ "metadata": {},
120
+ "source": [
121
+ "#### Training/Eval Loop"
122
+ ]
123
+ },
124
+ {
125
+ "cell_type": "markdown",
126
+ "metadata": {},
127
+ "source": [
128
+ "##### Trainer setup"
129
+ ]
130
+ },
131
+ {
132
+ "cell_type": "code",
133
+ "execution_count": 4,
134
+ "metadata": {},
135
+ "outputs": [
136
+ {
137
+ "name": "stderr",
138
+ "output_type": "stream",
139
+ "text": [
140
+ "Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.\n"
141
+ ]
142
+ }
143
+ ],
144
+ "source": [
145
+ "# preprocessor provides image normalization and resizing\n",
146
+ "preprocessor = TrOCRProcessor.from_pretrained(\n",
147
+ " \"microsoft/trocr-base-handwritten\")\n",
148
+ "\n",
149
+ "# apply preprocessing batch-wise\n",
150
+ "def collate_fn(data):\n",
151
+ " return {\n",
152
+ " 'choice': torch.tensor([d['choice'].item() for d in data]),\n",
153
+ " 'img0': preprocessor([d['img0'] for d in data], return_tensors='pt')['pixel_values'],\n",
154
+ " 'img1': preprocessor([d['img1'] for d in data], return_tensors='pt')['pixel_values']\n",
155
+ " }\n",
156
+ "\n",
157
+ "\n",
158
+ "train_args = TrainingArguments(\n",
159
+ " output_dir=f'runs', # change this to a unique path for each run, e.g. f'runs/{run_id}'\n",
160
+ " overwrite_output_dir=True,\n",
161
+ " num_train_epochs=5, # we found 3 epochs to be sufficient for convergence on the base models\n",
162
+ " per_device_train_batch_size=26, # fits on 1 x NVIDIA A6000, 48GB VRAM\n",
163
+ " per_device_eval_batch_size=26, # can be increased to 32\n",
164
+ " gradient_accumulation_steps=2, # increase this to fit on a smaller GPU\n",
165
+ " warmup_steps=0, \n",
166
+ " weight_decay=0.0,\n",
167
+ " learning_rate=1e-5, # we found this to be the best initial learning rate for the base models\n",
168
+ " save_strategy=\"steps\",\n",
169
+ " save_steps=200,\n",
170
+ " eval_steps=200,\n",
171
+ " evaluation_strategy=\"steps\",\n",
172
+ " logging_strategy='steps',\n",
173
+ " logging_steps=50,\n",
174
+ " fp16=False, \n",
175
+ " load_best_model_at_end=True, # load the best model at the end of training based on validation F1\n",
176
+ " metric_for_best_model='f1_score')\n",
177
+ "\n",
178
+ "trainer = MultiTaskTrainer(\n",
179
+ " model=model,\n",
180
+ " compute_metrics=binary_classification_metric, # check out metrics.py for a list of metrics\n",
181
+ " args=train_args,\n",
182
+ " data_collator=collate_fn,\n",
183
+ " train_dataset=dataset['train'],\n",
184
+ " eval_dataset=dataset['valid'])\n"
185
+ ]
186
+ },
187
+ {
188
+ "cell_type": "markdown",
189
+ "metadata": {},
190
+ "source": [
191
+ "##### Generate predictions and compute metrics"
192
+ ]
193
+ },
194
+ {
195
+ "cell_type": "code",
196
+ "execution_count": 5,
197
+ "metadata": {},
198
+ "outputs": [
199
+ {
200
+ "name": "stderr",
201
+ "output_type": "stream",
202
+ "text": [
203
+ "Parameter 'indices'=range(0, 100) of the transform datasets.arrow_dataset.Dataset.select couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed.\n",
204
+ "The following columns in the test set don't have a corresponding argument in `LegibilityModel.forward` and have been ignored: k, word1, word0, n, model1, word, n1, k1, model0. If k, word1, word0, n, model1, word, n1, k1, model0 are not expected by `LegibilityModel.forward`, you can safely ignore this message.\n",
205
+ "***** Running Prediction *****\n",
206
+ " Num examples = 100\n",
207
+ " Batch size = 26\n"
208
+ ]
209
+ },
210
+ {
211
+ "data": {
212
+ "application/vnd.jupyter.widget-view+json": {
213
+ "model_id": "1e9f7a59e7624c129d2dae1c85c2d11a",
214
+ "version_major": 2,
215
+ "version_minor": 0
216
+ },
217
+ "text/plain": [
218
+ " 0%| | 0/4 [00:00<?, ?it/s]"
219
+ ]
220
+ },
221
+ "metadata": {},
222
+ "output_type": "display_data"
223
+ },
224
+ {
225
+ "name": "stdout",
226
+ "output_type": "stream",
227
+ "text": [
228
+ "{'test_loss': 0.5344929695129395, 'test_precision': 0.9479166567925349, 'test_recall': 0.8921568539984622, 'test_accuracy': 0.8787878721303949, 'test_f1_score': 0.9191914103665608, 'test_runtime': 47.6671, 'test_samples_per_second': 2.098, 'test_steps_per_second': 0.084}\n"
229
+ ]
230
+ }
231
+ ],
232
+ "source": [
233
+ "predictions = trainer.predict(dataset['test'].select(range(100))) # takes ~1-2 minutes on a laptop CPU\n",
234
+ "print(predictions.metrics)"
235
+ ]
236
+ }
237
+ ],
238
+ "metadata": {
239
+ "kernelspec": {
240
+ "display_name": "Python 3.9.12 ('base')",
241
+ "language": "python",
242
+ "name": "python3"
243
+ },
244
+ "language_info": {
245
+ "codemirror_mode": {
246
+ "name": "ipython",
247
+ "version": 3
248
+ },
249
+ "file_extension": ".py",
250
+ "mimetype": "text/x-python",
251
+ "name": "python",
252
+ "nbconvert_exporter": "python",
253
+ "pygments_lexer": "ipython3",
254
+ "version": "3.9.12"
255
+ },
256
+ "orig_nbformat": 4,
257
+ "vscode": {
258
+ "interpreter": {
259
+ "hash": "a9a33fd02dcd74fd53701f10c0433ded41be0a0f53c9699722a73f690e69c2bc"
260
+ }
261
+ }
262
+ },
263
+ "nbformat": 4,
264
+ "nbformat_minor": 2
265
+ }
README.md CHANGED
@@ -1,12 +1,9 @@
1
- ---
2
- title: Learning Legibility 2023
3
- emoji: 🌍
4
- colorFrom: green
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 3.20.1
8
- app_file: app.py
9
- pinned: false
10
- ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
1
+ # learning-legibility-2023
2
+ This is the repo for *Learning the Legibility of Visual Text Perturbations* (EACL 2023)
 
 
 
 
 
 
 
 
3
 
4
+ Dev Seth, Rickard Stureborg, Danish Pruthi and Bhuwan Dhingra
5
+
6
+ Dataset available [here](https://huggingface.co/datasets/dvsth/LEGIT).
7
+ Model available [here](https://huggingface.co/dvsth/LEGIT-TrOCR-MT).
8
+
9
+ Run `README.ipynb` for a quick start.
classes/LegibilityModel.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch.nn as nn
2
+ from transformers import VisionEncoderDecoderModel, PreTrainedModel, AutoConfig
3
+
4
+ class LegibilityModel(PreTrainedModel):
5
+ def __init__(self, config):
6
+ config = AutoConfig.from_pretrained("microsoft/trocr-base-handwritten")
7
+ super(LegibilityModel, self).__init__(config=config)
8
+
9
+ # base model architecture
10
+ self.model = VisionEncoderDecoderModel(config).encoder
11
+
12
+ # change dropout during training
13
+ self.stack = nn.Sequential(
14
+ nn.Dropout(0),
15
+ nn.Linear(768, 768),
16
+ nn.ReLU(),
17
+ nn.Dropout(0),
18
+ nn.Linear(768, 1)
19
+ )
20
+
21
+ # choice, img0, img1 are not used by the model, but are passed by the trainer
22
+ def forward(self, img_batch, choice=None, img0=None, img1=None):
23
+ output = self.model(img_batch)
24
+ # average the output of the last hidden layer
25
+ output = output.last_hidden_state.mean(dim=1)
26
+ scores = self.stack(output)
27
+ return scores.squeeze()
classes/LegibilityPlot.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import matplotlib.font_manager as fm
2
+ import matplotlib.pyplot as plt
3
+ import matplotlib as mpl
4
+ import numpy as np
5
+
6
+ class LegibilityPlot:
7
+ def __init__(self):
8
+ fe = fm.FontEntry(fname='unifont.ttf', name='unifont')
9
+ fm.fontManager.ttflist.append(fe)
10
+
11
+ @staticmethod
12
+ def gradient_crim_to_darkg(mix):
13
+ c1 = '#3B724B'
14
+ c2 = '#9A2C44'
15
+ # increase saturation
16
+ c1=np.array(mpl.colors.to_rgb(c1))
17
+ c2=np.array(mpl.colors.to_rgb(c2))
18
+ return mpl.colors.to_hex((mix)*c1 + (1-mix)*c2)
19
+
20
+ def plot(self, scores, perturbations):
21
+ # convert the raw scores to probabilities
22
+ scores = [1 / (1 + np.exp(-x)) for x in scores]
23
+ fig, ax = plt.subplots(1, 1, figsize=(7, 10), dpi=600)
24
+
25
+ # add a horizontal bar for each probability, showing the difference between the probability and 0.5
26
+ # center the bar at 0.5
27
+ # make the bar 0.25 high
28
+ # make the bar color a gradient from crimson to darkgreen
29
+ # with opacity proportional to the probability
30
+ ax.barh(range(len(scores)), [x-0.5 for x in scores], height=0.1, color=[self.gradient_crim_to_darkg(x) for x in scores], left=0.5, alpha=0.9)
31
+
32
+ ax.scatter(scores, range(len(scores)), s=40, color='white', marker='v', alpha=1.0, zorder=10000, edgecolors='black', linewidths=0.5)
33
+ # place the image under the point
34
+ for i in range(len(perturbations)):
35
+ # place the image to the right of the point
36
+ ax.text(scores[i], i-0.25, perturbations[i], horizontalalignment='center', verticalalignment='top', fontsize=26, fontfamily='unifont', bbox=dict(facecolor=self.gradient_crim_to_darkg(scores[i]), edgecolor='white', alpha=0.3, boxstyle='round,pad=0.3'))
37
+
38
+ ax.set_xlim(0, 1)
39
+ # make ticks from 0 to 1 spaced by 0.1
40
+ ax.set_xticks(np.arange(0, 1.1, 0.20))
41
+ ax.set_ylim(-1, len(scores)-0.5)
42
+ ax.set_yticks([])
43
+ # show a line at 0.5
44
+ ax.axvline(0.5, color='black', linestyle=':', linewidth=1, alpha=0.5)
45
+ # disable grid lines
46
+ ax.grid(False)
47
+ # make the background white
48
+ ax.set_facecolor('white')
49
+ # show the x axis line
50
+ ax.spines['bottom'].set_visible(True)
51
+ # make the x axis line black with 0.5 alpha
52
+ ax.spines['bottom'].set_color('black')
53
+ ax.spines['bottom'].set_alpha(0.5)
54
+ # add x axis label
55
+ ax.set_xlabel('Legibility Score', fontsize=22)
56
+ # axis ticklabel font size
57
+ ax.tick_params(axis='x', labelsize=18)
58
+ # aspect ratio
59
+ ax.set_aspect(0.25)
60
+ # x tick marks
61
+ ax.tick_params(axis='x', which='both', bottom=True, top=False, labelbottom=True)
62
+ # embolden the x axis label
63
+ ax.xaxis.label.set_fontweight('bold')
64
+ # add some padding above the x axis label
65
+ # ax.xaxis.labelpad = 15
66
+
67
+ # remove the left and right spines
68
+ ax.spines['left'].set_visible(False)
69
+ ax.spines['right'].set_visible(False)
70
+ # remove the top spine
71
+ ax.spines['top'].set_visible(False)
72
+
73
+ return fig
classes/Metrics.py ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ def ranking_metric(evalpred):
4
+ scores0 = evalpred[0][0]
5
+ scores1 = evalpred[0][1]
6
+ labels = evalpred[1]
7
+
8
+ # labels:
9
+ # 0 or 1: word 0 or 1 is more legible, other unknown
10
+ # 2: both words are equally legible
11
+ # 3: neither word is legible
12
+
13
+ pairs_evaluated = 0
14
+ pairs_correct = 0
15
+ scores0 = 1 / (1 + np.exp(-scores0))
16
+ scores1 = 1 / (1 + np.exp(-scores1))
17
+ for i in range(scores0.shape[0]):
18
+ if labels[i] < 2:
19
+ pairs_evaluated += 1
20
+ if labels[i] == 0:
21
+ if scores0[i] >= scores1[i]:
22
+ pairs_correct += 1
23
+ elif labels[i] == 1:
24
+ if scores1[i] >= scores0[i]:
25
+ pairs_correct += 1
26
+
27
+ accuracy = pairs_correct / pairs_evaluated
28
+ return {'accuracy': accuracy}
29
+
30
+
31
+ def binary_classification_metric(evalpred):
32
+ scores0 = evalpred[0][0]
33
+ scores1 = evalpred[0][1]
34
+ labels = evalpred[1]
35
+
36
+ # labels:
37
+ # 0 or 1: word 0 or 1 is more legible, other unknown
38
+ # 2: both words are equally legible
39
+ # 3: neither word is legible
40
+
41
+ words_evaluated = 0
42
+ true_positives = 0
43
+ false_positives = 0
44
+ false_negatives = 0
45
+ true_negatives = 0
46
+ scores0 = 1 / (1 + np.exp(-scores0))
47
+ scores1 = 1 / (1 + np.exp(-scores1))
48
+ for i in range(scores0.shape[0]):
49
+ if labels[i] < 2:
50
+ words_evaluated += 1
51
+ else:
52
+ words_evaluated += 2
53
+ if labels[i] == 0:
54
+ if scores0[i] > 0.5:
55
+ true_positives += 1
56
+ else:
57
+ false_negatives += 1
58
+ elif labels[i] == 1:
59
+ if scores1[i] > 0.5:
60
+ true_positives += 1
61
+ else:
62
+ false_negatives += 1
63
+ elif labels[i] == 2:
64
+ if scores0[i] > 0.5:
65
+ true_positives += 1
66
+ else:
67
+ false_negatives += 1
68
+ if scores1[i] > 0.5:
69
+ true_positives += 1
70
+ else:
71
+ false_negatives += 1
72
+ elif labels[i] == 3:
73
+ if scores0[i] < 0.5:
74
+ true_negatives += 1
75
+ else:
76
+ false_positives += 1
77
+ if scores1[i] < 0.5:
78
+ true_negatives += 1
79
+ else:
80
+ false_positives += 1
81
+
82
+ # calculate precision, recall, accuracy and f1 score
83
+ precision = true_positives / (true_positives + false_positives + 1e-6)
84
+ recall = true_positives / (true_positives + false_negatives + 1e-6)
85
+ accuracy = (true_positives + true_negatives) / (words_evaluated + 1e-6)
86
+ f1_score = 2 * precision * recall / (precision + recall + 1e-6)
87
+ return {'precision': precision, 'recall': recall, 'accuracy': accuracy, 'f1_score': f1_score}
classes/Perturber.py ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from classes.Similarity import SimHelper
2
+ import numpy as np
3
+
4
+ class Perturber:
5
+ def __init__(self, model: str = 'trocr', max_k: int = 30):
6
+ try:
7
+ self.sim_space = SimHelper.create_sim_space(
8
+ model, 'features/trocr.hdf', num_nearest=max_k)
9
+ self.model = model
10
+ self.max_k = max_k
11
+ except:
12
+ raise Exception(
13
+ f"Could not load similarity space for model {model}. Make sure {model} is a valid model name. Valid model names: imgdot, trocr, detr, beit, clip")
14
+
15
+ def perturb_word(self, word: str, k: int = 10, n: float = 0.5):
16
+ if k > self.max_k:
17
+ raise Exception(
18
+ f"Cannot use k={k} for model {self.model}. Maximum k for this model is {self.max_k}.")
19
+ if n > 1 or n < 0:
20
+ raise Exception(
21
+ f"Cannot use n={n} for model {self.model}. n must be between 0 and 1.")
22
+
23
+ metadata = {}
24
+ metadata['original'] = word
25
+ metadata['model'] = self.model
26
+ metadata['k'] = k
27
+ metadata['substitutions'] = []
28
+
29
+ word = list(word)
30
+ l = len(word)
31
+ idx_to_replace = np.random.choice(
32
+ np.arange(l), size=int(l * n), replace=False)
33
+ for i in idx_to_replace:
34
+ rand_k = np.random.randint(1, k+1)
35
+ neighbor = self.sim_space.topk_neighbors(ord(word[i]), rand_k)[-1]
36
+ neighbor = chr(int(neighbor))
37
+ metadata['substitutions'].append({'from': word[i], 'to': neighbor, 'k': rand_k})
38
+ word[i] = neighbor
39
+
40
+ perturbation = ''.join(word)
41
+
42
+ return perturbation, metadata
classes/Renderer.py ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PIL import Image, ImageDraw, ImageFont
2
+
3
+ class Renderer:
4
+ def __init__(self, fontpath='unifont.ttf'):
5
+ # load the font
6
+ self.font = ImageFont.truetype(fontpath, 32)
7
+
8
+ def render_image(self, corrupted, original):
9
+ # create a new image with height slightly larger than the font size
10
+ text_length_px = self.font.getsize(corrupted + ' ' + original)[0]
11
+ img = Image.new('RGB', (text_length_px + 20, 40), color='white')
12
+ # create a drawing context
13
+ draw = ImageDraw.Draw(img)
14
+ # draw the text
15
+ draw.text((10, 0), corrupted + ' ' +
16
+ original, font=self.font, fill='black')
17
+ # return the image
18
+ return img
classes/Similarity.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ import pandas as pd
3
+ import torch
4
+ import pickle
5
+
6
+ class SimilaritySpace:
7
+ '''
8
+ PyTorch tensor implementation of a similarity space.
9
+ Much faster than sklearn's NearestNeighbors.
10
+ '''
11
+
12
+ def __init__(self, desc: str, feature_vectors: pd.DataFrame, num_nearest=10) -> None:
13
+ self.desc = desc
14
+ self.device = torch.device(
15
+ 'cuda' if torch.cuda.is_available() else 'cpu')
16
+ print("using device", self.device)
17
+ self.idx_to_codepoint = np.array(
18
+ feature_vectors.codepoint, dtype=np.int64)
19
+ self.codepoint_to_idx = {codepoint: idx for idx,
20
+ codepoint in enumerate(self.idx_to_codepoint)}
21
+ feature_vectors = torch.tensor(
22
+ np.vstack(feature_vectors.features), dtype=torch.float32, device=self.device)
23
+ # create a pairwise distance matrix
24
+ distance_matrix = self.matrix_cosine_distance(
25
+ feature_vectors)
26
+ # calculate the num_nearest nearest neighbors for each codepoint
27
+ distances, indices = torch.topk(
28
+ distance_matrix, k=num_nearest, dim=1, largest=False)
29
+ self.distances = distances.cpu().numpy()
30
+ self.indices = indices.cpu().numpy().astype(np.int64)
31
+ for row in self.indices:
32
+ # replace every element of row of indices with the corresponding codepoint
33
+ row[:] = self.idx_to_codepoint[row]
34
+
35
+ @staticmethod
36
+ def cosine_distance(x, y) -> float:
37
+ return 1 - np.dot(x, y) / ((np.linalg.norm(x) * np.linalg.norm(y)) + 1e-6)
38
+
39
+ @staticmethod
40
+ def matrix_cosine_distance(X: torch.TensorType) -> torch.TensorType:
41
+ '''
42
+ Compute the pairwise cosine distance between all rows of X.
43
+ X is a tensor of shape (n_samples, n_features)
44
+ '''
45
+ norm = torch.norm(X, dim=1, keepdim=True)
46
+ return 1 - (X @ X.T) / (norm @ norm.T)
47
+
48
+ def topk_neighbors(self, codepoint: int, k: int):
49
+ return self.indices[self.codepoint_to_idx[codepoint]][:k+1]
50
+
51
+ def topk_distances(self, codepoint: int, k: int):
52
+ return self.distances[self.codepoint_to_idx[codepoint]][:k+1]
53
+
54
+ def set_desc(self, desc: str) -> None:
55
+ self.desc = desc
56
+
57
+
58
+ class SimHelper:
59
+ @staticmethod
60
+ def create_sim_space(desc: str, path: str, key: str = 'df', num_nearest: int = 10) -> SimilaritySpace:
61
+ '''
62
+ Creates a similarity space from a feature vector HDF file stored at `path` with key `key`.
63
+ '''
64
+ df = pd.read_hdf(path, key)
65
+ return SimilaritySpace(desc=desc, feature_vectors=df, num_nearest=num_nearest)
66
+
67
+ @staticmethod
68
+ def load_sim_space(name: str):
69
+ return pickle.load(open(name + '.pkl', 'rb'))
70
+
71
+ @staticmethod
72
+ def save_sim_space(sim_space: SimilaritySpace, name: str) -> None:
73
+ pickle.dump(sim_space, open(name + '.pkl', 'wb'))
classes/Trainer.py ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ from transformers import Trainer
3
+ from classes import LegibilityModel
4
+
5
+ class MultiTaskTrainer(Trainer):
6
+ '''
7
+ Overrides the default HuggingFace Trainer class
8
+ '''
9
+
10
+ def training_step(self, model: LegibilityModel.LegibilityModel, inputs):
11
+ # get the input images and target label from the data dictionary
12
+ imgs0, imgs1, labels = inputs['img0'], inputs['img1'], inputs['choice']
13
+ # run the model
14
+ scores0 = model(imgs0)
15
+ scores1 = model(imgs1)
16
+ # compute the loss
17
+ loss = self.compute_loss(model, scores0, scores1, labels)
18
+ loss.backward()
19
+
20
+ return loss.detach()
21
+
22
+ def eval_step(self, model: LegibilityModel.LegibilityModel, inputs):
23
+ with torch.no_grad():
24
+ # get the input images and target label from the data dictionary
25
+ imgs0, imgs1, labels = inputs['img0'], inputs['img1'], inputs['choice']
26
+ # run the model
27
+ scores0 = model(imgs0)
28
+ scores1 = model(imgs1)
29
+ # compute the loss
30
+ loss = self.compute_loss(model, scores0, scores1, labels)
31
+ return loss
32
+
33
+ def prediction_step(self, model: LegibilityModel.LegibilityModel, inputs, prediction_loss_only=True, ignore_keys=None):
34
+ with torch.no_grad():
35
+ # get the input images and target label from the data dictionary
36
+ imgs0, imgs1, labels = inputs['img0'], inputs['img1'], inputs['choice']
37
+ # run the model
38
+ scores0 = model(imgs0)
39
+ scores1 = model(imgs1)
40
+ # compute the loss
41
+ loss = self.compute_loss(model, scores0, scores1, labels)
42
+ return loss, [scores0, scores1], labels
43
+
44
+ def compute_loss(self, model: LegibilityModel.LegibilityModel, scores0: torch.Tensor, scores1: torch.Tensor, labels: torch.Tensor, return_outputs=False):
45
+ # labels:
46
+ # 0 or 1: word 0 or 1 is more legible, other unknown
47
+ # 2: both words are equally legible
48
+ # 3: neither word is legible
49
+
50
+ contrastive_term = torch.binary_cross_entropy_with_logits(
51
+ scores0 - scores1, (labels == 0).type(torch.float))
52
+ word0_term = torch.binary_cross_entropy_with_logits(
53
+ scores0, torch.logical_or(labels == 0, labels == 2).type(torch.float))
54
+ word1_term = torch.binary_cross_entropy_with_logits(
55
+ scores1, torch.logical_or(labels == 1, labels == 2).type(torch.float))
56
+
57
+ # mask out terms which are not relevant for the loss
58
+ mask_c = labels < 2
59
+ mask_0 = torch.logical_or(torch.logical_or(labels == 0, labels == 2), labels == 3)
60
+ mask_1 = labels > 0
61
+
62
+ # compute the loss
63
+ loss = mask_c * contrastive_term + mask_0 * word0_term + mask_1 * word1_term
64
+ return loss.mean()
demo.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from classes.Perturber import Perturber
3
+ from classes.Renderer import Renderer
4
+ from classes.LegibilityPlot import LegibilityPlot
5
+
6
+ from transformers import TrOCRProcessor, AutoModel
7
+
8
+ # preprocessor provides image normalization and resizing
9
+ preprocessor = TrOCRProcessor.from_pretrained(
10
+ "microsoft/trocr-base-handwritten")
11
+
12
+ # load the model schema and pretrained weights
13
+ # (this may take some time to download)
14
+ model = AutoModel.from_pretrained("dvsth/LEGIT-TrOCR-MT", revision='main', trust_remote_code=True)
15
+
16
+ perturber = Perturber('trocr', 50)
17
+ renderer = Renderer('unifont.ttf')
18
+ plotter = LegibilityPlot()
19
+
20
+ def demo(word_to_perturb, k, n):
21
+ if ' ' in word_to_perturb:
22
+ return 'Please enter a single word.'
23
+
24
+ perturbations, metadatas, images, scores = [], [], [], []
25
+ for i in range(10):
26
+ perturbation, metadata = perturber.perturb_word(word_to_perturb, k, n)
27
+ inputimg = renderer.render_image(perturbation, word_to_perturb)
28
+ score = model(preprocessor(inputimg, return_tensors='pt').pixel_values).item()
29
+
30
+ metadata['score'] = score
31
+ outputimg = renderer.render_image(perturbation, '')
32
+
33
+ perturbations.append(perturbation)
34
+ images.append(outputimg)
35
+ metadatas.append(metadata)
36
+ scores.append(score)
37
+
38
+ # sort perturbations by score
39
+ perturbations = [perturbation for perturbation, score in sorted(zip(perturbations, scores), key=lambda x: x[1])]
40
+ scores = sorted(scores)
41
+ images = [image for image, score in sorted(zip(images, scores), key=lambda x: x[1])]
42
+ metadatas = [metadata for metadata, score in sorted(zip(metadatas, scores), key=lambda x: x[1])]
43
+
44
+ # return as a single string in the format
45
+ # perturbation1 (score1)
46
+ # perturbation2 (score2)
47
+ # ...
48
+ # perturbationN (scoreN)
49
+ # with all scores rounded to 2 decimal places
50
+ ret_str = ''
51
+ for i in range(len(perturbations)):
52
+ ret_str += f'{perturbations[i]} ({round(scores[i], 2)}) -- ' + ("legible" if scores[i] > 0 else "not legible") + '\n'
53
+
54
+ # plot the perturbations and scores
55
+ fig = plotter.plot(scores, perturbations)
56
+ return ret_str, fig
57
+
58
+ interface = gr.Interface(fn=demo, inputs=["text", gr.Slider(1, 50, 20, step=1), gr.Slider(0., 1., 0.5)], outputs=["text", "plot"], allow_flagging='never')
59
+
60
+ interface.launch(inbrowser=True)
features/beit.hdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:363347be388b8731dd54bbf537f1c2e459dcbec06e86a32e4ba7d24906200864
3
+ size 35072936
features/clip.hdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a94ed9b51b0b3a25e94939f7cf08b4a4b1a68edeba4e33ed45a5b904742aee24
3
+ size 35072936
features/detr.hdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:260b70c631fcefade5ef9d521c1619c3e4298adc5d8e298cecffb47d00faa3de
3
+ size 12761192
features/imgdot.hdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:930ebdf7588aa49dc70fd6c4c4beded7aba31e6d103abc7c25841e0ef16933f0
3
+ size 12761192
features/trocr.hdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f17b863064d15f02807549e654d4088966d2422a0f39d7bbf1100440afa58d58
3
+ size 35072936
requirements.txt ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ absl-py @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_e3n1cffagz/croot/absl-py_1666362938899/work
2
+ -e git+ssh://git@github.com/acl-org/aclpubcheck.git@1050a4fbca1602f39150e07f303627d36a452938#egg=aclpubcheck
3
+ aiofiles==23.1.0
4
+ aiohttp @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_4c_8pz93lf/croot/aiohttp_1670009562783/work
5
+ aiosignal @ file:///tmp/build/80754af9/aiosignal_1637843061372/work
6
+ altair==4.2.2
7
+ anyio @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/croot-t_zs64wy/anyio_1644482593257/work/dist
8
+ appnope @ file:///Users/ktietz/demo/mc3/conda-bld/appnope_1629146036738/work
9
+ argon2-cffi @ file:///opt/conda/conda-bld/argon2-cffi_1645000214183/work
10
+ argon2-cffi-bindings @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/croot-wbf5edig/argon2-cffi-bindings_1644845754377/work
11
+ asttokens @ file:///opt/conda/conda-bld/asttokens_1646925590279/work
12
+ async-timeout @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_19fv27ehgp/croots/recipe/async-timeout_1664876371666/work
13
+ attrs @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_978y9aqcd7/croot/attrs_1668696180911/work
14
+ autopep8 @ file:///opt/conda/conda-bld/autopep8_1650463822033/work
15
+ Babel @ file:///tmp/build/80754af9/babel_1620871417480/work
16
+ backcall @ file:///home/ktietz/src/ci/backcall_1611930011877/work
17
+ beautifulsoup4 @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-15cbtalq/beautifulsoup4_1650462161715/work
18
+ bibtexparser==1.4.0
19
+ bleach @ file:///opt/conda/conda-bld/bleach_1641577558959/work
20
+ blinker==1.4
21
+ Bottleneck @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_07078715-3ab7-4562-8d3d-d56b0eaa0f7dp504n_ny/croots/recipe/bottleneck_1657175566567/work
22
+ brotlipy==0.7.0
23
+ cachetools @ file:///tmp/build/80754af9/cachetools_1619597386817/work
24
+ certifi @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_5d968ni_yn/croot/certifi_1671487774636/work/certifi
25
+ cffi @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/croot-jgj0vmyy/cffi_1642701117808/work
26
+ charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
27
+ click @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-quwchbn8/click_1646123324461/work
28
+ conda==23.1.0
29
+ conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work
30
+ conda-package-handling @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-4sc96bd_/conda-package-handling_1649105290173/work
31
+ cryptography @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_5871d1ea-0250-4cd7-ac89-4b1e60514f5daqk8t0ow/croots/recipe/cryptography_1652101128666/work
32
+ cycler @ file:///tmp/build/80754af9/cycler_1637851556182/work
33
+ datasets @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_f0bj9iam3j/croot/datasets_1668181516280/work
34
+ debugpy @ file:///Users/builder/miniconda3/envs/prefect/conda-bld/debugpy_1637092214173/work
35
+ decorator @ file:///opt/conda/conda-bld/decorator_1643638310831/work
36
+ defusedxml @ file:///tmp/build/80754af9/defusedxml_1615228127516/work
37
+ dill @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_7eaenha9af/croot/dill_1667919539340/work
38
+ entrypoints @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-jb01gaox/entrypoints_1650293758411/work
39
+ et-xmlfile==1.1.0
40
+ executing @ file:///opt/conda/conda-bld/executing_1646925071911/work
41
+ fastapi==0.94.0
42
+ fastjsonschema @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_43a0jaiddu/croots/recipe/python-fastjsonschema_1661368628129/work
43
+ ffmpy==0.3.0
44
+ filelock @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_20tv0lrp8x/croot/filelock_1672387134240/work
45
+ flit_core @ file:///opt/conda/conda-bld/flit-core_1644941570762/work/source/flit_core
46
+ fonttools==4.25.0
47
+ frozenlist @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_38d0xkqt7k/croot/frozenlist_1670004509129/work
48
+ fsspec @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_c8b8kofgq9/croot/fsspec_1670336600001/work
49
+ google-auth @ file:///opt/conda/conda-bld/google-auth_1646735974934/work
50
+ google-auth-oauthlib==0.4.1
51
+ gradio==3.20.1
52
+ grpcio @ file:///Users/builder/miniconda3/envs/prefect/conda-bld/grpcio_1637592348692/work
53
+ h11==0.14.0
54
+ httpcore==0.16.3
55
+ httpx==0.23.3
56
+ huggingface-hub==0.11.1
57
+ idna @ file:///tmp/build/80754af9/idna_1637925883363/work
58
+ importlib-metadata @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-5pqd2z6f/importlib-metadata_1648710902288/work
59
+ ipykernel @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_7boejo_bxp/croots/recipe/ipykernel_1662361806503/work
60
+ ipython @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_813bgy7kg6/croot/ipython_1668088123202/work
61
+ ipython-genutils @ file:///tmp/build/80754af9/ipython_genutils_1606773439826/work
62
+ ipywidgets @ file:///tmp/build/80754af9/ipywidgets_1634143127070/work
63
+ jedi @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/croot-f1t6hma6/jedi_1644315882177/work
64
+ Jinja2 @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_9fjgzv9ant/croot/jinja2_1666908141308/work
65
+ json5 @ file:///tmp/build/80754af9/json5_1624432770122/work
66
+ jsonschema @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_e5_l6coyjh/croots/recipe/jsonschema_1663375475589/work
67
+ jupyter @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_c96hs6nzjt/croots/recipe/jupyter_1659349054648/work
68
+ jupyter-console @ file:///opt/conda/conda-bld/jupyter_console_1647002188872/work
69
+ jupyter-server @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_20576569-56d2-46ec-9455-d266af658edbzbqhavnh/croots/recipe/jupyter_server_1658754492228/work
70
+ jupyter_client @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_3bl8w897rj/croot/jupyter_client_1669040279222/work
71
+ jupyter_core @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_9bflj8byfz/croot/jupyter_core_1668084444072/work
72
+ jupyterlab @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_ebr25akxuh/croot/jupyterlab_1669368478002/work
73
+ jupyterlab-pygments @ file:///tmp/build/80754af9/jupyterlab_pygments_1601490720602/work
74
+ jupyterlab-widgets @ file:///tmp/build/80754af9/jupyterlab_widgets_1609884341231/work
75
+ jupyterlab_server @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_bfh24ty__4/croot/jupyterlab_server_1669363632509/work
76
+ kiwisolver @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_69de275f-e01d-4033-8ea6-9caf70c31072d5uirqak/croots/recipe/kiwisolver_1653292042660/work
77
+ latexcodec==2.0.1
78
+ linkify-it-py==2.0.0
79
+ lxml @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_71bcfe2b-fe7b-414a-9d7e-4f32bdd95f6d2vxca0jd/croots/recipe/lxml_1657545136492/work
80
+ Markdown @ file:///Users/ktietz/demo/mc3/conda-bld/markdown_1629712599908/work
81
+ markdown-it-py==2.2.0
82
+ MarkupSafe @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_12c133f5-0720-4727-9c18-599a3af825723lzwham3/croots/recipe/markupsafe_1654597866058/work
83
+ matplotlib @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_5fi9l6u9e6/croots/recipe/matplotlib-suite_1660167931049/work
84
+ matplotlib-inline @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_f6fdc0hldi/croots/recipe/matplotlib-inline_1662014472341/work
85
+ mdit-py-plugins==0.3.3
86
+ mdurl==0.1.2
87
+ mistune @ file:///Users/ktietz/demo/mc3/conda-bld/mistune_1629356075445/work
88
+ mock @ file:///tmp/build/80754af9/mock_1607622725907/work
89
+ multidict @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_bd3x0mcjpv/croot/multidict_1665674237951/work
90
+ multiprocess @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_c71oabygq3/croot/multiprocess_1668006436256/work
91
+ munkres==1.1.4
92
+ nbclassic @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_3btejl1i1y/croot/nbclassic_1668174982725/work
93
+ nbclient @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-08wgx75f/nbclient_1650373566605/work
94
+ nbconvert @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_36au3u9s44/croot/nbconvert_1668450648628/work
95
+ nbformat @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_edm2jl54tz/croots/recipe/nbformat_1663744962951/work
96
+ nest-asyncio @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-xymukih3/nest-asyncio_1649931465456/work
97
+ notebook @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_92n14lq88x/croot/notebook_1668179891126/work
98
+ notebook_shim @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_f8mr1gjfb7/croot/notebook-shim_1668160580414/work
99
+ numexpr @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_08e1f2b1-735c-4635-9755-5afc6d3eb18ew8ilpa0t/croots/recipe/numexpr_1656940301129/work
100
+ numpy @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_9bff6b9b-6adc-40a2-b632-076fce306026zq15ylrw/croots/recipe/numpy_and_numpy_base_1653915521605/work
101
+ oauthlib @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_4aroovdrpz/croot/oauthlib_1665490908808/work
102
+ openpyxl==3.0.10
103
+ orjson==3.8.7
104
+ packaging @ file:///tmp/build/80754af9/packaging_1637314298585/work
105
+ pandas==1.4.4
106
+ pandocfilters @ file:///opt/conda/conda-bld/pandocfilters_1643405455980/work
107
+ parso @ file:///opt/conda/conda-bld/parso_1641458642106/work
108
+ pdfminer.six==20221105
109
+ pdfplumber==0.7.6
110
+ pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
111
+ pickleshare @ file:///tmp/build/80754af9/pickleshare_1606932040724/work
112
+ Pillow==9.2.0
113
+ pluggy @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_croot-w6jyveby/pluggy_1648109277227/work
114
+ ply==3.11
115
+ prometheus-client @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_25sgeyk0j5/croots/recipe/prometheus_client_1659455103277/work
116
+ prompt-toolkit @ file:///tmp/build/80754af9/prompt-toolkit_1633440160888/work
117
+ protobuf==3.20.1
118
+ psutil @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_1310b568-21f4-4cb0-b0e3-2f3d31e39728k9coaga5/croots/recipe/psutil_1656431280844/work
119
+ ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1609355006118/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
120
+ pure-eval @ file:///opt/conda/conda-bld/pure_eval_1646925070566/work
121
+ pyarrow==8.0.0
122
+ pyasn1 @ file:///Users/ktietz/demo/mc3/conda-bld/pyasn1_1629708007385/work
123
+ pyasn1-modules==0.2.8
124
+ pybtex==0.24.0
125
+ pycodestyle @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_5b2mq44vl0/croot/pycodestyle_1674267228581/work
126
+ pycosat==0.6.3
127
+ pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
128
+ pycryptodome==3.17
129
+ pydantic==1.10.6
130
+ pydub==0.25.1
131
+ Pygments @ file:///opt/conda/conda-bld/pygments_1644249106324/work
132
+ PyJWT @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_83311740-c3bc-48a8-9688-62d7e2d625f5s6jo920w/croots/recipe/pyjwt_1657544592579/work
133
+ pylatexenc==2.10
134
+ pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work
135
+ pyparsing @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_3b_3vxnd07/croots/recipe/pyparsing_1661452540919/work
136
+ PyQt5-sip==12.11.0
137
+ pyrsistent @ file:///Users/ktietz/demo/mc3/conda-bld/pyrsistent_1628941062930/work
138
+ PySocks @ file:///Users/ktietz/Code/oss/ci_pkgs/pysocks_1626781349491/work
139
+ python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
140
+ python-multipart==0.0.6
141
+ pytz @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_830e8540-4c59-42fe-9515-a6f79c24ff6dh653xa67/croots/recipe/pytz_1654762630628/work
142
+ PyYAML @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_8dd_9u21zz/croot/pyyaml_1670514759576/work
143
+ pyzmq @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_8599562e-e9e5-443b-91db-7f7c0ba6aad3mrdoyvz4/croots/recipe/pyzmq_1657724196154/work
144
+ qtconsole @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_213etb6fb4/croots/recipe/qtconsole_1662018260717/work
145
+ QtPy @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_90hl8ymlpx/croots/recipe/qtpy_1662014534092/work
146
+ rebiber==1.1.3
147
+ regex==2022.10.31
148
+ requests @ file:///opt/conda/conda-bld/requests_1641824580448/work
149
+ requests-oauthlib==1.3.0
150
+ responses @ file:///tmp/build/80754af9/responses_1619800270522/work
151
+ rfc3986==1.5.0
152
+ rsa @ file:///tmp/build/80754af9/rsa_1614366226499/work
153
+ ruamel-yaml-conda @ file:///Users/ktietz/demo/mc3/conda-bld/ruamel_yaml_1629464769899/work
154
+ ruamel.yaml @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_dby9tad7ud/croots/recipe/ruamel.yaml_1664988526357/work
155
+ ruamel.yaml.clib @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_f64xdg2rww/croot/ruamel.yaml.clib_1666302244208/work
156
+ seaborn @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_bfw6gzh1ye/croot/seaborn_1669625743104/work
157
+ Send2Trash @ file:///tmp/build/80754af9/send2trash_1632406701022/work
158
+ sip @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_fbqiv4bzwo/croots/recipe/sip_1659012372184/work
159
+ six @ file:///tmp/build/80754af9/six_1644875935023/work
160
+ sniffio @ file:///Users/ktietz/demo/mc3/conda-bld/sniffio_1629145892482/work
161
+ soupsieve @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_d2jpk7eoyp/croot/soupsieve_1666296398381/work
162
+ stack-data @ file:///opt/conda/conda-bld/stack_data_1646927590127/work
163
+ starlette==0.26.0.post1
164
+ tables @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_19ckz1b_eh/croot/pytables_1673967665080/work
165
+ tensorboard @ file:///tmp/build/80754af9/tensorboard_1633093581375/work/tensorboard-2.6.0-py3-none-any.whl
166
+ tensorboard-data-server @ file:///var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_91lm1y0ip6/croot/tensorboard-data-server_1670853587549/work/tensorboard_data_server-0.6.1-py3-none-macosx_11_0_arm64.whl
167
+ tensorboard-plugin-wit==1.6.0
168
+ termcolor==2.2.0
169
+ terminado @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/croot-n1puqh63/terminado_1644395131327/work
170
+ timm==0.6.12
171
+ tinycss2 @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_fcw5_i306t/croot/tinycss2_1668168825117/work
172
+ tokenizers==0.13.2
173
+ toml @ file:///tmp/build/80754af9/toml_1616166611790/work
174
+ tomli @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_d0e5ffbf-5cf1-45be-8693-c5dff8108a2awhthtjlq/croots/recipe/tomli_1657175508477/work
175
+ toolz @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_362wyqvvgy/croot/toolz_1667464079070/work
176
+ torch==1.13.1
177
+ torchvision==0.14.1
178
+ tornado @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_a61b4xoie9/croots/recipe/tornado_1662061692951/work
179
+ tqdm @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_d8374dd1-2388-4771-b0ef-a205a18076f0p40h0nwh/croots/recipe/tqdm_1650891080348/work
180
+ traitlets @ file:///tmp/build/80754af9/traitlets_1636710298902/work
181
+ transformers==4.25.1
182
+ tsv==1.2
183
+ typing_extensions @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_10a8_u2ijw/croot/typing_extensions_1669923788997/work
184
+ uc-micro-py==1.0.1
185
+ Unidecode==1.3.6
186
+ urllib3 @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_dc7bd08d-3aa9-4e45-9124-1c432b2bee1dmp2r0ell/croots/recipe/urllib3_1650640006820/work
187
+ uvicorn==0.21.0
188
+ Wand==0.6.11
189
+ wcwidth @ file:///Users/ktietz/demo/mc3/conda-bld/wcwidth_1629357192024/work
190
+ webencodings==0.5.1
191
+ websocket-client @ file:///Users/ktietz/demo/mc3/conda-bld/websocket-client_1629357070578/work
192
+ websockets==10.4
193
+ Werkzeug @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_54t7yp6vfo/croot/werkzeug_1671215998207/work
194
+ widgetsnbextension @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/croot-ppn8qoma/widgetsnbextension_1645010005457/work
195
+ xxhash @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_5dctqdmbn8/croot/python-xxhash_1667919512545/work
196
+ yarl @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_a5sx8bz_hq/croots/recipe/yarl_1661437083380/work
197
+ zipp @ file:///private/var/folders/nz/j6p8yfhx1mv_0grj5xl4650h0000gp/T/abs_66c2c5f2-5dd5-4946-a16a-72af650ebd6cnmz4ou0f/croots/recipe/zipp_1652343960956/work
unifont.ttf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c632666c659ccdfdb841f151aa6cc48cb987e093b90806f5af3d5a4bea7c54a5
3
+ size 12273956