ronig commited on
Commit
d40c4dd
1 Parent(s): 081edaa

updating model peptriever_2023-06-23T16:07:24.508460

Browse files
Files changed (3) hide show
  1. special_tokens_map.json +3 -0
  2. tokenizer.json +2094 -0
  3. tokenizer_config.json +6 -0
special_tokens_map.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "pad_token": "<pad>"
3
+ }
tokenizer.json ADDED
@@ -0,0 +1,2094 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": {
5
+ "strategy": "BatchLongest",
6
+ "direction": "Right",
7
+ "pad_to_multiple_of": null,
8
+ "pad_id": 1,
9
+ "pad_type_id": 0,
10
+ "pad_token": "<pad>"
11
+ },
12
+ "added_tokens": [
13
+ {
14
+ "id": 0,
15
+ "content": "<unk>",
16
+ "single_word": false,
17
+ "lstrip": false,
18
+ "rstrip": false,
19
+ "normalized": false,
20
+ "special": true
21
+ },
22
+ {
23
+ "id": 1,
24
+ "content": "<pad>",
25
+ "single_word": false,
26
+ "lstrip": false,
27
+ "rstrip": false,
28
+ "normalized": false,
29
+ "special": true
30
+ },
31
+ {
32
+ "id": 2,
33
+ "content": "<mask>",
34
+ "single_word": false,
35
+ "lstrip": false,
36
+ "rstrip": false,
37
+ "normalized": false,
38
+ "special": true
39
+ }
40
+ ],
41
+ "normalizer": {
42
+ "type": "NFKC"
43
+ },
44
+ "pre_tokenizer": {
45
+ "type": "ByteLevel",
46
+ "add_prefix_space": true,
47
+ "trim_offsets": true,
48
+ "use_regex": true
49
+ },
50
+ "post_processor": {
51
+ "type": "ByteLevel",
52
+ "add_prefix_space": true,
53
+ "trim_offsets": true,
54
+ "use_regex": true
55
+ },
56
+ "decoder": {
57
+ "type": "ByteLevel",
58
+ "add_prefix_space": true,
59
+ "trim_offsets": true,
60
+ "use_regex": true
61
+ },
62
+ "model": {
63
+ "type": "BPE",
64
+ "dropout": null,
65
+ "unk_token": "<unk>",
66
+ "continuing_subword_prefix": null,
67
+ "end_of_word_suffix": null,
68
+ "fuse_unk": false,
69
+ "byte_fallback": false,
70
+ "vocab": {
71
+ "<unk>": 0,
72
+ "<pad>": 1,
73
+ "<mask>": 2,
74
+ "A": 3,
75
+ "B": 4,
76
+ "C": 5,
77
+ "D": 6,
78
+ "E": 7,
79
+ "F": 8,
80
+ "G": 9,
81
+ "H": 10,
82
+ "I": 11,
83
+ "K": 12,
84
+ "L": 13,
85
+ "M": 14,
86
+ "N": 15,
87
+ "O": 16,
88
+ "P": 17,
89
+ "Q": 18,
90
+ "R": 19,
91
+ "S": 20,
92
+ "T": 21,
93
+ "U": 22,
94
+ "V": 23,
95
+ "W": 24,
96
+ "X": 25,
97
+ "Y": 26,
98
+ "Z": 27,
99
+ "Ġ": 28,
100
+ "AA": 29,
101
+ "GG": 30,
102
+ "LL": 31,
103
+ "AG": 32,
104
+ "AL": 33,
105
+ "VL": 34,
106
+ "EL": 35,
107
+ "SL": 36,
108
+ "SG": 37,
109
+ "TL": 38,
110
+ "KL": 39,
111
+ "AV": 40,
112
+ "DL": 41,
113
+ "EE": 42,
114
+ "RL": 43,
115
+ "VV": 44,
116
+ "TG": 45,
117
+ "IL": 46,
118
+ "SS": 47,
119
+ "XX": 48,
120
+ "AE": 49,
121
+ "KK": 50,
122
+ "GL": 51,
123
+ "TV": 52,
124
+ "NL": 53,
125
+ "AI": 54,
126
+ "CC": 55,
127
+ "AR": 56,
128
+ "DG": 57,
129
+ "AK": 58,
130
+ "PL": 59,
131
+ "AS": 60,
132
+ "QL": 61,
133
+ "DI": 62,
134
+ "HH": 63,
135
+ "EK": 64,
136
+ "DV": 65,
137
+ "EV": 66,
138
+ "SV": 67,
139
+ "PG": 68,
140
+ "AT": 69,
141
+ "FL": 70,
142
+ "EG": 71,
143
+ "EI": 72,
144
+ "RV": 73,
145
+ "PV": 74,
146
+ "RG": 75,
147
+ "YL": 76,
148
+ "TI": 77,
149
+ "KV": 78,
150
+ "AD": 79,
151
+ "NG": 80,
152
+ "SI": 81,
153
+ "RI": 82,
154
+ "AQ": 83,
155
+ "CG": 84,
156
+ "KG": 85,
157
+ "TT": 86,
158
+ "NV": 87,
159
+ "KI": 88,
160
+ "FG": 89,
161
+ "NI": 90,
162
+ "AP": 91,
163
+ "ST": 92,
164
+ "DP": 93,
165
+ "FV": 94,
166
+ "RR": 95,
167
+ "SP": 96,
168
+ "DE": 97,
169
+ "QG": 98,
170
+ "KE": 99,
171
+ "QV": 100,
172
+ "XXXX": 101,
173
+ "YG": 102,
174
+ "AF": 103,
175
+ "UU": 104,
176
+ "TP": 105,
177
+ "NP": 106,
178
+ "SK": 107,
179
+ "SR": 108,
180
+ "SE": 109,
181
+ "YV": 110,
182
+ "SF": 111,
183
+ "QI": 112,
184
+ "ĠM": 113,
185
+ "DF": 114,
186
+ "HL": 115,
187
+ "DK": 116,
188
+ "AN": 117,
189
+ "AC": 118,
190
+ "RE": 119,
191
+ "AM": 120,
192
+ "IG": 121,
193
+ "RK": 122,
194
+ "TE": 123,
195
+ "SN": 124,
196
+ "AY": 125,
197
+ "PE": 126,
198
+ "ML": 127,
199
+ "IV": 128,
200
+ "TF": 129,
201
+ "DD": 130,
202
+ "TK": 131,
203
+ "RF": 132,
204
+ "SQ": 133,
205
+ "UG": 134,
206
+ "SD": 135,
207
+ "SY": 136,
208
+ "NN": 137,
209
+ "RP": 138,
210
+ "TQ": 139,
211
+ "RD": 140,
212
+ "AH": 141,
213
+ "TD": 142,
214
+ "PP": 143,
215
+ "RQ": 144,
216
+ "NK": 145,
217
+ "NE": 146,
218
+ "YI": 147,
219
+ "VG": 148,
220
+ "NF": 149,
221
+ "HG": 150,
222
+ "GGG": 151,
223
+ "PI": 152,
224
+ "YF": 153,
225
+ "QE": 154,
226
+ "MV": 155,
227
+ "MG": 156,
228
+ "QK": 157,
229
+ "TY": 158,
230
+ "FI": 159,
231
+ "SH": 160,
232
+ "KD": 161,
233
+ "HHHH": 162,
234
+ "CL": 163,
235
+ "TR": 164,
236
+ "AAG": 165,
237
+ "WL": 166,
238
+ "XXXXXXXX": 167,
239
+ "QQ": 168,
240
+ "HV": 169,
241
+ "TN": 170,
242
+ "FE": 171,
243
+ "IE": 172,
244
+ "YK": 173,
245
+ "TS": 174,
246
+ "YE": 175,
247
+ "AGG": 176,
248
+ "PK": 177,
249
+ "PD": 178,
250
+ "IK": 179,
251
+ "PF": 180,
252
+ "SM": 181,
253
+ "RN": 182,
254
+ "YD": 183,
255
+ "VE": 184,
256
+ "ID": 185,
257
+ "UC": 186,
258
+ "AW": 187,
259
+ "TH": 188,
260
+ "RY": 189,
261
+ "FD": 190,
262
+ "FK": 191,
263
+ "II": 192,
264
+ "QD": 193,
265
+ "ND": 194,
266
+ "ED": 195,
267
+ "VK": 196,
268
+ "YY": 197,
269
+ "ME": 198,
270
+ "QR": 199,
271
+ "QP": 200,
272
+ "WG": 201,
273
+ "NY": 202,
274
+ "QF": 203,
275
+ "MK": 204,
276
+ "IP": 205,
277
+ "RT": 206,
278
+ "RH": 207,
279
+ "NT": 208,
280
+ "RS": 209,
281
+ "VD": 210,
282
+ "VP": 211,
283
+ "AAL": 212,
284
+ "NQ": 213,
285
+ "ALL": 214,
286
+ "KP": 215,
287
+ "SC": 216,
288
+ "SW": 217,
289
+ "VI": 218,
290
+ "UGG": 219,
291
+ "MI": 220,
292
+ "YP": 221,
293
+ "FP": 222,
294
+ "ER": 223,
295
+ "ACC": 224,
296
+ "HI": 225,
297
+ "HP": 226,
298
+ "TC": 227,
299
+ "TM": 228,
300
+ "NR": 229,
301
+ "EF": 230,
302
+ "NS": 231,
303
+ "YR": 232,
304
+ "YQ": 233,
305
+ "HK": 234,
306
+ "GV": 235,
307
+ "MD": 236,
308
+ "FF": 237,
309
+ "EQ": 238,
310
+ "GI": 239,
311
+ "TW": 240,
312
+ "LLL": 241,
313
+ "KR": 242,
314
+ "AU": 243,
315
+ "SGL": 244,
316
+ "HE": 245,
317
+ "XXXXXXXXXXXXXXXX": 246,
318
+ "YN": 247,
319
+ "KN": 248,
320
+ "HHHHHH": 249,
321
+ "KT": 250,
322
+ "AVL": 251,
323
+ "CGG": 252,
324
+ "EP": 253,
325
+ "CV": 254,
326
+ "FR": 255,
327
+ "MP": 256,
328
+ "KQ": 257,
329
+ "FS": 258,
330
+ "FN": 259,
331
+ "YT": 260,
332
+ "ĠG": 261,
333
+ "YS": 262,
334
+ "DN": 263,
335
+ "DS": 264,
336
+ "DR": 265,
337
+ "DT": 266,
338
+ "QN": 267,
339
+ "AGL": 268,
340
+ "QT": 269,
341
+ "WV": 270,
342
+ "QS": 271,
343
+ "IN": 272,
344
+ "FT": 273,
345
+ "IR": 274,
346
+ "CCG": 275,
347
+ "GK": 276,
348
+ "ASL": 277,
349
+ "EN": 278,
350
+ "DY": 279,
351
+ "HF": 280,
352
+ "VR": 281,
353
+ "AEL": 282,
354
+ "IS": 283,
355
+ "QM": 284,
356
+ "ET": 285,
357
+ "ALG": 286,
358
+ "ES": 287,
359
+ "PR": 288,
360
+ "IT": 289,
361
+ "AGC": 290,
362
+ "ATL": 291,
363
+ "KS": 292,
364
+ "UAA": 293,
365
+ "KY": 294,
366
+ "MR": 295,
367
+ "ACG": 296,
368
+ "KF": 297,
369
+ "HR": 298,
370
+ "ATG": 299,
371
+ "MT": 300,
372
+ "WI": 301,
373
+ "DQ": 302,
374
+ "GE": 303,
375
+ "PQ": 304,
376
+ "PT": 305,
377
+ "MN": 306,
378
+ "ASG": 307,
379
+ "ADL": 308,
380
+ "FQ": 309,
381
+ "FY": 310,
382
+ "IQ": 311,
383
+ "AAAA": 312,
384
+ "VT": 313,
385
+ "GD": 314,
386
+ "HQ": 315,
387
+ "EY": 316,
388
+ "AUG": 317,
389
+ "VS": 318,
390
+ "AUU": 319,
391
+ "AEE": 320,
392
+ "LLG": 321,
393
+ "NH": 322,
394
+ "AKL": 323,
395
+ "IF": 324,
396
+ "ARL": 325,
397
+ "HD": 326,
398
+ "VN": 327,
399
+ "AIL": 328,
400
+ "RW": 329,
401
+ "MQ": 330,
402
+ "AKK": 331,
403
+ "GGL": 332,
404
+ "VQ": 333,
405
+ "CP": 334,
406
+ "UCC": 335,
407
+ "PY": 336,
408
+ "MS": 337,
409
+ "WE": 338,
410
+ "PS": 339,
411
+ "WK": 340,
412
+ "HT": 341,
413
+ "NM": 342,
414
+ "DM": 343,
415
+ "HY": 344,
416
+ "APG": 345,
417
+ "VLG": 346,
418
+ "VF": 347,
419
+ "SLG": 348,
420
+ "NC": 349,
421
+ "EM": 350,
422
+ "ADG": 351,
423
+ "TGL": 352,
424
+ "DC": 353,
425
+ "QY": 354,
426
+ "RM": 355,
427
+ "IY": 356,
428
+ "ATV": 357,
429
+ "RC": 358,
430
+ "HS": 359,
431
+ "CAA": 360,
432
+ "WD": 361,
433
+ "SGG": 362,
434
+ "UUG": 363,
435
+ "ALE": 364,
436
+ "MY": 365,
437
+ "ADV": 366,
438
+ "PN": 367,
439
+ "SLL": 368,
440
+ "ADI": 369,
441
+ "MF": 370,
442
+ "GF": 371,
443
+ "AAV": 372,
444
+ "SGV": 373,
445
+ "APL": 374,
446
+ "EH": 375,
447
+ "UCG": 376,
448
+ "GR": 377,
449
+ "ASS": 378,
450
+ "CI": 379,
451
+ "LLK": 380,
452
+ "ANL": 381,
453
+ "CT": 382,
454
+ "AQL": 383,
455
+ "AGV": 384,
456
+ "WQ": 385,
457
+ "CK": 386,
458
+ "LLE": 387,
459
+ "AFL": 388,
460
+ "PH": 389,
461
+ "SSL": 390,
462
+ "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX": 391,
463
+ "CE": 392,
464
+ "AVG": 393,
465
+ "KH": 394,
466
+ "DLG": 395,
467
+ "IH": 396,
468
+ "ALV": 397,
469
+ "DH": 398,
470
+ "DW": 399,
471
+ "ILG": 400,
472
+ "SAA": 401,
473
+ "TLG": 402,
474
+ "EW": 403,
475
+ "NW": 404,
476
+ "UAG": 405,
477
+ "FH": 406,
478
+ "AAC": 407,
479
+ "AIG": 408,
480
+ "ALK": 409,
481
+ "CD": 410,
482
+ "AVV": 411,
483
+ "ELG": 412,
484
+ "TAA": 413,
485
+ "QH": 414,
486
+ "VY": 415,
487
+ "VVG": 416,
488
+ "EC": 417,
489
+ "AYL": 418,
490
+ "EEG": 419,
491
+ "RLL": 420,
492
+ "ĠMG": 421,
493
+ "AGGG": 422,
494
+ "WY": 423,
495
+ "TLL": 424,
496
+ "IC": 425,
497
+ "AUC": 426,
498
+ "EEL": 427,
499
+ "HN": 428,
500
+ "VLD": 429,
501
+ "ANG": 430,
502
+ "PM": 431,
503
+ "WN": 432,
504
+ "RLG": 433,
505
+ "EKL": 434,
506
+ "AEV": 435,
507
+ "ALR": 436,
508
+ "VLV": 437,
509
+ "CAG": 438,
510
+ "AEG": 439,
511
+ "STL": 440,
512
+ "WR": 441,
513
+ "DLV": 442,
514
+ "DGV": 443,
515
+ "MM": 444,
516
+ "GGV": 445,
517
+ "SAL": 446,
518
+ "DLL": 447,
519
+ "YH": 448,
520
+ "WT": 449,
521
+ "EEV": 450,
522
+ "RLV": 451,
523
+ "SSG": 452,
524
+ "AAGG": 453,
525
+ "PC": 454,
526
+ "YM": 455,
527
+ "ASV": 456,
528
+ "APV": 457,
529
+ "VLK": 458,
530
+ "ELL": 459,
531
+ "REL": 460,
532
+ "AFV": 461,
533
+ "RGL": 462,
534
+ "ILK": 463,
535
+ "CF": 464,
536
+ "SDL": 465,
537
+ "SKL": 466,
538
+ "AAE": 467,
539
+ "AKG": 468,
540
+ "AFG": 469,
541
+ "WF": 470,
542
+ "PLG": 471,
543
+ "IM": 472,
544
+ "TLV": 473,
545
+ "ELK": 474,
546
+ "YC": 475,
547
+ "KKG": 476,
548
+ "RAL": 477,
549
+ "ARG": 478,
550
+ "KLG": 479,
551
+ "AAK": 480,
552
+ "PVG": 481,
553
+ "EAL": 482,
554
+ "WM": 483,
555
+ "AAI": 484,
556
+ "UGAA": 485,
557
+ "ELV": 486,
558
+ "AEI": 487,
559
+ "AGI": 488,
560
+ "PW": 489,
561
+ "ANV": 490,
562
+ "LLV": 491,
563
+ "RVL": 492,
564
+ "TVL": 493,
565
+ "GLG": 494,
566
+ "VVV": 495,
567
+ "AAP": 496,
568
+ "SGT": 497,
569
+ "SRL": 498,
570
+ "KM": 499,
571
+ "CCC": 500,
572
+ "LLD": 501,
573
+ "SLV": 502,
574
+ "CN": 503,
575
+ "KLV": 504,
576
+ "GGC": 505,
577
+ "TVG": 506,
578
+ "PLV": 507,
579
+ "QW": 508,
580
+ "CR": 509,
581
+ "EEI": 510,
582
+ "AVK": 511,
583
+ "TAL": 512,
584
+ "EKI": 513,
585
+ "ANI": 514,
586
+ "ELE": 515,
587
+ "CH": 516,
588
+ "ANP": 517,
589
+ "RTL": 518,
590
+ "TGK": 519,
591
+ "TKL": 520,
592
+ "GLV": 521,
593
+ "FM": 522,
594
+ "RVG": 523,
595
+ "VLE": 524,
596
+ "AEK": 525,
597
+ "CQ": 526,
598
+ "SLK": 527,
599
+ "SLI": 528,
600
+ "TGI": 529,
601
+ "SDI": 530,
602
+ "AYV": 531,
603
+ "TAT": 532,
604
+ "KKV": 533,
605
+ "TGG": 534,
606
+ "EKG": 535,
607
+ "SVG": 536,
608
+ "RLI": 537,
609
+ "SSV": 538,
610
+ "LLI": 539,
611
+ "AAR": 540,
612
+ "ELI": 541,
613
+ "UAC": 542,
614
+ "CGC": 543,
615
+ "AGK": 544,
616
+ "DAV": 545,
617
+ "TKV": 546,
618
+ "AHL": 547,
619
+ "WS": 548,
620
+ "YSL": 549,
621
+ "EEE": 550,
622
+ "AKV": 551,
623
+ "LLQ": 552,
624
+ "EGV": 553,
625
+ "DKL": 554,
626
+ "UGGG": 555,
627
+ "FLG": 556,
628
+ "DVV": 557,
629
+ "DLK": 558,
630
+ "TVV": 559,
631
+ "YW": 560,
632
+ "ILE": 561,
633
+ "SLE": 562,
634
+ "DAL": 563,
635
+ "SVL": 564,
636
+ "AGE": 565,
637
+ "RKL": 566,
638
+ "UGC": 567,
639
+ "HM": 568,
640
+ "DGK": 569,
641
+ "AML": 570,
642
+ "ARV": 571,
643
+ "PGV": 572,
644
+ "TTL": 573,
645
+ "DLI": 574,
646
+ "CCGG": 575,
647
+ "AIV": 576,
648
+ "AYG": 577,
649
+ "TAV": 578,
650
+ "ASI": 579,
651
+ "FC": 580,
652
+ "TLK": 581,
653
+ "TIG": 582,
654
+ "RGV": 583,
655
+ "EKV": 584,
656
+ "EEK": 585,
657
+ "KVG": 586,
658
+ "TNL": 587,
659
+ "EIV": 588,
660
+ "SIL": 589,
661
+ "QLL": 590,
662
+ "KLI": 591,
663
+ "AAD": 592,
664
+ "KLK": 593,
665
+ "RRL": 594,
666
+ "TLE": 595,
667
+ "ASQ": 596,
668
+ "CS": 597,
669
+ "VLP": 598,
670
+ "AGD": 599,
671
+ "TVK": 600,
672
+ "NGL": 601,
673
+ "STG": 602,
674
+ "MH": 603,
675
+ "ALQ": 604,
676
+ "EVI": 605,
677
+ "QLG": 606,
678
+ "ILV": 607,
679
+ "DLE": 608,
680
+ "NLG": 609,
681
+ "KKI": 610,
682
+ "RVV": 611,
683
+ "EVG": 612,
684
+ "ALD": 613,
685
+ "ALT": 614,
686
+ "EGK": 615,
687
+ "ACCG": 616,
688
+ "TVP": 617,
689
+ "ALP": 618,
690
+ "FW": 619,
691
+ "DLP": 620,
692
+ "RAV": 621,
693
+ "TEL": 622,
694
+ "SIV": 623,
695
+ "QLV": 624,
696
+ "QIG": 625,
697
+ "ALI": 626,
698
+ "AIK": 627,
699
+ "NSL": 628,
700
+ "NLL": 629,
701
+ "NSG": 630,
702
+ "SEL": 631,
703
+ "QEL": 632,
704
+ "YVG": 633,
705
+ "TEE": 634,
706
+ "RIG": 635,
707
+ "RLE": 636,
708
+ "SVV": 637,
709
+ "HHHHHHS": 638,
710
+ "SAS": 639,
711
+ "AUGG": 640,
712
+ "NLK": 641,
713
+ "TGV": 642,
714
+ "QLK": 643,
715
+ "TAG": 644,
716
+ "TSL": 645,
717
+ "NGV": 646,
718
+ "SAT": 647,
719
+ "SPL": 648,
720
+ "WP": 649,
721
+ "HC": 650,
722
+ "QGL": 651,
723
+ "AAAG": 652,
724
+ "VLI": 653,
725
+ "EII": 654,
726
+ "NTL": 655,
727
+ "ARE": 656,
728
+ "SAV": 657,
729
+ "RLK": 658,
730
+ "VVI": 659,
731
+ "YLG": 660,
732
+ "SEE": 661,
733
+ "ARK": 662,
734
+ "SSK": 663,
735
+ "RGG": 664,
736
+ "RDL": 665,
737
+ "QC": 666,
738
+ "SEG": 667,
739
+ "PPG": 668,
740
+ "DAA": 669,
741
+ "RAA": 670,
742
+ "SAG": 671,
743
+ "GLK": 672,
744
+ "DIG": 673,
745
+ "FLE": 674,
746
+ "FVG": 675,
747
+ "AKE": 676,
748
+ "FLV": 677,
749
+ "SKK": 678,
750
+ "SNL": 679,
751
+ "SKI": 680,
752
+ "QLI": 681,
753
+ "AVE": 682,
754
+ "AVI": 683,
755
+ "DVG": 684,
756
+ "SGF": 685,
757
+ "GLI": 686,
758
+ "TSV": 687,
759
+ "DEL": 688,
760
+ "RVI": 689,
761
+ "DGI": 690,
762
+ "TPL": 691,
763
+ "SIG": 692,
764
+ "RIL": 693,
765
+ "RSL": 694,
766
+ "PLL": 695,
767
+ "DEV": 696,
768
+ "ACGG": 697,
769
+ "AQG": 698,
770
+ "EIG": 699,
771
+ "QGV": 700,
772
+ "SNI": 701,
773
+ "RTG": 702,
774
+ "SGI": 703,
775
+ "ARI": 704,
776
+ "AQK": 705,
777
+ "CY": 706,
778
+ "UUC": 707,
779
+ "TAD": 708,
780
+ "GGI": 709,
781
+ "EIK": 710,
782
+ "VVK": 711,
783
+ "HW": 712,
784
+ "AII": 713,
785
+ "TTG": 714,
786
+ "NGK": 715,
787
+ "KGV": 716,
788
+ "AGF": 717,
789
+ "PGD": 718,
790
+ "NVL": 719,
791
+ "NKI": 720,
792
+ "KKK": 721,
793
+ "TAS": 722,
794
+ "NLV": 723,
795
+ "SGK": 724,
796
+ "SLP": 725,
797
+ "FGV": 726,
798
+ "EGI": 727,
799
+ "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX": 728,
800
+ "TAE": 729,
801
+ "GLE": 730,
802
+ "QAA": 731,
803
+ "LLP": 732,
804
+ "VPRG": 733,
805
+ "RAE": 734,
806
+ "DRL": 735,
807
+ "RDI": 736,
808
+ "EVE": 737,
809
+ "SHM": 738,
810
+ "DVI": 739,
811
+ "TSP": 740,
812
+ "VVD": 741,
813
+ "KLE": 742,
814
+ "QLE": 743,
815
+ "ĠMGSS": 744,
816
+ "SVF": 745,
817
+ "AGAA": 746,
818
+ "NDL": 747,
819
+ "SVI": 748,
820
+ "SKG": 749,
821
+ "NVK": 750,
822
+ "AGCC": 751,
823
+ "QAL": 752,
824
+ "RAR": 753,
825
+ "DGE": 754,
826
+ "SPG": 755,
827
+ "GGK": 756,
828
+ "TTV": 757,
829
+ "SAI": 758,
830
+ "EVK": 759,
831
+ "SAE": 760,
832
+ "QVL": 761,
833
+ "YGV": 762,
834
+ "SDE": 763,
835
+ "FLK": 764,
836
+ "TSG": 765,
837
+ "TLP": 766,
838
+ "RLF": 767,
839
+ "ELP": 768,
840
+ "NKL": 769,
841
+ "TLI": 770,
842
+ "RIV": 771,
843
+ "AVD": 772,
844
+ "GGGG": 773,
845
+ "SSI": 774,
846
+ "DIV": 775,
847
+ "SEV": 776,
848
+ "PVP": 777,
849
+ "STV": 778,
850
+ "CCCG": 779,
851
+ "NAA": 780,
852
+ "SAK": 781,
853
+ "LLR": 782,
854
+ "TIL": 783,
855
+ "QSL": 784,
856
+ "DII": 785,
857
+ "KGK": 786,
858
+ "AIE": 787,
859
+ "NIL": 788,
860
+ "DKK": 789,
861
+ "EKE": 790,
862
+ "QAV": 791,
863
+ "TLD": 792,
864
+ "TDL": 793,
865
+ "VLF": 794,
866
+ "AAF": 795,
867
+ "SII": 796,
868
+ "SKD": 797,
869
+ "KIV": 798,
870
+ "NRL": 799,
871
+ "TKG": 800,
872
+ "AKI": 801,
873
+ "SKV": 802,
874
+ "KIG": 803,
875
+ "TLF": 804,
876
+ "NLYF": 805,
877
+ "TEK": 806,
878
+ "EGE": 807,
879
+ "GLP": 808,
880
+ "TGE": 809,
881
+ "AQV": 810,
882
+ "UAGG": 811,
883
+ "TPV": 812,
884
+ "SYL": 813,
885
+ "TVI": 814,
886
+ "RQL": 815,
887
+ "ATK": 816,
888
+ "QKL": 817,
889
+ "ASK": 818,
890
+ "RAG": 819,
891
+ "AMG": 820,
892
+ "APE": 821,
893
+ "DVL": 822,
894
+ "DVK": 823,
895
+ "NAL": 824,
896
+ "SEK": 825,
897
+ "VVE": 826,
898
+ "ELF": 827,
899
+ "GGF": 828,
900
+ "RSG": 829,
901
+ "SRV": 830,
902
+ "KGI": 831,
903
+ "ILP": 832,
904
+ "NVG": 833,
905
+ "SFL": 834,
906
+ "STI": 835,
907
+ "HHHHHHSSGL": 836,
908
+ "TPG": 837,
909
+ "NLE": 838,
910
+ "HLL": 839,
911
+ "NTV": 840,
912
+ "TTI": 841,
913
+ "RGK": 842,
914
+ "MW": 843,
915
+ "SNG": 844,
916
+ "KVI": 845,
917
+ "TST": 846,
918
+ "NEE": 847,
919
+ "RYL": 848,
920
+ "REE": 849,
921
+ "SNV": 850,
922
+ "TDP": 851,
923
+ "KVK": 852,
924
+ "SQL": 853,
925
+ "PLP": 854,
926
+ "SSP": 855,
927
+ "TRL": 856,
928
+ "RDG": 857,
929
+ "UACC": 858,
930
+ "TAK": 859,
931
+ "RNL": 860,
932
+ "SAR": 861,
933
+ "AAQ": 862,
934
+ "DDL": 863,
935
+ "TVE": 864,
936
+ "KII": 865,
937
+ "KLF": 866,
938
+ "RII": 867,
939
+ "CLV": 868,
940
+ "ADE": 869,
941
+ "UAAG": 870,
942
+ "RGI": 871,
943
+ "TKK": 872,
944
+ "NIV": 873,
945
+ "DIL": 874,
946
+ "IVG": 875,
947
+ "ATI": 876,
948
+ "ADY": 877,
949
+ "TAR": 878,
950
+ "FGG": 879,
951
+ "AGCG": 880,
952
+ "TTE": 881,
953
+ "SAP": 882,
954
+ "WH": 883,
955
+ "AGY": 884,
956
+ "RRV": 885,
957
+ "RFL": 886,
958
+ "ELD": 887,
959
+ "REK": 888,
960
+ "DIE": 889,
961
+ "TSS": 890,
962
+ "QVG": 891,
963
+ "NSS": 892,
964
+ "HHHHHHSSGLVPRG": 893,
965
+ "REI": 894,
966
+ "QRL": 895,
967
+ "PVI": 896,
968
+ "YLK": 897,
969
+ "PGG": 898,
970
+ "KGE": 899,
971
+ "SDG": 900,
972
+ "PGK": 901,
973
+ "DIK": 902,
974
+ "MC": 903,
975
+ "PAA": 904,
976
+ "YLL": 905,
977
+ "PAL": 906,
978
+ "SGE": 907,
979
+ "TDE": 908,
980
+ "DGL": 909,
981
+ "ILI": 910,
982
+ "VLN": 911,
983
+ "SDP": 912,
984
+ "TIK": 913,
985
+ "TFG": 914,
986
+ "SDV": 915,
987
+ "DEE": 916,
988
+ "TEI": 917,
989
+ "ATN": 918,
990
+ "YTL": 919,
991
+ "TDG": 920,
992
+ "NSV": 921,
993
+ "TSN": 922,
994
+ "NGI": 923,
995
+ "AHG": 924,
996
+ "PVV": 925,
997
+ "RVK": 926,
998
+ "ASE": 927,
999
+ "UUGG": 928,
1000
+ "PGL": 929,
1001
+ "SPE": 930,
1002
+ "NAI": 931,
1003
+ "RRK": 932,
1004
+ "YYC": 933,
1005
+ "PGI": 934,
1006
+ "SRI": 935,
1007
+ "UGGGG": 936,
1008
+ "QDL": 937,
1009
+ "ASF": 938,
1010
+ "FSG": 939,
1011
+ "RVE": 940,
1012
+ "WW": 941,
1013
+ "PGQ": 942,
1014
+ "NII": 943,
1015
+ "NVI": 944,
1016
+ "TIE": 945,
1017
+ "SSE": 946,
1018
+ "SFG": 947,
1019
+ "NVV": 948,
1020
+ "TEV": 949,
1021
+ "NLP": 950,
1022
+ "STT": 951,
1023
+ "ADK": 952,
1024
+ "REG": 953,
1025
+ "RIK": 954,
1026
+ "UUUU": 955,
1027
+ "RFG": 956,
1028
+ "PLE": 957,
1029
+ "DPE": 958,
1030
+ "DIP": 959,
1031
+ "AGAG": 960,
1032
+ "CGCG": 961,
1033
+ "NIG": 962,
1034
+ "QVK": 963,
1035
+ "EEM": 964,
1036
+ "NSI": 965,
1037
+ "ELY": 966,
1038
+ "TAI": 967,
1039
+ "TPE": 968,
1040
+ "NLI": 969,
1041
+ "PGE": 970,
1042
+ "EID": 971,
1043
+ "AWL": 972,
1044
+ "IGI": 973,
1045
+ "PLI": 974,
1046
+ "SKE": 975,
1047
+ "SAQ": 976,
1048
+ "SVK": 977,
1049
+ "SLY": 978,
1050
+ "HLG": 979,
1051
+ "APK": 980,
1052
+ "MAA": 981,
1053
+ "TFP": 982,
1054
+ "SEI": 983,
1055
+ "RDV": 984,
1056
+ "SVE": 985,
1057
+ "TSI": 986,
1058
+ "QVI": 987,
1059
+ "RLP": 988,
1060
+ "ĠMGSSHHHHHHSSGLVPRG": 989,
1061
+ "ACL": 990,
1062
+ "YGL": 991,
1063
+ "ILD": 992,
1064
+ "QNL": 993,
1065
+ "TKE": 994,
1066
+ "AMK": 995,
1067
+ "QKV": 996,
1068
+ "YLE": 997,
1069
+ "ATE": 998,
1070
+ "PVE": 999,
1071
+ "VVP": 1000,
1072
+ "SRK": 1001,
1073
+ "NKK": 1002,
1074
+ "PLK": 1003,
1075
+ "NAK": 1004,
1076
+ "YVD": 1005,
1077
+ "RKI": 1006,
1078
+ "DPK": 1007,
1079
+ "UGCC": 1008,
1080
+ "RLH": 1009,
1081
+ "NDI": 1010,
1082
+ "QSG": 1011,
1083
+ "TEG": 1012,
1084
+ "PEG": 1013,
1085
+ "QIV": 1014,
1086
+ "YLV": 1015,
1087
+ "SRG": 1016,
1088
+ "SLD": 1017,
1089
+ "AYI": 1018,
1090
+ "TIV": 1019,
1091
+ "DDD": 1020,
1092
+ "AID": 1021,
1093
+ "SAD": 1022,
1094
+ "TRV": 1023
1095
+ },
1096
+ "merges": [
1097
+ "A A",
1098
+ "G G",
1099
+ "L L",
1100
+ "A G",
1101
+ "A L",
1102
+ "V L",
1103
+ "E L",
1104
+ "S L",
1105
+ "S G",
1106
+ "T L",
1107
+ "K L",
1108
+ "A V",
1109
+ "D L",
1110
+ "E E",
1111
+ "R L",
1112
+ "V V",
1113
+ "T G",
1114
+ "I L",
1115
+ "S S",
1116
+ "X X",
1117
+ "A E",
1118
+ "K K",
1119
+ "G L",
1120
+ "T V",
1121
+ "N L",
1122
+ "A I",
1123
+ "C C",
1124
+ "A R",
1125
+ "D G",
1126
+ "A K",
1127
+ "P L",
1128
+ "A S",
1129
+ "Q L",
1130
+ "D I",
1131
+ "H H",
1132
+ "E K",
1133
+ "D V",
1134
+ "E V",
1135
+ "S V",
1136
+ "P G",
1137
+ "A T",
1138
+ "F L",
1139
+ "E G",
1140
+ "E I",
1141
+ "R V",
1142
+ "P V",
1143
+ "R G",
1144
+ "Y L",
1145
+ "T I",
1146
+ "K V",
1147
+ "A D",
1148
+ "N G",
1149
+ "S I",
1150
+ "R I",
1151
+ "A Q",
1152
+ "C G",
1153
+ "K G",
1154
+ "T T",
1155
+ "N V",
1156
+ "K I",
1157
+ "F G",
1158
+ "N I",
1159
+ "A P",
1160
+ "S T",
1161
+ "D P",
1162
+ "F V",
1163
+ "R R",
1164
+ "S P",
1165
+ "D E",
1166
+ "Q G",
1167
+ "K E",
1168
+ "Q V",
1169
+ "XX XX",
1170
+ "Y G",
1171
+ "A F",
1172
+ "U U",
1173
+ "T P",
1174
+ "N P",
1175
+ "S K",
1176
+ "S R",
1177
+ "S E",
1178
+ "Y V",
1179
+ "S F",
1180
+ "Q I",
1181
+ "Ġ M",
1182
+ "D F",
1183
+ "H L",
1184
+ "D K",
1185
+ "A N",
1186
+ "A C",
1187
+ "R E",
1188
+ "A M",
1189
+ "I G",
1190
+ "R K",
1191
+ "T E",
1192
+ "S N",
1193
+ "A Y",
1194
+ "P E",
1195
+ "M L",
1196
+ "I V",
1197
+ "T F",
1198
+ "D D",
1199
+ "T K",
1200
+ "R F",
1201
+ "S Q",
1202
+ "U G",
1203
+ "S D",
1204
+ "S Y",
1205
+ "N N",
1206
+ "R P",
1207
+ "T Q",
1208
+ "R D",
1209
+ "A H",
1210
+ "T D",
1211
+ "P P",
1212
+ "R Q",
1213
+ "N K",
1214
+ "N E",
1215
+ "Y I",
1216
+ "V G",
1217
+ "N F",
1218
+ "H G",
1219
+ "GG G",
1220
+ "P I",
1221
+ "Y F",
1222
+ "Q E",
1223
+ "M V",
1224
+ "M G",
1225
+ "Q K",
1226
+ "T Y",
1227
+ "F I",
1228
+ "S H",
1229
+ "K D",
1230
+ "HH HH",
1231
+ "C L",
1232
+ "T R",
1233
+ "AA G",
1234
+ "W L",
1235
+ "XXXX XXXX",
1236
+ "Q Q",
1237
+ "H V",
1238
+ "T N",
1239
+ "F E",
1240
+ "I E",
1241
+ "Y K",
1242
+ "T S",
1243
+ "Y E",
1244
+ "A GG",
1245
+ "P K",
1246
+ "P D",
1247
+ "I K",
1248
+ "P F",
1249
+ "S M",
1250
+ "R N",
1251
+ "Y D",
1252
+ "V E",
1253
+ "I D",
1254
+ "U C",
1255
+ "A W",
1256
+ "T H",
1257
+ "R Y",
1258
+ "F D",
1259
+ "F K",
1260
+ "I I",
1261
+ "Q D",
1262
+ "N D",
1263
+ "E D",
1264
+ "V K",
1265
+ "Y Y",
1266
+ "M E",
1267
+ "Q R",
1268
+ "Q P",
1269
+ "W G",
1270
+ "N Y",
1271
+ "Q F",
1272
+ "M K",
1273
+ "I P",
1274
+ "R T",
1275
+ "R H",
1276
+ "N T",
1277
+ "R S",
1278
+ "V D",
1279
+ "V P",
1280
+ "AA L",
1281
+ "N Q",
1282
+ "A LL",
1283
+ "K P",
1284
+ "S C",
1285
+ "S W",
1286
+ "V I",
1287
+ "U GG",
1288
+ "M I",
1289
+ "Y P",
1290
+ "F P",
1291
+ "E R",
1292
+ "A CC",
1293
+ "H I",
1294
+ "H P",
1295
+ "T C",
1296
+ "T M",
1297
+ "N R",
1298
+ "E F",
1299
+ "N S",
1300
+ "Y R",
1301
+ "Y Q",
1302
+ "H K",
1303
+ "G V",
1304
+ "M D",
1305
+ "F F",
1306
+ "E Q",
1307
+ "G I",
1308
+ "T W",
1309
+ "LL L",
1310
+ "K R",
1311
+ "A U",
1312
+ "SG L",
1313
+ "H E",
1314
+ "XXXXXXXX XXXXXXXX",
1315
+ "Y N",
1316
+ "K N",
1317
+ "HHHH HH",
1318
+ "K T",
1319
+ "A VL",
1320
+ "C GG",
1321
+ "E P",
1322
+ "C V",
1323
+ "F R",
1324
+ "M P",
1325
+ "K Q",
1326
+ "F S",
1327
+ "F N",
1328
+ "Y T",
1329
+ "Ġ G",
1330
+ "Y S",
1331
+ "D N",
1332
+ "D S",
1333
+ "D R",
1334
+ "D T",
1335
+ "Q N",
1336
+ "AG L",
1337
+ "Q T",
1338
+ "W V",
1339
+ "Q S",
1340
+ "I N",
1341
+ "F T",
1342
+ "I R",
1343
+ "CC G",
1344
+ "G K",
1345
+ "A SL",
1346
+ "E N",
1347
+ "D Y",
1348
+ "H F",
1349
+ "V R",
1350
+ "A EL",
1351
+ "I S",
1352
+ "Q M",
1353
+ "E T",
1354
+ "AL G",
1355
+ "E S",
1356
+ "P R",
1357
+ "I T",
1358
+ "AG C",
1359
+ "A TL",
1360
+ "K S",
1361
+ "U AA",
1362
+ "K Y",
1363
+ "M R",
1364
+ "A CG",
1365
+ "K F",
1366
+ "H R",
1367
+ "A TG",
1368
+ "M T",
1369
+ "W I",
1370
+ "D Q",
1371
+ "G E",
1372
+ "P Q",
1373
+ "P T",
1374
+ "M N",
1375
+ "A SG",
1376
+ "A DL",
1377
+ "F Q",
1378
+ "F Y",
1379
+ "I Q",
1380
+ "AA AA",
1381
+ "V T",
1382
+ "G D",
1383
+ "H Q",
1384
+ "E Y",
1385
+ "A UG",
1386
+ "V S",
1387
+ "A UU",
1388
+ "A EE",
1389
+ "LL G",
1390
+ "N H",
1391
+ "A KL",
1392
+ "I F",
1393
+ "A RL",
1394
+ "H D",
1395
+ "V N",
1396
+ "A IL",
1397
+ "R W",
1398
+ "M Q",
1399
+ "A KK",
1400
+ "GG L",
1401
+ "V Q",
1402
+ "C P",
1403
+ "U CC",
1404
+ "P Y",
1405
+ "M S",
1406
+ "W E",
1407
+ "P S",
1408
+ "W K",
1409
+ "H T",
1410
+ "N M",
1411
+ "D M",
1412
+ "H Y",
1413
+ "A PG",
1414
+ "VL G",
1415
+ "V F",
1416
+ "SL G",
1417
+ "N C",
1418
+ "E M",
1419
+ "A DG",
1420
+ "TG L",
1421
+ "D C",
1422
+ "Q Y",
1423
+ "R M",
1424
+ "I Y",
1425
+ "A TV",
1426
+ "R C",
1427
+ "H S",
1428
+ "C AA",
1429
+ "W D",
1430
+ "S GG",
1431
+ "UU G",
1432
+ "AL E",
1433
+ "M Y",
1434
+ "A DV",
1435
+ "P N",
1436
+ "S LL",
1437
+ "A DI",
1438
+ "M F",
1439
+ "G F",
1440
+ "AA V",
1441
+ "SG V",
1442
+ "A PL",
1443
+ "E H",
1444
+ "U CG",
1445
+ "G R",
1446
+ "A SS",
1447
+ "C I",
1448
+ "LL K",
1449
+ "A NL",
1450
+ "C T",
1451
+ "A QL",
1452
+ "AG V",
1453
+ "W Q",
1454
+ "C K",
1455
+ "LL E",
1456
+ "A FL",
1457
+ "P H",
1458
+ "S SL",
1459
+ "XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX",
1460
+ "C E",
1461
+ "AV G",
1462
+ "K H",
1463
+ "DL G",
1464
+ "I H",
1465
+ "AL V",
1466
+ "D H",
1467
+ "D W",
1468
+ "IL G",
1469
+ "S AA",
1470
+ "TL G",
1471
+ "E W",
1472
+ "N W",
1473
+ "U AG",
1474
+ "F H",
1475
+ "AA C",
1476
+ "AI G",
1477
+ "AL K",
1478
+ "C D",
1479
+ "AV V",
1480
+ "EL G",
1481
+ "T AA",
1482
+ "Q H",
1483
+ "V Y",
1484
+ "VV G",
1485
+ "E C",
1486
+ "A YL",
1487
+ "EE G",
1488
+ "R LL",
1489
+ "ĠM G",
1490
+ "A GGG",
1491
+ "W Y",
1492
+ "T LL",
1493
+ "I C",
1494
+ "A UC",
1495
+ "E EL",
1496
+ "H N",
1497
+ "VL D",
1498
+ "A NG",
1499
+ "P M",
1500
+ "W N",
1501
+ "RL G",
1502
+ "E KL",
1503
+ "AE V",
1504
+ "AL R",
1505
+ "VL V",
1506
+ "C AG",
1507
+ "AE G",
1508
+ "S TL",
1509
+ "W R",
1510
+ "DL V",
1511
+ "DG V",
1512
+ "M M",
1513
+ "GG V",
1514
+ "S AL",
1515
+ "D LL",
1516
+ "Y H",
1517
+ "W T",
1518
+ "EE V",
1519
+ "RL V",
1520
+ "S SG",
1521
+ "AA GG",
1522
+ "P C",
1523
+ "Y M",
1524
+ "AS V",
1525
+ "A PV",
1526
+ "VL K",
1527
+ "E LL",
1528
+ "R EL",
1529
+ "A FV",
1530
+ "R GL",
1531
+ "IL K",
1532
+ "C F",
1533
+ "S DL",
1534
+ "S KL",
1535
+ "AA E",
1536
+ "AK G",
1537
+ "A FG",
1538
+ "W F",
1539
+ "PL G",
1540
+ "I M",
1541
+ "TL V",
1542
+ "EL K",
1543
+ "Y C",
1544
+ "KK G",
1545
+ "R AL",
1546
+ "AR G",
1547
+ "KL G",
1548
+ "AA K",
1549
+ "PV G",
1550
+ "E AL",
1551
+ "W M",
1552
+ "AA I",
1553
+ "UG AA",
1554
+ "EL V",
1555
+ "AE I",
1556
+ "AG I",
1557
+ "P W",
1558
+ "A NV",
1559
+ "LL V",
1560
+ "R VL",
1561
+ "T VL",
1562
+ "GL G",
1563
+ "VV V",
1564
+ "AA P",
1565
+ "SG T",
1566
+ "S RL",
1567
+ "K M",
1568
+ "CC C",
1569
+ "LL D",
1570
+ "SL V",
1571
+ "C N",
1572
+ "KL V",
1573
+ "GG C",
1574
+ "TV G",
1575
+ "PL V",
1576
+ "Q W",
1577
+ "C R",
1578
+ "EE I",
1579
+ "AV K",
1580
+ "T AL",
1581
+ "EK I",
1582
+ "A NI",
1583
+ "EL E",
1584
+ "C H",
1585
+ "A NP",
1586
+ "R TL",
1587
+ "TG K",
1588
+ "T KL",
1589
+ "GL V",
1590
+ "F M",
1591
+ "RV G",
1592
+ "VL E",
1593
+ "AE K",
1594
+ "C Q",
1595
+ "SL K",
1596
+ "SL I",
1597
+ "TG I",
1598
+ "S DI",
1599
+ "A YV",
1600
+ "T AT",
1601
+ "KK V",
1602
+ "T GG",
1603
+ "EK G",
1604
+ "SV G",
1605
+ "RL I",
1606
+ "SS V",
1607
+ "LL I",
1608
+ "AA R",
1609
+ "EL I",
1610
+ "U AC",
1611
+ "CG C",
1612
+ "AG K",
1613
+ "D AV",
1614
+ "T KV",
1615
+ "A HL",
1616
+ "W S",
1617
+ "Y SL",
1618
+ "EE E",
1619
+ "AK V",
1620
+ "LL Q",
1621
+ "EG V",
1622
+ "D KL",
1623
+ "U GGG",
1624
+ "FL G",
1625
+ "D VV",
1626
+ "DL K",
1627
+ "T VV",
1628
+ "Y W",
1629
+ "IL E",
1630
+ "SL E",
1631
+ "D AL",
1632
+ "S VL",
1633
+ "AG E",
1634
+ "R KL",
1635
+ "UG C",
1636
+ "H M",
1637
+ "DG K",
1638
+ "AM L",
1639
+ "AR V",
1640
+ "PG V",
1641
+ "T TL",
1642
+ "DL I",
1643
+ "CC GG",
1644
+ "AI V",
1645
+ "A YG",
1646
+ "T AV",
1647
+ "AS I",
1648
+ "F C",
1649
+ "TL K",
1650
+ "TI G",
1651
+ "RG V",
1652
+ "EK V",
1653
+ "EE K",
1654
+ "KV G",
1655
+ "T NL",
1656
+ "EI V",
1657
+ "S IL",
1658
+ "Q LL",
1659
+ "KL I",
1660
+ "AA D",
1661
+ "KL K",
1662
+ "R RL",
1663
+ "TL E",
1664
+ "AS Q",
1665
+ "C S",
1666
+ "VL P",
1667
+ "AG D",
1668
+ "TV K",
1669
+ "N GL",
1670
+ "S TG",
1671
+ "M H",
1672
+ "AL Q",
1673
+ "EV I",
1674
+ "QL G",
1675
+ "IL V",
1676
+ "DL E",
1677
+ "NL G",
1678
+ "KK I",
1679
+ "R VV",
1680
+ "EV G",
1681
+ "AL D",
1682
+ "AL T",
1683
+ "EG K",
1684
+ "ACC G",
1685
+ "TV P",
1686
+ "AL P",
1687
+ "F W",
1688
+ "DL P",
1689
+ "R AV",
1690
+ "T EL",
1691
+ "SI V",
1692
+ "QL V",
1693
+ "QI G",
1694
+ "AL I",
1695
+ "AI K",
1696
+ "N SL",
1697
+ "N LL",
1698
+ "N SG",
1699
+ "S EL",
1700
+ "Q EL",
1701
+ "YV G",
1702
+ "T EE",
1703
+ "RI G",
1704
+ "RL E",
1705
+ "S VV",
1706
+ "HHHHHH S",
1707
+ "S AS",
1708
+ "A UGG",
1709
+ "NL K",
1710
+ "TG V",
1711
+ "QL K",
1712
+ "T AG",
1713
+ "T SL",
1714
+ "NG V",
1715
+ "S AT",
1716
+ "S PL",
1717
+ "W P",
1718
+ "H C",
1719
+ "Q GL",
1720
+ "AA AG",
1721
+ "VL I",
1722
+ "EI I",
1723
+ "N TL",
1724
+ "AR E",
1725
+ "S AV",
1726
+ "RL K",
1727
+ "VV I",
1728
+ "YL G",
1729
+ "S EE",
1730
+ "AR K",
1731
+ "SS K",
1732
+ "R GG",
1733
+ "R DL",
1734
+ "Q C",
1735
+ "S EG",
1736
+ "P PG",
1737
+ "D AA",
1738
+ "R AA",
1739
+ "S AG",
1740
+ "GL K",
1741
+ "DI G",
1742
+ "FL E",
1743
+ "FV G",
1744
+ "AK E",
1745
+ "FL V",
1746
+ "S KK",
1747
+ "S NL",
1748
+ "S KI",
1749
+ "QL I",
1750
+ "AV E",
1751
+ "AV I",
1752
+ "DV G",
1753
+ "SG F",
1754
+ "GL I",
1755
+ "T SV",
1756
+ "D EL",
1757
+ "RV I",
1758
+ "DG I",
1759
+ "T PL",
1760
+ "SI G",
1761
+ "R IL",
1762
+ "R SL",
1763
+ "P LL",
1764
+ "D EV",
1765
+ "AC GG",
1766
+ "AQ G",
1767
+ "EI G",
1768
+ "QG V",
1769
+ "S NI",
1770
+ "R TG",
1771
+ "SG I",
1772
+ "AR I",
1773
+ "AQ K",
1774
+ "C Y",
1775
+ "UU C",
1776
+ "T AD",
1777
+ "GG I",
1778
+ "EI K",
1779
+ "VV K",
1780
+ "H W",
1781
+ "AI I",
1782
+ "T TG",
1783
+ "NG K",
1784
+ "KG V",
1785
+ "AG F",
1786
+ "PG D",
1787
+ "N VL",
1788
+ "N KI",
1789
+ "KK K",
1790
+ "T AS",
1791
+ "NL V",
1792
+ "SG K",
1793
+ "SL P",
1794
+ "FG V",
1795
+ "EG I",
1796
+ "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
1797
+ "T AE",
1798
+ "GL E",
1799
+ "Q AA",
1800
+ "LL P",
1801
+ "VP RG",
1802
+ "R AE",
1803
+ "D RL",
1804
+ "R DI",
1805
+ "EV E",
1806
+ "SH M",
1807
+ "DV I",
1808
+ "T SP",
1809
+ "VV D",
1810
+ "KL E",
1811
+ "QL E",
1812
+ "ĠMG SS",
1813
+ "SV F",
1814
+ "AG AA",
1815
+ "N DL",
1816
+ "SV I",
1817
+ "S KG",
1818
+ "NV K",
1819
+ "AG CC",
1820
+ "Q AL",
1821
+ "R AR",
1822
+ "DG E",
1823
+ "S PG",
1824
+ "GG K",
1825
+ "T TV",
1826
+ "S AI",
1827
+ "EV K",
1828
+ "S AE",
1829
+ "Q VL",
1830
+ "YG V",
1831
+ "S DE",
1832
+ "FL K",
1833
+ "T SG",
1834
+ "TL P",
1835
+ "RL F",
1836
+ "EL P",
1837
+ "N KL",
1838
+ "TL I",
1839
+ "RI V",
1840
+ "AV D",
1841
+ "GG GG",
1842
+ "SS I",
1843
+ "DI V",
1844
+ "S EV",
1845
+ "PV P",
1846
+ "S TV",
1847
+ "CC CG",
1848
+ "N AA",
1849
+ "S AK",
1850
+ "LL R",
1851
+ "T IL",
1852
+ "Q SL",
1853
+ "DI I",
1854
+ "KG K",
1855
+ "AI E",
1856
+ "N IL",
1857
+ "D KK",
1858
+ "EK E",
1859
+ "Q AV",
1860
+ "TL D",
1861
+ "T DL",
1862
+ "VL F",
1863
+ "AA F",
1864
+ "SI I",
1865
+ "SK D",
1866
+ "KI V",
1867
+ "N RL",
1868
+ "T KG",
1869
+ "AK I",
1870
+ "S KV",
1871
+ "KI G",
1872
+ "TL F",
1873
+ "NL YF",
1874
+ "T EK",
1875
+ "EG E",
1876
+ "GL P",
1877
+ "TG E",
1878
+ "AQ V",
1879
+ "U AGG",
1880
+ "T PV",
1881
+ "S YL",
1882
+ "TV I",
1883
+ "R QL",
1884
+ "AT K",
1885
+ "Q KL",
1886
+ "AS K",
1887
+ "R AG",
1888
+ "AM G",
1889
+ "AP E",
1890
+ "D VL",
1891
+ "DV K",
1892
+ "N AL",
1893
+ "S EK",
1894
+ "VV E",
1895
+ "EL F",
1896
+ "GG F",
1897
+ "R SG",
1898
+ "S RV",
1899
+ "KG I",
1900
+ "IL P",
1901
+ "NV G",
1902
+ "S FL",
1903
+ "S TI",
1904
+ "HHHHHHS SGL",
1905
+ "T PG",
1906
+ "NL E",
1907
+ "H LL",
1908
+ "N TV",
1909
+ "T TI",
1910
+ "RG K",
1911
+ "M W",
1912
+ "S NG",
1913
+ "KV I",
1914
+ "T ST",
1915
+ "N EE",
1916
+ "R YL",
1917
+ "R EE",
1918
+ "S NV",
1919
+ "T DP",
1920
+ "KV K",
1921
+ "S QL",
1922
+ "PL P",
1923
+ "SS P",
1924
+ "T RL",
1925
+ "R DG",
1926
+ "U ACC",
1927
+ "T AK",
1928
+ "R NL",
1929
+ "S AR",
1930
+ "AA Q",
1931
+ "D DL",
1932
+ "TV E",
1933
+ "KI I",
1934
+ "KL F",
1935
+ "RI I",
1936
+ "CL V",
1937
+ "AD E",
1938
+ "U AAG",
1939
+ "RG I",
1940
+ "T KK",
1941
+ "NI V",
1942
+ "D IL",
1943
+ "IV G",
1944
+ "AT I",
1945
+ "AD Y",
1946
+ "T AR",
1947
+ "F GG",
1948
+ "AG CG",
1949
+ "TT E",
1950
+ "S AP",
1951
+ "W H",
1952
+ "AG Y",
1953
+ "R RV",
1954
+ "R FL",
1955
+ "EL D",
1956
+ "R EK",
1957
+ "DI E",
1958
+ "T SS",
1959
+ "QV G",
1960
+ "N SS",
1961
+ "HHHHHHSSGL VPRG",
1962
+ "R EI",
1963
+ "Q RL",
1964
+ "PV I",
1965
+ "YL K",
1966
+ "P GG",
1967
+ "KG E",
1968
+ "S DG",
1969
+ "PG K",
1970
+ "DI K",
1971
+ "M C",
1972
+ "P AA",
1973
+ "Y LL",
1974
+ "P AL",
1975
+ "SG E",
1976
+ "T DE",
1977
+ "D GL",
1978
+ "IL I",
1979
+ "VL N",
1980
+ "S DP",
1981
+ "TI K",
1982
+ "T FG",
1983
+ "S DV",
1984
+ "D EE",
1985
+ "T EI",
1986
+ "AT N",
1987
+ "Y TL",
1988
+ "T DG",
1989
+ "N SV",
1990
+ "T SN",
1991
+ "NG I",
1992
+ "AH G",
1993
+ "P VV",
1994
+ "RV K",
1995
+ "AS E",
1996
+ "UU GG",
1997
+ "P GL",
1998
+ "SP E",
1999
+ "N AI",
2000
+ "RR K",
2001
+ "YY C",
2002
+ "PG I",
2003
+ "S RI",
2004
+ "UGG GG",
2005
+ "Q DL",
2006
+ "AS F",
2007
+ "F SG",
2008
+ "RV E",
2009
+ "W W",
2010
+ "PG Q",
2011
+ "NI I",
2012
+ "NV I",
2013
+ "TI E",
2014
+ "SS E",
2015
+ "S FG",
2016
+ "N VV",
2017
+ "T EV",
2018
+ "NL P",
2019
+ "S TT",
2020
+ "AD K",
2021
+ "R EG",
2022
+ "RI K",
2023
+ "UU UU",
2024
+ "R FG",
2025
+ "PL E",
2026
+ "DP E",
2027
+ "DI P",
2028
+ "AG AG",
2029
+ "CG CG",
2030
+ "NI G",
2031
+ "QV K",
2032
+ "EE M",
2033
+ "N SI",
2034
+ "EL Y",
2035
+ "T AI",
2036
+ "TP E",
2037
+ "NL I",
2038
+ "PG E",
2039
+ "EI D",
2040
+ "A WL",
2041
+ "IG I",
2042
+ "PL I",
2043
+ "S KE",
2044
+ "S AQ",
2045
+ "SV K",
2046
+ "SL Y",
2047
+ "HL G",
2048
+ "AP K",
2049
+ "M AA",
2050
+ "TF P",
2051
+ "S EI",
2052
+ "R DV",
2053
+ "SV E",
2054
+ "T SI",
2055
+ "QV I",
2056
+ "RL P",
2057
+ "ĠMGSS HHHHHHSSGLVPRG",
2058
+ "AC L",
2059
+ "Y GL",
2060
+ "IL D",
2061
+ "Q NL",
2062
+ "T KE",
2063
+ "AM K",
2064
+ "Q KV",
2065
+ "YL E",
2066
+ "AT E",
2067
+ "PV E",
2068
+ "VV P",
2069
+ "SR K",
2070
+ "N KK",
2071
+ "PL K",
2072
+ "N AK",
2073
+ "YV D",
2074
+ "R KI",
2075
+ "DP K",
2076
+ "UG CC",
2077
+ "RL H",
2078
+ "N DI",
2079
+ "Q SG",
2080
+ "T EG",
2081
+ "P EG",
2082
+ "QI V",
2083
+ "YL V",
2084
+ "S RG",
2085
+ "SL D",
2086
+ "AY I",
2087
+ "TI V",
2088
+ "DD D",
2089
+ "AI D",
2090
+ "S AD",
2091
+ "T RV"
2092
+ ]
2093
+ }
2094
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "clean_up_tokenization_spaces": true,
3
+ "model_max_length": 1000000000000000019884624838656,
4
+ "pad_token": "<pad>",
5
+ "tokenizer_class": "PreTrainedTokenizerFast"
6
+ }