anforsm commited on
Commit
6282ca8
1 Parent(s): 223284c

Upload tokenizer

Browse files
added_tokens.json ADDED
@@ -0,0 +1,1026 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "audio_token_0": 50257,
3
+ "audio_token_1": 50258,
4
+ "audio_token_10": 50267,
5
+ "audio_token_100": 50357,
6
+ "audio_token_1000": 51257,
7
+ "audio_token_1001": 51258,
8
+ "audio_token_1002": 51259,
9
+ "audio_token_1003": 51260,
10
+ "audio_token_1004": 51261,
11
+ "audio_token_1005": 51262,
12
+ "audio_token_1006": 51263,
13
+ "audio_token_1007": 51264,
14
+ "audio_token_1008": 51265,
15
+ "audio_token_1009": 51266,
16
+ "audio_token_101": 50358,
17
+ "audio_token_1010": 51267,
18
+ "audio_token_1011": 51268,
19
+ "audio_token_1012": 51269,
20
+ "audio_token_1013": 51270,
21
+ "audio_token_1014": 51271,
22
+ "audio_token_1015": 51272,
23
+ "audio_token_1016": 51273,
24
+ "audio_token_1017": 51274,
25
+ "audio_token_1018": 51275,
26
+ "audio_token_1019": 51276,
27
+ "audio_token_102": 50359,
28
+ "audio_token_1020": 51277,
29
+ "audio_token_1021": 51278,
30
+ "audio_token_1022": 51279,
31
+ "audio_token_1023": 51280,
32
+ "audio_token_103": 50360,
33
+ "audio_token_104": 50361,
34
+ "audio_token_105": 50362,
35
+ "audio_token_106": 50363,
36
+ "audio_token_107": 50364,
37
+ "audio_token_108": 50365,
38
+ "audio_token_109": 50366,
39
+ "audio_token_11": 50268,
40
+ "audio_token_110": 50367,
41
+ "audio_token_111": 50368,
42
+ "audio_token_112": 50369,
43
+ "audio_token_113": 50370,
44
+ "audio_token_114": 50371,
45
+ "audio_token_115": 50372,
46
+ "audio_token_116": 50373,
47
+ "audio_token_117": 50374,
48
+ "audio_token_118": 50375,
49
+ "audio_token_119": 50376,
50
+ "audio_token_12": 50269,
51
+ "audio_token_120": 50377,
52
+ "audio_token_121": 50378,
53
+ "audio_token_122": 50379,
54
+ "audio_token_123": 50380,
55
+ "audio_token_124": 50381,
56
+ "audio_token_125": 50382,
57
+ "audio_token_126": 50383,
58
+ "audio_token_127": 50384,
59
+ "audio_token_128": 50385,
60
+ "audio_token_129": 50386,
61
+ "audio_token_13": 50270,
62
+ "audio_token_130": 50387,
63
+ "audio_token_131": 50388,
64
+ "audio_token_132": 50389,
65
+ "audio_token_133": 50390,
66
+ "audio_token_134": 50391,
67
+ "audio_token_135": 50392,
68
+ "audio_token_136": 50393,
69
+ "audio_token_137": 50394,
70
+ "audio_token_138": 50395,
71
+ "audio_token_139": 50396,
72
+ "audio_token_14": 50271,
73
+ "audio_token_140": 50397,
74
+ "audio_token_141": 50398,
75
+ "audio_token_142": 50399,
76
+ "audio_token_143": 50400,
77
+ "audio_token_144": 50401,
78
+ "audio_token_145": 50402,
79
+ "audio_token_146": 50403,
80
+ "audio_token_147": 50404,
81
+ "audio_token_148": 50405,
82
+ "audio_token_149": 50406,
83
+ "audio_token_15": 50272,
84
+ "audio_token_150": 50407,
85
+ "audio_token_151": 50408,
86
+ "audio_token_152": 50409,
87
+ "audio_token_153": 50410,
88
+ "audio_token_154": 50411,
89
+ "audio_token_155": 50412,
90
+ "audio_token_156": 50413,
91
+ "audio_token_157": 50414,
92
+ "audio_token_158": 50415,
93
+ "audio_token_159": 50416,
94
+ "audio_token_16": 50273,
95
+ "audio_token_160": 50417,
96
+ "audio_token_161": 50418,
97
+ "audio_token_162": 50419,
98
+ "audio_token_163": 50420,
99
+ "audio_token_164": 50421,
100
+ "audio_token_165": 50422,
101
+ "audio_token_166": 50423,
102
+ "audio_token_167": 50424,
103
+ "audio_token_168": 50425,
104
+ "audio_token_169": 50426,
105
+ "audio_token_17": 50274,
106
+ "audio_token_170": 50427,
107
+ "audio_token_171": 50428,
108
+ "audio_token_172": 50429,
109
+ "audio_token_173": 50430,
110
+ "audio_token_174": 50431,
111
+ "audio_token_175": 50432,
112
+ "audio_token_176": 50433,
113
+ "audio_token_177": 50434,
114
+ "audio_token_178": 50435,
115
+ "audio_token_179": 50436,
116
+ "audio_token_18": 50275,
117
+ "audio_token_180": 50437,
118
+ "audio_token_181": 50438,
119
+ "audio_token_182": 50439,
120
+ "audio_token_183": 50440,
121
+ "audio_token_184": 50441,
122
+ "audio_token_185": 50442,
123
+ "audio_token_186": 50443,
124
+ "audio_token_187": 50444,
125
+ "audio_token_188": 50445,
126
+ "audio_token_189": 50446,
127
+ "audio_token_19": 50276,
128
+ "audio_token_190": 50447,
129
+ "audio_token_191": 50448,
130
+ "audio_token_192": 50449,
131
+ "audio_token_193": 50450,
132
+ "audio_token_194": 50451,
133
+ "audio_token_195": 50452,
134
+ "audio_token_196": 50453,
135
+ "audio_token_197": 50454,
136
+ "audio_token_198": 50455,
137
+ "audio_token_199": 50456,
138
+ "audio_token_2": 50259,
139
+ "audio_token_20": 50277,
140
+ "audio_token_200": 50457,
141
+ "audio_token_201": 50458,
142
+ "audio_token_202": 50459,
143
+ "audio_token_203": 50460,
144
+ "audio_token_204": 50461,
145
+ "audio_token_205": 50462,
146
+ "audio_token_206": 50463,
147
+ "audio_token_207": 50464,
148
+ "audio_token_208": 50465,
149
+ "audio_token_209": 50466,
150
+ "audio_token_21": 50278,
151
+ "audio_token_210": 50467,
152
+ "audio_token_211": 50468,
153
+ "audio_token_212": 50469,
154
+ "audio_token_213": 50470,
155
+ "audio_token_214": 50471,
156
+ "audio_token_215": 50472,
157
+ "audio_token_216": 50473,
158
+ "audio_token_217": 50474,
159
+ "audio_token_218": 50475,
160
+ "audio_token_219": 50476,
161
+ "audio_token_22": 50279,
162
+ "audio_token_220": 50477,
163
+ "audio_token_221": 50478,
164
+ "audio_token_222": 50479,
165
+ "audio_token_223": 50480,
166
+ "audio_token_224": 50481,
167
+ "audio_token_225": 50482,
168
+ "audio_token_226": 50483,
169
+ "audio_token_227": 50484,
170
+ "audio_token_228": 50485,
171
+ "audio_token_229": 50486,
172
+ "audio_token_23": 50280,
173
+ "audio_token_230": 50487,
174
+ "audio_token_231": 50488,
175
+ "audio_token_232": 50489,
176
+ "audio_token_233": 50490,
177
+ "audio_token_234": 50491,
178
+ "audio_token_235": 50492,
179
+ "audio_token_236": 50493,
180
+ "audio_token_237": 50494,
181
+ "audio_token_238": 50495,
182
+ "audio_token_239": 50496,
183
+ "audio_token_24": 50281,
184
+ "audio_token_240": 50497,
185
+ "audio_token_241": 50498,
186
+ "audio_token_242": 50499,
187
+ "audio_token_243": 50500,
188
+ "audio_token_244": 50501,
189
+ "audio_token_245": 50502,
190
+ "audio_token_246": 50503,
191
+ "audio_token_247": 50504,
192
+ "audio_token_248": 50505,
193
+ "audio_token_249": 50506,
194
+ "audio_token_25": 50282,
195
+ "audio_token_250": 50507,
196
+ "audio_token_251": 50508,
197
+ "audio_token_252": 50509,
198
+ "audio_token_253": 50510,
199
+ "audio_token_254": 50511,
200
+ "audio_token_255": 50512,
201
+ "audio_token_256": 50513,
202
+ "audio_token_257": 50514,
203
+ "audio_token_258": 50515,
204
+ "audio_token_259": 50516,
205
+ "audio_token_26": 50283,
206
+ "audio_token_260": 50517,
207
+ "audio_token_261": 50518,
208
+ "audio_token_262": 50519,
209
+ "audio_token_263": 50520,
210
+ "audio_token_264": 50521,
211
+ "audio_token_265": 50522,
212
+ "audio_token_266": 50523,
213
+ "audio_token_267": 50524,
214
+ "audio_token_268": 50525,
215
+ "audio_token_269": 50526,
216
+ "audio_token_27": 50284,
217
+ "audio_token_270": 50527,
218
+ "audio_token_271": 50528,
219
+ "audio_token_272": 50529,
220
+ "audio_token_273": 50530,
221
+ "audio_token_274": 50531,
222
+ "audio_token_275": 50532,
223
+ "audio_token_276": 50533,
224
+ "audio_token_277": 50534,
225
+ "audio_token_278": 50535,
226
+ "audio_token_279": 50536,
227
+ "audio_token_28": 50285,
228
+ "audio_token_280": 50537,
229
+ "audio_token_281": 50538,
230
+ "audio_token_282": 50539,
231
+ "audio_token_283": 50540,
232
+ "audio_token_284": 50541,
233
+ "audio_token_285": 50542,
234
+ "audio_token_286": 50543,
235
+ "audio_token_287": 50544,
236
+ "audio_token_288": 50545,
237
+ "audio_token_289": 50546,
238
+ "audio_token_29": 50286,
239
+ "audio_token_290": 50547,
240
+ "audio_token_291": 50548,
241
+ "audio_token_292": 50549,
242
+ "audio_token_293": 50550,
243
+ "audio_token_294": 50551,
244
+ "audio_token_295": 50552,
245
+ "audio_token_296": 50553,
246
+ "audio_token_297": 50554,
247
+ "audio_token_298": 50555,
248
+ "audio_token_299": 50556,
249
+ "audio_token_3": 50260,
250
+ "audio_token_30": 50287,
251
+ "audio_token_300": 50557,
252
+ "audio_token_301": 50558,
253
+ "audio_token_302": 50559,
254
+ "audio_token_303": 50560,
255
+ "audio_token_304": 50561,
256
+ "audio_token_305": 50562,
257
+ "audio_token_306": 50563,
258
+ "audio_token_307": 50564,
259
+ "audio_token_308": 50565,
260
+ "audio_token_309": 50566,
261
+ "audio_token_31": 50288,
262
+ "audio_token_310": 50567,
263
+ "audio_token_311": 50568,
264
+ "audio_token_312": 50569,
265
+ "audio_token_313": 50570,
266
+ "audio_token_314": 50571,
267
+ "audio_token_315": 50572,
268
+ "audio_token_316": 50573,
269
+ "audio_token_317": 50574,
270
+ "audio_token_318": 50575,
271
+ "audio_token_319": 50576,
272
+ "audio_token_32": 50289,
273
+ "audio_token_320": 50577,
274
+ "audio_token_321": 50578,
275
+ "audio_token_322": 50579,
276
+ "audio_token_323": 50580,
277
+ "audio_token_324": 50581,
278
+ "audio_token_325": 50582,
279
+ "audio_token_326": 50583,
280
+ "audio_token_327": 50584,
281
+ "audio_token_328": 50585,
282
+ "audio_token_329": 50586,
283
+ "audio_token_33": 50290,
284
+ "audio_token_330": 50587,
285
+ "audio_token_331": 50588,
286
+ "audio_token_332": 50589,
287
+ "audio_token_333": 50590,
288
+ "audio_token_334": 50591,
289
+ "audio_token_335": 50592,
290
+ "audio_token_336": 50593,
291
+ "audio_token_337": 50594,
292
+ "audio_token_338": 50595,
293
+ "audio_token_339": 50596,
294
+ "audio_token_34": 50291,
295
+ "audio_token_340": 50597,
296
+ "audio_token_341": 50598,
297
+ "audio_token_342": 50599,
298
+ "audio_token_343": 50600,
299
+ "audio_token_344": 50601,
300
+ "audio_token_345": 50602,
301
+ "audio_token_346": 50603,
302
+ "audio_token_347": 50604,
303
+ "audio_token_348": 50605,
304
+ "audio_token_349": 50606,
305
+ "audio_token_35": 50292,
306
+ "audio_token_350": 50607,
307
+ "audio_token_351": 50608,
308
+ "audio_token_352": 50609,
309
+ "audio_token_353": 50610,
310
+ "audio_token_354": 50611,
311
+ "audio_token_355": 50612,
312
+ "audio_token_356": 50613,
313
+ "audio_token_357": 50614,
314
+ "audio_token_358": 50615,
315
+ "audio_token_359": 50616,
316
+ "audio_token_36": 50293,
317
+ "audio_token_360": 50617,
318
+ "audio_token_361": 50618,
319
+ "audio_token_362": 50619,
320
+ "audio_token_363": 50620,
321
+ "audio_token_364": 50621,
322
+ "audio_token_365": 50622,
323
+ "audio_token_366": 50623,
324
+ "audio_token_367": 50624,
325
+ "audio_token_368": 50625,
326
+ "audio_token_369": 50626,
327
+ "audio_token_37": 50294,
328
+ "audio_token_370": 50627,
329
+ "audio_token_371": 50628,
330
+ "audio_token_372": 50629,
331
+ "audio_token_373": 50630,
332
+ "audio_token_374": 50631,
333
+ "audio_token_375": 50632,
334
+ "audio_token_376": 50633,
335
+ "audio_token_377": 50634,
336
+ "audio_token_378": 50635,
337
+ "audio_token_379": 50636,
338
+ "audio_token_38": 50295,
339
+ "audio_token_380": 50637,
340
+ "audio_token_381": 50638,
341
+ "audio_token_382": 50639,
342
+ "audio_token_383": 50640,
343
+ "audio_token_384": 50641,
344
+ "audio_token_385": 50642,
345
+ "audio_token_386": 50643,
346
+ "audio_token_387": 50644,
347
+ "audio_token_388": 50645,
348
+ "audio_token_389": 50646,
349
+ "audio_token_39": 50296,
350
+ "audio_token_390": 50647,
351
+ "audio_token_391": 50648,
352
+ "audio_token_392": 50649,
353
+ "audio_token_393": 50650,
354
+ "audio_token_394": 50651,
355
+ "audio_token_395": 50652,
356
+ "audio_token_396": 50653,
357
+ "audio_token_397": 50654,
358
+ "audio_token_398": 50655,
359
+ "audio_token_399": 50656,
360
+ "audio_token_4": 50261,
361
+ "audio_token_40": 50297,
362
+ "audio_token_400": 50657,
363
+ "audio_token_401": 50658,
364
+ "audio_token_402": 50659,
365
+ "audio_token_403": 50660,
366
+ "audio_token_404": 50661,
367
+ "audio_token_405": 50662,
368
+ "audio_token_406": 50663,
369
+ "audio_token_407": 50664,
370
+ "audio_token_408": 50665,
371
+ "audio_token_409": 50666,
372
+ "audio_token_41": 50298,
373
+ "audio_token_410": 50667,
374
+ "audio_token_411": 50668,
375
+ "audio_token_412": 50669,
376
+ "audio_token_413": 50670,
377
+ "audio_token_414": 50671,
378
+ "audio_token_415": 50672,
379
+ "audio_token_416": 50673,
380
+ "audio_token_417": 50674,
381
+ "audio_token_418": 50675,
382
+ "audio_token_419": 50676,
383
+ "audio_token_42": 50299,
384
+ "audio_token_420": 50677,
385
+ "audio_token_421": 50678,
386
+ "audio_token_422": 50679,
387
+ "audio_token_423": 50680,
388
+ "audio_token_424": 50681,
389
+ "audio_token_425": 50682,
390
+ "audio_token_426": 50683,
391
+ "audio_token_427": 50684,
392
+ "audio_token_428": 50685,
393
+ "audio_token_429": 50686,
394
+ "audio_token_43": 50300,
395
+ "audio_token_430": 50687,
396
+ "audio_token_431": 50688,
397
+ "audio_token_432": 50689,
398
+ "audio_token_433": 50690,
399
+ "audio_token_434": 50691,
400
+ "audio_token_435": 50692,
401
+ "audio_token_436": 50693,
402
+ "audio_token_437": 50694,
403
+ "audio_token_438": 50695,
404
+ "audio_token_439": 50696,
405
+ "audio_token_44": 50301,
406
+ "audio_token_440": 50697,
407
+ "audio_token_441": 50698,
408
+ "audio_token_442": 50699,
409
+ "audio_token_443": 50700,
410
+ "audio_token_444": 50701,
411
+ "audio_token_445": 50702,
412
+ "audio_token_446": 50703,
413
+ "audio_token_447": 50704,
414
+ "audio_token_448": 50705,
415
+ "audio_token_449": 50706,
416
+ "audio_token_45": 50302,
417
+ "audio_token_450": 50707,
418
+ "audio_token_451": 50708,
419
+ "audio_token_452": 50709,
420
+ "audio_token_453": 50710,
421
+ "audio_token_454": 50711,
422
+ "audio_token_455": 50712,
423
+ "audio_token_456": 50713,
424
+ "audio_token_457": 50714,
425
+ "audio_token_458": 50715,
426
+ "audio_token_459": 50716,
427
+ "audio_token_46": 50303,
428
+ "audio_token_460": 50717,
429
+ "audio_token_461": 50718,
430
+ "audio_token_462": 50719,
431
+ "audio_token_463": 50720,
432
+ "audio_token_464": 50721,
433
+ "audio_token_465": 50722,
434
+ "audio_token_466": 50723,
435
+ "audio_token_467": 50724,
436
+ "audio_token_468": 50725,
437
+ "audio_token_469": 50726,
438
+ "audio_token_47": 50304,
439
+ "audio_token_470": 50727,
440
+ "audio_token_471": 50728,
441
+ "audio_token_472": 50729,
442
+ "audio_token_473": 50730,
443
+ "audio_token_474": 50731,
444
+ "audio_token_475": 50732,
445
+ "audio_token_476": 50733,
446
+ "audio_token_477": 50734,
447
+ "audio_token_478": 50735,
448
+ "audio_token_479": 50736,
449
+ "audio_token_48": 50305,
450
+ "audio_token_480": 50737,
451
+ "audio_token_481": 50738,
452
+ "audio_token_482": 50739,
453
+ "audio_token_483": 50740,
454
+ "audio_token_484": 50741,
455
+ "audio_token_485": 50742,
456
+ "audio_token_486": 50743,
457
+ "audio_token_487": 50744,
458
+ "audio_token_488": 50745,
459
+ "audio_token_489": 50746,
460
+ "audio_token_49": 50306,
461
+ "audio_token_490": 50747,
462
+ "audio_token_491": 50748,
463
+ "audio_token_492": 50749,
464
+ "audio_token_493": 50750,
465
+ "audio_token_494": 50751,
466
+ "audio_token_495": 50752,
467
+ "audio_token_496": 50753,
468
+ "audio_token_497": 50754,
469
+ "audio_token_498": 50755,
470
+ "audio_token_499": 50756,
471
+ "audio_token_5": 50262,
472
+ "audio_token_50": 50307,
473
+ "audio_token_500": 50757,
474
+ "audio_token_501": 50758,
475
+ "audio_token_502": 50759,
476
+ "audio_token_503": 50760,
477
+ "audio_token_504": 50761,
478
+ "audio_token_505": 50762,
479
+ "audio_token_506": 50763,
480
+ "audio_token_507": 50764,
481
+ "audio_token_508": 50765,
482
+ "audio_token_509": 50766,
483
+ "audio_token_51": 50308,
484
+ "audio_token_510": 50767,
485
+ "audio_token_511": 50768,
486
+ "audio_token_512": 50769,
487
+ "audio_token_513": 50770,
488
+ "audio_token_514": 50771,
489
+ "audio_token_515": 50772,
490
+ "audio_token_516": 50773,
491
+ "audio_token_517": 50774,
492
+ "audio_token_518": 50775,
493
+ "audio_token_519": 50776,
494
+ "audio_token_52": 50309,
495
+ "audio_token_520": 50777,
496
+ "audio_token_521": 50778,
497
+ "audio_token_522": 50779,
498
+ "audio_token_523": 50780,
499
+ "audio_token_524": 50781,
500
+ "audio_token_525": 50782,
501
+ "audio_token_526": 50783,
502
+ "audio_token_527": 50784,
503
+ "audio_token_528": 50785,
504
+ "audio_token_529": 50786,
505
+ "audio_token_53": 50310,
506
+ "audio_token_530": 50787,
507
+ "audio_token_531": 50788,
508
+ "audio_token_532": 50789,
509
+ "audio_token_533": 50790,
510
+ "audio_token_534": 50791,
511
+ "audio_token_535": 50792,
512
+ "audio_token_536": 50793,
513
+ "audio_token_537": 50794,
514
+ "audio_token_538": 50795,
515
+ "audio_token_539": 50796,
516
+ "audio_token_54": 50311,
517
+ "audio_token_540": 50797,
518
+ "audio_token_541": 50798,
519
+ "audio_token_542": 50799,
520
+ "audio_token_543": 50800,
521
+ "audio_token_544": 50801,
522
+ "audio_token_545": 50802,
523
+ "audio_token_546": 50803,
524
+ "audio_token_547": 50804,
525
+ "audio_token_548": 50805,
526
+ "audio_token_549": 50806,
527
+ "audio_token_55": 50312,
528
+ "audio_token_550": 50807,
529
+ "audio_token_551": 50808,
530
+ "audio_token_552": 50809,
531
+ "audio_token_553": 50810,
532
+ "audio_token_554": 50811,
533
+ "audio_token_555": 50812,
534
+ "audio_token_556": 50813,
535
+ "audio_token_557": 50814,
536
+ "audio_token_558": 50815,
537
+ "audio_token_559": 50816,
538
+ "audio_token_56": 50313,
539
+ "audio_token_560": 50817,
540
+ "audio_token_561": 50818,
541
+ "audio_token_562": 50819,
542
+ "audio_token_563": 50820,
543
+ "audio_token_564": 50821,
544
+ "audio_token_565": 50822,
545
+ "audio_token_566": 50823,
546
+ "audio_token_567": 50824,
547
+ "audio_token_568": 50825,
548
+ "audio_token_569": 50826,
549
+ "audio_token_57": 50314,
550
+ "audio_token_570": 50827,
551
+ "audio_token_571": 50828,
552
+ "audio_token_572": 50829,
553
+ "audio_token_573": 50830,
554
+ "audio_token_574": 50831,
555
+ "audio_token_575": 50832,
556
+ "audio_token_576": 50833,
557
+ "audio_token_577": 50834,
558
+ "audio_token_578": 50835,
559
+ "audio_token_579": 50836,
560
+ "audio_token_58": 50315,
561
+ "audio_token_580": 50837,
562
+ "audio_token_581": 50838,
563
+ "audio_token_582": 50839,
564
+ "audio_token_583": 50840,
565
+ "audio_token_584": 50841,
566
+ "audio_token_585": 50842,
567
+ "audio_token_586": 50843,
568
+ "audio_token_587": 50844,
569
+ "audio_token_588": 50845,
570
+ "audio_token_589": 50846,
571
+ "audio_token_59": 50316,
572
+ "audio_token_590": 50847,
573
+ "audio_token_591": 50848,
574
+ "audio_token_592": 50849,
575
+ "audio_token_593": 50850,
576
+ "audio_token_594": 50851,
577
+ "audio_token_595": 50852,
578
+ "audio_token_596": 50853,
579
+ "audio_token_597": 50854,
580
+ "audio_token_598": 50855,
581
+ "audio_token_599": 50856,
582
+ "audio_token_6": 50263,
583
+ "audio_token_60": 50317,
584
+ "audio_token_600": 50857,
585
+ "audio_token_601": 50858,
586
+ "audio_token_602": 50859,
587
+ "audio_token_603": 50860,
588
+ "audio_token_604": 50861,
589
+ "audio_token_605": 50862,
590
+ "audio_token_606": 50863,
591
+ "audio_token_607": 50864,
592
+ "audio_token_608": 50865,
593
+ "audio_token_609": 50866,
594
+ "audio_token_61": 50318,
595
+ "audio_token_610": 50867,
596
+ "audio_token_611": 50868,
597
+ "audio_token_612": 50869,
598
+ "audio_token_613": 50870,
599
+ "audio_token_614": 50871,
600
+ "audio_token_615": 50872,
601
+ "audio_token_616": 50873,
602
+ "audio_token_617": 50874,
603
+ "audio_token_618": 50875,
604
+ "audio_token_619": 50876,
605
+ "audio_token_62": 50319,
606
+ "audio_token_620": 50877,
607
+ "audio_token_621": 50878,
608
+ "audio_token_622": 50879,
609
+ "audio_token_623": 50880,
610
+ "audio_token_624": 50881,
611
+ "audio_token_625": 50882,
612
+ "audio_token_626": 50883,
613
+ "audio_token_627": 50884,
614
+ "audio_token_628": 50885,
615
+ "audio_token_629": 50886,
616
+ "audio_token_63": 50320,
617
+ "audio_token_630": 50887,
618
+ "audio_token_631": 50888,
619
+ "audio_token_632": 50889,
620
+ "audio_token_633": 50890,
621
+ "audio_token_634": 50891,
622
+ "audio_token_635": 50892,
623
+ "audio_token_636": 50893,
624
+ "audio_token_637": 50894,
625
+ "audio_token_638": 50895,
626
+ "audio_token_639": 50896,
627
+ "audio_token_64": 50321,
628
+ "audio_token_640": 50897,
629
+ "audio_token_641": 50898,
630
+ "audio_token_642": 50899,
631
+ "audio_token_643": 50900,
632
+ "audio_token_644": 50901,
633
+ "audio_token_645": 50902,
634
+ "audio_token_646": 50903,
635
+ "audio_token_647": 50904,
636
+ "audio_token_648": 50905,
637
+ "audio_token_649": 50906,
638
+ "audio_token_65": 50322,
639
+ "audio_token_650": 50907,
640
+ "audio_token_651": 50908,
641
+ "audio_token_652": 50909,
642
+ "audio_token_653": 50910,
643
+ "audio_token_654": 50911,
644
+ "audio_token_655": 50912,
645
+ "audio_token_656": 50913,
646
+ "audio_token_657": 50914,
647
+ "audio_token_658": 50915,
648
+ "audio_token_659": 50916,
649
+ "audio_token_66": 50323,
650
+ "audio_token_660": 50917,
651
+ "audio_token_661": 50918,
652
+ "audio_token_662": 50919,
653
+ "audio_token_663": 50920,
654
+ "audio_token_664": 50921,
655
+ "audio_token_665": 50922,
656
+ "audio_token_666": 50923,
657
+ "audio_token_667": 50924,
658
+ "audio_token_668": 50925,
659
+ "audio_token_669": 50926,
660
+ "audio_token_67": 50324,
661
+ "audio_token_670": 50927,
662
+ "audio_token_671": 50928,
663
+ "audio_token_672": 50929,
664
+ "audio_token_673": 50930,
665
+ "audio_token_674": 50931,
666
+ "audio_token_675": 50932,
667
+ "audio_token_676": 50933,
668
+ "audio_token_677": 50934,
669
+ "audio_token_678": 50935,
670
+ "audio_token_679": 50936,
671
+ "audio_token_68": 50325,
672
+ "audio_token_680": 50937,
673
+ "audio_token_681": 50938,
674
+ "audio_token_682": 50939,
675
+ "audio_token_683": 50940,
676
+ "audio_token_684": 50941,
677
+ "audio_token_685": 50942,
678
+ "audio_token_686": 50943,
679
+ "audio_token_687": 50944,
680
+ "audio_token_688": 50945,
681
+ "audio_token_689": 50946,
682
+ "audio_token_69": 50326,
683
+ "audio_token_690": 50947,
684
+ "audio_token_691": 50948,
685
+ "audio_token_692": 50949,
686
+ "audio_token_693": 50950,
687
+ "audio_token_694": 50951,
688
+ "audio_token_695": 50952,
689
+ "audio_token_696": 50953,
690
+ "audio_token_697": 50954,
691
+ "audio_token_698": 50955,
692
+ "audio_token_699": 50956,
693
+ "audio_token_7": 50264,
694
+ "audio_token_70": 50327,
695
+ "audio_token_700": 50957,
696
+ "audio_token_701": 50958,
697
+ "audio_token_702": 50959,
698
+ "audio_token_703": 50960,
699
+ "audio_token_704": 50961,
700
+ "audio_token_705": 50962,
701
+ "audio_token_706": 50963,
702
+ "audio_token_707": 50964,
703
+ "audio_token_708": 50965,
704
+ "audio_token_709": 50966,
705
+ "audio_token_71": 50328,
706
+ "audio_token_710": 50967,
707
+ "audio_token_711": 50968,
708
+ "audio_token_712": 50969,
709
+ "audio_token_713": 50970,
710
+ "audio_token_714": 50971,
711
+ "audio_token_715": 50972,
712
+ "audio_token_716": 50973,
713
+ "audio_token_717": 50974,
714
+ "audio_token_718": 50975,
715
+ "audio_token_719": 50976,
716
+ "audio_token_72": 50329,
717
+ "audio_token_720": 50977,
718
+ "audio_token_721": 50978,
719
+ "audio_token_722": 50979,
720
+ "audio_token_723": 50980,
721
+ "audio_token_724": 50981,
722
+ "audio_token_725": 50982,
723
+ "audio_token_726": 50983,
724
+ "audio_token_727": 50984,
725
+ "audio_token_728": 50985,
726
+ "audio_token_729": 50986,
727
+ "audio_token_73": 50330,
728
+ "audio_token_730": 50987,
729
+ "audio_token_731": 50988,
730
+ "audio_token_732": 50989,
731
+ "audio_token_733": 50990,
732
+ "audio_token_734": 50991,
733
+ "audio_token_735": 50992,
734
+ "audio_token_736": 50993,
735
+ "audio_token_737": 50994,
736
+ "audio_token_738": 50995,
737
+ "audio_token_739": 50996,
738
+ "audio_token_74": 50331,
739
+ "audio_token_740": 50997,
740
+ "audio_token_741": 50998,
741
+ "audio_token_742": 50999,
742
+ "audio_token_743": 51000,
743
+ "audio_token_744": 51001,
744
+ "audio_token_745": 51002,
745
+ "audio_token_746": 51003,
746
+ "audio_token_747": 51004,
747
+ "audio_token_748": 51005,
748
+ "audio_token_749": 51006,
749
+ "audio_token_75": 50332,
750
+ "audio_token_750": 51007,
751
+ "audio_token_751": 51008,
752
+ "audio_token_752": 51009,
753
+ "audio_token_753": 51010,
754
+ "audio_token_754": 51011,
755
+ "audio_token_755": 51012,
756
+ "audio_token_756": 51013,
757
+ "audio_token_757": 51014,
758
+ "audio_token_758": 51015,
759
+ "audio_token_759": 51016,
760
+ "audio_token_76": 50333,
761
+ "audio_token_760": 51017,
762
+ "audio_token_761": 51018,
763
+ "audio_token_762": 51019,
764
+ "audio_token_763": 51020,
765
+ "audio_token_764": 51021,
766
+ "audio_token_765": 51022,
767
+ "audio_token_766": 51023,
768
+ "audio_token_767": 51024,
769
+ "audio_token_768": 51025,
770
+ "audio_token_769": 51026,
771
+ "audio_token_77": 50334,
772
+ "audio_token_770": 51027,
773
+ "audio_token_771": 51028,
774
+ "audio_token_772": 51029,
775
+ "audio_token_773": 51030,
776
+ "audio_token_774": 51031,
777
+ "audio_token_775": 51032,
778
+ "audio_token_776": 51033,
779
+ "audio_token_777": 51034,
780
+ "audio_token_778": 51035,
781
+ "audio_token_779": 51036,
782
+ "audio_token_78": 50335,
783
+ "audio_token_780": 51037,
784
+ "audio_token_781": 51038,
785
+ "audio_token_782": 51039,
786
+ "audio_token_783": 51040,
787
+ "audio_token_784": 51041,
788
+ "audio_token_785": 51042,
789
+ "audio_token_786": 51043,
790
+ "audio_token_787": 51044,
791
+ "audio_token_788": 51045,
792
+ "audio_token_789": 51046,
793
+ "audio_token_79": 50336,
794
+ "audio_token_790": 51047,
795
+ "audio_token_791": 51048,
796
+ "audio_token_792": 51049,
797
+ "audio_token_793": 51050,
798
+ "audio_token_794": 51051,
799
+ "audio_token_795": 51052,
800
+ "audio_token_796": 51053,
801
+ "audio_token_797": 51054,
802
+ "audio_token_798": 51055,
803
+ "audio_token_799": 51056,
804
+ "audio_token_8": 50265,
805
+ "audio_token_80": 50337,
806
+ "audio_token_800": 51057,
807
+ "audio_token_801": 51058,
808
+ "audio_token_802": 51059,
809
+ "audio_token_803": 51060,
810
+ "audio_token_804": 51061,
811
+ "audio_token_805": 51062,
812
+ "audio_token_806": 51063,
813
+ "audio_token_807": 51064,
814
+ "audio_token_808": 51065,
815
+ "audio_token_809": 51066,
816
+ "audio_token_81": 50338,
817
+ "audio_token_810": 51067,
818
+ "audio_token_811": 51068,
819
+ "audio_token_812": 51069,
820
+ "audio_token_813": 51070,
821
+ "audio_token_814": 51071,
822
+ "audio_token_815": 51072,
823
+ "audio_token_816": 51073,
824
+ "audio_token_817": 51074,
825
+ "audio_token_818": 51075,
826
+ "audio_token_819": 51076,
827
+ "audio_token_82": 50339,
828
+ "audio_token_820": 51077,
829
+ "audio_token_821": 51078,
830
+ "audio_token_822": 51079,
831
+ "audio_token_823": 51080,
832
+ "audio_token_824": 51081,
833
+ "audio_token_825": 51082,
834
+ "audio_token_826": 51083,
835
+ "audio_token_827": 51084,
836
+ "audio_token_828": 51085,
837
+ "audio_token_829": 51086,
838
+ "audio_token_83": 50340,
839
+ "audio_token_830": 51087,
840
+ "audio_token_831": 51088,
841
+ "audio_token_832": 51089,
842
+ "audio_token_833": 51090,
843
+ "audio_token_834": 51091,
844
+ "audio_token_835": 51092,
845
+ "audio_token_836": 51093,
846
+ "audio_token_837": 51094,
847
+ "audio_token_838": 51095,
848
+ "audio_token_839": 51096,
849
+ "audio_token_84": 50341,
850
+ "audio_token_840": 51097,
851
+ "audio_token_841": 51098,
852
+ "audio_token_842": 51099,
853
+ "audio_token_843": 51100,
854
+ "audio_token_844": 51101,
855
+ "audio_token_845": 51102,
856
+ "audio_token_846": 51103,
857
+ "audio_token_847": 51104,
858
+ "audio_token_848": 51105,
859
+ "audio_token_849": 51106,
860
+ "audio_token_85": 50342,
861
+ "audio_token_850": 51107,
862
+ "audio_token_851": 51108,
863
+ "audio_token_852": 51109,
864
+ "audio_token_853": 51110,
865
+ "audio_token_854": 51111,
866
+ "audio_token_855": 51112,
867
+ "audio_token_856": 51113,
868
+ "audio_token_857": 51114,
869
+ "audio_token_858": 51115,
870
+ "audio_token_859": 51116,
871
+ "audio_token_86": 50343,
872
+ "audio_token_860": 51117,
873
+ "audio_token_861": 51118,
874
+ "audio_token_862": 51119,
875
+ "audio_token_863": 51120,
876
+ "audio_token_864": 51121,
877
+ "audio_token_865": 51122,
878
+ "audio_token_866": 51123,
879
+ "audio_token_867": 51124,
880
+ "audio_token_868": 51125,
881
+ "audio_token_869": 51126,
882
+ "audio_token_87": 50344,
883
+ "audio_token_870": 51127,
884
+ "audio_token_871": 51128,
885
+ "audio_token_872": 51129,
886
+ "audio_token_873": 51130,
887
+ "audio_token_874": 51131,
888
+ "audio_token_875": 51132,
889
+ "audio_token_876": 51133,
890
+ "audio_token_877": 51134,
891
+ "audio_token_878": 51135,
892
+ "audio_token_879": 51136,
893
+ "audio_token_88": 50345,
894
+ "audio_token_880": 51137,
895
+ "audio_token_881": 51138,
896
+ "audio_token_882": 51139,
897
+ "audio_token_883": 51140,
898
+ "audio_token_884": 51141,
899
+ "audio_token_885": 51142,
900
+ "audio_token_886": 51143,
901
+ "audio_token_887": 51144,
902
+ "audio_token_888": 51145,
903
+ "audio_token_889": 51146,
904
+ "audio_token_89": 50346,
905
+ "audio_token_890": 51147,
906
+ "audio_token_891": 51148,
907
+ "audio_token_892": 51149,
908
+ "audio_token_893": 51150,
909
+ "audio_token_894": 51151,
910
+ "audio_token_895": 51152,
911
+ "audio_token_896": 51153,
912
+ "audio_token_897": 51154,
913
+ "audio_token_898": 51155,
914
+ "audio_token_899": 51156,
915
+ "audio_token_9": 50266,
916
+ "audio_token_90": 50347,
917
+ "audio_token_900": 51157,
918
+ "audio_token_901": 51158,
919
+ "audio_token_902": 51159,
920
+ "audio_token_903": 51160,
921
+ "audio_token_904": 51161,
922
+ "audio_token_905": 51162,
923
+ "audio_token_906": 51163,
924
+ "audio_token_907": 51164,
925
+ "audio_token_908": 51165,
926
+ "audio_token_909": 51166,
927
+ "audio_token_91": 50348,
928
+ "audio_token_910": 51167,
929
+ "audio_token_911": 51168,
930
+ "audio_token_912": 51169,
931
+ "audio_token_913": 51170,
932
+ "audio_token_914": 51171,
933
+ "audio_token_915": 51172,
934
+ "audio_token_916": 51173,
935
+ "audio_token_917": 51174,
936
+ "audio_token_918": 51175,
937
+ "audio_token_919": 51176,
938
+ "audio_token_92": 50349,
939
+ "audio_token_920": 51177,
940
+ "audio_token_921": 51178,
941
+ "audio_token_922": 51179,
942
+ "audio_token_923": 51180,
943
+ "audio_token_924": 51181,
944
+ "audio_token_925": 51182,
945
+ "audio_token_926": 51183,
946
+ "audio_token_927": 51184,
947
+ "audio_token_928": 51185,
948
+ "audio_token_929": 51186,
949
+ "audio_token_93": 50350,
950
+ "audio_token_930": 51187,
951
+ "audio_token_931": 51188,
952
+ "audio_token_932": 51189,
953
+ "audio_token_933": 51190,
954
+ "audio_token_934": 51191,
955
+ "audio_token_935": 51192,
956
+ "audio_token_936": 51193,
957
+ "audio_token_937": 51194,
958
+ "audio_token_938": 51195,
959
+ "audio_token_939": 51196,
960
+ "audio_token_94": 50351,
961
+ "audio_token_940": 51197,
962
+ "audio_token_941": 51198,
963
+ "audio_token_942": 51199,
964
+ "audio_token_943": 51200,
965
+ "audio_token_944": 51201,
966
+ "audio_token_945": 51202,
967
+ "audio_token_946": 51203,
968
+ "audio_token_947": 51204,
969
+ "audio_token_948": 51205,
970
+ "audio_token_949": 51206,
971
+ "audio_token_95": 50352,
972
+ "audio_token_950": 51207,
973
+ "audio_token_951": 51208,
974
+ "audio_token_952": 51209,
975
+ "audio_token_953": 51210,
976
+ "audio_token_954": 51211,
977
+ "audio_token_955": 51212,
978
+ "audio_token_956": 51213,
979
+ "audio_token_957": 51214,
980
+ "audio_token_958": 51215,
981
+ "audio_token_959": 51216,
982
+ "audio_token_96": 50353,
983
+ "audio_token_960": 51217,
984
+ "audio_token_961": 51218,
985
+ "audio_token_962": 51219,
986
+ "audio_token_963": 51220,
987
+ "audio_token_964": 51221,
988
+ "audio_token_965": 51222,
989
+ "audio_token_966": 51223,
990
+ "audio_token_967": 51224,
991
+ "audio_token_968": 51225,
992
+ "audio_token_969": 51226,
993
+ "audio_token_97": 50354,
994
+ "audio_token_970": 51227,
995
+ "audio_token_971": 51228,
996
+ "audio_token_972": 51229,
997
+ "audio_token_973": 51230,
998
+ "audio_token_974": 51231,
999
+ "audio_token_975": 51232,
1000
+ "audio_token_976": 51233,
1001
+ "audio_token_977": 51234,
1002
+ "audio_token_978": 51235,
1003
+ "audio_token_979": 51236,
1004
+ "audio_token_98": 50355,
1005
+ "audio_token_980": 51237,
1006
+ "audio_token_981": 51238,
1007
+ "audio_token_982": 51239,
1008
+ "audio_token_983": 51240,
1009
+ "audio_token_984": 51241,
1010
+ "audio_token_985": 51242,
1011
+ "audio_token_986": 51243,
1012
+ "audio_token_987": 51244,
1013
+ "audio_token_988": 51245,
1014
+ "audio_token_989": 51246,
1015
+ "audio_token_99": 50356,
1016
+ "audio_token_990": 51247,
1017
+ "audio_token_991": 51248,
1018
+ "audio_token_992": 51249,
1019
+ "audio_token_993": 51250,
1020
+ "audio_token_994": 51251,
1021
+ "audio_token_995": 51252,
1022
+ "audio_token_996": 51253,
1023
+ "audio_token_997": 51254,
1024
+ "audio_token_998": 51255,
1025
+ "audio_token_999": 51256
1026
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "pad_token": "<|endoftext|>",
5
+ "unk_token": "<|endoftext|>"
6
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "bos_token": "<|endoftext|>",
4
+ "eos_token": "<|endoftext|>",
5
+ "model_max_length": 1024,
6
+ "name_or_path": "distilgpt2",
7
+ "special_tokens_map_file": null,
8
+ "tokenizer_class": "GPT2Tokenizer",
9
+ "unk_token": "<|endoftext|>"
10
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff