bclavie commited on
Commit
c612297
1 Parent(s): 6ce93fe

Upload tokenizer

Browse files
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
added_tokens.json ADDED
@@ -0,0 +1,770 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "[CLS]": 32001,
3
+ "[MASK]": 32004,
4
+ "[PAD]": 32003,
5
+ "[SEP]": 32002,
6
+ "[UNK]": 32000,
7
+ "[unused0]": 32005,
8
+ "[unused100]": 32105,
9
+ "[unused101]": 32106,
10
+ "[unused102]": 32107,
11
+ "[unused103]": 32108,
12
+ "[unused104]": 32109,
13
+ "[unused105]": 32110,
14
+ "[unused106]": 32111,
15
+ "[unused107]": 32112,
16
+ "[unused108]": 32113,
17
+ "[unused109]": 32114,
18
+ "[unused10]": 32015,
19
+ "[unused110]": 32115,
20
+ "[unused111]": 32116,
21
+ "[unused112]": 32117,
22
+ "[unused113]": 32118,
23
+ "[unused114]": 32119,
24
+ "[unused115]": 32120,
25
+ "[unused116]": 32121,
26
+ "[unused117]": 32122,
27
+ "[unused118]": 32123,
28
+ "[unused119]": 32124,
29
+ "[unused11]": 32016,
30
+ "[unused120]": 32125,
31
+ "[unused121]": 32126,
32
+ "[unused122]": 32127,
33
+ "[unused123]": 32128,
34
+ "[unused124]": 32129,
35
+ "[unused125]": 32130,
36
+ "[unused126]": 32131,
37
+ "[unused127]": 32132,
38
+ "[unused128]": 32133,
39
+ "[unused129]": 32134,
40
+ "[unused12]": 32017,
41
+ "[unused130]": 32135,
42
+ "[unused131]": 32136,
43
+ "[unused132]": 32137,
44
+ "[unused133]": 32138,
45
+ "[unused134]": 32139,
46
+ "[unused135]": 32140,
47
+ "[unused136]": 32141,
48
+ "[unused137]": 32142,
49
+ "[unused138]": 32143,
50
+ "[unused139]": 32144,
51
+ "[unused13]": 32018,
52
+ "[unused140]": 32145,
53
+ "[unused141]": 32146,
54
+ "[unused142]": 32147,
55
+ "[unused143]": 32148,
56
+ "[unused144]": 32149,
57
+ "[unused145]": 32150,
58
+ "[unused146]": 32151,
59
+ "[unused147]": 32152,
60
+ "[unused148]": 32153,
61
+ "[unused149]": 32154,
62
+ "[unused14]": 32019,
63
+ "[unused150]": 32155,
64
+ "[unused151]": 32156,
65
+ "[unused152]": 32157,
66
+ "[unused153]": 32158,
67
+ "[unused154]": 32159,
68
+ "[unused155]": 32160,
69
+ "[unused156]": 32161,
70
+ "[unused157]": 32162,
71
+ "[unused158]": 32163,
72
+ "[unused159]": 32164,
73
+ "[unused15]": 32020,
74
+ "[unused160]": 32165,
75
+ "[unused161]": 32166,
76
+ "[unused162]": 32167,
77
+ "[unused163]": 32168,
78
+ "[unused164]": 32169,
79
+ "[unused165]": 32170,
80
+ "[unused166]": 32171,
81
+ "[unused167]": 32172,
82
+ "[unused168]": 32173,
83
+ "[unused169]": 32174,
84
+ "[unused16]": 32021,
85
+ "[unused170]": 32175,
86
+ "[unused171]": 32176,
87
+ "[unused172]": 32177,
88
+ "[unused173]": 32178,
89
+ "[unused174]": 32179,
90
+ "[unused175]": 32180,
91
+ "[unused176]": 32181,
92
+ "[unused177]": 32182,
93
+ "[unused178]": 32183,
94
+ "[unused179]": 32184,
95
+ "[unused17]": 32022,
96
+ "[unused180]": 32185,
97
+ "[unused181]": 32186,
98
+ "[unused182]": 32187,
99
+ "[unused183]": 32188,
100
+ "[unused184]": 32189,
101
+ "[unused185]": 32190,
102
+ "[unused186]": 32191,
103
+ "[unused187]": 32192,
104
+ "[unused188]": 32193,
105
+ "[unused189]": 32194,
106
+ "[unused18]": 32023,
107
+ "[unused190]": 32195,
108
+ "[unused191]": 32196,
109
+ "[unused192]": 32197,
110
+ "[unused193]": 32198,
111
+ "[unused194]": 32199,
112
+ "[unused195]": 32200,
113
+ "[unused196]": 32201,
114
+ "[unused197]": 32202,
115
+ "[unused198]": 32203,
116
+ "[unused199]": 32204,
117
+ "[unused19]": 32024,
118
+ "[unused1]": 32006,
119
+ "[unused200]": 32205,
120
+ "[unused201]": 32206,
121
+ "[unused202]": 32207,
122
+ "[unused203]": 32208,
123
+ "[unused204]": 32209,
124
+ "[unused205]": 32210,
125
+ "[unused206]": 32211,
126
+ "[unused207]": 32212,
127
+ "[unused208]": 32213,
128
+ "[unused209]": 32214,
129
+ "[unused20]": 32025,
130
+ "[unused210]": 32215,
131
+ "[unused211]": 32216,
132
+ "[unused212]": 32217,
133
+ "[unused213]": 32218,
134
+ "[unused214]": 32219,
135
+ "[unused215]": 32220,
136
+ "[unused216]": 32221,
137
+ "[unused217]": 32222,
138
+ "[unused218]": 32223,
139
+ "[unused219]": 32224,
140
+ "[unused21]": 32026,
141
+ "[unused220]": 32225,
142
+ "[unused221]": 32226,
143
+ "[unused222]": 32227,
144
+ "[unused223]": 32228,
145
+ "[unused224]": 32229,
146
+ "[unused225]": 32230,
147
+ "[unused226]": 32231,
148
+ "[unused227]": 32232,
149
+ "[unused228]": 32233,
150
+ "[unused229]": 32234,
151
+ "[unused22]": 32027,
152
+ "[unused230]": 32235,
153
+ "[unused231]": 32236,
154
+ "[unused232]": 32237,
155
+ "[unused233]": 32238,
156
+ "[unused234]": 32239,
157
+ "[unused235]": 32240,
158
+ "[unused236]": 32241,
159
+ "[unused237]": 32242,
160
+ "[unused238]": 32243,
161
+ "[unused239]": 32244,
162
+ "[unused23]": 32028,
163
+ "[unused240]": 32245,
164
+ "[unused241]": 32246,
165
+ "[unused242]": 32247,
166
+ "[unused243]": 32248,
167
+ "[unused244]": 32249,
168
+ "[unused245]": 32250,
169
+ "[unused246]": 32251,
170
+ "[unused247]": 32252,
171
+ "[unused248]": 32253,
172
+ "[unused249]": 32254,
173
+ "[unused24]": 32029,
174
+ "[unused250]": 32255,
175
+ "[unused251]": 32256,
176
+ "[unused252]": 32257,
177
+ "[unused253]": 32258,
178
+ "[unused254]": 32259,
179
+ "[unused255]": 32260,
180
+ "[unused256]": 32261,
181
+ "[unused257]": 32262,
182
+ "[unused258]": 32263,
183
+ "[unused259]": 32264,
184
+ "[unused25]": 32030,
185
+ "[unused260]": 32265,
186
+ "[unused261]": 32266,
187
+ "[unused262]": 32267,
188
+ "[unused263]": 32268,
189
+ "[unused264]": 32269,
190
+ "[unused265]": 32270,
191
+ "[unused266]": 32271,
192
+ "[unused267]": 32272,
193
+ "[unused268]": 32273,
194
+ "[unused269]": 32274,
195
+ "[unused26]": 32031,
196
+ "[unused270]": 32275,
197
+ "[unused271]": 32276,
198
+ "[unused272]": 32277,
199
+ "[unused273]": 32278,
200
+ "[unused274]": 32279,
201
+ "[unused275]": 32280,
202
+ "[unused276]": 32281,
203
+ "[unused277]": 32282,
204
+ "[unused278]": 32283,
205
+ "[unused279]": 32284,
206
+ "[unused27]": 32032,
207
+ "[unused280]": 32285,
208
+ "[unused281]": 32286,
209
+ "[unused282]": 32287,
210
+ "[unused283]": 32288,
211
+ "[unused284]": 32289,
212
+ "[unused285]": 32290,
213
+ "[unused286]": 32291,
214
+ "[unused287]": 32292,
215
+ "[unused288]": 32293,
216
+ "[unused289]": 32294,
217
+ "[unused28]": 32033,
218
+ "[unused290]": 32295,
219
+ "[unused291]": 32296,
220
+ "[unused292]": 32297,
221
+ "[unused293]": 32298,
222
+ "[unused294]": 32299,
223
+ "[unused295]": 32300,
224
+ "[unused296]": 32301,
225
+ "[unused297]": 32302,
226
+ "[unused298]": 32303,
227
+ "[unused299]": 32304,
228
+ "[unused29]": 32034,
229
+ "[unused2]": 32007,
230
+ "[unused300]": 32305,
231
+ "[unused301]": 32306,
232
+ "[unused302]": 32307,
233
+ "[unused303]": 32308,
234
+ "[unused304]": 32309,
235
+ "[unused305]": 32310,
236
+ "[unused306]": 32311,
237
+ "[unused307]": 32312,
238
+ "[unused308]": 32313,
239
+ "[unused309]": 32314,
240
+ "[unused30]": 32035,
241
+ "[unused310]": 32315,
242
+ "[unused311]": 32316,
243
+ "[unused312]": 32317,
244
+ "[unused313]": 32318,
245
+ "[unused314]": 32319,
246
+ "[unused315]": 32320,
247
+ "[unused316]": 32321,
248
+ "[unused317]": 32322,
249
+ "[unused318]": 32323,
250
+ "[unused319]": 32324,
251
+ "[unused31]": 32036,
252
+ "[unused320]": 32325,
253
+ "[unused321]": 32326,
254
+ "[unused322]": 32327,
255
+ "[unused323]": 32328,
256
+ "[unused324]": 32329,
257
+ "[unused325]": 32330,
258
+ "[unused326]": 32331,
259
+ "[unused327]": 32332,
260
+ "[unused328]": 32333,
261
+ "[unused329]": 32334,
262
+ "[unused32]": 32037,
263
+ "[unused330]": 32335,
264
+ "[unused331]": 32336,
265
+ "[unused332]": 32337,
266
+ "[unused333]": 32338,
267
+ "[unused334]": 32339,
268
+ "[unused335]": 32340,
269
+ "[unused336]": 32341,
270
+ "[unused337]": 32342,
271
+ "[unused338]": 32343,
272
+ "[unused339]": 32344,
273
+ "[unused33]": 32038,
274
+ "[unused340]": 32345,
275
+ "[unused341]": 32346,
276
+ "[unused342]": 32347,
277
+ "[unused343]": 32348,
278
+ "[unused344]": 32349,
279
+ "[unused345]": 32350,
280
+ "[unused346]": 32351,
281
+ "[unused347]": 32352,
282
+ "[unused348]": 32353,
283
+ "[unused349]": 32354,
284
+ "[unused34]": 32039,
285
+ "[unused350]": 32355,
286
+ "[unused351]": 32356,
287
+ "[unused352]": 32357,
288
+ "[unused353]": 32358,
289
+ "[unused354]": 32359,
290
+ "[unused355]": 32360,
291
+ "[unused356]": 32361,
292
+ "[unused357]": 32362,
293
+ "[unused358]": 32363,
294
+ "[unused359]": 32364,
295
+ "[unused35]": 32040,
296
+ "[unused360]": 32365,
297
+ "[unused361]": 32366,
298
+ "[unused362]": 32367,
299
+ "[unused363]": 32368,
300
+ "[unused364]": 32369,
301
+ "[unused365]": 32370,
302
+ "[unused366]": 32371,
303
+ "[unused367]": 32372,
304
+ "[unused368]": 32373,
305
+ "[unused369]": 32374,
306
+ "[unused36]": 32041,
307
+ "[unused370]": 32375,
308
+ "[unused371]": 32376,
309
+ "[unused372]": 32377,
310
+ "[unused373]": 32378,
311
+ "[unused374]": 32379,
312
+ "[unused375]": 32380,
313
+ "[unused376]": 32381,
314
+ "[unused377]": 32382,
315
+ "[unused378]": 32383,
316
+ "[unused379]": 32384,
317
+ "[unused37]": 32042,
318
+ "[unused380]": 32385,
319
+ "[unused381]": 32386,
320
+ "[unused382]": 32387,
321
+ "[unused383]": 32388,
322
+ "[unused384]": 32389,
323
+ "[unused385]": 32390,
324
+ "[unused386]": 32391,
325
+ "[unused387]": 32392,
326
+ "[unused388]": 32393,
327
+ "[unused389]": 32394,
328
+ "[unused38]": 32043,
329
+ "[unused390]": 32395,
330
+ "[unused391]": 32396,
331
+ "[unused392]": 32397,
332
+ "[unused393]": 32398,
333
+ "[unused394]": 32399,
334
+ "[unused395]": 32400,
335
+ "[unused396]": 32401,
336
+ "[unused397]": 32402,
337
+ "[unused398]": 32403,
338
+ "[unused399]": 32404,
339
+ "[unused39]": 32044,
340
+ "[unused3]": 32008,
341
+ "[unused400]": 32405,
342
+ "[unused401]": 32406,
343
+ "[unused402]": 32407,
344
+ "[unused403]": 32408,
345
+ "[unused404]": 32409,
346
+ "[unused405]": 32410,
347
+ "[unused406]": 32411,
348
+ "[unused407]": 32412,
349
+ "[unused408]": 32413,
350
+ "[unused409]": 32414,
351
+ "[unused40]": 32045,
352
+ "[unused410]": 32415,
353
+ "[unused411]": 32416,
354
+ "[unused412]": 32417,
355
+ "[unused413]": 32418,
356
+ "[unused414]": 32419,
357
+ "[unused415]": 32420,
358
+ "[unused416]": 32421,
359
+ "[unused417]": 32422,
360
+ "[unused418]": 32423,
361
+ "[unused419]": 32424,
362
+ "[unused41]": 32046,
363
+ "[unused420]": 32425,
364
+ "[unused421]": 32426,
365
+ "[unused422]": 32427,
366
+ "[unused423]": 32428,
367
+ "[unused424]": 32429,
368
+ "[unused425]": 32430,
369
+ "[unused426]": 32431,
370
+ "[unused427]": 32432,
371
+ "[unused428]": 32433,
372
+ "[unused429]": 32434,
373
+ "[unused42]": 32047,
374
+ "[unused430]": 32435,
375
+ "[unused431]": 32436,
376
+ "[unused432]": 32437,
377
+ "[unused433]": 32438,
378
+ "[unused434]": 32439,
379
+ "[unused435]": 32440,
380
+ "[unused436]": 32441,
381
+ "[unused437]": 32442,
382
+ "[unused438]": 32443,
383
+ "[unused439]": 32444,
384
+ "[unused43]": 32048,
385
+ "[unused440]": 32445,
386
+ "[unused441]": 32446,
387
+ "[unused442]": 32447,
388
+ "[unused443]": 32448,
389
+ "[unused444]": 32449,
390
+ "[unused445]": 32450,
391
+ "[unused446]": 32451,
392
+ "[unused447]": 32452,
393
+ "[unused448]": 32453,
394
+ "[unused449]": 32454,
395
+ "[unused44]": 32049,
396
+ "[unused450]": 32455,
397
+ "[unused451]": 32456,
398
+ "[unused452]": 32457,
399
+ "[unused453]": 32458,
400
+ "[unused454]": 32459,
401
+ "[unused455]": 32460,
402
+ "[unused456]": 32461,
403
+ "[unused457]": 32462,
404
+ "[unused458]": 32463,
405
+ "[unused459]": 32464,
406
+ "[unused45]": 32050,
407
+ "[unused460]": 32465,
408
+ "[unused461]": 32466,
409
+ "[unused462]": 32467,
410
+ "[unused463]": 32468,
411
+ "[unused464]": 32469,
412
+ "[unused465]": 32470,
413
+ "[unused466]": 32471,
414
+ "[unused467]": 32472,
415
+ "[unused468]": 32473,
416
+ "[unused469]": 32474,
417
+ "[unused46]": 32051,
418
+ "[unused470]": 32475,
419
+ "[unused471]": 32476,
420
+ "[unused472]": 32477,
421
+ "[unused473]": 32478,
422
+ "[unused474]": 32479,
423
+ "[unused475]": 32480,
424
+ "[unused476]": 32481,
425
+ "[unused477]": 32482,
426
+ "[unused478]": 32483,
427
+ "[unused479]": 32484,
428
+ "[unused47]": 32052,
429
+ "[unused480]": 32485,
430
+ "[unused481]": 32486,
431
+ "[unused482]": 32487,
432
+ "[unused483]": 32488,
433
+ "[unused484]": 32489,
434
+ "[unused485]": 32490,
435
+ "[unused486]": 32491,
436
+ "[unused487]": 32492,
437
+ "[unused488]": 32493,
438
+ "[unused489]": 32494,
439
+ "[unused48]": 32053,
440
+ "[unused490]": 32495,
441
+ "[unused491]": 32496,
442
+ "[unused492]": 32497,
443
+ "[unused493]": 32498,
444
+ "[unused494]": 32499,
445
+ "[unused495]": 32500,
446
+ "[unused496]": 32501,
447
+ "[unused497]": 32502,
448
+ "[unused498]": 32503,
449
+ "[unused499]": 32504,
450
+ "[unused49]": 32054,
451
+ "[unused4]": 32009,
452
+ "[unused500]": 32505,
453
+ "[unused501]": 32506,
454
+ "[unused502]": 32507,
455
+ "[unused503]": 32508,
456
+ "[unused504]": 32509,
457
+ "[unused505]": 32510,
458
+ "[unused506]": 32511,
459
+ "[unused507]": 32512,
460
+ "[unused508]": 32513,
461
+ "[unused509]": 32514,
462
+ "[unused50]": 32055,
463
+ "[unused510]": 32515,
464
+ "[unused511]": 32516,
465
+ "[unused512]": 32517,
466
+ "[unused513]": 32518,
467
+ "[unused514]": 32519,
468
+ "[unused515]": 32520,
469
+ "[unused516]": 32521,
470
+ "[unused517]": 32522,
471
+ "[unused518]": 32523,
472
+ "[unused519]": 32524,
473
+ "[unused51]": 32056,
474
+ "[unused520]": 32525,
475
+ "[unused521]": 32526,
476
+ "[unused522]": 32527,
477
+ "[unused523]": 32528,
478
+ "[unused524]": 32529,
479
+ "[unused525]": 32530,
480
+ "[unused526]": 32531,
481
+ "[unused527]": 32532,
482
+ "[unused528]": 32533,
483
+ "[unused529]": 32534,
484
+ "[unused52]": 32057,
485
+ "[unused530]": 32535,
486
+ "[unused531]": 32536,
487
+ "[unused532]": 32537,
488
+ "[unused533]": 32538,
489
+ "[unused534]": 32539,
490
+ "[unused535]": 32540,
491
+ "[unused536]": 32541,
492
+ "[unused537]": 32542,
493
+ "[unused538]": 32543,
494
+ "[unused539]": 32544,
495
+ "[unused53]": 32058,
496
+ "[unused540]": 32545,
497
+ "[unused541]": 32546,
498
+ "[unused542]": 32547,
499
+ "[unused543]": 32548,
500
+ "[unused544]": 32549,
501
+ "[unused545]": 32550,
502
+ "[unused546]": 32551,
503
+ "[unused547]": 32552,
504
+ "[unused548]": 32553,
505
+ "[unused549]": 32554,
506
+ "[unused54]": 32059,
507
+ "[unused550]": 32555,
508
+ "[unused551]": 32556,
509
+ "[unused552]": 32557,
510
+ "[unused553]": 32558,
511
+ "[unused554]": 32559,
512
+ "[unused555]": 32560,
513
+ "[unused556]": 32561,
514
+ "[unused557]": 32562,
515
+ "[unused558]": 32563,
516
+ "[unused559]": 32564,
517
+ "[unused55]": 32060,
518
+ "[unused560]": 32565,
519
+ "[unused561]": 32566,
520
+ "[unused562]": 32567,
521
+ "[unused563]": 32568,
522
+ "[unused564]": 32569,
523
+ "[unused565]": 32570,
524
+ "[unused566]": 32571,
525
+ "[unused567]": 32572,
526
+ "[unused568]": 32573,
527
+ "[unused569]": 32574,
528
+ "[unused56]": 32061,
529
+ "[unused570]": 32575,
530
+ "[unused571]": 32576,
531
+ "[unused572]": 32577,
532
+ "[unused573]": 32578,
533
+ "[unused574]": 32579,
534
+ "[unused575]": 32580,
535
+ "[unused576]": 32581,
536
+ "[unused577]": 32582,
537
+ "[unused578]": 32583,
538
+ "[unused579]": 32584,
539
+ "[unused57]": 32062,
540
+ "[unused580]": 32585,
541
+ "[unused581]": 32586,
542
+ "[unused582]": 32587,
543
+ "[unused583]": 32588,
544
+ "[unused584]": 32589,
545
+ "[unused585]": 32590,
546
+ "[unused586]": 32591,
547
+ "[unused587]": 32592,
548
+ "[unused588]": 32593,
549
+ "[unused589]": 32594,
550
+ "[unused58]": 32063,
551
+ "[unused590]": 32595,
552
+ "[unused591]": 32596,
553
+ "[unused592]": 32597,
554
+ "[unused593]": 32598,
555
+ "[unused594]": 32599,
556
+ "[unused595]": 32600,
557
+ "[unused596]": 32601,
558
+ "[unused597]": 32602,
559
+ "[unused598]": 32603,
560
+ "[unused599]": 32604,
561
+ "[unused59]": 32064,
562
+ "[unused5]": 32010,
563
+ "[unused600]": 32605,
564
+ "[unused601]": 32606,
565
+ "[unused602]": 32607,
566
+ "[unused603]": 32608,
567
+ "[unused604]": 32609,
568
+ "[unused605]": 32610,
569
+ "[unused606]": 32611,
570
+ "[unused607]": 32612,
571
+ "[unused608]": 32613,
572
+ "[unused609]": 32614,
573
+ "[unused60]": 32065,
574
+ "[unused610]": 32615,
575
+ "[unused611]": 32616,
576
+ "[unused612]": 32617,
577
+ "[unused613]": 32618,
578
+ "[unused614]": 32619,
579
+ "[unused615]": 32620,
580
+ "[unused616]": 32621,
581
+ "[unused617]": 32622,
582
+ "[unused618]": 32623,
583
+ "[unused619]": 32624,
584
+ "[unused61]": 32066,
585
+ "[unused620]": 32625,
586
+ "[unused621]": 32626,
587
+ "[unused622]": 32627,
588
+ "[unused623]": 32628,
589
+ "[unused624]": 32629,
590
+ "[unused625]": 32630,
591
+ "[unused626]": 32631,
592
+ "[unused627]": 32632,
593
+ "[unused628]": 32633,
594
+ "[unused629]": 32634,
595
+ "[unused62]": 32067,
596
+ "[unused630]": 32635,
597
+ "[unused631]": 32636,
598
+ "[unused632]": 32637,
599
+ "[unused633]": 32638,
600
+ "[unused634]": 32639,
601
+ "[unused635]": 32640,
602
+ "[unused636]": 32641,
603
+ "[unused637]": 32642,
604
+ "[unused638]": 32643,
605
+ "[unused639]": 32644,
606
+ "[unused63]": 32068,
607
+ "[unused640]": 32645,
608
+ "[unused641]": 32646,
609
+ "[unused642]": 32647,
610
+ "[unused643]": 32648,
611
+ "[unused644]": 32649,
612
+ "[unused645]": 32650,
613
+ "[unused646]": 32651,
614
+ "[unused647]": 32652,
615
+ "[unused648]": 32653,
616
+ "[unused649]": 32654,
617
+ "[unused64]": 32069,
618
+ "[unused650]": 32655,
619
+ "[unused651]": 32656,
620
+ "[unused652]": 32657,
621
+ "[unused653]": 32658,
622
+ "[unused654]": 32659,
623
+ "[unused655]": 32660,
624
+ "[unused656]": 32661,
625
+ "[unused657]": 32662,
626
+ "[unused658]": 32663,
627
+ "[unused659]": 32664,
628
+ "[unused65]": 32070,
629
+ "[unused660]": 32665,
630
+ "[unused661]": 32666,
631
+ "[unused662]": 32667,
632
+ "[unused663]": 32668,
633
+ "[unused664]": 32669,
634
+ "[unused665]": 32670,
635
+ "[unused666]": 32671,
636
+ "[unused667]": 32672,
637
+ "[unused668]": 32673,
638
+ "[unused669]": 32674,
639
+ "[unused66]": 32071,
640
+ "[unused670]": 32675,
641
+ "[unused671]": 32676,
642
+ "[unused672]": 32677,
643
+ "[unused673]": 32678,
644
+ "[unused674]": 32679,
645
+ "[unused675]": 32680,
646
+ "[unused676]": 32681,
647
+ "[unused677]": 32682,
648
+ "[unused678]": 32683,
649
+ "[unused679]": 32684,
650
+ "[unused67]": 32072,
651
+ "[unused680]": 32685,
652
+ "[unused681]": 32686,
653
+ "[unused682]": 32687,
654
+ "[unused683]": 32688,
655
+ "[unused684]": 32689,
656
+ "[unused685]": 32690,
657
+ "[unused686]": 32691,
658
+ "[unused687]": 32692,
659
+ "[unused688]": 32693,
660
+ "[unused689]": 32694,
661
+ "[unused68]": 32073,
662
+ "[unused690]": 32695,
663
+ "[unused691]": 32696,
664
+ "[unused692]": 32697,
665
+ "[unused693]": 32698,
666
+ "[unused694]": 32699,
667
+ "[unused695]": 32700,
668
+ "[unused696]": 32701,
669
+ "[unused697]": 32702,
670
+ "[unused698]": 32703,
671
+ "[unused699]": 32704,
672
+ "[unused69]": 32074,
673
+ "[unused6]": 32011,
674
+ "[unused700]": 32705,
675
+ "[unused701]": 32706,
676
+ "[unused702]": 32707,
677
+ "[unused703]": 32708,
678
+ "[unused704]": 32709,
679
+ "[unused705]": 32710,
680
+ "[unused706]": 32711,
681
+ "[unused707]": 32712,
682
+ "[unused708]": 32713,
683
+ "[unused709]": 32714,
684
+ "[unused70]": 32075,
685
+ "[unused710]": 32715,
686
+ "[unused711]": 32716,
687
+ "[unused712]": 32717,
688
+ "[unused713]": 32718,
689
+ "[unused714]": 32719,
690
+ "[unused715]": 32720,
691
+ "[unused716]": 32721,
692
+ "[unused717]": 32722,
693
+ "[unused718]": 32723,
694
+ "[unused719]": 32724,
695
+ "[unused71]": 32076,
696
+ "[unused720]": 32725,
697
+ "[unused721]": 32726,
698
+ "[unused722]": 32727,
699
+ "[unused723]": 32728,
700
+ "[unused724]": 32729,
701
+ "[unused725]": 32730,
702
+ "[unused726]": 32731,
703
+ "[unused727]": 32732,
704
+ "[unused728]": 32733,
705
+ "[unused729]": 32734,
706
+ "[unused72]": 32077,
707
+ "[unused730]": 32735,
708
+ "[unused731]": 32736,
709
+ "[unused732]": 32737,
710
+ "[unused733]": 32738,
711
+ "[unused734]": 32739,
712
+ "[unused735]": 32740,
713
+ "[unused736]": 32741,
714
+ "[unused737]": 32742,
715
+ "[unused738]": 32743,
716
+ "[unused739]": 32744,
717
+ "[unused73]": 32078,
718
+ "[unused740]": 32745,
719
+ "[unused741]": 32746,
720
+ "[unused742]": 32747,
721
+ "[unused743]": 32748,
722
+ "[unused744]": 32749,
723
+ "[unused745]": 32750,
724
+ "[unused746]": 32751,
725
+ "[unused747]": 32752,
726
+ "[unused748]": 32753,
727
+ "[unused749]": 32754,
728
+ "[unused74]": 32079,
729
+ "[unused750]": 32755,
730
+ "[unused751]": 32756,
731
+ "[unused752]": 32757,
732
+ "[unused753]": 32758,
733
+ "[unused754]": 32759,
734
+ "[unused755]": 32760,
735
+ "[unused756]": 32761,
736
+ "[unused757]": 32762,
737
+ "[unused758]": 32763,
738
+ "[unused759]": 32764,
739
+ "[unused75]": 32080,
740
+ "[unused760]": 32765,
741
+ "[unused761]": 32766,
742
+ "[unused762]": 32767,
743
+ "[unused76]": 32081,
744
+ "[unused77]": 32082,
745
+ "[unused78]": 32083,
746
+ "[unused79]": 32084,
747
+ "[unused7]": 32012,
748
+ "[unused80]": 32085,
749
+ "[unused81]": 32086,
750
+ "[unused82]": 32087,
751
+ "[unused83]": 32088,
752
+ "[unused84]": 32089,
753
+ "[unused85]": 32090,
754
+ "[unused86]": 32091,
755
+ "[unused87]": 32092,
756
+ "[unused88]": 32093,
757
+ "[unused89]": 32094,
758
+ "[unused8]": 32013,
759
+ "[unused90]": 32095,
760
+ "[unused91]": 32096,
761
+ "[unused92]": 32097,
762
+ "[unused93]": 32098,
763
+ "[unused94]": 32099,
764
+ "[unused95]": 32100,
765
+ "[unused96]": 32101,
766
+ "[unused97]": 32102,
767
+ "[unused98]": 32103,
768
+ "[unused99]": 32104,
769
+ "[unused9]": 32014
770
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "[MASK]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "[PAD]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "[SEP]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff