ChlorophyllChampion commited on
Commit
75af7e5
1 Parent(s): e811e3b

Upload 12 files

Browse files
README.md CHANGED
@@ -1,3 +1,202 @@
1
  ---
2
- license: apache-2.0
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: peft
3
+ base_model: state-spaces/mamba-1.4b-hf
4
  ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
200
+ ### Framework versions
201
+
202
+ - PEFT 0.10.0
adapter_config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "state-spaces/mamba-1.4b-hf",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layer_replication": null,
10
+ "layers_pattern": null,
11
+ "layers_to_transform": null,
12
+ "loftq_config": {},
13
+ "lora_alpha": 8,
14
+ "lora_dropout": 0.0,
15
+ "megatron_config": null,
16
+ "megatron_core": "megatron.core",
17
+ "modules_to_save": null,
18
+ "peft_type": "LORA",
19
+ "r": 8,
20
+ "rank_pattern": {},
21
+ "revision": null,
22
+ "target_modules": [
23
+ "in_proj",
24
+ "embeddings",
25
+ "x_proj",
26
+ "out_proj"
27
+ ],
28
+ "task_type": "CAUSAL_LM",
29
+ "use_dora": false,
30
+ "use_rslora": false
31
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f495b010261b933752eca10301a312097565413fba9eec07c386452006f18b7d
3
+ size 33416152
optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:554db53941aba188854e9c707e676e0411d589460daf1fa3a6b7962863963c1a
3
+ size 66998458
rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e6bdf71b784d7fc380e872c2786bd835681df5b8b45ffd8a17dc718ecedc1d0
3
+ size 14244
scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a4c73e8e3f62c1b386ed1b274c0c557f9f49722dcd6ef8e801f8e856dd2235a
3
+ size 1064
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
@@ -0,0 +1,1029 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 2048,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
+ "padding": null,
10
+ "added_tokens": [
11
+ {
12
+ "id": 0,
13
+ "content": "[UNK]",
14
+ "single_word": false,
15
+ "lstrip": false,
16
+ "rstrip": false,
17
+ "normalized": false,
18
+ "special": true
19
+ },
20
+ {
21
+ "id": 1,
22
+ "content": "[CLS]",
23
+ "single_word": false,
24
+ "lstrip": false,
25
+ "rstrip": false,
26
+ "normalized": false,
27
+ "special": true
28
+ },
29
+ {
30
+ "id": 2,
31
+ "content": "[SEP]",
32
+ "single_word": false,
33
+ "lstrip": false,
34
+ "rstrip": false,
35
+ "normalized": false,
36
+ "special": true
37
+ },
38
+ {
39
+ "id": 3,
40
+ "content": "[PAD]",
41
+ "single_word": false,
42
+ "lstrip": false,
43
+ "rstrip": false,
44
+ "normalized": false,
45
+ "special": true
46
+ },
47
+ {
48
+ "id": 4,
49
+ "content": "[MASK]",
50
+ "single_word": false,
51
+ "lstrip": false,
52
+ "rstrip": false,
53
+ "normalized": false,
54
+ "special": true
55
+ },
56
+ {
57
+ "id": 5,
58
+ "content": "\u0001",
59
+ "single_word": false,
60
+ "lstrip": false,
61
+ "rstrip": false,
62
+ "normalized": false,
63
+ "special": true
64
+ },
65
+ {
66
+ "id": 6,
67
+ "content": "\u0002",
68
+ "single_word": false,
69
+ "lstrip": false,
70
+ "rstrip": false,
71
+ "normalized": false,
72
+ "special": true
73
+ },
74
+ {
75
+ "id": 7,
76
+ "content": "\u0003",
77
+ "single_word": false,
78
+ "lstrip": false,
79
+ "rstrip": false,
80
+ "normalized": false,
81
+ "special": true
82
+ },
83
+ {
84
+ "id": 8,
85
+ "content": "\u0004",
86
+ "single_word": false,
87
+ "lstrip": false,
88
+ "rstrip": false,
89
+ "normalized": false,
90
+ "special": true
91
+ },
92
+ {
93
+ "id": 9,
94
+ "content": "\u0005",
95
+ "single_word": false,
96
+ "lstrip": false,
97
+ "rstrip": false,
98
+ "normalized": false,
99
+ "special": true
100
+ }
101
+ ],
102
+ "normalizer": {
103
+ "type": "BertNormalizer",
104
+ "clean_text": true,
105
+ "handle_chinese_chars": false,
106
+ "strip_accents": true,
107
+ "lowercase": false
108
+ },
109
+ "pre_tokenizer": {
110
+ "type": "BertPreTokenizer"
111
+ },
112
+ "post_processor": {
113
+ "type": "TemplateProcessing",
114
+ "single": [
115
+ {
116
+ "SpecialToken": {
117
+ "id": "[CLS]",
118
+ "type_id": 0
119
+ }
120
+ },
121
+ {
122
+ "Sequence": {
123
+ "id": "A",
124
+ "type_id": 0
125
+ }
126
+ },
127
+ {
128
+ "SpecialToken": {
129
+ "id": "[SEP]",
130
+ "type_id": 0
131
+ }
132
+ }
133
+ ],
134
+ "pair": [
135
+ {
136
+ "SpecialToken": {
137
+ "id": "[CLS]",
138
+ "type_id": 0
139
+ }
140
+ },
141
+ {
142
+ "Sequence": {
143
+ "id": "A",
144
+ "type_id": 0
145
+ }
146
+ },
147
+ {
148
+ "SpecialToken": {
149
+ "id": "[SEP]",
150
+ "type_id": 0
151
+ }
152
+ },
153
+ {
154
+ "Sequence": {
155
+ "id": "B",
156
+ "type_id": 1
157
+ }
158
+ },
159
+ {
160
+ "SpecialToken": {
161
+ "id": "[SEP]",
162
+ "type_id": 1
163
+ }
164
+ }
165
+ ],
166
+ "special_tokens": {
167
+ "[CLS]": {
168
+ "id": "[CLS]",
169
+ "ids": [
170
+ 1
171
+ ],
172
+ "tokens": [
173
+ "[CLS]"
174
+ ]
175
+ },
176
+ "[SEP]": {
177
+ "id": "[SEP]",
178
+ "ids": [
179
+ 2
180
+ ],
181
+ "tokens": [
182
+ "[SEP]"
183
+ ]
184
+ }
185
+ }
186
+ },
187
+ "decoder": {
188
+ "type": "WordPiece",
189
+ "prefix": "##",
190
+ "cleanup": true
191
+ },
192
+ "model": {
193
+ "type": "WordPiece",
194
+ "unk_token": "[UNK]",
195
+ "continuing_subword_prefix": "##",
196
+ "max_input_chars_per_word": 100,
197
+ "vocab": {
198
+ "[UNK]": 0,
199
+ "[CLS]": 1,
200
+ "[SEP]": 2,
201
+ "[PAD]": 3,
202
+ "[MASK]": 4,
203
+ "\u0001": 5,
204
+ "\u0002": 6,
205
+ "\u0003": 7,
206
+ "\u0004": 8,
207
+ "\u0005": 9,
208
+ "!": 10,
209
+ "\"": 11,
210
+ "#": 12,
211
+ "$": 13,
212
+ "%": 14,
213
+ "&": 15,
214
+ "'": 16,
215
+ "(": 17,
216
+ ")": 18,
217
+ "*": 19,
218
+ "+": 20,
219
+ ",": 21,
220
+ "-": 22,
221
+ ".": 23,
222
+ "/": 24,
223
+ "0": 25,
224
+ "1": 26,
225
+ "2": 27,
226
+ "3": 28,
227
+ "4": 29,
228
+ "5": 30,
229
+ "6": 31,
230
+ "7": 32,
231
+ "8": 33,
232
+ "9": 34,
233
+ ":": 35,
234
+ ";": 36,
235
+ "<": 37,
236
+ "=": 38,
237
+ ">": 39,
238
+ "?": 40,
239
+ "@": 41,
240
+ "[": 42,
241
+ "\\": 43,
242
+ "]": 44,
243
+ "^": 45,
244
+ "_": 46,
245
+ "`": 47,
246
+ "a": 48,
247
+ "b": 49,
248
+ "c": 50,
249
+ "d": 51,
250
+ "e": 52,
251
+ "f": 53,
252
+ "g": 54,
253
+ "h": 55,
254
+ "i": 56,
255
+ "j": 57,
256
+ "k": 58,
257
+ "l": 59,
258
+ "m": 60,
259
+ "n": 61,
260
+ "o": 62,
261
+ "p": 63,
262
+ "q": 64,
263
+ "r": 65,
264
+ "s": 66,
265
+ "t": 67,
266
+ "u": 68,
267
+ "v": 69,
268
+ "w": 70,
269
+ "x": 71,
270
+ "y": 72,
271
+ "z": 73,
272
+ "{": 74,
273
+ "|": 75,
274
+ "}": 76,
275
+ "~": 77,
276
+ "¡": 78,
277
+ "¢": 79,
278
+ "£": 80,
279
+ "¥": 81,
280
+ "§": 82,
281
+ "¯": 83,
282
+ "µ": 84,
283
+ "º": 85,
284
+ "»": 86,
285
+ "¿": 87,
286
+ "À": 88,
287
+ "Â": 89,
288
+ "Ã": 90,
289
+ "Ä": 91,
290
+ "Å": 92,
291
+ "Ç": 93,
292
+ "Ë": 94,
293
+ "Í": 95,
294
+ "Î": 96,
295
+ "Ï": 97,
296
+ "Ñ": 98,
297
+ "Ó": 99,
298
+ "Ø": 100,
299
+ "Ù": 101,
300
+ "Ú": 102,
301
+ "Ü": 103,
302
+ "ß": 104,
303
+ "à": 105,
304
+ "á": 106,
305
+ "â": 107,
306
+ "ã": 108,
307
+ "ä": 109,
308
+ "å": 110,
309
+ "æ": 111,
310
+ "ç": 112,
311
+ "è": 113,
312
+ "é": 114,
313
+ "ê": 115,
314
+ "ë": 116,
315
+ "ì": 117,
316
+ "í": 118,
317
+ "î": 119,
318
+ "ï": 120,
319
+ "ñ": 121,
320
+ "ò": 122,
321
+ "ó": 123,
322
+ "ô": 124,
323
+ "õ": 125,
324
+ "ö": 126,
325
+ "ø": 127,
326
+ "ù": 128,
327
+ "ú": 129,
328
+ "û": 130,
329
+ "ü": 131,
330
+ "ý": 132,
331
+ "þ": 133,
332
+ "ā": 134,
333
+ "ă": 135,
334
+ "ą": 136,
335
+ "ć": 137,
336
+ "č": 138,
337
+ "ď": 139,
338
+ "đ": 140,
339
+ "ē": 141,
340
+ "ĕ": 142,
341
+ "ė": 143,
342
+ "Ę": 144,
343
+ "ę": 145,
344
+ "ě": 146,
345
+ "ġ": 147,
346
+ "ģ": 148,
347
+ "ĩ": 149,
348
+ "ī": 150,
349
+ "ĭ": 151,
350
+ "İ": 152,
351
+ "ı": 153,
352
+ "ĵ": 154,
353
+ "ķ": 155,
354
+ "ĸ": 156,
355
+ "ĺ": 157,
356
+ "ł": 158,
357
+ "ń": 159,
358
+ "ň": 160,
359
+ "ʼn": 161,
360
+ "ŋ": 162,
361
+ "ō": 163,
362
+ "ŏ": 164,
363
+ "ő": 165,
364
+ "œ": 166,
365
+ "ŕ": 167,
366
+ "ŗ": 168,
367
+ "Ř": 169,
368
+ "ř": 170,
369
+ "Ś": 171,
370
+ "ś": 172,
371
+ "Ş": 173,
372
+ "ş": 174,
373
+ "š": 175,
374
+ "ţ": 176,
375
+ "Ť": 177,
376
+ "ť": 178,
377
+ "ũ": 179,
378
+ "ū": 180,
379
+ "ŭ": 181,
380
+ "ű": 182,
381
+ "ų": 183,
382
+ "ŵ": 184,
383
+ "Ÿ": 185,
384
+ "ż": 186,
385
+ "ž": 187,
386
+ "ƀ": 188,
387
+ "Ɓ": 189,
388
+ "Ƅ": 190,
389
+ "ƅ": 191,
390
+ "Ƈ": 192,
391
+ "ƒ": 193,
392
+ "ƙ": 194,
393
+ "ƞ": 195,
394
+ "Ƭ": 196,
395
+ "Ư": 197,
396
+ "Ƴ": 198,
397
+ "Ǐ": 199,
398
+ "Ƿ": 200,
399
+ "ǹ": 201,
400
+ "ȋ": 202,
401
+ "ș": 203,
402
+ "ț": 204,
403
+ "ȧ": 205,
404
+ "ȯ": 206,
405
+ "Ʌ": 207,
406
+ "ɑ": 208,
407
+ "ɗ": 209,
408
+ "ɠ": 210,
409
+ "ɡ": 211,
410
+ "ɢ": 212,
411
+ "ɣ": 213,
412
+ "ɩ": 214,
413
+ "ɪ": 215,
414
+ "ɭ": 216,
415
+ "ɯ": 217,
416
+ "ɱ": 218,
417
+ "ɳ": 219,
418
+ "ɴ": 220,
419
+ "ɺ": 221,
420
+ "ɼ": 222,
421
+ "ɾ": 223,
422
+ "ʀ": 224,
423
+ "ʂ": 225,
424
+ "ʄ": 226,
425
+ "ʋ": 227,
426
+ "ʌ": 228,
427
+ "ʍ": 229,
428
+ "ʏ": 230,
429
+ "ʙ": 231,
430
+ "ʜ": 232,
431
+ "ʝ": 233,
432
+ "ʟ": 234,
433
+ "ʨ": 235,
434
+ "˄": 236,
435
+ "Α": 237,
436
+ "Β": 238,
437
+ "Ε": 239,
438
+ "Ζ": 240,
439
+ "Η": 241,
440
+ "Ι": 242,
441
+ "Κ": 243,
442
+ "Μ": 244,
443
+ "Ν": 245,
444
+ "Ο": 246,
445
+ "Ρ": 247,
446
+ "Τ": 248,
447
+ "Υ": 249,
448
+ "Χ": 250,
449
+ "ί": 251,
450
+ "α": 252,
451
+ "β": 253,
452
+ "γ": 254,
453
+ "η": 255,
454
+ "ι": 256,
455
+ "κ": 257,
456
+ "μ": 258,
457
+ "ν": 259,
458
+ "ο": 260,
459
+ "π": 261,
460
+ "ρ": 262,
461
+ "σ": 263,
462
+ "τ": 264,
463
+ "υ": 265,
464
+ "χ": 266,
465
+ "ω": 267,
466
+ "ϲ": 268,
467
+ "ϳ": 269,
468
+ "Ϲ": 270,
469
+ "Ϻ": 271,
470
+ "Ѕ": 272,
471
+ "Ј": 273,
472
+ "А": 274,
473
+ "В": 275,
474
+ "Е": 276,
475
+ "З": 277,
476
+ "К": 278,
477
+ "М": 279,
478
+ "Н": 280,
479
+ "О": 281,
480
+ "Р": 282,
481
+ "С": 283,
482
+ "Т": 284,
483
+ "У": 285,
484
+ "Х": 286,
485
+ "Ь": 287,
486
+ "а": 288,
487
+ "в": 289,
488
+ "г": 290,
489
+ "д": 291,
490
+ "е": 292,
491
+ "и": 293,
492
+ "к": 294,
493
+ "л": 295,
494
+ "н": 296,
495
+ "о": 297,
496
+ "п": 298,
497
+ "р": 299,
498
+ "с": 300,
499
+ "т": 301,
500
+ "у": 302,
501
+ "х": 303,
502
+ "ч": 304,
503
+ "ш": 305,
504
+ "щ": 306,
505
+ "ѐ": 307,
506
+ "ё": 308,
507
+ "ѕ": 309,
508
+ "і": 310,
509
+ "ј": 311,
510
+ "џ": 312,
511
+ "ѡ": 313,
512
+ "Ѵ": 314,
513
+ "ѵ": 315,
514
+ "ҏ": 316,
515
+ "қ": 317,
516
+ "ҡ": 318,
517
+ "ң": 319,
518
+ "ҥ": 320,
519
+ "Ү": 321,
520
+ "ү": 322,
521
+ "ҳ": 323,
522
+ "һ": 324,
523
+ "ҽ": 325,
524
+ "ӏ": 326,
525
+ "ԁ": 327,
526
+ "ԛ": 328,
527
+ "Ա": 329,
528
+ "Ի": 330,
529
+ "Ս": 331,
530
+ "Տ": 332,
531
+ "Օ": 333,
532
+ "ա": 334,
533
+ "գ": 335,
534
+ "զ": 336,
535
+ "ժ": 337,
536
+ "հ": 338,
537
+ "յ": 339,
538
+ "ս": 340,
539
+ "օ": 341,
540
+ "Ⴍ": 342,
541
+ "Ⴓ": 343,
542
+ "Ⴝ": 344,
543
+ "Ꭰ": 345,
544
+ "Ꭲ": 346,
545
+ "Ꭵ": 347,
546
+ "Ꭺ": 348,
547
+ "Ꭻ": 349,
548
+ "Ꮃ": 350,
549
+ "Ꮇ": 351,
550
+ "Ꮋ": 352,
551
+ "Ꮐ": 353,
552
+ "Ꮓ": 354,
553
+ "Ꮢ": 355,
554
+ "Ꮩ": 356,
555
+ "Ꮪ": 357,
556
+ "Ꮮ": 358,
557
+ "Ꮯ": 359,
558
+ "Ꮲ": 360,
559
+ "Ꮶ": 361,
560
+ "Ᏼ": 362,
561
+ "ᚱ": 363,
562
+ "ᛁ": 364,
563
+ "ᛒ": 365,
564
+ "ᛕ": 366,
565
+ "ᛖ": 367,
566
+ "ᴄ": 368,
567
+ "ᴇ": 369,
568
+ "ᴋ": 370,
569
+ "ᴍ": 371,
570
+ "ᴏ": 372,
571
+ "ᴑ": 373,
572
+ "ᴜ": 374,
573
+ "ᴠ": 375,
574
+ "ᴡ": 376,
575
+ "ᴦ": 377,
576
+ "ᴨ": 378,
577
+ "ᴺ": 379,
578
+ "ᴼ": 380,
579
+ "ᴾ": 381,
580
+ "ᴿ": 382,
581
+ "ḟ": 383,
582
+ "ḱ": 384,
583
+ "ḿ": 385,
584
+ "ṁ": 386,
585
+ "ṅ": 387,
586
+ "Ṛ": 388,
587
+ "ṡ": 389,
588
+ "ẁ": 390,
589
+ "ẃ": 391,
590
+ "ẇ": 392,
591
+ "ἀ": 393,
592
+ "ἁ": 394,
593
+ "ἇ": 395,
594
+ "ἰ": 396,
595
+ "ἱ": 397,
596
+ "ἳ": 398,
597
+ "ὀ": 399,
598
+ "ὁ": 400,
599
+ "ὶ": 401,
600
+ "ί": 402,
601
+ "ῤ": 403,
602
+ "ῥ": 404,
603
+ "―": 405,
604
+ "₩": 406,
605
+ "€": 407,
606
+ "₿": 408,
607
+ "ℹ": 409,
608
+ "⋃": 410,
609
+ "𝘼": 411,
610
+ "𝘾": 412,
611
+ "𝘿": 413,
612
+ "𝙀": 414,
613
+ "𝙍": 415,
614
+ "𝙏": 416,
615
+ "##\u0001": 417,
616
+ "##\u0002": 418,
617
+ "##\u0003": 419,
618
+ "##\u0004": 420,
619
+ "##\u0005": 421,
620
+ "##!": 422,
621
+ "##\"": 423,
622
+ "###": 424,
623
+ "##$": 425,
624
+ "##%": 426,
625
+ "##&": 427,
626
+ "##'": 428,
627
+ "##(": 429,
628
+ "##)": 430,
629
+ "##*": 431,
630
+ "##+": 432,
631
+ "##,": 433,
632
+ "##-": 434,
633
+ "##.": 435,
634
+ "##/": 436,
635
+ "##0": 437,
636
+ "##1": 438,
637
+ "##2": 439,
638
+ "##3": 440,
639
+ "##4": 441,
640
+ "##5": 442,
641
+ "##6": 443,
642
+ "##7": 444,
643
+ "##8": 445,
644
+ "##9": 446,
645
+ "##:": 447,
646
+ "##;": 448,
647
+ "##<": 449,
648
+ "##=": 450,
649
+ "##>": 451,
650
+ "##?": 452,
651
+ "##@": 453,
652
+ "##[": 454,
653
+ "##\\": 455,
654
+ "##]": 456,
655
+ "##^": 457,
656
+ "##_": 458,
657
+ "##`": 459,
658
+ "##a": 460,
659
+ "##b": 461,
660
+ "##c": 462,
661
+ "##d": 463,
662
+ "##e": 464,
663
+ "##f": 465,
664
+ "##g": 466,
665
+ "##h": 467,
666
+ "##i": 468,
667
+ "##j": 469,
668
+ "##k": 470,
669
+ "##l": 471,
670
+ "##m": 472,
671
+ "##n": 473,
672
+ "##o": 474,
673
+ "##p": 475,
674
+ "##q": 476,
675
+ "##r": 477,
676
+ "##s": 478,
677
+ "##t": 479,
678
+ "##u": 480,
679
+ "##v": 481,
680
+ "##w": 482,
681
+ "##x": 483,
682
+ "##y": 484,
683
+ "##z": 485,
684
+ "##{": 486,
685
+ "##|": 487,
686
+ "##}": 488,
687
+ "##~": 489,
688
+ "##¡": 490,
689
+ "##¢": 491,
690
+ "##£": 492,
691
+ "##¥": 493,
692
+ "##§": 494,
693
+ "##¯": 495,
694
+ "##µ": 496,
695
+ "##º": 497,
696
+ "##»": 498,
697
+ "##¿": 499,
698
+ "##À": 500,
699
+ "##Â": 501,
700
+ "##Ã": 502,
701
+ "##Ä": 503,
702
+ "##Å": 504,
703
+ "##Ç": 505,
704
+ "##Ë": 506,
705
+ "##Í": 507,
706
+ "##Î": 508,
707
+ "##Ï": 509,
708
+ "##Ñ": 510,
709
+ "##Ó": 511,
710
+ "##Ø": 512,
711
+ "##Ù": 513,
712
+ "##Ú": 514,
713
+ "##Ü": 515,
714
+ "##ß": 516,
715
+ "##à": 517,
716
+ "##á": 518,
717
+ "##â": 519,
718
+ "##ã": 520,
719
+ "##ä": 521,
720
+ "##å": 522,
721
+ "##æ": 523,
722
+ "##ç": 524,
723
+ "##è": 525,
724
+ "##é": 526,
725
+ "##ê": 527,
726
+ "##ë": 528,
727
+ "##ì": 529,
728
+ "##í": 530,
729
+ "##î": 531,
730
+ "##ï": 532,
731
+ "##ñ": 533,
732
+ "##ò": 534,
733
+ "##ó": 535,
734
+ "##ô": 536,
735
+ "##õ": 537,
736
+ "##ö": 538,
737
+ "##ø": 539,
738
+ "##ù": 540,
739
+ "##ú": 541,
740
+ "##û": 542,
741
+ "##ü": 543,
742
+ "##ý": 544,
743
+ "##þ": 545,
744
+ "##ā": 546,
745
+ "##ă": 547,
746
+ "##ą": 548,
747
+ "##ć": 549,
748
+ "##č": 550,
749
+ "##ď": 551,
750
+ "##đ": 552,
751
+ "##ē": 553,
752
+ "##ĕ": 554,
753
+ "##ė": 555,
754
+ "##Ę": 556,
755
+ "##ę": 557,
756
+ "##ě": 558,
757
+ "##ġ": 559,
758
+ "##ģ": 560,
759
+ "##ĩ": 561,
760
+ "##ī": 562,
761
+ "##ĭ": 563,
762
+ "##İ": 564,
763
+ "##ı": 565,
764
+ "##ĵ": 566,
765
+ "##ķ": 567,
766
+ "##ĸ": 568,
767
+ "##ĺ": 569,
768
+ "##ł": 570,
769
+ "##ń": 571,
770
+ "##ň": 572,
771
+ "##ʼn": 573,
772
+ "##ŋ": 574,
773
+ "##ō": 575,
774
+ "##ŏ": 576,
775
+ "##ő": 577,
776
+ "##œ": 578,
777
+ "##ŕ": 579,
778
+ "##ŗ": 580,
779
+ "##Ř": 581,
780
+ "##ř": 582,
781
+ "##Ś": 583,
782
+ "##ś": 584,
783
+ "##Ş": 585,
784
+ "##ş": 586,
785
+ "##š": 587,
786
+ "##ţ": 588,
787
+ "##Ť": 589,
788
+ "##ť": 590,
789
+ "##ũ": 591,
790
+ "##ū": 592,
791
+ "##ŭ": 593,
792
+ "##ű": 594,
793
+ "##ų": 595,
794
+ "##ŵ": 596,
795
+ "##Ÿ": 597,
796
+ "##ż": 598,
797
+ "##ž": 599,
798
+ "##ƀ": 600,
799
+ "##Ɓ": 601,
800
+ "##Ƅ": 602,
801
+ "##ƅ": 603,
802
+ "##Ƈ": 604,
803
+ "##ƒ": 605,
804
+ "##ƙ": 606,
805
+ "##ƞ": 607,
806
+ "##Ƭ": 608,
807
+ "##Ư": 609,
808
+ "##Ƴ": 610,
809
+ "##Ǐ": 611,
810
+ "##Ƿ": 612,
811
+ "##ǹ": 613,
812
+ "##ȋ": 614,
813
+ "##ș": 615,
814
+ "##ț": 616,
815
+ "##ȧ": 617,
816
+ "##ȯ": 618,
817
+ "##Ʌ": 619,
818
+ "##ɑ": 620,
819
+ "##ɗ": 621,
820
+ "##ɠ": 622,
821
+ "##ɡ": 623,
822
+ "##ɢ": 624,
823
+ "##ɣ": 625,
824
+ "##ɩ": 626,
825
+ "##ɪ": 627,
826
+ "##ɭ": 628,
827
+ "##ɯ": 629,
828
+ "##ɱ": 630,
829
+ "##ɳ": 631,
830
+ "##ɴ": 632,
831
+ "##ɺ": 633,
832
+ "##ɼ": 634,
833
+ "##ɾ": 635,
834
+ "##ʀ": 636,
835
+ "##ʂ": 637,
836
+ "##ʄ": 638,
837
+ "##ʋ": 639,
838
+ "##ʌ": 640,
839
+ "##ʍ": 641,
840
+ "##ʏ": 642,
841
+ "##ʙ": 643,
842
+ "##ʜ": 644,
843
+ "##ʝ": 645,
844
+ "##ʟ": 646,
845
+ "##ʨ": 647,
846
+ "##˄": 648,
847
+ "##Α": 649,
848
+ "##Β": 650,
849
+ "##Ε": 651,
850
+ "##Ζ": 652,
851
+ "##Η": 653,
852
+ "##Ι": 654,
853
+ "##Κ": 655,
854
+ "##Μ": 656,
855
+ "##Ν": 657,
856
+ "##Ο": 658,
857
+ "##Ρ": 659,
858
+ "##Τ": 660,
859
+ "##Υ": 661,
860
+ "##Χ": 662,
861
+ "##ί": 663,
862
+ "##α": 664,
863
+ "##β": 665,
864
+ "##γ": 666,
865
+ "##η": 667,
866
+ "##ι": 668,
867
+ "##κ": 669,
868
+ "##μ": 670,
869
+ "##ν": 671,
870
+ "##ο": 672,
871
+ "##π": 673,
872
+ "##ρ": 674,
873
+ "##σ": 675,
874
+ "##τ": 676,
875
+ "##υ": 677,
876
+ "##χ": 678,
877
+ "##ω": 679,
878
+ "##ϲ": 680,
879
+ "##ϳ": 681,
880
+ "##Ϲ": 682,
881
+ "##Ϻ": 683,
882
+ "##Ѕ": 684,
883
+ "##Ј": 685,
884
+ "##А": 686,
885
+ "##В": 687,
886
+ "##Е": 688,
887
+ "##З": 689,
888
+ "##К": 690,
889
+ "##М": 691,
890
+ "##Н": 692,
891
+ "##О": 693,
892
+ "##Р": 694,
893
+ "##С": 695,
894
+ "##Т": 696,
895
+ "##У": 697,
896
+ "##Х": 698,
897
+ "##Ь": 699,
898
+ "##а": 700,
899
+ "##в": 701,
900
+ "##г": 702,
901
+ "##д": 703,
902
+ "##е": 704,
903
+ "##и": 705,
904
+ "##к": 706,
905
+ "##л": 707,
906
+ "##н": 708,
907
+ "##о": 709,
908
+ "##п": 710,
909
+ "##р": 711,
910
+ "##с": 712,
911
+ "##т": 713,
912
+ "##у": 714,
913
+ "##х": 715,
914
+ "##ч": 716,
915
+ "##ш": 717,
916
+ "##щ": 718,
917
+ "##ѐ": 719,
918
+ "##ё": 720,
919
+ "##ѕ": 721,
920
+ "##і": 722,
921
+ "##ј": 723,
922
+ "##џ": 724,
923
+ "##ѡ": 725,
924
+ "##Ѵ": 726,
925
+ "##ѵ": 727,
926
+ "##ҏ": 728,
927
+ "##қ": 729,
928
+ "##ҡ": 730,
929
+ "##ң": 731,
930
+ "##ҥ": 732,
931
+ "##Ү": 733,
932
+ "##ү": 734,
933
+ "##ҳ": 735,
934
+ "##һ": 736,
935
+ "##ҽ": 737,
936
+ "##ӏ": 738,
937
+ "##ԁ": 739,
938
+ "##ԛ": 740,
939
+ "##Ա": 741,
940
+ "##Ի": 742,
941
+ "##Ս": 743,
942
+ "##Տ": 744,
943
+ "##Օ": 745,
944
+ "##ա": 746,
945
+ "##գ": 747,
946
+ "##զ": 748,
947
+ "##ժ": 749,
948
+ "##հ": 750,
949
+ "##յ": 751,
950
+ "##ս": 752,
951
+ "##օ": 753,
952
+ "##Ⴍ": 754,
953
+ "##Ⴓ": 755,
954
+ "##Ⴝ": 756,
955
+ "##Ꭰ": 757,
956
+ "##Ꭲ": 758,
957
+ "##Ꭵ": 759,
958
+ "##Ꭺ": 760,
959
+ "##Ꭻ": 761,
960
+ "##Ꮃ": 762,
961
+ "##Ꮇ": 763,
962
+ "##Ꮋ": 764,
963
+ "##Ꮐ": 765,
964
+ "##Ꮓ": 766,
965
+ "##Ꮢ": 767,
966
+ "##Ꮩ": 768,
967
+ "##Ꮪ": 769,
968
+ "##Ꮮ": 770,
969
+ "##Ꮯ": 771,
970
+ "##Ꮲ": 772,
971
+ "##Ꮶ": 773,
972
+ "##Ᏼ": 774,
973
+ "##ᚱ": 775,
974
+ "##ᛁ": 776,
975
+ "##ᛒ": 777,
976
+ "##ᛕ": 778,
977
+ "##ᛖ": 779,
978
+ "##ᴄ": 780,
979
+ "##ᴇ": 781,
980
+ "##ᴋ": 782,
981
+ "##ᴍ": 783,
982
+ "##ᴏ": 784,
983
+ "##ᴑ": 785,
984
+ "##ᴜ": 786,
985
+ "##ᴠ": 787,
986
+ "##ᴡ": 788,
987
+ "##ᴦ": 789,
988
+ "##ᴨ": 790,
989
+ "##ᴺ": 791,
990
+ "##ᴼ": 792,
991
+ "##ᴾ": 793,
992
+ "##ᴿ": 794,
993
+ "##ḟ": 795,
994
+ "##ḱ": 796,
995
+ "##ḿ": 797,
996
+ "##ṁ": 798,
997
+ "##ṅ": 799,
998
+ "##Ṛ": 800,
999
+ "##ṡ": 801,
1000
+ "##ẁ": 802,
1001
+ "##ẃ": 803,
1002
+ "##ẇ": 804,
1003
+ "##ἀ": 805,
1004
+ "##ἁ": 806,
1005
+ "##ἇ": 807,
1006
+ "##ἰ": 808,
1007
+ "##ἱ": 809,
1008
+ "##ἳ": 810,
1009
+ "##ὀ": 811,
1010
+ "##ὁ": 812,
1011
+ "##ὶ": 813,
1012
+ "##ί": 814,
1013
+ "##ῤ": 815,
1014
+ "##ῥ": 816,
1015
+ "##―": 817,
1016
+ "##₩": 818,
1017
+ "##€": 819,
1018
+ "##₿": 820,
1019
+ "##ℹ": 821,
1020
+ "##⋃": 822,
1021
+ "##𝘼": 823,
1022
+ "##𝘾": 824,
1023
+ "##𝘿": 825,
1024
+ "##𝙀": 826,
1025
+ "##𝙍": 827,
1026
+ "##𝙏": 828
1027
+ }
1028
+ }
1029
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[UNK]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[CLS]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[SEP]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[PAD]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "4": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "5": {
44
+ "content": "\u0001",
45
+ "lstrip": false,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ },
51
+ "6": {
52
+ "content": "\u0002",
53
+ "lstrip": false,
54
+ "normalized": false,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": true
58
+ },
59
+ "7": {
60
+ "content": "\u0003",
61
+ "lstrip": false,
62
+ "normalized": false,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": true
66
+ },
67
+ "8": {
68
+ "content": "\u0004",
69
+ "lstrip": false,
70
+ "normalized": false,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": true
74
+ },
75
+ "9": {
76
+ "content": "\u0005",
77
+ "lstrip": false,
78
+ "normalized": false,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": true
82
+ }
83
+ },
84
+ "clean_text": false,
85
+ "clean_up_tokenization_spaces": true,
86
+ "cls_token": "[CLS]",
87
+ "do_basic_tokenize": true,
88
+ "do_lower_case": false,
89
+ "mask_token": "[MASK]",
90
+ "model_max_length": 512,
91
+ "never_split": null,
92
+ "pad_token": "[PAD]",
93
+ "sep_token": "[SEP]",
94
+ "strip_accents": true,
95
+ "tokenize_chinese_chars": false,
96
+ "tokenizer_class": "BertTokenizer",
97
+ "unk_token": "[UNK]"
98
+ }
trainer_state.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 0.0001394353204167889,
5
+ "eval_steps": 500,
6
+ "global_step": 500,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.0,
13
+ "grad_norm": 50.64784240722656,
14
+ "learning_rate": 0.004999860564679583,
15
+ "loss": 1.2003,
16
+ "step": 100
17
+ },
18
+ {
19
+ "epoch": 0.0,
20
+ "grad_norm": 3.446443557739258,
21
+ "learning_rate": 0.004999721129359167,
22
+ "loss": 1.8397,
23
+ "step": 200
24
+ },
25
+ {
26
+ "epoch": 0.0,
27
+ "grad_norm": 0.034487757831811905,
28
+ "learning_rate": 0.0049995816940387496,
29
+ "loss": 0.6524,
30
+ "step": 300
31
+ },
32
+ {
33
+ "epoch": 0.0,
34
+ "grad_norm": 0.0324031226336956,
35
+ "learning_rate": 0.004999442258718333,
36
+ "loss": 0.0663,
37
+ "step": 400
38
+ },
39
+ {
40
+ "epoch": 0.0,
41
+ "grad_norm": 0.10035407543182373,
42
+ "learning_rate": 0.0049993028233979156,
43
+ "loss": 0.0657,
44
+ "step": 500
45
+ }
46
+ ],
47
+ "logging_steps": 100,
48
+ "max_steps": 3585892,
49
+ "num_input_tokens_seen": 0,
50
+ "num_train_epochs": 1,
51
+ "save_steps": 500,
52
+ "total_flos": 735868440576000.0,
53
+ "train_batch_size": 16,
54
+ "trial_name": null,
55
+ "trial_params": null
56
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1829ee33f8a86971af862a4f014ac34f0dc8be74772f3fd1a76051b529aaa14
3
+ size 4856
vocab.txt ADDED
@@ -0,0 +1,829 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [UNK]
2
+ [CLS]
3
+ [SEP]
4
+ [PAD]
5
+ [MASK]
6
+ 
7
+ 
8
+ 
9
+ 
10
+ 
11
+ !
12
+ "
13
+ #
14
+ $
15
+ %
16
+ &
17
+ '
18
+ (
19
+ )
20
+ *
21
+ +
22
+ ,
23
+ -
24
+ .
25
+ /
26
+ 0
27
+ 1
28
+ 2
29
+ 3
30
+ 4
31
+ 5
32
+ 6
33
+ 7
34
+ 8
35
+ 9
36
+ :
37
+ ;
38
+ <
39
+ =
40
+ >
41
+ ?
42
+ @
43
+ [
44
+ \
45
+ ]
46
+ ^
47
+ _
48
+ `
49
+ a
50
+ b
51
+ c
52
+ d
53
+ e
54
+ f
55
+ g
56
+ h
57
+ i
58
+ j
59
+ k
60
+ l
61
+ m
62
+ n
63
+ o
64
+ p
65
+ q
66
+ r
67
+ s
68
+ t
69
+ u
70
+ v
71
+ w
72
+ x
73
+ y
74
+ z
75
+ {
76
+ |
77
+ }
78
+ ~
79
+ ¡
80
+ ¢
81
+ £
82
+ ¥
83
+ §
84
+ ¯
85
+ µ
86
+ º
87
+ »
88
+ ¿
89
+ À
90
+ Â
91
+ Ã
92
+ Ä
93
+ Å
94
+ Ç
95
+ Ë
96
+ Í
97
+ Î
98
+ Ï
99
+ Ñ
100
+ Ó
101
+ Ø
102
+ Ù
103
+ Ú
104
+ Ü
105
+ ß
106
+ à
107
+ á
108
+ â
109
+ ã
110
+ ä
111
+ å
112
+ æ
113
+ ç
114
+ è
115
+ é
116
+ ê
117
+ ë
118
+ ì
119
+ í
120
+ î
121
+ ï
122
+ ñ
123
+ ò
124
+ ó
125
+ ô
126
+ õ
127
+ ö
128
+ ø
129
+ ù
130
+ ú
131
+ û
132
+ ü
133
+ ý
134
+ þ
135
+ ā
136
+ ă
137
+ ą
138
+ ć
139
+ č
140
+ ď
141
+ đ
142
+ ē
143
+ ĕ
144
+ ė
145
+ Ę
146
+ ę
147
+ ě
148
+ ġ
149
+ ģ
150
+ ĩ
151
+ ī
152
+ ĭ
153
+ İ
154
+ ı
155
+ ĵ
156
+ ķ
157
+ ĸ
158
+ ĺ
159
+ ł
160
+ ń
161
+ ň
162
+ ʼn
163
+ ŋ
164
+ ō
165
+ ŏ
166
+ ő
167
+ œ
168
+ ŕ
169
+ ŗ
170
+ Ř
171
+ ř
172
+ Ś
173
+ ś
174
+ Ş
175
+ ş
176
+ š
177
+ ţ
178
+ Ť
179
+ ť
180
+ ũ
181
+ ū
182
+ ŭ
183
+ ű
184
+ ų
185
+ ŵ
186
+ Ÿ
187
+ ż
188
+ ž
189
+ ƀ
190
+ Ɓ
191
+ Ƅ
192
+ ƅ
193
+ Ƈ
194
+ ƒ
195
+ ƙ
196
+ ƞ
197
+ Ƭ
198
+ Ư
199
+ Ƴ
200
+ Ǐ
201
+ Ƿ
202
+ ǹ
203
+ ȋ
204
+ ș
205
+ ț
206
+ ȧ
207
+ ȯ
208
+ Ʌ
209
+ ɑ
210
+ ɗ
211
+ ɠ
212
+ ɡ
213
+ ɢ
214
+ ɣ
215
+ ɩ
216
+ ɪ
217
+ ɭ
218
+ ɯ
219
+ ɱ
220
+ ɳ
221
+ ɴ
222
+ ɺ
223
+ ɼ
224
+ ɾ
225
+ ʀ
226
+ ʂ
227
+ ʄ
228
+ ʋ
229
+ ʌ
230
+ ʍ
231
+ ʏ
232
+ ʙ
233
+ ʜ
234
+ ʝ
235
+ ʟ
236
+ ʨ
237
+ ˄
238
+ Α
239
+ Β
240
+ Ε
241
+ Ζ
242
+ Η
243
+ Ι
244
+ Κ
245
+ Μ
246
+ Ν
247
+ Ο
248
+ Ρ
249
+ Τ
250
+ Υ
251
+ Χ
252
+ ί
253
+ α
254
+ β
255
+ γ
256
+ η
257
+ ι
258
+ κ
259
+ μ
260
+ ν
261
+ ο
262
+ π
263
+ ρ
264
+ σ
265
+ τ
266
+ υ
267
+ χ
268
+ ω
269
+ ϲ
270
+ ϳ
271
+ Ϲ
272
+ Ϻ
273
+ Ѕ
274
+ Ј
275
+ А
276
+ В
277
+ Е
278
+ З
279
+ К
280
+ М
281
+ Н
282
+ О
283
+ Р
284
+ С
285
+ Т
286
+ У
287
+ Х
288
+ Ь
289
+ а
290
+ в
291
+ г
292
+ д
293
+ е
294
+ и
295
+ к
296
+ л
297
+ н
298
+ о
299
+ п
300
+ р
301
+ с
302
+ т
303
+ у
304
+ х
305
+ ч
306
+ ш
307
+ щ
308
+ ѐ
309
+ ё
310
+ ѕ
311
+ і
312
+ ј
313
+ џ
314
+ ѡ
315
+ Ѵ
316
+ ѵ
317
+ ҏ
318
+ қ
319
+ ҡ
320
+ ң
321
+ ҥ
322
+ Ү
323
+ ү
324
+ ҳ
325
+ һ
326
+ ҽ
327
+ ӏ
328
+ ԁ
329
+ ԛ
330
+ Ա
331
+ Ի
332
+ Ս
333
+ Տ
334
+ Օ
335
+ ա
336
+ գ
337
+ զ
338
+ ժ
339
+ հ
340
+ յ
341
+ ս
342
+ օ
343
+
344
+
345
+
346
+
347
+
348
+
349
+
350
+
351
+
352
+
353
+
354
+
355
+
356
+
357
+
358
+
359
+
360
+
361
+
362
+
363
+
364
+
365
+
366
+
367
+
368
+
369
+
370
+
371
+
372
+
373
+
374
+
375
+
376
+
377
+
378
+
379
+
380
+
381
+
382
+
383
+ ᴿ
384
+
385
+
386
+ ḿ
387
+
388
+
389
+
390
+
391
+
392
+
393
+
394
+
395
+
396
+
397
+
398
+
399
+
400
+
401
+
402
+
403
+
404
+
405
+
406
+
407
+
408
+
409
+
410
+
411
+
412
+ 𝘼
413
+ 𝘾
414
+ 𝘿
415
+ 𝙀
416
+ 𝙍
417
+ 𝙏
418
+ ##
419
+ ##
420
+ ##
421
+ ##
422
+ ##
423
+ ##!
424
+ ##"
425
+ ###
426
+ ##$
427
+ ##%
428
+ ##&
429
+ ##'
430
+ ##(
431
+ ##)
432
+ ##*
433
+ ##+
434
+ ##,
435
+ ##-
436
+ ##.
437
+ ##/
438
+ ##0
439
+ ##1
440
+ ##2
441
+ ##3
442
+ ##4
443
+ ##5
444
+ ##6
445
+ ##7
446
+ ##8
447
+ ##9
448
+ ##:
449
+ ##;
450
+ ##<
451
+ ##=
452
+ ##>
453
+ ##?
454
+ ##@
455
+ ##[
456
+ ##\
457
+ ##]
458
+ ##^
459
+ ##_
460
+ ##`
461
+ ##a
462
+ ##b
463
+ ##c
464
+ ##d
465
+ ##e
466
+ ##f
467
+ ##g
468
+ ##h
469
+ ##i
470
+ ##j
471
+ ##k
472
+ ##l
473
+ ##m
474
+ ##n
475
+ ##o
476
+ ##p
477
+ ##q
478
+ ##r
479
+ ##s
480
+ ##t
481
+ ##u
482
+ ##v
483
+ ##w
484
+ ##x
485
+ ##y
486
+ ##z
487
+ ##{
488
+ ##|
489
+ ##}
490
+ ##~
491
+ ##¡
492
+ ##¢
493
+ ##£
494
+ ##¥
495
+ ##§
496
+ ##¯
497
+ ##µ
498
+ ##º
499
+ ##»
500
+ ##¿
501
+ ##À
502
+ ##Â
503
+ ##Ã
504
+ ##Ä
505
+ ##Å
506
+ ##Ç
507
+ ##Ë
508
+ ##Í
509
+ ##Î
510
+ ##Ï
511
+ ##Ñ
512
+ ##Ó
513
+ ##Ø
514
+ ##Ù
515
+ ##Ú
516
+ ##Ü
517
+ ##ß
518
+ ##à
519
+ ##á
520
+ ##â
521
+ ##ã
522
+ ##ä
523
+ ##å
524
+ ##æ
525
+ ##ç
526
+ ##è
527
+ ##é
528
+ ##ê
529
+ ##ë
530
+ ##ì
531
+ ##í
532
+ ##î
533
+ ##ï
534
+ ##ñ
535
+ ##ò
536
+ ##ó
537
+ ##ô
538
+ ##õ
539
+ ##ö
540
+ ##ø
541
+ ##ù
542
+ ##ú
543
+ ##û
544
+ ##ü
545
+ ##ý
546
+ ##þ
547
+ ##ā
548
+ ##ă
549
+ ##ą
550
+ ##ć
551
+ ##č
552
+ ##ď
553
+ ##đ
554
+ ##ē
555
+ ##ĕ
556
+ ##ė
557
+ ##Ę
558
+ ##ę
559
+ ##ě
560
+ ##ġ
561
+ ##ģ
562
+ ##ĩ
563
+ ##ī
564
+ ##ĭ
565
+ ##İ
566
+ ##ı
567
+ ##ĵ
568
+ ##ķ
569
+ ##ĸ
570
+ ##ĺ
571
+ ##ł
572
+ ##ń
573
+ ##ň
574
+ ##ʼn
575
+ ##ŋ
576
+ ##ō
577
+ ##ŏ
578
+ ##ő
579
+ ##œ
580
+ ##ŕ
581
+ ##ŗ
582
+ ##Ř
583
+ ##ř
584
+ ##Ś
585
+ ##ś
586
+ ##Ş
587
+ ##ş
588
+ ##š
589
+ ##ţ
590
+ ##Ť
591
+ ##ť
592
+ ##ũ
593
+ ##ū
594
+ ##ŭ
595
+ ##ű
596
+ ##ų
597
+ ##ŵ
598
+ ##Ÿ
599
+ ##ż
600
+ ##ž
601
+ ##ƀ
602
+ ##Ɓ
603
+ ##Ƅ
604
+ ##ƅ
605
+ ##Ƈ
606
+ ##ƒ
607
+ ##ƙ
608
+ ##ƞ
609
+ ##Ƭ
610
+ ##Ư
611
+ ##Ƴ
612
+ ##Ǐ
613
+ ##Ƿ
614
+ ##ǹ
615
+ ##ȋ
616
+ ##ș
617
+ ##ț
618
+ ##ȧ
619
+ ##ȯ
620
+ ##Ʌ
621
+ ##ɑ
622
+ ##ɗ
623
+ ##ɠ
624
+ ##ɡ
625
+ ##ɢ
626
+ ##ɣ
627
+ ##ɩ
628
+ ##ɪ
629
+ ##ɭ
630
+ ##ɯ
631
+ ##ɱ
632
+ ##ɳ
633
+ ##ɴ
634
+ ##ɺ
635
+ ##ɼ
636
+ ##ɾ
637
+ ##ʀ
638
+ ##ʂ
639
+ ##ʄ
640
+ ##ʋ
641
+ ##ʌ
642
+ ##ʍ
643
+ ##ʏ
644
+ ##ʙ
645
+ ##ʜ
646
+ ##ʝ
647
+ ##ʟ
648
+ ##ʨ
649
+ ##˄
650
+ ##Α
651
+ ##Β
652
+ ##Ε
653
+ ##Ζ
654
+ ##Η
655
+ ##Ι
656
+ ##Κ
657
+ ##Μ
658
+ ##Ν
659
+ ##Ο
660
+ ##Ρ
661
+ ##Τ
662
+ ##Υ
663
+ ##Χ
664
+ ##ί
665
+ ##α
666
+ ##β
667
+ ##γ
668
+ ##η
669
+ ##ι
670
+ ##κ
671
+ ##μ
672
+ ##ν
673
+ ##ο
674
+ ##π
675
+ ##ρ
676
+ ##σ
677
+ ##τ
678
+ ##υ
679
+ ##χ
680
+ ##ω
681
+ ##ϲ
682
+ ##ϳ
683
+ ##Ϲ
684
+ ##Ϻ
685
+ ##Ѕ
686
+ ##Ј
687
+ ##А
688
+ ##В
689
+ ##Е
690
+ ##З
691
+ ##К
692
+ ##М
693
+ ##Н
694
+ ##О
695
+ ##Р
696
+ ##С
697
+ ##Т
698
+ ##У
699
+ ##Х
700
+ ##Ь
701
+ ##а
702
+ ##в
703
+ ##г
704
+ ##д
705
+ ##е
706
+ ##и
707
+ ##к
708
+ ##л
709
+ ##н
710
+ ##о
711
+ ##п
712
+ ##р
713
+ ##с
714
+ ##т
715
+ ##у
716
+ ##х
717
+ ##ч
718
+ ##ш
719
+ ##щ
720
+ ##ѐ
721
+ ##ё
722
+ ##ѕ
723
+ ##і
724
+ ##ј
725
+ ##џ
726
+ ##ѡ
727
+ ##Ѵ
728
+ ##ѵ
729
+ ##ҏ
730
+ ##қ
731
+ ##ҡ
732
+ ##ң
733
+ ##ҥ
734
+ ##Ү
735
+ ##ү
736
+ ##ҳ
737
+ ##һ
738
+ ##ҽ
739
+ ##ӏ
740
+ ##ԁ
741
+ ##ԛ
742
+ ##Ա
743
+ ##Ի
744
+ ##Ս
745
+ ##Տ
746
+ ##Օ
747
+ ##ա
748
+ ##գ
749
+ ##զ
750
+ ##ժ
751
+ ##հ
752
+ ##յ
753
+ ##ս
754
+ ##օ
755
+ ##Ⴍ
756
+ ##Ⴓ
757
+ ##Ⴝ
758
+ ##Ꭰ
759
+ ##Ꭲ
760
+ ##Ꭵ
761
+ ##Ꭺ
762
+ ##Ꭻ
763
+ ##Ꮃ
764
+ ##Ꮇ
765
+ ##Ꮋ
766
+ ##Ꮐ
767
+ ##Ꮓ
768
+ ##Ꮢ
769
+ ##Ꮩ
770
+ ##Ꮪ
771
+ ##Ꮮ
772
+ ##Ꮯ
773
+ ##Ꮲ
774
+ ##Ꮶ
775
+ ##Ᏼ
776
+ ##ᚱ
777
+ ##ᛁ
778
+ ##ᛒ
779
+ ##ᛕ
780
+ ##ᛖ
781
+ ##ᴄ
782
+ ##ᴇ
783
+ ##ᴋ
784
+ ##ᴍ
785
+ ##ᴏ
786
+ ##ᴑ
787
+ ##ᴜ
788
+ ##ᴠ
789
+ ##ᴡ
790
+ ##ᴦ
791
+ ##ᴨ
792
+ ##ᴺ
793
+ ##ᴼ
794
+ ##ᴾ
795
+ ##ᴿ
796
+ ##ḟ
797
+ ##ḱ
798
+ ##ḿ
799
+ ##ṁ
800
+ ##ṅ
801
+ ##Ṛ
802
+ ##ṡ
803
+ ##ẁ
804
+ ##ẃ
805
+ ##ẇ
806
+ ##ἀ
807
+ ##ἁ
808
+ ##ἇ
809
+ ##ἰ
810
+ ##ἱ
811
+ ##ἳ
812
+ ##ὀ
813
+ ##ὁ
814
+ ##ὶ
815
+ ##ί
816
+ ##ῤ
817
+ ##ῥ
818
+ ##―
819
+ ##₩
820
+ ##€
821
+ ##₿
822
+ ##ℹ
823
+ ##⋃
824
+ ##𝘼
825
+ ##𝘾
826
+ ##𝘿
827
+ ##𝙀
828
+ ##𝙍
829
+ ##𝙏