jettjaniak commited on
Commit
4fcbc9c
·
verified ·
1 Parent(s): a6f3951

Upload tokenizer

Browse files
Files changed (4) hide show
  1. README.md +199 -0
  2. special_tokens_map.json +5 -0
  3. tokenizer.json +1808 -0
  4. tokenizer_config.json +34 -0
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
special_tokens_map.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<bos>",
3
+ "eos_token": "<eos>",
4
+ "pad_token": "<pad>"
5
+ }
tokenizer.json ADDED
@@ -0,0 +1,1808 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "<bos>",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ },
15
+ {
16
+ "id": 1,
17
+ "content": "<eos>",
18
+ "single_word": false,
19
+ "lstrip": false,
20
+ "rstrip": false,
21
+ "normalized": false,
22
+ "special": true
23
+ },
24
+ {
25
+ "id": 2,
26
+ "content": "<pad>",
27
+ "single_word": false,
28
+ "lstrip": false,
29
+ "rstrip": false,
30
+ "normalized": false,
31
+ "special": true
32
+ }
33
+ ],
34
+ "normalizer": null,
35
+ "pre_tokenizer": {
36
+ "type": "ByteLevel",
37
+ "add_prefix_space": false,
38
+ "trim_offsets": true,
39
+ "use_regex": true
40
+ },
41
+ "post_processor": {
42
+ "type": "ByteLevel",
43
+ "add_prefix_space": true,
44
+ "trim_offsets": false,
45
+ "use_regex": true
46
+ },
47
+ "decoder": {
48
+ "type": "ByteLevel",
49
+ "add_prefix_space": true,
50
+ "trim_offsets": true,
51
+ "use_regex": true
52
+ },
53
+ "model": {
54
+ "type": "BPE",
55
+ "dropout": null,
56
+ "unk_token": null,
57
+ "continuing_subword_prefix": null,
58
+ "end_of_word_suffix": null,
59
+ "fuse_unk": false,
60
+ "byte_fallback": false,
61
+ "ignore_merges": false,
62
+ "vocab": {
63
+ "<bos>": 0,
64
+ "<eos>": 1,
65
+ "<pad>": 2,
66
+ "!": 3,
67
+ "\"": 4,
68
+ "#": 5,
69
+ "$": 6,
70
+ "%": 7,
71
+ "&": 8,
72
+ "'": 9,
73
+ "(": 10,
74
+ ")": 11,
75
+ "*": 12,
76
+ "+": 13,
77
+ ",": 14,
78
+ "-": 15,
79
+ ".": 16,
80
+ "/": 17,
81
+ "0": 18,
82
+ "1": 19,
83
+ "2": 20,
84
+ "3": 21,
85
+ "4": 22,
86
+ "5": 23,
87
+ "6": 24,
88
+ "7": 25,
89
+ "8": 26,
90
+ "9": 27,
91
+ ":": 28,
92
+ ";": 29,
93
+ "<": 30,
94
+ "=": 31,
95
+ ">": 32,
96
+ "?": 33,
97
+ "@": 34,
98
+ "A": 35,
99
+ "B": 36,
100
+ "C": 37,
101
+ "D": 38,
102
+ "E": 39,
103
+ "F": 40,
104
+ "G": 41,
105
+ "H": 42,
106
+ "I": 43,
107
+ "J": 44,
108
+ "K": 45,
109
+ "L": 46,
110
+ "M": 47,
111
+ "N": 48,
112
+ "O": 49,
113
+ "P": 50,
114
+ "Q": 51,
115
+ "R": 52,
116
+ "S": 53,
117
+ "T": 54,
118
+ "U": 55,
119
+ "V": 56,
120
+ "W": 57,
121
+ "X": 58,
122
+ "Y": 59,
123
+ "Z": 60,
124
+ "[": 61,
125
+ "\\": 62,
126
+ "]": 63,
127
+ "^": 64,
128
+ "_": 65,
129
+ "`": 66,
130
+ "a": 67,
131
+ "b": 68,
132
+ "c": 69,
133
+ "d": 70,
134
+ "e": 71,
135
+ "f": 72,
136
+ "g": 73,
137
+ "h": 74,
138
+ "i": 75,
139
+ "j": 76,
140
+ "k": 77,
141
+ "l": 78,
142
+ "m": 79,
143
+ "n": 80,
144
+ "o": 81,
145
+ "p": 82,
146
+ "q": 83,
147
+ "r": 84,
148
+ "s": 85,
149
+ "t": 86,
150
+ "u": 87,
151
+ "v": 88,
152
+ "w": 89,
153
+ "x": 90,
154
+ "y": 91,
155
+ "z": 92,
156
+ "{": 93,
157
+ "|": 94,
158
+ "}": 95,
159
+ "~": 96,
160
+ "¡": 97,
161
+ "¢": 98,
162
+ "£": 99,
163
+ "¤": 100,
164
+ "¥": 101,
165
+ "¦": 102,
166
+ "§": 103,
167
+ "¨": 104,
168
+ "©": 105,
169
+ "ª": 106,
170
+ "«": 107,
171
+ "¬": 108,
172
+ "®": 109,
173
+ "¯": 110,
174
+ "°": 111,
175
+ "±": 112,
176
+ "²": 113,
177
+ "³": 114,
178
+ "´": 115,
179
+ "µ": 116,
180
+ "¶": 117,
181
+ "·": 118,
182
+ "¸": 119,
183
+ "¹": 120,
184
+ "º": 121,
185
+ "»": 122,
186
+ "¼": 123,
187
+ "½": 124,
188
+ "¾": 125,
189
+ "¿": 126,
190
+ "À": 127,
191
+ "Á": 128,
192
+ "Â": 129,
193
+ "Ã": 130,
194
+ "Ä": 131,
195
+ "Å": 132,
196
+ "Æ": 133,
197
+ "Ç": 134,
198
+ "È": 135,
199
+ "É": 136,
200
+ "Ê": 137,
201
+ "Ë": 138,
202
+ "Ì": 139,
203
+ "Í": 140,
204
+ "Î": 141,
205
+ "Ï": 142,
206
+ "Ð": 143,
207
+ "Ñ": 144,
208
+ "Ò": 145,
209
+ "Ó": 146,
210
+ "Ô": 147,
211
+ "Õ": 148,
212
+ "Ö": 149,
213
+ "×": 150,
214
+ "Ø": 151,
215
+ "Ù": 152,
216
+ "Ú": 153,
217
+ "Û": 154,
218
+ "Ü": 155,
219
+ "Ý": 156,
220
+ "Þ": 157,
221
+ "ß": 158,
222
+ "à": 159,
223
+ "á": 160,
224
+ "â": 161,
225
+ "ã": 162,
226
+ "ä": 163,
227
+ "å": 164,
228
+ "æ": 165,
229
+ "ç": 166,
230
+ "è": 167,
231
+ "é": 168,
232
+ "ê": 169,
233
+ "ë": 170,
234
+ "ì": 171,
235
+ "í": 172,
236
+ "î": 173,
237
+ "ï": 174,
238
+ "ð": 175,
239
+ "ñ": 176,
240
+ "ò": 177,
241
+ "ó": 178,
242
+ "ô": 179,
243
+ "õ": 180,
244
+ "ö": 181,
245
+ "÷": 182,
246
+ "ø": 183,
247
+ "ù": 184,
248
+ "ú": 185,
249
+ "û": 186,
250
+ "ü": 187,
251
+ "ý": 188,
252
+ "þ": 189,
253
+ "ÿ": 190,
254
+ "Ā": 191,
255
+ "ā": 192,
256
+ "Ă": 193,
257
+ "ă": 194,
258
+ "Ą": 195,
259
+ "ą": 196,
260
+ "Ć": 197,
261
+ "ć": 198,
262
+ "Ĉ": 199,
263
+ "ĉ": 200,
264
+ "Ċ": 201,
265
+ "ċ": 202,
266
+ "Č": 203,
267
+ "č": 204,
268
+ "Ď": 205,
269
+ "ď": 206,
270
+ "Đ": 207,
271
+ "đ": 208,
272
+ "Ē": 209,
273
+ "ē": 210,
274
+ "Ĕ": 211,
275
+ "ĕ": 212,
276
+ "Ė": 213,
277
+ "ė": 214,
278
+ "Ę": 215,
279
+ "ę": 216,
280
+ "Ě": 217,
281
+ "ě": 218,
282
+ "Ĝ": 219,
283
+ "ĝ": 220,
284
+ "Ğ": 221,
285
+ "ğ": 222,
286
+ "Ġ": 223,
287
+ "ġ": 224,
288
+ "Ģ": 225,
289
+ "ģ": 226,
290
+ "Ĥ": 227,
291
+ "ĥ": 228,
292
+ "Ħ": 229,
293
+ "ħ": 230,
294
+ "Ĩ": 231,
295
+ "ĩ": 232,
296
+ "Ī": 233,
297
+ "ī": 234,
298
+ "Ĭ": 235,
299
+ "ĭ": 236,
300
+ "Į": 237,
301
+ "į": 238,
302
+ "İ": 239,
303
+ "ı": 240,
304
+ "IJ": 241,
305
+ "ij": 242,
306
+ "Ĵ": 243,
307
+ "ĵ": 244,
308
+ "Ķ": 245,
309
+ "ķ": 246,
310
+ "ĸ": 247,
311
+ "Ĺ": 248,
312
+ "ĺ": 249,
313
+ "Ļ": 250,
314
+ "ļ": 251,
315
+ "Ľ": 252,
316
+ "ľ": 253,
317
+ "Ŀ": 254,
318
+ "ŀ": 255,
319
+ "Ł": 256,
320
+ "ł": 257,
321
+ "Ń": 258,
322
+ "Ġt": 259,
323
+ "he": 260,
324
+ "Ġa": 261,
325
+ "Ġs": 262,
326
+ "Ġw": 263,
327
+ "nd": 264,
328
+ "Ġthe": 265,
329
+ "ed": 266,
330
+ "Ġb": 267,
331
+ "Ġto": 268,
332
+ "Ġand": 269,
333
+ "Ġh": 270,
334
+ "Ġf": 271,
335
+ "ĠT": 272,
336
+ "in": 273,
337
+ "Ġwa": 274,
338
+ "re": 275,
339
+ "it": 276,
340
+ "ou": 277,
341
+ "Ġl": 278,
342
+ "Ġd": 279,
343
+ "Ġc": 280,
344
+ "Ġp": 281,
345
+ "ay": 282,
346
+ "Ġm": 283,
347
+ "er": 284,
348
+ "Ġwas": 285,
349
+ "ĠThe": 286,
350
+ "om": 287,
351
+ "Ġhe": 288,
352
+ "is": 289,
353
+ "Ġn": 290,
354
+ "ar": 291,
355
+ "im": 292,
356
+ "on": 293,
357
+ "Ġsa": 294,
358
+ "id": 295,
359
+ "ll": 296,
360
+ "Ġha": 297,
361
+ "Ġg": 298,
362
+ "at": 299,
363
+ "ĠS": 300,
364
+ "ing": 301,
365
+ "ot": 302,
366
+ "en": 303,
367
+ "an": 304,
368
+ "le": 305,
369
+ "or": 306,
370
+ "ir": 307,
371
+ "am": 308,
372
+ "ĠH": 309,
373
+ "et": 310,
374
+ "Ġit": 311,
375
+ "Ġth": 312,
376
+ "ig": 313,
377
+ "ĠThey": 314,
378
+ "il": 315,
379
+ "Ġin": 316,
380
+ "Ġpl": 317,
381
+ "ĠHe": 318,
382
+ "Ġ\"": 319,
383
+ "ow": 320,
384
+ "ri": 321,
385
+ "ver": 322,
386
+ "ut": 323,
387
+ "Ġu": 324,
388
+ "Ġbe": 325,
389
+ "Ġplay": 326,
390
+ "Ġsaid": 327,
391
+ "ith": 328,
392
+ "pp": 329,
393
+ "Ġwith": 330,
394
+ "Ġday": 331,
395
+ "On": 332,
396
+ "Ġy": 333,
397
+ "Ġo": 334,
398
+ "oo": 335,
399
+ "ked": 336,
400
+ "Ġr": 337,
401
+ "ce": 338,
402
+ "ĠI": 339,
403
+ "Ġher": 340,
404
+ "ĠTim": 341,
405
+ "ĠShe": 342,
406
+ "Ġhis": 343,
407
+ "ld": 344,
408
+ "Ġst": 345,
409
+ "ke": 346,
410
+ "Ġe": 347,
411
+ "Ġbig": 348,
412
+ "nt": 349,
413
+ "ck": 350,
414
+ "Ġyou": 351,
415
+ "very": 352,
416
+ "st": 353,
417
+ "ve": 354,
418
+ "end": 355,
419
+ "Ġhapp": 356,
420
+ "Ġon": 357,
421
+ "un": 358,
422
+ "riend": 359,
423
+ "Ġfriend": 360,
424
+ "all": 361,
425
+ "ily": 362,
426
+ "ĠL": 363,
427
+ "Ġthey": 364,
428
+ "Ġnot": 365,
429
+ "Ġwe": 366,
430
+ "Ġhad": 367,
431
+ "Ġli": 368,
432
+ "Ġup": 369,
433
+ "Ġwant": 370,
434
+ "her": 371,
435
+ "Ġof": 372,
436
+ "itt": 373,
437
+ "ad": 374,
438
+ "ĠB": 375,
439
+ "se": 376,
440
+ "Ġdo": 377,
441
+ "Ġhappy": 378,
442
+ "ent": 379,
443
+ "Ġvery": 380,
444
+ "'s": 381,
445
+ "Ġsaw": 382,
446
+ "es": 383,
447
+ "ĠM": 384,
448
+ "One": 385,
449
+ "Ġthat": 386,
450
+ "ould": 387,
451
+ "Ġsh": 388,
452
+ "Ġmom": 389,
453
+ "Ġfor": 390,
454
+ "ittle": 391,
455
+ "Ġlittle": 392,
456
+ "Ġso": 393,
457
+ "Ġshe": 394,
458
+ ".\"": 395,
459
+ "ime": 396,
460
+ "ch": 397,
461
+ "Ġnam": 398,
462
+ "Ġtime": 399,
463
+ "Ġne": 400,
464
+ "Ġk": 401,
465
+ "ound": 402,
466
+ "Ġthere": 403,
467
+ "Ġnamed": 404,
468
+ "Ġbo": 405,
469
+ "Ġsm": 406,
470
+ "ĠLily": 407,
471
+ "Ġwere": 408,
472
+ "Ġwanted": 409,
473
+ "!\"": 410,
474
+ "Ġfriends": 411,
475
+ "Ġbut": 412,
476
+ "ird": 413,
477
+ "ĠTom": 414,
478
+ "out": 415,
479
+ "The": 416,
480
+ "ved": 417,
481
+ "ht": 418,
482
+ "Ġbird": 419,
483
+ "el": 420,
484
+ "Ġan": 421,
485
+ "al": 422,
486
+ "ake": 423,
487
+ "ĠIt": 424,
488
+ "Ġtoo": 425,
489
+ "ug": 426,
490
+ "ome": 427,
491
+ "Ġwent": 428,
492
+ "Ġhel": 429,
493
+ "ide": 430,
494
+ "Once": 431,
495
+ "Ġwh": 432,
496
+ "Ġhelp": 433,
497
+ "Ġall": 434,
498
+ "Ġis": 435,
499
+ "Ġloo": 436,
500
+ "ue": 437,
501
+ "Ġlo": 438,
502
+ "Ġupon": 439,
503
+ "ter": 440,
504
+ "ĠA": 441,
505
+ "ry": 442,
506
+ "ore": 443,
507
+ "ind": 444,
508
+ "Ġtoy": 445,
509
+ "get": 446,
510
+ "Ġfun": 447,
511
+ "ame": 448,
512
+ "ill": 449,
513
+ "Ġas": 450,
514
+ "Ġat": 451,
515
+ "Ġdid": 452,
516
+ "ra": 453,
517
+ "Ġj": 454,
518
+ "gether": 455,
519
+ "ur": 456,
520
+ "Ġtogether": 457,
521
+ "Ġre": 458,
522
+ "Ġse": 459,
523
+ "ack": 460,
524
+ "Ġtre": 461,
525
+ "Ġcat": 462,
526
+ "ly": 463,
527
+ "ood": 464,
528
+ "ic": 465,
529
+ "ted": 466,
530
+ "Ġcould": 467,
531
+ "Ġdog": 468,
532
+ "Ġcan": 469,
533
+ "ec": 470,
534
+ "Ġplayed": 471,
535
+ "ark": 472,
536
+ "Ġtheir": 473,
537
+ "ard": 474,
538
+ "Ġgir": 475,
539
+ "?\"": 476,
540
+ "Ġball": 477,
541
+ "Ġgirl": 478,
542
+ "Ġro": 479,
543
+ "way": 480,
544
+ "Ġhim": 481,
545
+ "hed": 482,
546
+ "Ġgo": 483,
547
+ "Ġle": 484,
548
+ "Ġfr": 485,
549
+ "Ġare": 486,
550
+ "um": 487,
551
+ "'t": 488,
552
+ "Ġout": 489,
553
+ "my": 490,
554
+ "ain": 491,
555
+ "Ġkn": 492,
556
+ "Ġsad": 493,
557
+ "hen": 494,
558
+ "ax": 495,
559
+ "Ġthem": 496,
560
+ "Ġboy": 497,
561
+ "Ġtree": 498,
562
+ "Ġman": 499,
563
+ "other": 500,
564
+ "ul": 501,
565
+ "Ġhave": 502,
566
+ "Ġcl": 503,
567
+ "Ġlooked": 504,
568
+ "Ġfound": 505,
569
+ "Ġloved": 506,
570
+ "oug": 507,
571
+ "Ġstar": 508,
572
+ "Ġsp": 509,
573
+ "one": 510,
574
+ "ĠSue": 511,
575
+ "hing": 512,
576
+ "Ġback": 513,
577
+ "Ġsc": 514,
578
+ "own": 515,
579
+ "ĠMax": 516,
580
+ "Ġlike": 517,
581
+ "are": 518,
582
+ "Ġme": 519,
583
+ "Ġbec": 520,
584
+ "side": 521,
585
+ "Ġcar": 522,
586
+ "ful": 523,
587
+ "ong": 524,
588
+ "Ġpark": 525,
589
+ "ight": 526,
590
+ "op": 527,
591
+ "Ġliked": 528,
592
+ "ĠOne": 529,
593
+ "elt": 530,
594
+ "Ġla": 531,
595
+ "Ġmake": 532,
596
+ "Ġfa": 533,
597
+ "Ġwould": 534,
598
+ "round": 535,
599
+ "ell": 536,
600
+ "You": 537,
601
+ "Ġfelt": 538,
602
+ "ĠBut": 539,
603
+ "omet": 540,
604
+ "Ġsee": 541,
605
+ "Ġno": 542,
606
+ "Ġnew": 543,
607
+ "ĠBen": 544,
608
+ "ĠW": 545,
609
+ "ared": 546,
610
+ "Ġasked": 547,
611
+ "Ġcame": 548,
612
+ "Ġstarted": 549,
613
+ "ag": 550,
614
+ "Ġother": 551,
615
+ "ice": 552,
616
+ "ĠSam": 553,
617
+ "ouse": 554,
618
+ "Ġal": 555,
619
+ "ought": 556,
620
+ "iled": 557,
621
+ "Ġgood": 558,
622
+ "Ġsomet": 559,
623
+ "Ġag": 560,
624
+ "Ġsmall": 561,
625
+ "ss": 562,
626
+ "ade": 563,
627
+ "Ġbr": 564,
628
+ "ried": 565,
629
+ "Ġsmiled": 566,
630
+ "ob": 567,
631
+ "Ġsay": 568,
632
+ "ings": 569,
633
+ "pot": 570,
634
+ "Ġfind": 571,
635
+ "Ġaway": 572,
636
+ "ty": 573,
637
+ "Ġmade": 574,
638
+ "Ġwor": 575,
639
+ "Ġsomething": 576,
640
+ "Ġex": 577,
641
+ "Ġput": 578,
642
+ "ia": 579,
643
+ "Ġhome": 580,
644
+ "ened": 581,
645
+ "Ġthought": 582,
646
+ "Ġwhat": 583,
647
+ "Ġfrom": 584,
648
+ "Ġco": 585,
649
+ "Ġplaying": 586,
650
+ "Ġmu": 587,
651
+ "Ġevery": 588,
652
+ "Ġwal": 589,
653
+ "ach": 590,
654
+ "uc": 591,
655
+ "arn": 592,
656
+ "ook": 593,
657
+ "ĠSpot": 594,
658
+ "Ġran": 595,
659
+ "ie": 596,
660
+ "ile": 597,
661
+ "Ġagain": 598,
662
+ "Ġfl": 599,
663
+ "ĠF": 600,
664
+ "Ġlaug": 601,
665
+ "Ġdown": 602,
666
+ "Ġhouse": 603,
667
+ "Ġtoys": 604,
668
+ "ave": 605,
669
+ "Ġscared": 606,
670
+ "dd": 607,
671
+ "Ġsome": 608,
672
+ "king": 609,
673
+ "Ġtook": 610,
674
+ "Ġpr": 611,
675
+ "ĠJ": 612,
676
+ "ure": 613,
677
+ "Ġlearn": 614,
678
+ "ĠYou": 615,
679
+ "ep": 616,
680
+ "Ġwill": 617,
681
+ "ret": 618,
682
+ "ny": 619,
683
+ "Ġbox": 620,
684
+ "Ġmy": 621,
685
+ "ab": 622,
686
+ "if": 623,
687
+ "ick": 624,
688
+ "Ġthings": 625,
689
+ "Ġyour": 626,
690
+ "oud": 627,
691
+ "Ġlived": 628,
692
+ "uck": 629,
693
+ "Ġbl": 630,
694
+ "Ġaround": 631,
695
+ "ish": 632,
696
+ "us": 633,
697
+ "Ġsun": 634,
698
+ ",\"": 635,
699
+ "Ġwhen": 636,
700
+ "Ġsw": 637,
701
+ "Ġthen": 638,
702
+ "Ġfe": 639,
703
+ "as": 640,
704
+ "pped": 641,
705
+ "ump": 642,
706
+ "Ġch": 643,
707
+ "Ġab": 644,
708
+ "ank": 645,
709
+ "ucy": 646,
710
+ "ĠMia": 647,
711
+ "Tim": 648,
712
+ "Th": 649,
713
+ "ist": 650,
714
+ "Ġlot": 651,
715
+ "oth": 652,
716
+ "Ġtried": 653,
717
+ "Ġget": 654,
718
+ "Ġgot": 655,
719
+ "Ġsays": 656,
720
+ "Ġknow": 657,
721
+ "ited": 658,
722
+ "ap": 659,
723
+ "Ġmany": 660,
724
+ "Ġkne": 661,
725
+ "Ġwho": 662,
726
+ "ust": 663,
727
+ "Ġint": 664,
728
+ "nder": 665,
729
+ "Ġdec": 666,
730
+ "Lily": 667,
731
+ "Ġpret": 668,
732
+ "Ġabout": 669,
733
+ "Ġany": 670,
734
+ "ĠD": 671,
735
+ "Ġred": 672,
736
+ "ous": 673,
737
+ "ive": 674,
738
+ "Ġmore": 675,
739
+ "Ġknew": 676,
740
+ "au": 677,
741
+ "ace": 678,
742
+ "ise": 679,
743
+ "ally": 680,
744
+ "Ġwater": 681,
745
+ "Ġcare": 682,
746
+ "Ġbecame": 683,
747
+ "Ġpic": 684,
748
+ "ĠLucy": 685,
749
+ "Ġpo": 686,
750
+ "ways": 687,
751
+ "ause": 688,
752
+ "fter": 689,
753
+ "Ġhug": 690,
754
+ "Ġlearned": 691,
755
+ "Ġalways": 692,
756
+ "Ġbest": 693,
757
+ "ĠBob": 694,
758
+ "Ġgre": 695,
759
+ "qu": 696,
760
+ "Ġlaughed": 697,
761
+ "Ġun": 698,
762
+ "Ġdecid": 699,
763
+ "urp": 700,
764
+ "Ġroom": 701,
765
+ "Ġop": 702,
766
+ "Ġv": 703,
767
+ "Ġdecided": 704,
768
+ "Ġho": 705,
769
+ "Ġbecause": 706,
770
+ "ĠSo": 707,
771
+ "Ġlook": 708,
772
+ "Ġpe": 709,
773
+ "Ġexc": 710,
774
+ "Ġinto": 711,
775
+ "Ġoutside": 712,
776
+ "Ġjump": 713,
777
+ "fe": 714,
778
+ "Ġboth": 715,
779
+ "Ġshow": 716,
780
+ "ant": 717,
781
+ "Ġeat": 718,
782
+ "ĠAnd": 719,
783
+ "ite": 720,
784
+ "ĠMom": 721,
785
+ "ers": 722,
786
+ "They": 723,
787
+ "Ġke": 724,
788
+ "Ġdad": 725,
789
+ "Ġone": 726,
790
+ "Ġnice": 727,
791
+ "udd": 728,
792
+ "Ġlong": 729,
793
+ "Tom": 730,
794
+ "Ġfast": 731,
795
+ "Ġthis": 732,
796
+ "Yes": 733,
797
+ "Ġrun": 734,
798
+ "ĠE": 735,
799
+ "Ġfeel": 736,
800
+ "Ġexcited": 737,
801
+ "nn": 738,
802
+ "Ġtr": 739,
803
+ "Ġam": 740,
804
+ "iny": 741,
805
+ "Ġtold": 742,
806
+ "Ġpretty": 743,
807
+ "ull": 744,
808
+ "urpr": 745,
809
+ "Ġsk": 746,
810
+ "Ġinside": 747,
811
+ "Ġmo": 748,
812
+ "Ġsurpr": 749,
813
+ "Ġsor": 750,
814
+ "our": 751,
815
+ "ink": 752,
816
+ "og": 753,
817
+ "Ġrock": 754,
818
+ "Ġtake": 755,
819
+ "ara": 756,
820
+ "Ġeach": 757,
821
+ "Ġmuch": 758,
822
+ "Wh": 759,
823
+ "Ġtow": 760,
824
+ "Ġsl": 761,
825
+ "lew": 762,
826
+ "Ġhow": 763,
827
+ "Ġgra": 764,
828
+ "Ġgave": 765,
829
+ "Ġstr": 766,
830
+ "imal": 767,
831
+ "But": 768,
832
+ "Ġanimal": 769,
833
+ "Ġneed": 770,
834
+ "Ġthan": 771,
835
+ "ged": 772,
836
+ "nna": 773,
837
+ "etter": 774,
838
+ "ĠC": 775,
839
+ "ven": 776,
840
+ "Ġunder": 777,
841
+ "Ġsorry": 778,
842
+ "ess": 779,
843
+ "Ġor": 780,
844
+ "Ġold": 781,
845
+ "ro": 782,
846
+ "urt": 783,
847
+ "Ġclo": 784,
848
+ "ised": 785,
849
+ "ge": 786,
850
+ "Ġfish": 787,
851
+ "Ġwalked": 788,
852
+ "Ġbear": 789,
853
+ "Ġcle": 790,
854
+ "ft": 791,
855
+ "Ġkind": 792,
856
+ "urn": 793,
857
+ "ast": 794,
858
+ "ase": 795,
859
+ "ĠHis": 796,
860
+ "Ġflow": 797,
861
+ "Ġhig": 798,
862
+ "Ġhappened": 799,
863
+ "Ġhand": 800,
864
+ "Ġte": 801,
865
+ "and": 802,
866
+ "Ġfood": 803,
867
+ "ving": 804,
868
+ "here": 805,
869
+ "Ġwat": 806,
870
+ "ine": 807,
871
+ "ĠWe": 808,
872
+ "Ġbug": 809,
873
+ "Ġlist": 810,
874
+ "Ġjust": 811,
875
+ "Ġide": 812,
876
+ "Ġsn": 813,
877
+ "Ġtry": 814,
878
+ "Ġanimals": 815,
879
+ "Ġnear": 816,
880
+ "gry": 817,
881
+ "Ġits": 818,
882
+ "Ġsky": 819,
883
+ "Ġdidn": 820,
884
+ "Ġidea": 821,
885
+ "ched": 822,
886
+ "pl": 823,
887
+ "rom": 824,
888
+ "Ġus": 825,
889
+ "Ġfi": 826,
890
+ "pec": 827,
891
+ "Ġbetter": 828,
892
+ "Ġshare": 829,
893
+ "ex": 830,
894
+ "Ġheard": 831,
895
+ "able": 832,
896
+ "ĠAmy": 833,
897
+ "Ġtw": 834,
898
+ "Ġen": 835,
899
+ "Ġlet": 836,
900
+ "more": 837,
901
+ "Ġfly": 838,
902
+ "Ġanymore": 839,
903
+ "ate": 840,
904
+ "Thank": 841,
905
+ "ff": 842,
906
+ "Ġflew": 843,
907
+ "Ġif": 844,
908
+ "Ġcareful": 845,
909
+ "Mom": 846,
910
+ "ĠTh": 847,
911
+ "ial": 848,
912
+ "Ġcom": 849,
913
+ "Ġbu": 850,
914
+ "lf": 851,
915
+ "Ġstor": 852,
916
+ "Ġspec": 853,
917
+ "Ġspecial": 854,
918
+ "ople": 855,
919
+ "ion": 856,
920
+ "Ġby": 857,
921
+ "Ġnever": 858,
922
+ "Ġlots": 859,
923
+ "Ġdan": 860,
924
+ "Ġlove": 861,
925
+ "ream": 862,
926
+ "Ġwind": 863,
927
+ "ort": 864,
928
+ "Ġshiny": 865,
929
+ "Ġfo": 866,
930
+ "Ġclean": 867,
931
+ "Ġtal": 868,
932
+ "ĠThen": 869,
933
+ "Ġend": 870,
934
+ "Ġmag": 871,
935
+ "bb": 872,
936
+ "Ġdon": 873,
937
+ "Ġgr": 874,
938
+ "Ġfore": 875,
939
+ "ak": 876,
940
+ "Ġpeople": 877,
941
+ "Ġeven": 878,
942
+ "Ġhard": 879,
943
+ "rm": 880,
944
+ "Ġover": 881,
945
+ "Ġhigh": 882,
946
+ "udden": 883,
947
+ "Ġturn": 884,
948
+ "Ġsafe": 885,
949
+ "pected": 886,
950
+ "Ġbad": 887,
951
+ "imb": 888,
952
+ "Ġcu": 889,
953
+ "Ġcake": 890,
954
+ "Ġcol": 891,
955
+ "Ġhurt": 892,
956
+ "Ġclimb": 893,
957
+ "Ġafter": 894,
958
+ "Ġfam": 895,
959
+ "Ġloud": 896,
960
+ "expected": 897,
961
+ "Ġunexpected": 898,
962
+ "ĠEvery": 899,
963
+ "Ġsurprised": 900,
964
+ "uddenly": 901,
965
+ "Ġground": 902,
966
+ "Let": 903,
967
+ "Ġbook": 904,
968
+ "Ġim": 905,
969
+ "ber": 906,
970
+ "ĠSara": 907,
971
+ "Ġpicked": 908,
972
+ "ail": 909,
973
+ "ild": 910,
974
+ "Ġgl": 911,
975
+ "Ġdoor": 912,
976
+ "Ġopened": 913,
977
+ "Ġcome": 914,
978
+ "Ġproud": 915,
979
+ "As": 916,
980
+ "ĠAnna": 917,
981
+ "arden": 918,
982
+ "Ġgarden": 919,
983
+ "ĠK": 920,
984
+ "Ġche": 921,
985
+ "ady": 922,
986
+ "'m": 923,
987
+ "Ġthanked": 924,
988
+ "Ġgive": 925,
989
+ "Ġfar": 926,
990
+ "Ġstill": 927,
991
+ "No": 928,
992
+ "Ġblue": 929,
993
+ "ip": 930,
994
+ "Ġcall": 931,
995
+ "Ġway": 932,
996
+ "Ġever": 933,
997
+ "Ġcolor": 934,
998
+ "ĠFrom": 935,
999
+ "Ġhugged": 936,
1000
+ "Ġjumped": 937,
1001
+ "Ġoff": 938,
1002
+ "ĠHer": 939,
1003
+ "iz": 940,
1004
+ "Ġmagic": 941,
1005
+ "ummy": 942,
1006
+ "itty": 943,
1007
+ "ĠWhen": 944,
1008
+ "Ġshould": 945,
1009
+ "ough": 946,
1010
+ "Ġplace": 947,
1011
+ "age": 948,
1012
+ "Ġkid": 949,
1013
+ "ool": 950,
1014
+ "Ġfamily": 951,
1015
+ "ĠIn": 952,
1016
+ "Ġpar": 953,
1017
+ "em": 954,
1018
+ "Ġsmile": 955,
1019
+ "kay": 956,
1020
+ "ct": 957,
1021
+ "Ġgreat": 958,
1022
+ "Ġnow": 959,
1023
+ "Ġwalk": 960,
1024
+ "Ġunt": 961,
1025
+ "uff": 962,
1026
+ "hes": 963,
1027
+ "Ġforest": 964,
1028
+ "Ġstrong": 965,
1029
+ "ĠP": 966,
1030
+ "Ġstay": 967,
1031
+ "Ġboat": 968,
1032
+ "ane": 969,
1033
+ "ture": 970,
1034
+ "Ġqu": 971,
1035
+ "Ġfrog": 972,
1036
+ "Ġuntil": 973,
1037
+ "Ġsto": 974,
1038
+ "lease": 975,
1039
+ "Ġdra": 976,
1040
+ "Ġbra": 977,
1041
+ "Ġhappily": 978,
1042
+ "Ġbro": 979,
1043
+ "dy": 980,
1044
+ "oon": 981,
1045
+ "les": 982,
1046
+ "xt": 983,
1047
+ "Ġnext": 984,
1048
+ "aut": 985,
1049
+ "Ġapp": 986,
1050
+ "Ġhelped": 987,
1051
+ "Ġimp": 988,
1052
+ "Ġstick": 989,
1053
+ "ress": 990,
1054
+ "Ġtown": 991,
1055
+ "Ġclos": 992,
1056
+ "ning": 993,
1057
+ "Ġlisten": 994,
1058
+ "Ġbeaut": 995,
1059
+ "Ġkids": 996,
1060
+ "aking": 997,
1061
+ "Ġsqu": 998,
1062
+ "Ġbeing": 999
1063
+ },
1064
+ "merges": [
1065
+ "Ġ t",
1066
+ "h e",
1067
+ "Ġ a",
1068
+ "Ġ s",
1069
+ "Ġ w",
1070
+ "n d",
1071
+ "Ġt he",
1072
+ "e d",
1073
+ "Ġ b",
1074
+ "Ġt o",
1075
+ "Ġa nd",
1076
+ "Ġ h",
1077
+ "Ġ f",
1078
+ "Ġ T",
1079
+ "i n",
1080
+ "Ġw a",
1081
+ "r e",
1082
+ "i t",
1083
+ "o u",
1084
+ "Ġ l",
1085
+ "Ġ d",
1086
+ "Ġ c",
1087
+ "Ġ p",
1088
+ "a y",
1089
+ "Ġ m",
1090
+ "e r",
1091
+ "Ġwa s",
1092
+ "ĠT he",
1093
+ "o m",
1094
+ "Ġ he",
1095
+ "i s",
1096
+ "Ġ n",
1097
+ "a r",
1098
+ "i m",
1099
+ "o n",
1100
+ "Ġs a",
1101
+ "i d",
1102
+ "l l",
1103
+ "Ġh a",
1104
+ "Ġ g",
1105
+ "a t",
1106
+ "Ġ S",
1107
+ "in g",
1108
+ "o t",
1109
+ "e n",
1110
+ "a n",
1111
+ "l e",
1112
+ "o r",
1113
+ "i r",
1114
+ "a m",
1115
+ "Ġ H",
1116
+ "e t",
1117
+ "Ġ it",
1118
+ "Ġt h",
1119
+ "i g",
1120
+ "ĠThe y",
1121
+ "i l",
1122
+ "Ġ in",
1123
+ "Ġp l",
1124
+ "ĠH e",
1125
+ "Ġ \"",
1126
+ "o w",
1127
+ "r i",
1128
+ "v er",
1129
+ "u t",
1130
+ "Ġ u",
1131
+ "Ġb e",
1132
+ "Ġpl ay",
1133
+ "Ġsa id",
1134
+ "it h",
1135
+ "p p",
1136
+ "Ġw ith",
1137
+ "Ġd ay",
1138
+ "O n",
1139
+ "Ġ y",
1140
+ "Ġ o",
1141
+ "o o",
1142
+ "k ed",
1143
+ "Ġ r",
1144
+ "c e",
1145
+ "Ġ I",
1146
+ "Ġhe r",
1147
+ "ĠT im",
1148
+ "ĠS he",
1149
+ "Ġh is",
1150
+ "l d",
1151
+ "Ġs t",
1152
+ "k e",
1153
+ "Ġ e",
1154
+ "Ġb ig",
1155
+ "n t",
1156
+ "c k",
1157
+ "Ġy ou",
1158
+ "ver y",
1159
+ "s t",
1160
+ "v e",
1161
+ "e nd",
1162
+ "Ġha pp",
1163
+ "Ġ on",
1164
+ "u n",
1165
+ "ri end",
1166
+ "Ġf riend",
1167
+ "a ll",
1168
+ "il y",
1169
+ "Ġ L",
1170
+ "Ġthe y",
1171
+ "Ġn ot",
1172
+ "Ġw e",
1173
+ "Ġha d",
1174
+ "Ġl i",
1175
+ "Ġu p",
1176
+ "Ġwa nt",
1177
+ "he r",
1178
+ "Ġo f",
1179
+ "it t",
1180
+ "a d",
1181
+ "Ġ B",
1182
+ "s e",
1183
+ "Ġd o",
1184
+ "Ġhapp y",
1185
+ "en t",
1186
+ "Ġ very",
1187
+ "' s",
1188
+ "Ġsa w",
1189
+ "e s",
1190
+ "Ġ M",
1191
+ "On e",
1192
+ "Ġth at",
1193
+ "ou ld",
1194
+ "Ġs h",
1195
+ "Ġm om",
1196
+ "Ġf or",
1197
+ "itt le",
1198
+ "Ġl ittle",
1199
+ "Ġs o",
1200
+ "Ġs he",
1201
+ ". \"",
1202
+ "im e",
1203
+ "c h",
1204
+ "Ġn am",
1205
+ "Ġt ime",
1206
+ "Ġn e",
1207
+ "Ġ k",
1208
+ "ou nd",
1209
+ "Ġthe re",
1210
+ "Ġnam ed",
1211
+ "Ġb o",
1212
+ "Ġs m",
1213
+ "ĠL ily",
1214
+ "Ġwe re",
1215
+ "Ġwant ed",
1216
+ "! \"",
1217
+ "Ġfriend s",
1218
+ "Ġb ut",
1219
+ "ir d",
1220
+ "ĠT om",
1221
+ "ou t",
1222
+ "T he",
1223
+ "v ed",
1224
+ "h t",
1225
+ "Ġb ird",
1226
+ "e l",
1227
+ "Ġa n",
1228
+ "a l",
1229
+ "a ke",
1230
+ "ĠI t",
1231
+ "Ġto o",
1232
+ "u g",
1233
+ "om e",
1234
+ "Ġw ent",
1235
+ "Ġhe l",
1236
+ "id e",
1237
+ "On ce",
1238
+ "Ġw h",
1239
+ "Ġhel p",
1240
+ "Ġa ll",
1241
+ "Ġ is",
1242
+ "Ġl oo",
1243
+ "u e",
1244
+ "Ġl o",
1245
+ "Ġup on",
1246
+ "t er",
1247
+ "Ġ A",
1248
+ "r y",
1249
+ "o re",
1250
+ "i nd",
1251
+ "Ġto y",
1252
+ "g et",
1253
+ "Ġf un",
1254
+ "am e",
1255
+ "i ll",
1256
+ "Ġa s",
1257
+ "Ġa t",
1258
+ "Ġd id",
1259
+ "r a",
1260
+ "Ġ j",
1261
+ "get her",
1262
+ "u r",
1263
+ "Ġto gether",
1264
+ "Ġ re",
1265
+ "Ġs e",
1266
+ "a ck",
1267
+ "Ġt re",
1268
+ "Ġc at",
1269
+ "l y",
1270
+ "oo d",
1271
+ "i c",
1272
+ "t ed",
1273
+ "Ġc ould",
1274
+ "Ġdo g",
1275
+ "Ġc an",
1276
+ "e c",
1277
+ "Ġplay ed",
1278
+ "ar k",
1279
+ "Ġthe ir",
1280
+ "ar d",
1281
+ "Ġg ir",
1282
+ "? \"",
1283
+ "Ġb all",
1284
+ "Ġgir l",
1285
+ "Ġr o",
1286
+ "w ay",
1287
+ "Ġh im",
1288
+ "he d",
1289
+ "Ġg o",
1290
+ "Ġl e",
1291
+ "Ġf r",
1292
+ "Ġa re",
1293
+ "u m",
1294
+ "' t",
1295
+ "Ġ out",
1296
+ "m y",
1297
+ "a in",
1298
+ "Ġk n",
1299
+ "Ġsa d",
1300
+ "he n",
1301
+ "a x",
1302
+ "Ġthe m",
1303
+ "Ġbo y",
1304
+ "Ġtre e",
1305
+ "Ġm an",
1306
+ "ot her",
1307
+ "u l",
1308
+ "Ġha ve",
1309
+ "Ġc l",
1310
+ "Ġloo ked",
1311
+ "Ġf ound",
1312
+ "Ġlo ved",
1313
+ "ou g",
1314
+ "Ġst ar",
1315
+ "Ġs p",
1316
+ "on e",
1317
+ "ĠS ue",
1318
+ "h ing",
1319
+ "Ġb ack",
1320
+ "Ġs c",
1321
+ "ow n",
1322
+ "ĠM ax",
1323
+ "Ġli ke",
1324
+ "a re",
1325
+ "Ġm e",
1326
+ "Ġbe c",
1327
+ "s ide",
1328
+ "Ġc ar",
1329
+ "f ul",
1330
+ "on g",
1331
+ "Ġp ark",
1332
+ "ig ht",
1333
+ "o p",
1334
+ "Ġli ked",
1335
+ "Ġ One",
1336
+ "el t",
1337
+ "Ġl a",
1338
+ "Ġm ake",
1339
+ "Ġf a",
1340
+ "Ġw ould",
1341
+ "r ound",
1342
+ "e ll",
1343
+ "Y ou",
1344
+ "Ġf elt",
1345
+ "ĠB ut",
1346
+ "om et",
1347
+ "Ġse e",
1348
+ "Ġn o",
1349
+ "Ġne w",
1350
+ "ĠB en",
1351
+ "Ġ W",
1352
+ "ar ed",
1353
+ "Ġas ked",
1354
+ "Ġc ame",
1355
+ "Ġstar ted",
1356
+ "a g",
1357
+ "Ġ other",
1358
+ "i ce",
1359
+ "ĠS am",
1360
+ "ou se",
1361
+ "Ġa l",
1362
+ "oug ht",
1363
+ "il ed",
1364
+ "Ġg ood",
1365
+ "Ġs omet",
1366
+ "Ġa g",
1367
+ "Ġsm all",
1368
+ "s s",
1369
+ "ad e",
1370
+ "Ġb r",
1371
+ "ri ed",
1372
+ "Ġsm iled",
1373
+ "o b",
1374
+ "Ġs ay",
1375
+ "ing s",
1376
+ "p ot",
1377
+ "Ġf ind",
1378
+ "Ġa way",
1379
+ "t y",
1380
+ "Ġm ade",
1381
+ "Ġw or",
1382
+ "Ġsomet hing",
1383
+ "Ġe x",
1384
+ "Ġp ut",
1385
+ "i a",
1386
+ "Ġh ome",
1387
+ "en ed",
1388
+ "Ġth ought",
1389
+ "Ġwh at",
1390
+ "Ġfr om",
1391
+ "Ġc o",
1392
+ "Ġplay ing",
1393
+ "Ġm u",
1394
+ "Ġe very",
1395
+ "Ġwa l",
1396
+ "a ch",
1397
+ "u c",
1398
+ "ar n",
1399
+ "oo k",
1400
+ "ĠS pot",
1401
+ "Ġr an",
1402
+ "i e",
1403
+ "i le",
1404
+ "Ġag ain",
1405
+ "Ġf l",
1406
+ "Ġ F",
1407
+ "Ġla ug",
1408
+ "Ġd own",
1409
+ "Ġh ouse",
1410
+ "Ġtoy s",
1411
+ "a ve",
1412
+ "Ġsc ared",
1413
+ "d d",
1414
+ "Ġs ome",
1415
+ "k ing",
1416
+ "Ġtoo k",
1417
+ "Ġp r",
1418
+ "Ġ J",
1419
+ "u re",
1420
+ "Ġle arn",
1421
+ "Ġ You",
1422
+ "e p",
1423
+ "Ġw ill",
1424
+ "re t",
1425
+ "n y",
1426
+ "Ġbo x",
1427
+ "Ġm y",
1428
+ "a b",
1429
+ "i f",
1430
+ "i ck",
1431
+ "Ġth ings",
1432
+ "Ġyou r",
1433
+ "ou d",
1434
+ "Ġli ved",
1435
+ "u ck",
1436
+ "Ġb l",
1437
+ "Ġa round",
1438
+ "is h",
1439
+ "u s",
1440
+ "Ġs un",
1441
+ ", \"",
1442
+ "Ġw hen",
1443
+ "Ġs w",
1444
+ "Ġthe n",
1445
+ "Ġf e",
1446
+ "a s",
1447
+ "pp ed",
1448
+ "um p",
1449
+ "Ġc h",
1450
+ "Ġa b",
1451
+ "an k",
1452
+ "uc y",
1453
+ "ĠM ia",
1454
+ "T im",
1455
+ "T h",
1456
+ "is t",
1457
+ "Ġl ot",
1458
+ "ot h",
1459
+ "Ġt ried",
1460
+ "Ġg et",
1461
+ "Ġg ot",
1462
+ "Ġsay s",
1463
+ "Ġkn ow",
1464
+ "it ed",
1465
+ "a p",
1466
+ "Ġman y",
1467
+ "Ġkn e",
1468
+ "Ġwh o",
1469
+ "u st",
1470
+ "Ġin t",
1471
+ "nd er",
1472
+ "Ġd ec",
1473
+ "L ily",
1474
+ "Ġp ret",
1475
+ "Ġab out",
1476
+ "Ġan y",
1477
+ "Ġ D",
1478
+ "Ġr ed",
1479
+ "ou s",
1480
+ "i ve",
1481
+ "Ġm ore",
1482
+ "Ġkne w",
1483
+ "a u",
1484
+ "a ce",
1485
+ "is e",
1486
+ "all y",
1487
+ "Ġwa ter",
1488
+ "Ġc are",
1489
+ "Ġbec ame",
1490
+ "Ġp ic",
1491
+ "ĠL ucy",
1492
+ "Ġp o",
1493
+ "way s",
1494
+ "au se",
1495
+ "f ter",
1496
+ "Ġh ug",
1497
+ "Ġlearn ed",
1498
+ "Ġal ways",
1499
+ "Ġbe st",
1500
+ "ĠB ob",
1501
+ "Ġg re",
1502
+ "q u",
1503
+ "Ġlaug hed",
1504
+ "Ġu n",
1505
+ "Ġdec id",
1506
+ "ur p",
1507
+ "Ġro om",
1508
+ "Ġo p",
1509
+ "Ġ v",
1510
+ "Ġdecid ed",
1511
+ "Ġh o",
1512
+ "Ġbec ause",
1513
+ "ĠS o",
1514
+ "Ġloo k",
1515
+ "Ġp e",
1516
+ "Ġex c",
1517
+ "Ġint o",
1518
+ "Ġout side",
1519
+ "Ġj ump",
1520
+ "f e",
1521
+ "Ġb oth",
1522
+ "Ġsh ow",
1523
+ "an t",
1524
+ "Ġe at",
1525
+ "ĠA nd",
1526
+ "it e",
1527
+ "ĠM om",
1528
+ "er s",
1529
+ "The y",
1530
+ "Ġ ke",
1531
+ "Ġd ad",
1532
+ "Ġon e",
1533
+ "Ġn ice",
1534
+ "u dd",
1535
+ "Ġl ong",
1536
+ "T om",
1537
+ "Ġfa st",
1538
+ "Ġth is",
1539
+ "Y es",
1540
+ "Ġr un",
1541
+ "Ġ E",
1542
+ "Ġfe el",
1543
+ "Ġexc ited",
1544
+ "n n",
1545
+ "Ġt r",
1546
+ "Ġa m",
1547
+ "in y",
1548
+ "Ġto ld",
1549
+ "Ġpret ty",
1550
+ "u ll",
1551
+ "urp r",
1552
+ "Ġs k",
1553
+ "Ġin side",
1554
+ "Ġm o",
1555
+ "Ġs urpr",
1556
+ "Ġs or",
1557
+ "ou r",
1558
+ "in k",
1559
+ "o g",
1560
+ "Ġro ck",
1561
+ "Ġt ake",
1562
+ "ar a",
1563
+ "Ġe ach",
1564
+ "Ġmu ch",
1565
+ "W h",
1566
+ "Ġto w",
1567
+ "Ġs l",
1568
+ "le w",
1569
+ "Ġh ow",
1570
+ "Ġg ra",
1571
+ "Ġg ave",
1572
+ "Ġst r",
1573
+ "im al",
1574
+ "B ut",
1575
+ "Ġan imal",
1576
+ "Ġne ed",
1577
+ "Ġth an",
1578
+ "g ed",
1579
+ "nn a",
1580
+ "et ter",
1581
+ "Ġ C",
1582
+ "v en",
1583
+ "Ġu nder",
1584
+ "Ġsor ry",
1585
+ "es s",
1586
+ "Ġ or",
1587
+ "Ġo ld",
1588
+ "r o",
1589
+ "ur t",
1590
+ "Ġcl o",
1591
+ "is ed",
1592
+ "g e",
1593
+ "Ġf ish",
1594
+ "Ġwal ked",
1595
+ "Ġbe ar",
1596
+ "Ġc le",
1597
+ "f t",
1598
+ "Ġk ind",
1599
+ "ur n",
1600
+ "a st",
1601
+ "a se",
1602
+ "ĠH is",
1603
+ "Ġfl ow",
1604
+ "Ġh ig",
1605
+ "Ġhapp ened",
1606
+ "Ġha nd",
1607
+ "Ġt e",
1608
+ "a nd",
1609
+ "Ġf ood",
1610
+ "v ing",
1611
+ "he re",
1612
+ "Ġwa t",
1613
+ "in e",
1614
+ "ĠW e",
1615
+ "Ġb ug",
1616
+ "Ġl ist",
1617
+ "Ġj ust",
1618
+ "Ġ ide",
1619
+ "Ġs n",
1620
+ "Ġt ry",
1621
+ "Ġanimal s",
1622
+ "Ġne ar",
1623
+ "g ry",
1624
+ "Ġit s",
1625
+ "Ġsk y",
1626
+ "Ġdid n",
1627
+ "Ġide a",
1628
+ "c hed",
1629
+ "p l",
1630
+ "r om",
1631
+ "Ġu s",
1632
+ "Ġf i",
1633
+ "p ec",
1634
+ "Ġb etter",
1635
+ "Ġsh are",
1636
+ "e x",
1637
+ "Ġhe ard",
1638
+ "ab le",
1639
+ "ĠA my",
1640
+ "Ġt w",
1641
+ "Ġ en",
1642
+ "Ġl et",
1643
+ "m ore",
1644
+ "Ġf ly",
1645
+ "Ġany more",
1646
+ "at e",
1647
+ "Th ank",
1648
+ "f f",
1649
+ "Ġf lew",
1650
+ "Ġ if",
1651
+ "Ġcare ful",
1652
+ "M om",
1653
+ "ĠT h",
1654
+ "i al",
1655
+ "Ġc om",
1656
+ "Ġb u",
1657
+ "l f",
1658
+ "Ġst or",
1659
+ "Ġsp ec",
1660
+ "Ġspec ial",
1661
+ "op le",
1662
+ "i on",
1663
+ "Ġb y",
1664
+ "Ġne ver",
1665
+ "Ġlot s",
1666
+ "Ġd an",
1667
+ "Ġlo ve",
1668
+ "re am",
1669
+ "Ġw ind",
1670
+ "or t",
1671
+ "Ġsh iny",
1672
+ "Ġf o",
1673
+ "Ġcle an",
1674
+ "Ġt al",
1675
+ "ĠThe n",
1676
+ "Ġe nd",
1677
+ "Ġm ag",
1678
+ "b b",
1679
+ "Ġd on",
1680
+ "Ġg r",
1681
+ "Ġf ore",
1682
+ "a k",
1683
+ "Ġpe ople",
1684
+ "Ġe ven",
1685
+ "Ġh ard",
1686
+ "r m",
1687
+ "Ġo ver",
1688
+ "Ġhig h",
1689
+ "udd en",
1690
+ "Ġt urn",
1691
+ "Ġsa fe",
1692
+ "pec ted",
1693
+ "Ġb ad",
1694
+ "im b",
1695
+ "Ġc u",
1696
+ "Ġc ake",
1697
+ "Ġco l",
1698
+ "Ġh urt",
1699
+ "Ġcl imb",
1700
+ "Ġa fter",
1701
+ "Ġf am",
1702
+ "Ġl oud",
1703
+ "ex pected",
1704
+ "Ġun expected",
1705
+ "ĠE very",
1706
+ "Ġsurpr ised",
1707
+ "udden ly",
1708
+ "Ġg round",
1709
+ "L et",
1710
+ "Ġb ook",
1711
+ "Ġ im",
1712
+ "b er",
1713
+ "ĠS ara",
1714
+ "Ġpic ked",
1715
+ "a il",
1716
+ "il d",
1717
+ "Ġg l",
1718
+ "Ġdo or",
1719
+ "Ġop ened",
1720
+ "Ġc ome",
1721
+ "Ġpr oud",
1722
+ "A s",
1723
+ "ĠA nna",
1724
+ "ard en",
1725
+ "Ġg arden",
1726
+ "Ġ K",
1727
+ "Ġc he",
1728
+ "ad y",
1729
+ "' m",
1730
+ "Ġthan ked",
1731
+ "Ġg ive",
1732
+ "Ġf ar",
1733
+ "Ġst ill",
1734
+ "N o",
1735
+ "Ġbl ue",
1736
+ "i p",
1737
+ "Ġc all",
1738
+ "Ġwa y",
1739
+ "Ġe ver",
1740
+ "Ġcol or",
1741
+ "ĠF rom",
1742
+ "Ġhug ged",
1743
+ "Ġjump ed",
1744
+ "Ġof f",
1745
+ "ĠH er",
1746
+ "i z",
1747
+ "Ġmag ic",
1748
+ "um my",
1749
+ "itt y",
1750
+ "ĠW hen",
1751
+ "Ġsh ould",
1752
+ "oug h",
1753
+ "Ġpl ace",
1754
+ "ag e",
1755
+ "Ġk id",
1756
+ "oo l",
1757
+ "Ġfam ily",
1758
+ "ĠI n",
1759
+ "Ġp ar",
1760
+ "e m",
1761
+ "Ġsm ile",
1762
+ "k ay",
1763
+ "c t",
1764
+ "Ġgre at",
1765
+ "Ġn ow",
1766
+ "Ġwal k",
1767
+ "Ġu nt",
1768
+ "u ff",
1769
+ "he s",
1770
+ "Ġfore st",
1771
+ "Ġstr ong",
1772
+ "Ġ P",
1773
+ "Ġst ay",
1774
+ "Ġbo at",
1775
+ "an e",
1776
+ "t ure",
1777
+ "Ġ qu",
1778
+ "Ġfr og",
1779
+ "Ġunt il",
1780
+ "Ġst o",
1781
+ "le ase",
1782
+ "Ġd ra",
1783
+ "Ġb ra",
1784
+ "Ġhapp ily",
1785
+ "Ġbr o",
1786
+ "d y",
1787
+ "o on",
1788
+ "le s",
1789
+ "x t",
1790
+ "Ġne xt",
1791
+ "a ut",
1792
+ "Ġa pp",
1793
+ "Ġhelp ed",
1794
+ "Ġim p",
1795
+ "Ġst ick",
1796
+ "re ss",
1797
+ "Ġtow n",
1798
+ "Ġclo s",
1799
+ "n ing",
1800
+ "Ġlist en",
1801
+ "Ġbe aut",
1802
+ "Ġkid s",
1803
+ "a king",
1804
+ "Ġs qu",
1805
+ "Ġbe ing"
1806
+ ]
1807
+ }
1808
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<bos>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<eos>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "<pad>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ }
27
+ },
28
+ "bos_token": "<bos>",
29
+ "clean_up_tokenization_spaces": true,
30
+ "eos_token": "<eos>",
31
+ "model_max_length": 1000000000000000019884624838656,
32
+ "pad_token": "<pad>",
33
+ "tokenizer_class": "PreTrainedTokenizerFast"
34
+ }