BlackBeenie commited on
Commit
c3c5faa
1 Parent(s): c731a76

Add new SentenceTransformer model.

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,875 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: microsoft/mdeberta-v3-base
3
+ library_name: sentence-transformers
4
+ pipeline_tag: sentence-similarity
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:498970
11
+ - loss:BPRLoss
12
+ widget:
13
+ - source_sentence: meaning of the prefix em
14
+ sentences:
15
+ - Word Origin and History for em- Expand. from French assimilation of en- to following
16
+ labial (see en- (1)). Also a prefix used to form verbs from adjectives and nouns.
17
+ representing Latin ex- assimilated to following -m- (see ex-).
18
+ - Rating Newest Oldest. 1 MO probably has the most insanely complex sales tax in
19
+ the country. Not only is there a state level tax (4.225% for most items and 1.225%
20
+ for grocery foods) but city and county level sales taxes. 2 The sales tax is
21
+ set by county. Go to Missouri Sales Tax website and look up your county.
22
+ - 'Prefixes: Un, Dis, Im, Mis. A prefix is placed at the beginning of a word to
23
+ change its meaning. For example, the suffix re- means either again or back as
24
+ in return, repeat or refurbish. The following 4 prefixes are easy to confuse because
25
+ they all have a negative meaning. un-.'
26
+ - source_sentence: is woolwich london safe
27
+ sentences:
28
+ - SE18 has four train stations Plumstead, Woolwich Arsenal and Woolwich Dockyard.
29
+ Plumstead and Woolwich Arsenal are situated in Zone 4, Woolwich Dockyard in Zone
30
+ 3.Approximately just under 30 minutes to Charing Cross from all Stations. Trains
31
+ are operated buy South-eastern. Train timetables are available at southeasternrailway.co.uk.here
32
+ is no shortage of schools, libraries and colleges in SE18. A short walk from Plumstead
33
+ station is Greenwich Community College offering a wide range of courses from cookery
34
+ to languages. Notable schools include the newly re-built Foxfield Primary, Saint
35
+ Pauls and Plumstead Mannor.
36
+ - "In its heyday Woolwich was known better known as the home of Arsenal Football\
37
+ \ Club, the first McDonalds in the UK and the base for the British Armyâ\x80\x99\
38
+ s artillery. At present, it is safe to say the town would not be found in any\
39
+ \ London travel guide."
40
+ - Income and Qualifications. Car sales consultants often have compensation packages
41
+ that include salary, commissions and bonuses. For example, Ford Motor sales reps
42
+ earned an average base salary of $37,000, according to Glassdoor -- with the rest
43
+ of their $54,600 in earnings comprised of commissions and benefits.
44
+ - source_sentence: who is christopher kyle
45
+ sentences:
46
+ - Kyle Kulinski is an American Political Activist, progressive talk radio host,
47
+ social democratic political commentator, and the co-founder of Justice Democrats.
48
+ He is the host and producer of the YouTube show Secular Talk, an affiliate of
49
+ The Young Turks network.
50
+ - A passport card is valid for travel to and from Canada, Mexico, the Caribbean
51
+ and Bermuda at land border crossings and sea ports-of-entry. It is not valid for
52
+ air travel. It is valid for 10 years for adults and 5 years for minors under 16.
53
+ A first passport book costs $135 for adults and $105 for minors under the age
54
+ of 16. It costs $110 to renew. A first passport card costs $55 for adults and
55
+ $40 for minors under the age of 16. It costs $30 to renew. The cost when applying
56
+ for both is $165 for adults and $120 for minors.
57
+ - Chris Kyle American Sniper. Christopher Scott Kyle was born and raised in Texas
58
+ and was a United States Navy SEAL from 1999 to 2009. He is currently known as
59
+ the most successful sniper in American military history. According to his book
60
+ American Sniper, he had 160 confirmed kills (which was from 255 claimed kills).
61
+ - source_sentence: do potato chips have sugar
62
+ sentences:
63
+ - Glycemic Index. White potatoes, whether you have them mashed, baked, as french
64
+ fries or potato chips, have a high glycemic index, which means that their carbohydrates
65
+ are quickly turned into sugar, which elevates your blood sugar levels after your
66
+ meal.ating sweet potatoes in moderate amounts will help you keep your blood sugar
67
+ levels in the healthy range even if you have diabetes. A medium sweet potato contains
68
+ 26 grams of carbohydrates, of which 3.8 grams are dietary fiber, while a cup of
69
+ mashed sweet potatoes has 58 grams of carbohydrates and 8.2 grams of fiber.
70
+ - So before tying that knot in the morning, consider what personality traits you
71
+ are conveying through the color of your tie. Reds are a power color, symbolizing
72
+ wealth, strength, and passion. Many cultures also find special meaning in the
73
+ color red, such as good luck.
74
+ - "Corn chips have a glycemic index score of 42, which is in the low range and indicates\
75
+ \ they wonâ\x80\x99t spike your blood sugar. Of the total carbohydrates, 1.5 grams\
76
+ \ are dietary fiber, 16 grams are complex carbs in the form of starches and only\
77
+ \ 0.3 grams are sugar."
78
+ - source_sentence: definition of stoop
79
+ sentences:
80
+ - Definition of stoop written for English Language Learners from the Merriam-Webster
81
+ Learner's Dictionary with audio pronunciations, usage examples, and count/noncount
82
+ noun labels. Learner's Dictionary mobile search
83
+ - "Define stoop: to bend the body or a part of the body forward and downward sometimes\
84
+ \ simultaneously bending the knees â\x80\x94 stoop in a sentence to bend the body\
85
+ \ or a part of the body forward and downward sometimes simultaneously bending\
86
+ \ the kneesâ\x80¦ See the full definition"
87
+ - Blood plasma is the yellow liquid in which blood cells float. Plasma is made up
88
+ of nutrients, electrolytes (salts), gases, non-protein hormones, waste, lipids,
89
+ and proteins.These proteins are albumin, antibodies (also called immunoglobulins),
90
+ clotting factors, and protein hormones.lood plasma is the yellow liquid in which
91
+ blood cells float. Plasma is made up of nutrients, electrolytes (salts), gases,
92
+ non-protein hormones, waste, lipids, and proteins.
93
+ ---
94
+
95
+ # SentenceTransformer based on microsoft/mdeberta-v3-base
96
+
97
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
98
+
99
+ ## Model Details
100
+
101
+ ### Model Description
102
+ - **Model Type:** Sentence Transformer
103
+ - **Base model:** [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) <!-- at revision a0484667b22365f84929a935b5e50a51f71f159d -->
104
+ - **Maximum Sequence Length:** 1024 tokens
105
+ - **Output Dimensionality:** 768 tokens
106
+ - **Similarity Function:** Cosine Similarity
107
+ <!-- - **Training Dataset:** Unknown -->
108
+ <!-- - **Language:** Unknown -->
109
+ <!-- - **License:** Unknown -->
110
+
111
+ ### Model Sources
112
+
113
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
114
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
115
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
116
+
117
+ ### Full Model Architecture
118
+
119
+ ```
120
+ SentenceTransformer(
121
+ (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: DebertaV2Model
122
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
123
+ )
124
+ ```
125
+
126
+ ## Usage
127
+
128
+ ### Direct Usage (Sentence Transformers)
129
+
130
+ First install the Sentence Transformers library:
131
+
132
+ ```bash
133
+ pip install -U sentence-transformers
134
+ ```
135
+
136
+ Then you can load this model and run inference.
137
+ ```python
138
+ from sentence_transformers import SentenceTransformer
139
+
140
+ # Download from the 🤗 Hub
141
+ model = SentenceTransformer("BlackBeenie/mdeberta-v3-base-msmarco-v3-bpr")
142
+ # Run inference
143
+ sentences = [
144
+ 'definition of stoop',
145
+ 'Define stoop: to bend the body or a part of the body forward and downward sometimes simultaneously bending the knees â\x80\x94 stoop in a sentence to bend the body or a part of the body forward and downward sometimes simultaneously bending the kneesâ\x80¦ See the full definition',
146
+ "Definition of stoop written for English Language Learners from the Merriam-Webster Learner's Dictionary with audio pronunciations, usage examples, and count/noncount noun labels. Learner's Dictionary mobile search",
147
+ ]
148
+ embeddings = model.encode(sentences)
149
+ print(embeddings.shape)
150
+ # [3, 768]
151
+
152
+ # Get the similarity scores for the embeddings
153
+ similarities = model.similarity(embeddings, embeddings)
154
+ print(similarities.shape)
155
+ # [3, 3]
156
+ ```
157
+
158
+ <!--
159
+ ### Direct Usage (Transformers)
160
+
161
+ <details><summary>Click to see the direct usage in Transformers</summary>
162
+
163
+ </details>
164
+ -->
165
+
166
+ <!--
167
+ ### Downstream Usage (Sentence Transformers)
168
+
169
+ You can finetune this model on your own dataset.
170
+
171
+ <details><summary>Click to expand</summary>
172
+
173
+ </details>
174
+ -->
175
+
176
+ <!--
177
+ ### Out-of-Scope Use
178
+
179
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
180
+ -->
181
+
182
+ <!--
183
+ ## Bias, Risks and Limitations
184
+
185
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
186
+ -->
187
+
188
+ <!--
189
+ ### Recommendations
190
+
191
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
192
+ -->
193
+
194
+ ## Training Details
195
+
196
+ ### Training Dataset
197
+
198
+ #### Unnamed Dataset
199
+
200
+
201
+ * Size: 498,970 training samples
202
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
203
+ * Approximate statistics based on the first 1000 samples:
204
+ | | sentence_0 | sentence_1 | sentence_2 |
205
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
206
+ | type | string | string | string |
207
+ | details | <ul><li>min: 4 tokens</li><li>mean: 10.61 tokens</li><li>max: 40 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 96.41 tokens</li><li>max: 259 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 92.21 tokens</li><li>max: 250 tokens</li></ul> |
208
+ * Samples:
209
+ | sentence_0 | sentence_1 | sentence_2 |
210
+ |:-------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
211
+ | <code>how much does it cost to paint a interior house</code> | <code>Interior House Painting Cost Factors. Generally, it will take a minimum of two gallons of paint to cover a room. At the highest end, paint will cost anywhere between $30 and $60 per gallon and come in three different finishes: flat, semi-gloss or high-gloss.Flat finishes are the least shiny and are best suited for areas requiring frequent cleaning.rovide a few details about your project and receive competitive quotes from local pros. The average national cost to paint a home interior is $1,671, with most homeowners spending between $966 and $2,426.</code> | <code>How Much to Charge to Paint the Interior of a House (and how much not to charge) Let me give you an example - stay with me here. Imagine you drop all of your painting estimates by 20% to win more jobs. Maybe you'll close $10,000 in sales instead of $6,000 (because you had a better price - you landed an extra job)...</code> |
212
+ | <code>when is s corp taxes due</code> | <code>If you form a corporate entity for your small business, regardless of whether it's taxed as a C or S corporation, a tax return must be filed with the Internal Revenue Service on its due date each year. Corporate tax returns are always due on the 15th day of the third month following the close of the tax year. The actual day that the tax return filing deadline falls on, however, isn't the same for every corporation.</code> | <code>In Summary. 1 S-corporations are pass-through entities. 2 Form 1120S is the form used for an S-corp’s annual tax return. 3 Shareholders do not have to pay self-employment tax on their share of an S-corp’s profits.</code> |
213
+ | <code>what are disaccharides</code> | <code>Disaccharides are formed when two monosaccharides are joined together and a molecule of water is removed, a process known as dehydration reaction. For example; milk sugar (lactose) is made from glucose and galactose whereas the sugar from sugar cane and sugar beets (sucrose) is made from glucose and fructose.altose, another notable disaccharide, is made up of two glucose molecules. The two monosaccharides are bonded via a dehydration reaction (also called a condensation reaction or dehydration synthesis) that leads to the loss of a molecule of water and formation of a glycosidic bond.</code> | <code>No. Sugars and starches are types of carbohydrates,(ex: monosaccharides, disaccharides) Lipids are much different.o. Sugars and starches are types of carbohydrates,(ex: monosaccharides, disaccharides) Lipids are much different.</code> |
214
+ * Loss: <code>beir.losses.bpr_loss.BPRLoss</code>
215
+
216
+ ### Training Hyperparameters
217
+ #### Non-Default Hyperparameters
218
+
219
+ - `eval_strategy`: steps
220
+ - `per_device_train_batch_size`: 32
221
+ - `per_device_eval_batch_size`: 32
222
+ - `num_train_epochs`: 15
223
+ - `fp16`: True
224
+ - `multi_dataset_batch_sampler`: round_robin
225
+
226
+ #### All Hyperparameters
227
+ <details><summary>Click to expand</summary>
228
+
229
+ - `overwrite_output_dir`: False
230
+ - `do_predict`: False
231
+ - `eval_strategy`: steps
232
+ - `prediction_loss_only`: True
233
+ - `per_device_train_batch_size`: 32
234
+ - `per_device_eval_batch_size`: 32
235
+ - `per_gpu_train_batch_size`: None
236
+ - `per_gpu_eval_batch_size`: None
237
+ - `gradient_accumulation_steps`: 1
238
+ - `eval_accumulation_steps`: None
239
+ - `torch_empty_cache_steps`: None
240
+ - `learning_rate`: 5e-05
241
+ - `weight_decay`: 0.0
242
+ - `adam_beta1`: 0.9
243
+ - `adam_beta2`: 0.999
244
+ - `adam_epsilon`: 1e-08
245
+ - `max_grad_norm`: 1
246
+ - `num_train_epochs`: 15
247
+ - `max_steps`: -1
248
+ - `lr_scheduler_type`: linear
249
+ - `lr_scheduler_kwargs`: {}
250
+ - `warmup_ratio`: 0.0
251
+ - `warmup_steps`: 0
252
+ - `log_level`: passive
253
+ - `log_level_replica`: warning
254
+ - `log_on_each_node`: True
255
+ - `logging_nan_inf_filter`: True
256
+ - `save_safetensors`: True
257
+ - `save_on_each_node`: False
258
+ - `save_only_model`: False
259
+ - `restore_callback_states_from_checkpoint`: False
260
+ - `no_cuda`: False
261
+ - `use_cpu`: False
262
+ - `use_mps_device`: False
263
+ - `seed`: 42
264
+ - `data_seed`: None
265
+ - `jit_mode_eval`: False
266
+ - `use_ipex`: False
267
+ - `bf16`: False
268
+ - `fp16`: True
269
+ - `fp16_opt_level`: O1
270
+ - `half_precision_backend`: auto
271
+ - `bf16_full_eval`: False
272
+ - `fp16_full_eval`: False
273
+ - `tf32`: None
274
+ - `local_rank`: 0
275
+ - `ddp_backend`: None
276
+ - `tpu_num_cores`: None
277
+ - `tpu_metrics_debug`: False
278
+ - `debug`: []
279
+ - `dataloader_drop_last`: False
280
+ - `dataloader_num_workers`: 0
281
+ - `dataloader_prefetch_factor`: None
282
+ - `past_index`: -1
283
+ - `disable_tqdm`: False
284
+ - `remove_unused_columns`: True
285
+ - `label_names`: None
286
+ - `load_best_model_at_end`: False
287
+ - `ignore_data_skip`: False
288
+ - `fsdp`: []
289
+ - `fsdp_min_num_params`: 0
290
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
291
+ - `fsdp_transformer_layer_cls_to_wrap`: None
292
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
293
+ - `deepspeed`: None
294
+ - `label_smoothing_factor`: 0.0
295
+ - `optim`: adamw_torch
296
+ - `optim_args`: None
297
+ - `adafactor`: False
298
+ - `group_by_length`: False
299
+ - `length_column_name`: length
300
+ - `ddp_find_unused_parameters`: None
301
+ - `ddp_bucket_cap_mb`: None
302
+ - `ddp_broadcast_buffers`: False
303
+ - `dataloader_pin_memory`: True
304
+ - `dataloader_persistent_workers`: False
305
+ - `skip_memory_metrics`: True
306
+ - `use_legacy_prediction_loop`: False
307
+ - `push_to_hub`: False
308
+ - `resume_from_checkpoint`: None
309
+ - `hub_model_id`: None
310
+ - `hub_strategy`: every_save
311
+ - `hub_private_repo`: False
312
+ - `hub_always_push`: False
313
+ - `gradient_checkpointing`: False
314
+ - `gradient_checkpointing_kwargs`: None
315
+ - `include_inputs_for_metrics`: False
316
+ - `eval_do_concat_batches`: True
317
+ - `fp16_backend`: auto
318
+ - `push_to_hub_model_id`: None
319
+ - `push_to_hub_organization`: None
320
+ - `mp_parameters`:
321
+ - `auto_find_batch_size`: False
322
+ - `full_determinism`: False
323
+ - `torchdynamo`: None
324
+ - `ray_scope`: last
325
+ - `ddp_timeout`: 1800
326
+ - `torch_compile`: False
327
+ - `torch_compile_backend`: None
328
+ - `torch_compile_mode`: None
329
+ - `dispatch_batches`: None
330
+ - `split_batches`: None
331
+ - `include_tokens_per_second`: False
332
+ - `include_num_input_tokens_seen`: False
333
+ - `neftune_noise_alpha`: None
334
+ - `optim_target_modules`: None
335
+ - `batch_eval_metrics`: False
336
+ - `eval_on_start`: False
337
+ - `eval_use_gather_object`: False
338
+ - `batch_sampler`: batch_sampler
339
+ - `multi_dataset_batch_sampler`: round_robin
340
+
341
+ </details>
342
+
343
+ ### Training Logs
344
+ <details><summary>Click to expand</summary>
345
+
346
+ | Epoch | Step | Training Loss |
347
+ |:-------:|:------:|:-------------:|
348
+ | 0.0321 | 500 | 7.0196 |
349
+ | 0.0641 | 1000 | 2.0193 |
350
+ | 0.0962 | 1500 | 1.4466 |
351
+ | 0.1283 | 2000 | 1.1986 |
352
+ | 0.1603 | 2500 | 1.0912 |
353
+ | 0.1924 | 3000 | 1.0179 |
354
+ | 0.2245 | 3500 | 0.9659 |
355
+ | 0.2565 | 4000 | 0.9229 |
356
+ | 0.2886 | 4500 | 0.9034 |
357
+ | 0.3207 | 5000 | 0.871 |
358
+ | 0.3527 | 5500 | 0.8474 |
359
+ | 0.3848 | 6000 | 0.8247 |
360
+ | 0.4169 | 6500 | 0.8377 |
361
+ | 0.4489 | 7000 | 0.8119 |
362
+ | 0.4810 | 7500 | 0.8042 |
363
+ | 0.5131 | 8000 | 0.7831 |
364
+ | 0.5451 | 8500 | 0.7667 |
365
+ | 0.5772 | 9000 | 0.7653 |
366
+ | 0.6092 | 9500 | 0.7502 |
367
+ | 0.6413 | 10000 | 0.7615 |
368
+ | 0.6734 | 10500 | 0.7435 |
369
+ | 0.7054 | 11000 | 0.7346 |
370
+ | 0.7375 | 11500 | 0.718 |
371
+ | 0.7696 | 12000 | 0.711 |
372
+ | 0.8016 | 12500 | 0.6963 |
373
+ | 0.8337 | 13000 | 0.6969 |
374
+ | 0.8658 | 13500 | 0.6937 |
375
+ | 0.8978 | 14000 | 0.6721 |
376
+ | 0.9299 | 14500 | 0.6902 |
377
+ | 0.9620 | 15000 | 0.6783 |
378
+ | 0.9940 | 15500 | 0.6669 |
379
+ | 1.0 | 15593 | - |
380
+ | 1.0261 | 16000 | 0.689 |
381
+ | 1.0582 | 16500 | 0.6549 |
382
+ | 1.0902 | 17000 | 0.6354 |
383
+ | 1.1223 | 17500 | 0.6013 |
384
+ | 1.1544 | 18000 | 0.6091 |
385
+ | 1.1864 | 18500 | 0.5907 |
386
+ | 1.2185 | 19000 | 0.5979 |
387
+ | 1.2506 | 19500 | 0.5724 |
388
+ | 1.2826 | 20000 | 0.5718 |
389
+ | 1.3147 | 20500 | 0.5851 |
390
+ | 1.3468 | 21000 | 0.5716 |
391
+ | 1.3788 | 21500 | 0.5568 |
392
+ | 1.4109 | 22000 | 0.5502 |
393
+ | 1.4430 | 22500 | 0.5591 |
394
+ | 1.4750 | 23000 | 0.5688 |
395
+ | 1.5071 | 23500 | 0.5484 |
396
+ | 1.5392 | 24000 | 0.531 |
397
+ | 1.5712 | 24500 | 0.5445 |
398
+ | 1.6033 | 25000 | 0.5269 |
399
+ | 1.6353 | 25500 | 0.55 |
400
+ | 1.6674 | 26000 | 0.537 |
401
+ | 1.6995 | 26500 | 0.5259 |
402
+ | 1.7315 | 27000 | 0.5153 |
403
+ | 1.7636 | 27500 | 0.5184 |
404
+ | 1.7957 | 28000 | 0.5154 |
405
+ | 1.8277 | 28500 | 0.5279 |
406
+ | 1.8598 | 29000 | 0.5267 |
407
+ | 1.8919 | 29500 | 0.4938 |
408
+ | 1.9239 | 30000 | 0.5088 |
409
+ | 1.9560 | 30500 | 0.516 |
410
+ | 1.9881 | 31000 | 0.4998 |
411
+ | 2.0 | 31186 | - |
412
+ | 2.0201 | 31500 | 0.5252 |
413
+ | 2.0522 | 32000 | 0.4998 |
414
+ | 2.0843 | 32500 | 0.484 |
415
+ | 2.1163 | 33000 | 0.4612 |
416
+ | 2.1484 | 33500 | 0.4617 |
417
+ | 2.1805 | 34000 | 0.4441 |
418
+ | 2.2125 | 34500 | 0.4653 |
419
+ | 2.2446 | 35000 | 0.4592 |
420
+ | 2.2767 | 35500 | 0.4347 |
421
+ | 2.3087 | 36000 | 0.4557 |
422
+ | 2.3408 | 36500 | 0.4401 |
423
+ | 2.3729 | 37000 | 0.436 |
424
+ | 2.4049 | 37500 | 0.4315 |
425
+ | 2.4370 | 38000 | 0.4447 |
426
+ | 2.4691 | 38500 | 0.4258 |
427
+ | 2.5011 | 39000 | 0.4275 |
428
+ | 2.5332 | 39500 | 0.4142 |
429
+ | 2.5653 | 40000 | 0.434 |
430
+ | 2.5973 | 40500 | 0.4222 |
431
+ | 2.6294 | 41000 | 0.4284 |
432
+ | 2.6615 | 41500 | 0.4187 |
433
+ | 2.6935 | 42000 | 0.4156 |
434
+ | 2.7256 | 42500 | 0.4054 |
435
+ | 2.7576 | 43000 | 0.4182 |
436
+ | 2.7897 | 43500 | 0.4142 |
437
+ | 2.8218 | 44000 | 0.4152 |
438
+ | 2.8538 | 44500 | 0.421 |
439
+ | 2.8859 | 45000 | 0.403 |
440
+ | 2.9180 | 45500 | 0.4003 |
441
+ | 2.9500 | 46000 | 0.4032 |
442
+ | 2.9821 | 46500 | 0.4072 |
443
+ | 3.0 | 46779 | - |
444
+ | 3.0142 | 47000 | 0.4137 |
445
+ | 3.0462 | 47500 | 0.4151 |
446
+ | 3.0783 | 48000 | 0.3959 |
447
+ | 3.1104 | 48500 | 0.3808 |
448
+ | 3.1424 | 49000 | 0.3701 |
449
+ | 3.1745 | 49500 | 0.3716 |
450
+ | 3.2066 | 50000 | 0.387 |
451
+ | 3.2386 | 50500 | 0.3747 |
452
+ | 3.2707 | 51000 | 0.3488 |
453
+ | 3.3028 | 51500 | 0.3795 |
454
+ | 3.3348 | 52000 | 0.3511 |
455
+ | 3.3669 | 52500 | 0.3469 |
456
+ | 3.3990 | 53000 | 0.3475 |
457
+ | 3.4310 | 53500 | 0.3669 |
458
+ | 3.4631 | 54000 | 0.3428 |
459
+ | 3.4952 | 54500 | 0.3597 |
460
+ | 3.5272 | 55000 | 0.3525 |
461
+ | 3.5593 | 55500 | 0.3502 |
462
+ | 3.5914 | 56000 | 0.3446 |
463
+ | 3.6234 | 56500 | 0.3563 |
464
+ | 3.6555 | 57000 | 0.34 |
465
+ | 3.6876 | 57500 | 0.3385 |
466
+ | 3.7196 | 58000 | 0.335 |
467
+ | 3.7517 | 58500 | 0.3344 |
468
+ | 3.7837 | 59000 | 0.3361 |
469
+ | 3.8158 | 59500 | 0.3285 |
470
+ | 3.8479 | 60000 | 0.3429 |
471
+ | 3.8799 | 60500 | 0.3162 |
472
+ | 3.9120 | 61000 | 0.3279 |
473
+ | 3.9441 | 61500 | 0.3448 |
474
+ | 3.9761 | 62000 | 0.322 |
475
+ | 4.0 | 62372 | - |
476
+ | 4.0082 | 62500 | 0.3356 |
477
+ | 4.0403 | 63000 | 0.3416 |
478
+ | 4.0723 | 63500 | 0.3195 |
479
+ | 4.1044 | 64000 | 0.3033 |
480
+ | 4.1365 | 64500 | 0.2957 |
481
+ | 4.1685 | 65000 | 0.312 |
482
+ | 4.2006 | 65500 | 0.3135 |
483
+ | 4.2327 | 66000 | 0.3193 |
484
+ | 4.2647 | 66500 | 0.2919 |
485
+ | 4.2968 | 67000 | 0.3078 |
486
+ | 4.3289 | 67500 | 0.302 |
487
+ | 4.3609 | 68000 | 0.2973 |
488
+ | 4.3930 | 68500 | 0.2725 |
489
+ | 4.4251 | 69000 | 0.3013 |
490
+ | 4.4571 | 69500 | 0.2936 |
491
+ | 4.4892 | 70000 | 0.3009 |
492
+ | 4.5213 | 70500 | 0.2941 |
493
+ | 4.5533 | 71000 | 0.2957 |
494
+ | 4.5854 | 71500 | 0.288 |
495
+ | 4.6175 | 72000 | 0.3032 |
496
+ | 4.6495 | 72500 | 0.2919 |
497
+ | 4.6816 | 73000 | 0.2843 |
498
+ | 4.7137 | 73500 | 0.2862 |
499
+ | 4.7457 | 74000 | 0.2789 |
500
+ | 4.7778 | 74500 | 0.2843 |
501
+ | 4.8099 | 75000 | 0.2816 |
502
+ | 4.8419 | 75500 | 0.2813 |
503
+ | 4.8740 | 76000 | 0.2839 |
504
+ | 4.9060 | 76500 | 0.2619 |
505
+ | 4.9381 | 77000 | 0.2877 |
506
+ | 4.9702 | 77500 | 0.2693 |
507
+ | 5.0 | 77965 | - |
508
+ | 5.0022 | 78000 | 0.2738 |
509
+ | 5.0343 | 78500 | 0.286 |
510
+ | 5.0664 | 79000 | 0.2754 |
511
+ | 5.0984 | 79500 | 0.2561 |
512
+ | 5.1305 | 80000 | 0.2498 |
513
+ | 5.1626 | 80500 | 0.2563 |
514
+ | 5.1946 | 81000 | 0.2618 |
515
+ | 5.2267 | 81500 | 0.265 |
516
+ | 5.2588 | 82000 | 0.245 |
517
+ | 5.2908 | 82500 | 0.2551 |
518
+ | 5.3229 | 83000 | 0.2653 |
519
+ | 5.3550 | 83500 | 0.2453 |
520
+ | 5.3870 | 84000 | 0.24 |
521
+ | 5.4191 | 84500 | 0.2478 |
522
+ | 5.4512 | 85000 | 0.2444 |
523
+ | 5.4832 | 85500 | 0.2464 |
524
+ | 5.5153 | 86000 | 0.2327 |
525
+ | 5.5474 | 86500 | 0.2376 |
526
+ | 5.5794 | 87000 | 0.2469 |
527
+ | 5.6115 | 87500 | 0.2488 |
528
+ | 5.6436 | 88000 | 0.2467 |
529
+ | 5.6756 | 88500 | 0.2409 |
530
+ | 5.7077 | 89000 | 0.2287 |
531
+ | 5.7398 | 89500 | 0.2288 |
532
+ | 5.7718 | 90000 | 0.2399 |
533
+ | 5.8039 | 90500 | 0.2341 |
534
+ | 5.8360 | 91000 | 0.2352 |
535
+ | 5.8680 | 91500 | 0.2196 |
536
+ | 5.9001 | 92000 | 0.2196 |
537
+ | 5.9321 | 92500 | 0.2246 |
538
+ | 5.9642 | 93000 | 0.2411 |
539
+ | 5.9963 | 93500 | 0.2279 |
540
+ | 6.0 | 93558 | - |
541
+ | 6.0283 | 94000 | 0.2489 |
542
+ | 6.0604 | 94500 | 0.2339 |
543
+ | 6.0925 | 95000 | 0.224 |
544
+ | 6.1245 | 95500 | 0.209 |
545
+ | 6.1566 | 96000 | 0.2262 |
546
+ | 6.1887 | 96500 | 0.2221 |
547
+ | 6.2207 | 97000 | 0.214 |
548
+ | 6.2528 | 97500 | 0.21 |
549
+ | 6.2849 | 98000 | 0.2072 |
550
+ | 6.3169 | 98500 | 0.2204 |
551
+ | 6.3490 | 99000 | 0.2041 |
552
+ | 6.3811 | 99500 | 0.2067 |
553
+ | 6.4131 | 100000 | 0.2102 |
554
+ | 6.4452 | 100500 | 0.2031 |
555
+ | 6.4773 | 101000 | 0.2107 |
556
+ | 6.5093 | 101500 | 0.2009 |
557
+ | 6.5414 | 102000 | 0.2057 |
558
+ | 6.5735 | 102500 | 0.1979 |
559
+ | 6.6055 | 103000 | 0.1994 |
560
+ | 6.6376 | 103500 | 0.2065 |
561
+ | 6.6697 | 104000 | 0.1958 |
562
+ | 6.7017 | 104500 | 0.2074 |
563
+ | 6.7338 | 105000 | 0.1941 |
564
+ | 6.7659 | 105500 | 0.2035 |
565
+ | 6.7979 | 106000 | 0.2003 |
566
+ | 6.8300 | 106500 | 0.2083 |
567
+ | 6.8621 | 107000 | 0.1921 |
568
+ | 6.8941 | 107500 | 0.1893 |
569
+ | 6.9262 | 108000 | 0.2014 |
570
+ | 6.9583 | 108500 | 0.192 |
571
+ | 6.9903 | 109000 | 0.1921 |
572
+ | 7.0 | 109151 | - |
573
+ | 7.0224 | 109500 | 0.2141 |
574
+ | 7.0544 | 110000 | 0.1868 |
575
+ | 7.0865 | 110500 | 0.1815 |
576
+ | 7.1186 | 111000 | 0.1793 |
577
+ | 7.1506 | 111500 | 0.1812 |
578
+ | 7.1827 | 112000 | 0.1853 |
579
+ | 7.2148 | 112500 | 0.1922 |
580
+ | 7.2468 | 113000 | 0.179 |
581
+ | 7.2789 | 113500 | 0.1707 |
582
+ | 7.3110 | 114000 | 0.1829 |
583
+ | 7.3430 | 114500 | 0.1743 |
584
+ | 7.3751 | 115000 | 0.1787 |
585
+ | 7.4072 | 115500 | 0.1815 |
586
+ | 7.4392 | 116000 | 0.1776 |
587
+ | 7.4713 | 116500 | 0.1773 |
588
+ | 7.5034 | 117000 | 0.1753 |
589
+ | 7.5354 | 117500 | 0.1816 |
590
+ | 7.5675 | 118000 | 0.1795 |
591
+ | 7.5996 | 118500 | 0.178 |
592
+ | 7.6316 | 119000 | 0.177 |
593
+ | 7.6637 | 119500 | 0.175 |
594
+ | 7.6958 | 120000 | 0.1701 |
595
+ | 7.7278 | 120500 | 0.1686 |
596
+ | 7.7599 | 121000 | 0.1727 |
597
+ | 7.7920 | 121500 | 0.1733 |
598
+ | 7.8240 | 122000 | 0.1707 |
599
+ | 7.8561 | 122500 | 0.1729 |
600
+ | 7.8882 | 123000 | 0.1569 |
601
+ | 7.9202 | 123500 | 0.1657 |
602
+ | 7.9523 | 124000 | 0.1773 |
603
+ | 7.9844 | 124500 | 0.1625 |
604
+ | 8.0 | 124744 | - |
605
+ | 8.0164 | 125000 | 0.1824 |
606
+ | 8.0485 | 125500 | 0.1852 |
607
+ | 8.0805 | 126000 | 0.1701 |
608
+ | 8.1126 | 126500 | 0.1573 |
609
+ | 8.1447 | 127000 | 0.1614 |
610
+ | 8.1767 | 127500 | 0.1624 |
611
+ | 8.2088 | 128000 | 0.1575 |
612
+ | 8.2409 | 128500 | 0.1481 |
613
+ | 8.2729 | 129000 | 0.1537 |
614
+ | 8.3050 | 129500 | 0.1616 |
615
+ | 8.3371 | 130000 | 0.1544 |
616
+ | 8.3691 | 130500 | 0.1511 |
617
+ | 8.4012 | 131000 | 0.1569 |
618
+ | 8.4333 | 131500 | 0.1535 |
619
+ | 8.4653 | 132000 | 0.1489 |
620
+ | 8.4974 | 132500 | 0.1593 |
621
+ | 8.5295 | 133000 | 0.1552 |
622
+ | 8.5615 | 133500 | 0.1578 |
623
+ | 8.5936 | 134000 | 0.1501 |
624
+ | 8.6257 | 134500 | 0.156 |
625
+ | 8.6577 | 135000 | 0.1455 |
626
+ | 8.6898 | 135500 | 0.1524 |
627
+ | 8.7219 | 136000 | 0.1344 |
628
+ | 8.7539 | 136500 | 0.1513 |
629
+ | 8.7860 | 137000 | 0.141 |
630
+ | 8.8181 | 137500 | 0.1518 |
631
+ | 8.8501 | 138000 | 0.1468 |
632
+ | 8.8822 | 138500 | 0.1416 |
633
+ | 8.9143 | 139000 | 0.1434 |
634
+ | 8.9463 | 139500 | 0.1495 |
635
+ | 8.9784 | 140000 | 0.1364 |
636
+ | 9.0 | 140337 | - |
637
+ | 9.0105 | 140500 | 0.1507 |
638
+ | 9.0425 | 141000 | 0.1496 |
639
+ | 9.0746 | 141500 | 0.1475 |
640
+ | 9.1067 | 142000 | 0.1348 |
641
+ | 9.1387 | 142500 | 0.1282 |
642
+ | 9.1708 | 143000 | 0.1362 |
643
+ | 9.2028 | 143500 | 0.1364 |
644
+ | 9.2349 | 144000 | 0.1385 |
645
+ | 9.2670 | 144500 | 0.1309 |
646
+ | 9.2990 | 145000 | 0.1324 |
647
+ | 9.3311 | 145500 | 0.1354 |
648
+ | 9.3632 | 146000 | 0.1283 |
649
+ | 9.3952 | 146500 | 0.1239 |
650
+ | 9.4273 | 147000 | 0.126 |
651
+ | 9.4594 | 147500 | 0.1232 |
652
+ | 9.4914 | 148000 | 0.1269 |
653
+ | 9.5235 | 148500 | 0.1269 |
654
+ | 9.5556 | 149000 | 0.1299 |
655
+ | 9.5876 | 149500 | 0.1367 |
656
+ | 9.6197 | 150000 | 0.1354 |
657
+ | 9.6518 | 150500 | 0.1239 |
658
+ | 9.6838 | 151000 | 0.1311 |
659
+ | 9.7159 | 151500 | 0.1235 |
660
+ | 9.7480 | 152000 | 0.129 |
661
+ | 9.7800 | 152500 | 0.1244 |
662
+ | 9.8121 | 153000 | 0.1201 |
663
+ | 9.8442 | 153500 | 0.1332 |
664
+ | 9.8762 | 154000 | 0.1189 |
665
+ | 9.9083 | 154500 | 0.1221 |
666
+ | 9.9404 | 155000 | 0.1228 |
667
+ | 9.9724 | 155500 | 0.1173 |
668
+ | 10.0 | 155930 | - |
669
+ | 10.0045 | 156000 | 0.1347 |
670
+ | 10.0366 | 156500 | 0.1384 |
671
+ | 10.0686 | 157000 | 0.1402 |
672
+ | 10.1007 | 157500 | 0.1161 |
673
+ | 10.1328 | 158000 | 0.1141 |
674
+ | 10.1648 | 158500 | 0.1199 |
675
+ | 10.1969 | 159000 | 0.1328 |
676
+ | 10.2289 | 159500 | 0.1263 |
677
+ | 10.2610 | 160000 | 0.1143 |
678
+ | 10.2931 | 160500 | 0.1207 |
679
+ | 10.3251 | 161000 | 0.1119 |
680
+ | 10.3572 | 161500 | 0.114 |
681
+ | 10.3893 | 162000 | 0.114 |
682
+ | 10.4213 | 162500 | 0.1118 |
683
+ | 10.4534 | 163000 | 0.1228 |
684
+ | 10.4855 | 163500 | 0.1209 |
685
+ | 10.5175 | 164000 | 0.1153 |
686
+ | 10.5496 | 164500 | 0.118 |
687
+ | 10.5817 | 165000 | 0.1118 |
688
+ | 10.6137 | 165500 | 0.1206 |
689
+ | 10.6458 | 166000 | 0.1108 |
690
+ | 10.6779 | 166500 | 0.1084 |
691
+ | 10.7099 | 167000 | 0.1127 |
692
+ | 10.7420 | 167500 | 0.1001 |
693
+ | 10.7741 | 168000 | 0.1073 |
694
+ | 10.8061 | 168500 | 0.1174 |
695
+ | 10.8382 | 169000 | 0.1143 |
696
+ | 10.8703 | 169500 | 0.1158 |
697
+ | 10.9023 | 170000 | 0.1099 |
698
+ | 10.9344 | 170500 | 0.0998 |
699
+ | 10.9665 | 171000 | 0.1009 |
700
+ | 10.9985 | 171500 | 0.1167 |
701
+ | 11.0 | 171523 | - |
702
+ | 11.0306 | 172000 | 0.1161 |
703
+ | 11.0627 | 172500 | 0.1126 |
704
+ | 11.0947 | 173000 | 0.1046 |
705
+ | 11.1268 | 173500 | 0.1054 |
706
+ | 11.1589 | 174000 | 0.1063 |
707
+ | 11.1909 | 174500 | 0.1136 |
708
+ | 11.2230 | 175000 | 0.108 |
709
+ | 11.2551 | 175500 | 0.1014 |
710
+ | 11.2871 | 176000 | 0.1036 |
711
+ | 11.3192 | 176500 | 0.1043 |
712
+ | 11.3512 | 177000 | 0.0973 |
713
+ | 11.3833 | 177500 | 0.0934 |
714
+ | 11.4154 | 178000 | 0.095 |
715
+ | 11.4474 | 178500 | 0.1032 |
716
+ | 11.4795 | 179000 | 0.1089 |
717
+ | 11.5116 | 179500 | 0.098 |
718
+ | 11.5436 | 180000 | 0.099 |
719
+ | 11.5757 | 180500 | 0.1007 |
720
+ | 11.6078 | 181000 | 0.096 |
721
+ | 11.6398 | 181500 | 0.0986 |
722
+ | 11.6719 | 182000 | 0.1033 |
723
+ | 11.7040 | 182500 | 0.0899 |
724
+ | 11.7360 | 183000 | 0.0946 |
725
+ | 11.7681 | 183500 | 0.0943 |
726
+ | 11.8002 | 184000 | 0.0954 |
727
+ | 11.8322 | 184500 | 0.0955 |
728
+ | 11.8643 | 185000 | 0.0924 |
729
+ | 11.8964 | 185500 | 0.0847 |
730
+ | 11.9284 | 186000 | 0.0914 |
731
+ | 11.9605 | 186500 | 0.0918 |
732
+ | 11.9926 | 187000 | 0.099 |
733
+ | 12.0 | 187116 | - |
734
+ | 12.0246 | 187500 | 0.1029 |
735
+ | 12.0567 | 188000 | 0.1032 |
736
+ | 12.0888 | 188500 | 0.0864 |
737
+ | 12.1208 | 189000 | 0.0921 |
738
+ | 12.1529 | 189500 | 0.0959 |
739
+ | 12.1850 | 190000 | 0.0846 |
740
+ | 12.2170 | 190500 | 0.0924 |
741
+ | 12.2491 | 191000 | 0.0897 |
742
+ | 12.2812 | 191500 | 0.0858 |
743
+ | 12.3132 | 192000 | 0.0851 |
744
+ | 12.3453 | 192500 | 0.0925 |
745
+ | 12.3773 | 193000 | 0.0963 |
746
+ | 12.4094 | 193500 | 0.0867 |
747
+ | 12.4415 | 194000 | 0.0929 |
748
+ | 12.4735 | 194500 | 0.0904 |
749
+ | 12.5056 | 195000 | 0.0854 |
750
+ | 12.5377 | 195500 | 0.0876 |
751
+ | 12.5697 | 196000 | 0.0899 |
752
+ | 12.6018 | 196500 | 0.09 |
753
+ | 12.6339 | 197000 | 0.0921 |
754
+ | 12.6659 | 197500 | 0.0829 |
755
+ | 12.6980 | 198000 | 0.0952 |
756
+ | 12.7301 | 198500 | 0.087 |
757
+ | 12.7621 | 199000 | 0.086 |
758
+ | 12.7942 | 199500 | 0.0836 |
759
+ | 12.8263 | 200000 | 0.0845 |
760
+ | 12.8583 | 200500 | 0.0808 |
761
+ | 12.8904 | 201000 | 0.0771 |
762
+ | 12.9225 | 201500 | 0.0815 |
763
+ | 12.9545 | 202000 | 0.0901 |
764
+ | 12.9866 | 202500 | 0.0871 |
765
+ | 13.0 | 202709 | - |
766
+ | 13.0187 | 203000 | 0.088 |
767
+ | 13.0507 | 203500 | 0.089 |
768
+ | 13.0828 | 204000 | 0.081 |
769
+ | 13.1149 | 204500 | 0.0739 |
770
+ | 13.1469 | 205000 | 0.0825 |
771
+ | 13.1790 | 205500 | 0.0855 |
772
+ | 13.2111 | 206000 | 0.0788 |
773
+ | 13.2431 | 206500 | 0.0769 |
774
+ | 13.2752 | 207000 | 0.0706 |
775
+ | 13.3073 | 207500 | 0.0821 |
776
+ | 13.3393 | 208000 | 0.0752 |
777
+ | 13.3714 | 208500 | 0.0746 |
778
+ | 13.4035 | 209000 | 0.066 |
779
+ | 13.4355 | 209500 | 0.0779 |
780
+ | 13.4676 | 210000 | 0.0755 |
781
+ | 13.4996 | 210500 | 0.0829 |
782
+ | 13.5317 | 211000 | 0.0731 |
783
+ | 13.5638 | 211500 | 0.086 |
784
+ | 13.5958 | 212000 | 0.078 |
785
+ | 13.6279 | 212500 | 0.0724 |
786
+ | 13.6600 | 213000 | 0.0696 |
787
+ | 13.6920 | 213500 | 0.0789 |
788
+ | 13.7241 | 214000 | 0.0657 |
789
+ | 13.7562 | 214500 | 0.0767 |
790
+ | 13.7882 | 215000 | 0.0728 |
791
+ | 13.8203 | 215500 | 0.071 |
792
+ | 13.8524 | 216000 | 0.0733 |
793
+ | 13.8844 | 216500 | 0.0621 |
794
+ | 13.9165 | 217000 | 0.0677 |
795
+ | 13.9486 | 217500 | 0.0761 |
796
+ | 13.9806 | 218000 | 0.0669 |
797
+ | 14.0 | 218302 | - |
798
+ | 14.0127 | 218500 | 0.0848 |
799
+ | 14.0448 | 219000 | 0.0647 |
800
+ | 14.0768 | 219500 | 0.0717 |
801
+ | 14.1089 | 220000 | 0.0653 |
802
+ | 14.1410 | 220500 | 0.0615 |
803
+ | 14.1730 | 221000 | 0.0711 |
804
+ | 14.2051 | 221500 | 0.0674 |
805
+ | 14.2372 | 222000 | 0.0674 |
806
+ | 14.2692 | 222500 | 0.0657 |
807
+ | 14.3013 | 223000 | 0.0727 |
808
+ | 14.3334 | 223500 | 0.0709 |
809
+ | 14.3654 | 224000 | 0.061 |
810
+ | 14.3975 | 224500 | 0.0638 |
811
+ | 14.4296 | 225000 | 0.0704 |
812
+ | 14.4616 | 225500 | 0.0623 |
813
+ | 14.4937 | 226000 | 0.065 |
814
+ | 14.5257 | 226500 | 0.0657 |
815
+ | 14.5578 | 227000 | 0.0634 |
816
+ | 14.5899 | 227500 | 0.0555 |
817
+ | 14.6219 | 228000 | 0.0647 |
818
+ | 14.6540 | 228500 | 0.0616 |
819
+ | 14.6861 | 229000 | 0.0645 |
820
+ | 14.7181 | 229500 | 0.0649 |
821
+ | 14.7502 | 230000 | 0.0612 |
822
+ | 14.7823 | 230500 | 0.0646 |
823
+ | 14.8143 | 231000 | 0.0571 |
824
+ | 14.8464 | 231500 | 0.0561 |
825
+ | 14.8785 | 232000 | 0.0598 |
826
+ | 14.9105 | 232500 | 0.0634 |
827
+ | 14.9426 | 233000 | 0.0657 |
828
+ | 14.9747 | 233500 | 0.0644 |
829
+ | 15.0 | 233895 | - |
830
+
831
+ </details>
832
+
833
+ ### Framework Versions
834
+ - Python: 3.10.12
835
+ - Sentence Transformers: 3.1.1
836
+ - Transformers: 4.44.2
837
+ - PyTorch: 2.4.1+cu121
838
+ - Accelerate: 0.34.2
839
+ - Datasets: 3.0.0
840
+ - Tokenizers: 0.19.1
841
+
842
+ ## Citation
843
+
844
+ ### BibTeX
845
+
846
+ #### Sentence Transformers
847
+ ```bibtex
848
+ @inproceedings{reimers-2019-sentence-bert,
849
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
850
+ author = "Reimers, Nils and Gurevych, Iryna",
851
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
852
+ month = "11",
853
+ year = "2019",
854
+ publisher = "Association for Computational Linguistics",
855
+ url = "https://arxiv.org/abs/1908.10084",
856
+ }
857
+ ```
858
+
859
+ <!--
860
+ ## Glossary
861
+
862
+ *Clearly define terms in order to be accessible across audiences.*
863
+ -->
864
+
865
+ <!--
866
+ ## Model Card Authors
867
+
868
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
869
+ -->
870
+
871
+ <!--
872
+ ## Model Card Contact
873
+
874
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
875
+ -->
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "[MASK]": 250101
3
+ }
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "microsoft/mdeberta-v3-base",
3
+ "architectures": [
4
+ "DebertaV2Model"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 768,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 3072,
12
+ "layer_norm_eps": 1e-07,
13
+ "max_position_embeddings": 512,
14
+ "max_relative_positions": -1,
15
+ "model_type": "deberta-v2",
16
+ "norm_rel_ebd": "layer_norm",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "pooler_dropout": 0,
21
+ "pooler_hidden_act": "gelu",
22
+ "pooler_hidden_size": 768,
23
+ "pos_att_type": [
24
+ "p2c",
25
+ "c2p"
26
+ ],
27
+ "position_biased_input": false,
28
+ "position_buckets": 256,
29
+ "relative_attention": true,
30
+ "share_att_key": true,
31
+ "torch_dtype": "float32",
32
+ "transformers_version": "4.44.2",
33
+ "type_vocab_size": 0,
34
+ "vocab_size": 251000
35
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.1",
4
+ "transformers": "4.44.2",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15368399cea2b697ce413b9aba01fb25300a494eec78176a481c2f9cba390336
3
+ size 1112897768
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 1024,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "[CLS]",
3
+ "cls_token": "[CLS]",
4
+ "eos_token": "[SEP]",
5
+ "mask_token": "[MASK]",
6
+ "pad_token": "[PAD]",
7
+ "sep_token": "[SEP]",
8
+ "unk_token": {
9
+ "content": "[UNK]",
10
+ "lstrip": false,
11
+ "normalized": true,
12
+ "rstrip": false,
13
+ "single_word": false
14
+ }
15
+ }
spm.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:13c8d666d62a7bc4ac8f040aab68e942c861f93303156cc28f5c7e885d86d6e3
3
+ size 4305025
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2826b21e853d3a67db3218270390f0e6af4b2cf7a953c4a8fe72ab347a7bfa62
3
+ size 16351018
tokenizer_config.json ADDED
@@ -0,0 +1,858 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[CLS]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[SEP]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "250001": {
36
+ "content": "▁<extra_id_99>",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "250002": {
44
+ "content": "▁<extra_id_98>",
45
+ "lstrip": false,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "250003": {
52
+ "content": "▁<extra_id_97>",
53
+ "lstrip": false,
54
+ "normalized": false,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "250004": {
60
+ "content": "▁<extra_id_96>",
61
+ "lstrip": false,
62
+ "normalized": false,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "250005": {
68
+ "content": "▁<extra_id_95>",
69
+ "lstrip": false,
70
+ "normalized": false,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "250006": {
76
+ "content": "▁<extra_id_94>",
77
+ "lstrip": false,
78
+ "normalized": false,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "250007": {
84
+ "content": "▁<extra_id_93>",
85
+ "lstrip": false,
86
+ "normalized": false,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "250008": {
92
+ "content": "▁<extra_id_92>",
93
+ "lstrip": false,
94
+ "normalized": false,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "250009": {
100
+ "content": "▁<extra_id_91>",
101
+ "lstrip": false,
102
+ "normalized": false,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "250010": {
108
+ "content": "▁<extra_id_90>",
109
+ "lstrip": false,
110
+ "normalized": false,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "250011": {
116
+ "content": "▁<extra_id_89>",
117
+ "lstrip": false,
118
+ "normalized": false,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "250012": {
124
+ "content": "▁<extra_id_88>",
125
+ "lstrip": false,
126
+ "normalized": false,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "250013": {
132
+ "content": "▁<extra_id_87>",
133
+ "lstrip": false,
134
+ "normalized": false,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "250014": {
140
+ "content": "▁<extra_id_86>",
141
+ "lstrip": false,
142
+ "normalized": false,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "250015": {
148
+ "content": "▁<extra_id_85>",
149
+ "lstrip": false,
150
+ "normalized": false,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "250016": {
156
+ "content": "▁<extra_id_84>",
157
+ "lstrip": false,
158
+ "normalized": false,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "250017": {
164
+ "content": "▁<extra_id_83>",
165
+ "lstrip": false,
166
+ "normalized": false,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "250018": {
172
+ "content": "▁<extra_id_82>",
173
+ "lstrip": false,
174
+ "normalized": false,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "250019": {
180
+ "content": "▁<extra_id_81>",
181
+ "lstrip": false,
182
+ "normalized": false,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "250020": {
188
+ "content": "▁<extra_id_80>",
189
+ "lstrip": false,
190
+ "normalized": false,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "250021": {
196
+ "content": "▁<extra_id_79>",
197
+ "lstrip": false,
198
+ "normalized": false,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "250022": {
204
+ "content": "▁<extra_id_78>",
205
+ "lstrip": false,
206
+ "normalized": false,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "250023": {
212
+ "content": "▁<extra_id_77>",
213
+ "lstrip": false,
214
+ "normalized": false,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "250024": {
220
+ "content": "▁<extra_id_76>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": false
226
+ },
227
+ "250025": {
228
+ "content": "▁<extra_id_75>",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": false
234
+ },
235
+ "250026": {
236
+ "content": "▁<extra_id_74>",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": false
242
+ },
243
+ "250027": {
244
+ "content": "▁<extra_id_73>",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": false
250
+ },
251
+ "250028": {
252
+ "content": "▁<extra_id_72>",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": false
258
+ },
259
+ "250029": {
260
+ "content": "▁<extra_id_71>",
261
+ "lstrip": false,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": false
266
+ },
267
+ "250030": {
268
+ "content": "▁<extra_id_70>",
269
+ "lstrip": false,
270
+ "normalized": false,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "250031": {
276
+ "content": "▁<extra_id_69>",
277
+ "lstrip": false,
278
+ "normalized": false,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "250032": {
284
+ "content": "▁<extra_id_68>",
285
+ "lstrip": false,
286
+ "normalized": false,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "250033": {
292
+ "content": "▁<extra_id_67>",
293
+ "lstrip": false,
294
+ "normalized": false,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "250034": {
300
+ "content": "▁<extra_id_66>",
301
+ "lstrip": false,
302
+ "normalized": false,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "250035": {
308
+ "content": "▁<extra_id_65>",
309
+ "lstrip": false,
310
+ "normalized": false,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "250036": {
316
+ "content": "▁<extra_id_64>",
317
+ "lstrip": false,
318
+ "normalized": false,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "250037": {
324
+ "content": "▁<extra_id_63>",
325
+ "lstrip": false,
326
+ "normalized": false,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "250038": {
332
+ "content": "▁<extra_id_62>",
333
+ "lstrip": false,
334
+ "normalized": false,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "250039": {
340
+ "content": "▁<extra_id_61>",
341
+ "lstrip": false,
342
+ "normalized": false,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "250040": {
348
+ "content": "▁<extra_id_60>",
349
+ "lstrip": false,
350
+ "normalized": false,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "250041": {
356
+ "content": "▁<extra_id_59>",
357
+ "lstrip": false,
358
+ "normalized": false,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "250042": {
364
+ "content": "▁<extra_id_58>",
365
+ "lstrip": false,
366
+ "normalized": false,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "250043": {
372
+ "content": "▁<extra_id_57>",
373
+ "lstrip": false,
374
+ "normalized": false,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "250044": {
380
+ "content": "▁<extra_id_56>",
381
+ "lstrip": false,
382
+ "normalized": false,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "250045": {
388
+ "content": "▁<extra_id_55>",
389
+ "lstrip": false,
390
+ "normalized": false,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "250046": {
396
+ "content": "▁<extra_id_54>",
397
+ "lstrip": false,
398
+ "normalized": false,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "250047": {
404
+ "content": "▁<extra_id_53>",
405
+ "lstrip": false,
406
+ "normalized": false,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "250048": {
412
+ "content": "▁<extra_id_52>",
413
+ "lstrip": false,
414
+ "normalized": false,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "250049": {
420
+ "content": "▁<extra_id_51>",
421
+ "lstrip": false,
422
+ "normalized": false,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "250050": {
428
+ "content": "▁<extra_id_50>",
429
+ "lstrip": false,
430
+ "normalized": false,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "250051": {
436
+ "content": "▁<extra_id_49>",
437
+ "lstrip": false,
438
+ "normalized": false,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "250052": {
444
+ "content": "▁<extra_id_48>",
445
+ "lstrip": false,
446
+ "normalized": false,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "250053": {
452
+ "content": "▁<extra_id_47>",
453
+ "lstrip": false,
454
+ "normalized": false,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "250054": {
460
+ "content": "▁<extra_id_46>",
461
+ "lstrip": false,
462
+ "normalized": false,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "250055": {
468
+ "content": "▁<extra_id_45>",
469
+ "lstrip": false,
470
+ "normalized": false,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "250056": {
476
+ "content": "▁<extra_id_44>",
477
+ "lstrip": false,
478
+ "normalized": false,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "250057": {
484
+ "content": "▁<extra_id_43>",
485
+ "lstrip": false,
486
+ "normalized": false,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "250058": {
492
+ "content": "▁<extra_id_42>",
493
+ "lstrip": false,
494
+ "normalized": false,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "250059": {
500
+ "content": "▁<extra_id_41>",
501
+ "lstrip": false,
502
+ "normalized": false,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "250060": {
508
+ "content": "▁<extra_id_40>",
509
+ "lstrip": false,
510
+ "normalized": false,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "250061": {
516
+ "content": "▁<extra_id_39>",
517
+ "lstrip": false,
518
+ "normalized": false,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "250062": {
524
+ "content": "▁<extra_id_38>",
525
+ "lstrip": false,
526
+ "normalized": false,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "250063": {
532
+ "content": "▁<extra_id_37>",
533
+ "lstrip": false,
534
+ "normalized": false,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "250064": {
540
+ "content": "▁<extra_id_36>",
541
+ "lstrip": false,
542
+ "normalized": false,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "250065": {
548
+ "content": "▁<extra_id_35>",
549
+ "lstrip": false,
550
+ "normalized": false,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "250066": {
556
+ "content": "▁<extra_id_34>",
557
+ "lstrip": false,
558
+ "normalized": false,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "250067": {
564
+ "content": "▁<extra_id_33>",
565
+ "lstrip": false,
566
+ "normalized": false,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "250068": {
572
+ "content": "▁<extra_id_32>",
573
+ "lstrip": false,
574
+ "normalized": false,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "250069": {
580
+ "content": "▁<extra_id_31>",
581
+ "lstrip": false,
582
+ "normalized": false,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "250070": {
588
+ "content": "▁<extra_id_30>",
589
+ "lstrip": false,
590
+ "normalized": false,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "250071": {
596
+ "content": "▁<extra_id_29>",
597
+ "lstrip": false,
598
+ "normalized": false,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "250072": {
604
+ "content": "▁<extra_id_28>",
605
+ "lstrip": false,
606
+ "normalized": false,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "250073": {
612
+ "content": "▁<extra_id_27>",
613
+ "lstrip": false,
614
+ "normalized": false,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "250074": {
620
+ "content": "▁<extra_id_26>",
621
+ "lstrip": false,
622
+ "normalized": false,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "250075": {
628
+ "content": "▁<extra_id_25>",
629
+ "lstrip": false,
630
+ "normalized": false,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "250076": {
636
+ "content": "▁<extra_id_24>",
637
+ "lstrip": false,
638
+ "normalized": false,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "250077": {
644
+ "content": "▁<extra_id_23>",
645
+ "lstrip": false,
646
+ "normalized": false,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "250078": {
652
+ "content": "▁<extra_id_22>",
653
+ "lstrip": false,
654
+ "normalized": false,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "250079": {
660
+ "content": "▁<extra_id_21>",
661
+ "lstrip": false,
662
+ "normalized": false,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "250080": {
668
+ "content": "▁<extra_id_20>",
669
+ "lstrip": false,
670
+ "normalized": false,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "250081": {
676
+ "content": "▁<extra_id_19>",
677
+ "lstrip": false,
678
+ "normalized": false,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "250082": {
684
+ "content": "▁<extra_id_18>",
685
+ "lstrip": false,
686
+ "normalized": false,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "250083": {
692
+ "content": "▁<extra_id_17>",
693
+ "lstrip": false,
694
+ "normalized": false,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "250084": {
700
+ "content": "▁<extra_id_16>",
701
+ "lstrip": false,
702
+ "normalized": false,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "250085": {
708
+ "content": "▁<extra_id_15>",
709
+ "lstrip": false,
710
+ "normalized": false,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "250086": {
716
+ "content": "▁<extra_id_14>",
717
+ "lstrip": false,
718
+ "normalized": false,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "250087": {
724
+ "content": "▁<extra_id_13>",
725
+ "lstrip": false,
726
+ "normalized": false,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "250088": {
732
+ "content": "▁<extra_id_12>",
733
+ "lstrip": false,
734
+ "normalized": false,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "250089": {
740
+ "content": "▁<extra_id_11>",
741
+ "lstrip": false,
742
+ "normalized": false,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "250090": {
748
+ "content": "▁<extra_id_10>",
749
+ "lstrip": false,
750
+ "normalized": false,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "250091": {
756
+ "content": "▁<extra_id_9>",
757
+ "lstrip": false,
758
+ "normalized": false,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "250092": {
764
+ "content": "▁<extra_id_8>",
765
+ "lstrip": false,
766
+ "normalized": false,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "250093": {
772
+ "content": "▁<extra_id_7>",
773
+ "lstrip": false,
774
+ "normalized": false,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "250094": {
780
+ "content": "▁<extra_id_6>",
781
+ "lstrip": false,
782
+ "normalized": false,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "250095": {
788
+ "content": "▁<extra_id_5>",
789
+ "lstrip": false,
790
+ "normalized": false,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "250096": {
796
+ "content": "▁<extra_id_4>",
797
+ "lstrip": false,
798
+ "normalized": false,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "250097": {
804
+ "content": "▁<extra_id_3>",
805
+ "lstrip": false,
806
+ "normalized": false,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "250098": {
812
+ "content": "▁<extra_id_2>",
813
+ "lstrip": false,
814
+ "normalized": false,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "250099": {
820
+ "content": "▁<extra_id_1>",
821
+ "lstrip": false,
822
+ "normalized": false,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "250100": {
828
+ "content": "▁<extra_id_0>",
829
+ "lstrip": false,
830
+ "normalized": false,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "250101": {
836
+ "content": "[MASK]",
837
+ "lstrip": false,
838
+ "normalized": false,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": true
842
+ }
843
+ },
844
+ "bos_token": "[CLS]",
845
+ "clean_up_tokenization_spaces": true,
846
+ "cls_token": "[CLS]",
847
+ "do_lower_case": false,
848
+ "eos_token": "[SEP]",
849
+ "mask_token": "[MASK]",
850
+ "model_max_length": 1024,
851
+ "pad_token": "[PAD]",
852
+ "sep_token": "[SEP]",
853
+ "sp_model_kwargs": {},
854
+ "split_by_punct": false,
855
+ "tokenizer_class": "DebertaV2Tokenizer",
856
+ "unk_token": "[UNK]",
857
+ "vocab_type": "spm"
858
+ }