Marco127 commited on
Commit
c3e1918
·
verified ·
1 Parent(s): 20f5a2e

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,376 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:72
8
+ - loss:ContrastiveLoss
9
+ base_model: sentence-transformers/multi-qa-mpnet-base-dot-v1
10
+ widget:
11
+ - source_sentence: What was the original purpose of the Basilica di San Lorenzo's
12
+ design by Filippo Brunelleschi in 1419?
13
+ sentences:
14
+ - ' It is one of several churches that claim to be the oldest in Florence, having
15
+ been consecrated in 393 AD, at which time it stood outside the city walls.'
16
+ - The Palazzo Pitti, in English sometimes called the Pitti Palace, is a vast, mainly
17
+ Renaissance, palace in Florence, Italy. It is situated on the south side of the
18
+ River Arno in Pitti Square, a short distance from the Ponte Vecchio.
19
+ - ' The architects were Mariano Falcini, Professor Vincenzo Micheli, and Marco Treves,
20
+ who was Jewish. '
21
+ - source_sentence: What is the name of the architect who expanded the façade and the
22
+ rear section of the Palazzo Pitti in 1549?
23
+ sentences:
24
+ - ' The palace was left incomplete by Simone del Pollaiolo (il Cronaca), who was
25
+ in charge of the construction of the palace until 1504. '
26
+ - The Palazzo Pitti, in English sometimes called the Pitti Palace, is a vast, mainly
27
+ Renaissance, palace in Florence, Italy. It is situated on the south side of the
28
+ River Arno in Pitti Square, a short distance from the Ponte Vecchio.
29
+ - ' In 1939, these were joined by the Palestrina Pietà, discovered in the Barberini
30
+ chapel in Palestrina, though experts now consider its attribution to Michelangelo
31
+ to be dubious. '
32
+ - source_sentence: When did the Uffizi Gallery officially open to the public?
33
+ sentences:
34
+ - ' The project was intended to display prime artworks of the Medici collections
35
+ on the piano nobile; the plan was carried out by his son, Grand Duke Francesco
36
+ I.'
37
+ - ' The gallery had been open to visitors by request since the sixteenth century,
38
+ and in 1769 it was officially opened to the public, formally becoming a museum
39
+ in 1865.'
40
+ - ' In 1939, these were joined by the Palestrina Pietà, discovered in the Barberini
41
+ chapel in Palestrina, though experts now consider its attribution to Michelangelo
42
+ to be dubious. '
43
+ - source_sentence: When was the first church on the site of the current Santa Felicita
44
+ church in Florence probably built?
45
+ sentences:
46
+ - ' The project was intended to display prime artworks of the Medici collections
47
+ on the piano nobile; the plan was carried out by his son, Grand Duke Francesco
48
+ I.'
49
+ - ' It was employed as a prison; executions took place in the Bargello''s yard until
50
+ they were abolished by Grand Duke Peter Leopold in 1786, but it remained the headquarters
51
+ of the Florentine police until 1859.'
52
+ - 'Santa Felicita (Church of St Felicity) is a Roman Catholic church in Florence,
53
+ region of Tuscany, Italy, probably the oldest in the city after San Lorenzo. '
54
+ - source_sentence: What was the original purpose of the building in 1255?
55
+ sentences:
56
+ - ' The palace was built to house first the Capitano del Popolo and later, in 1261,
57
+ the ''podestà'', the highest magistrate of the Florence City Council.'
58
+ - The Ponte Vecchio is a medieval stone closed-spandrel segmental arch bridge over
59
+ the Arno, in Florence, Italy. It is the only bridge in Florence spared from destruction
60
+ during World War II and is noted for the shops built along it, a practice that
61
+ was once common on bridges. Initially, these shops were occupied by butchers,
62
+ tanners, and farmers, but today they are home to jewellers, art dealers, and souvenir
63
+ sellers.
64
+ - ' The door retains its original massive, iron-clad doors. '
65
+ pipeline_tag: sentence-similarity
66
+ library_name: sentence-transformers
67
+ ---
68
+
69
+ # SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1
70
+
71
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
72
+
73
+ ## Model Details
74
+
75
+ ### Model Description
76
+ - **Model Type:** Sentence Transformer
77
+ - **Base model:** [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) <!-- at revision 4633e80e17ea975bc090c97b049da26062b054d3 -->
78
+ - **Maximum Sequence Length:** 512 tokens
79
+ - **Output Dimensionality:** 768 dimensions
80
+ - **Similarity Function:** Dot Product
81
+ <!-- - **Training Dataset:** Unknown -->
82
+ <!-- - **Language:** Unknown -->
83
+ <!-- - **License:** Unknown -->
84
+
85
+ ### Model Sources
86
+
87
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
88
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
89
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
90
+
91
+ ### Full Model Architecture
92
+
93
+ ```
94
+ SentenceTransformer(
95
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
96
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
97
+ )
98
+ ```
99
+
100
+ ## Usage
101
+
102
+ ### Direct Usage (Sentence Transformers)
103
+
104
+ First install the Sentence Transformers library:
105
+
106
+ ```bash
107
+ pip install -U sentence-transformers
108
+ ```
109
+
110
+ Then you can load this model and run inference.
111
+ ```python
112
+ from sentence_transformers import SentenceTransformer
113
+
114
+ # Download from the 🤗 Hub
115
+ model = SentenceTransformer("Marco127/D1_finetuned_2_test_1")
116
+ # Run inference
117
+ sentences = [
118
+ 'What was the original purpose of the building in 1255?',
119
+ " The palace was built to house first the Capitano del Popolo and later, in 1261, the 'podestà', the highest magistrate of the Florence City Council.",
120
+ ' The door retains its original massive, iron-clad doors. ',
121
+ ]
122
+ embeddings = model.encode(sentences)
123
+ print(embeddings.shape)
124
+ # [3, 768]
125
+
126
+ # Get the similarity scores for the embeddings
127
+ similarities = model.similarity(embeddings, embeddings)
128
+ print(similarities.shape)
129
+ # [3, 3]
130
+ ```
131
+
132
+ <!--
133
+ ### Direct Usage (Transformers)
134
+
135
+ <details><summary>Click to see the direct usage in Transformers</summary>
136
+
137
+ </details>
138
+ -->
139
+
140
+ <!--
141
+ ### Downstream Usage (Sentence Transformers)
142
+
143
+ You can finetune this model on your own dataset.
144
+
145
+ <details><summary>Click to expand</summary>
146
+
147
+ </details>
148
+ -->
149
+
150
+ <!--
151
+ ### Out-of-Scope Use
152
+
153
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
154
+ -->
155
+
156
+ <!--
157
+ ## Bias, Risks and Limitations
158
+
159
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
160
+ -->
161
+
162
+ <!--
163
+ ### Recommendations
164
+
165
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
166
+ -->
167
+
168
+ ## Training Details
169
+
170
+ ### Training Dataset
171
+
172
+ #### Unnamed Dataset
173
+
174
+
175
+ * Size: 72 training samples
176
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
177
+ * Approximate statistics based on the first 72 samples:
178
+ | | sentence1 | sentence2 | label |
179
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
180
+ | type | string | string | int |
181
+ | details | <ul><li>min: 14 tokens</li><li>mean: 24.06 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 36.08 tokens</li><li>max: 98 tokens</li></ul> | <ul><li>0: ~50.00%</li><li>1: ~50.00%</li></ul> |
182
+ * Samples:
183
+ | sentence1 | sentence2 | label |
184
+ |:--------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
185
+ | <code>What was the name of the first owner of the Palazzo Pitti, and in which year did he die?</code> | <code>The Palazzo Pitti, in English sometimes called the Pitti Palace, is a vast, mainly Renaissance, palace in Florence, Italy. It is situated on the south side of the River Arno in Pitti Square, a short distance from the Ponte Vecchio.</code> | <code>1</code> |
186
+ | <code>What is the name of the architect who expanded the façade and the rear section of the Palazzo Pitti in 1549?</code> | <code><br>The palace became a great treasure house as generations of the Medici and subsequent dynasties amassed paintings, plates, jewelry, and luxurious possessions. Today, the Palazzo Pitti is the largest museum complex in Florence, divided into several principal galleries or museums.</code> | <code>1</code> |
187
+ | <code>What was the name of the first owner of the Palazzo Pitti, and in which year did he die?</code> | <code><br>The palace became a great treasure house as generations of the Medici and subsequent dynasties amassed paintings, plates, jewelry, and luxurious possessions. Today, the Palazzo Pitti is the largest museum complex in Florence, divided into several principal galleries or museums.</code> | <code>0</code> |
188
+ * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
189
+ ```json
190
+ {
191
+ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
192
+ "margin": 0.5,
193
+ "size_average": true
194
+ }
195
+ ```
196
+
197
+ ### Training Hyperparameters
198
+
199
+ #### All Hyperparameters
200
+ <details><summary>Click to expand</summary>
201
+
202
+ - `overwrite_output_dir`: False
203
+ - `do_predict`: False
204
+ - `eval_strategy`: no
205
+ - `prediction_loss_only`: True
206
+ - `per_device_train_batch_size`: 8
207
+ - `per_device_eval_batch_size`: 8
208
+ - `per_gpu_train_batch_size`: None
209
+ - `per_gpu_eval_batch_size`: None
210
+ - `gradient_accumulation_steps`: 1
211
+ - `eval_accumulation_steps`: None
212
+ - `torch_empty_cache_steps`: None
213
+ - `learning_rate`: 5e-05
214
+ - `weight_decay`: 0.0
215
+ - `adam_beta1`: 0.9
216
+ - `adam_beta2`: 0.999
217
+ - `adam_epsilon`: 1e-08
218
+ - `max_grad_norm`: 1.0
219
+ - `num_train_epochs`: 3.0
220
+ - `max_steps`: -1
221
+ - `lr_scheduler_type`: linear
222
+ - `lr_scheduler_kwargs`: {}
223
+ - `warmup_ratio`: 0.0
224
+ - `warmup_steps`: 0
225
+ - `log_level`: passive
226
+ - `log_level_replica`: warning
227
+ - `log_on_each_node`: True
228
+ - `logging_nan_inf_filter`: True
229
+ - `save_safetensors`: True
230
+ - `save_on_each_node`: False
231
+ - `save_only_model`: False
232
+ - `restore_callback_states_from_checkpoint`: False
233
+ - `no_cuda`: False
234
+ - `use_cpu`: False
235
+ - `use_mps_device`: False
236
+ - `seed`: 42
237
+ - `data_seed`: None
238
+ - `jit_mode_eval`: False
239
+ - `use_ipex`: False
240
+ - `bf16`: False
241
+ - `fp16`: False
242
+ - `fp16_opt_level`: O1
243
+ - `half_precision_backend`: auto
244
+ - `bf16_full_eval`: False
245
+ - `fp16_full_eval`: False
246
+ - `tf32`: None
247
+ - `local_rank`: 0
248
+ - `ddp_backend`: None
249
+ - `tpu_num_cores`: None
250
+ - `tpu_metrics_debug`: False
251
+ - `debug`: []
252
+ - `dataloader_drop_last`: False
253
+ - `dataloader_num_workers`: 0
254
+ - `dataloader_prefetch_factor`: None
255
+ - `past_index`: -1
256
+ - `disable_tqdm`: False
257
+ - `remove_unused_columns`: True
258
+ - `label_names`: None
259
+ - `load_best_model_at_end`: False
260
+ - `ignore_data_skip`: False
261
+ - `fsdp`: []
262
+ - `fsdp_min_num_params`: 0
263
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
264
+ - `fsdp_transformer_layer_cls_to_wrap`: None
265
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
266
+ - `deepspeed`: None
267
+ - `label_smoothing_factor`: 0.0
268
+ - `optim`: adamw_torch
269
+ - `optim_args`: None
270
+ - `adafactor`: False
271
+ - `group_by_length`: False
272
+ - `length_column_name`: length
273
+ - `ddp_find_unused_parameters`: None
274
+ - `ddp_bucket_cap_mb`: None
275
+ - `ddp_broadcast_buffers`: False
276
+ - `dataloader_pin_memory`: True
277
+ - `dataloader_persistent_workers`: False
278
+ - `skip_memory_metrics`: True
279
+ - `use_legacy_prediction_loop`: False
280
+ - `push_to_hub`: False
281
+ - `resume_from_checkpoint`: None
282
+ - `hub_model_id`: None
283
+ - `hub_strategy`: every_save
284
+ - `hub_private_repo`: None
285
+ - `hub_always_push`: False
286
+ - `gradient_checkpointing`: False
287
+ - `gradient_checkpointing_kwargs`: None
288
+ - `include_inputs_for_metrics`: False
289
+ - `include_for_metrics`: []
290
+ - `eval_do_concat_batches`: True
291
+ - `fp16_backend`: auto
292
+ - `push_to_hub_model_id`: None
293
+ - `push_to_hub_organization`: None
294
+ - `mp_parameters`:
295
+ - `auto_find_batch_size`: False
296
+ - `full_determinism`: False
297
+ - `torchdynamo`: None
298
+ - `ray_scope`: last
299
+ - `ddp_timeout`: 1800
300
+ - `torch_compile`: False
301
+ - `torch_compile_backend`: None
302
+ - `torch_compile_mode`: None
303
+ - `dispatch_batches`: None
304
+ - `split_batches`: None
305
+ - `include_tokens_per_second`: False
306
+ - `include_num_input_tokens_seen`: False
307
+ - `neftune_noise_alpha`: None
308
+ - `optim_target_modules`: None
309
+ - `batch_eval_metrics`: False
310
+ - `eval_on_start`: False
311
+ - `use_liger_kernel`: False
312
+ - `eval_use_gather_object`: False
313
+ - `average_tokens_across_devices`: False
314
+ - `prompts`: None
315
+ - `batch_sampler`: batch_sampler
316
+ - `multi_dataset_batch_sampler`: proportional
317
+
318
+ </details>
319
+
320
+ ### Framework Versions
321
+ - Python: 3.10.12
322
+ - Sentence Transformers: 3.3.1
323
+ - Transformers: 4.47.1
324
+ - PyTorch: 2.5.1+cu121
325
+ - Accelerate: 1.2.1
326
+ - Datasets: 3.2.0
327
+ - Tokenizers: 0.21.0
328
+
329
+ ## Citation
330
+
331
+ ### BibTeX
332
+
333
+ #### Sentence Transformers
334
+ ```bibtex
335
+ @inproceedings{reimers-2019-sentence-bert,
336
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
337
+ author = "Reimers, Nils and Gurevych, Iryna",
338
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
339
+ month = "11",
340
+ year = "2019",
341
+ publisher = "Association for Computational Linguistics",
342
+ url = "https://arxiv.org/abs/1908.10084",
343
+ }
344
+ ```
345
+
346
+ #### ContrastiveLoss
347
+ ```bibtex
348
+ @inproceedings{hadsell2006dimensionality,
349
+ author={Hadsell, R. and Chopra, S. and LeCun, Y.},
350
+ booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
351
+ title={Dimensionality Reduction by Learning an Invariant Mapping},
352
+ year={2006},
353
+ volume={2},
354
+ number={},
355
+ pages={1735-1742},
356
+ doi={10.1109/CVPR.2006.100}
357
+ }
358
+ ```
359
+
360
+ <!--
361
+ ## Glossary
362
+
363
+ *Clearly define terms in order to be accessible across audiences.*
364
+ -->
365
+
366
+ <!--
367
+ ## Model Card Authors
368
+
369
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
370
+ -->
371
+
372
+ <!--
373
+ ## Model Card Contact
374
+
375
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
376
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/multi-qa-mpnet-base-dot-v1",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.47.1",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.47.1",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "dot"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f3e256718b7937562206039c332d8a7faa73f8ea14a56d134c7f60f6bf63d688
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": false,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "extra_special_tokens": {},
58
+ "mask_token": "<mask>",
59
+ "max_length": 250,
60
+ "model_max_length": 512,
61
+ "pad_to_multiple_of": null,
62
+ "pad_token": "<pad>",
63
+ "pad_token_type_id": 0,
64
+ "padding_side": "right",
65
+ "sep_token": "</s>",
66
+ "stride": 0,
67
+ "strip_accents": null,
68
+ "tokenize_chinese_chars": true,
69
+ "tokenizer_class": "MPNetTokenizer",
70
+ "truncation_side": "right",
71
+ "truncation_strategy": "longest_first",
72
+ "unk_token": "[UNK]"
73
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff