Stern5497 commited on
Commit
fa09db7
1 Parent(s): f096515

Add new SentenceTransformer model.

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,368 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: []
3
+ library_name: sentence-transformers
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - dataset_size:100K<n<1M
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: FacebookAI/xlm-roberta-base
11
+ widget:
12
+ - source_sentence: who did ezra play for in the nfl
13
+ sentences:
14
+ - how many all nba first teams does kobe have
15
+ - who does the voice of the little mermaid
16
+ - dont come around here no more video director
17
+ - source_sentence: who led the elves at helm s deep
18
+ sentences:
19
+ - who was the captain of the flying dutchman
20
+ - what are the 2 seasons in the philippines
21
+ - when can you get a tattoo in georgia
22
+ - source_sentence: who plays red on once upon a time
23
+ sentences:
24
+ - who plays the new receptionist on the office
25
+ - who wrote the magic school bus theme song
26
+ - when did south africa declare war on germany
27
+ - source_sentence: who plays the dark elf in thor 2
28
+ sentences:
29
+ - who plays mantis in guardian of the galaxy 2
30
+ - where in los angeles do the chargers play
31
+ - when did alaska become part of the us
32
+ - source_sentence: who plays oz in the wizard of oz
33
+ sentences:
34
+ - where did the wizard of oz come from
35
+ - when did brazil win the soccer world cup
36
+ - when did the ar 15 first go on sale
37
+ pipeline_tag: sentence-similarity
38
+ ---
39
+
40
+ # SentenceTransformer based on FacebookAI/xlm-roberta-base
41
+
42
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
43
+
44
+ ## Model Details
45
+
46
+ ### Model Description
47
+ - **Model Type:** Sentence Transformer
48
+ - **Base model:** [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base) <!-- at revision e73636d4f797dec63c3081bb6ed5c7b0bb3f2089 -->
49
+ - **Maximum Sequence Length:** 512 tokens
50
+ - **Output Dimensionality:** 768 tokens
51
+ - **Similarity Function:** Cosine Similarity
52
+ <!-- - **Training Dataset:** Unknown -->
53
+ <!-- - **Language:** Unknown -->
54
+ <!-- - **License:** Unknown -->
55
+
56
+ ### Model Sources
57
+
58
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
59
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
60
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
61
+
62
+ ### Full Model Architecture
63
+
64
+ ```
65
+ SentenceTransformer(
66
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
67
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
68
+ )
69
+ ```
70
+
71
+ ## Usage
72
+
73
+ ### Direct Usage (Sentence Transformers)
74
+
75
+ First install the Sentence Transformers library:
76
+
77
+ ```bash
78
+ pip install -U sentence-transformers
79
+ ```
80
+
81
+ Then you can load this model and run inference.
82
+ ```python
83
+ from sentence_transformers import SentenceTransformer
84
+
85
+ # Download from the 🤗 Hub
86
+ model = SentenceTransformer("Stern5497/nir-2024-xlm-roberta-base")
87
+ # Run inference
88
+ sentences = [
89
+ 'who plays oz in the wizard of oz',
90
+ 'where did the wizard of oz come from',
91
+ 'when did brazil win the soccer world cup',
92
+ ]
93
+ embeddings = model.encode(sentences)
94
+ print(embeddings.shape)
95
+ # [3, 768]
96
+
97
+ # Get the similarity scores for the embeddings
98
+ similarities = model.similarity(embeddings, embeddings)
99
+ print(similarities.shape)
100
+ # [3, 3]
101
+ ```
102
+
103
+ <!--
104
+ ### Direct Usage (Transformers)
105
+
106
+ <details><summary>Click to see the direct usage in Transformers</summary>
107
+
108
+ </details>
109
+ -->
110
+
111
+ <!--
112
+ ### Downstream Usage (Sentence Transformers)
113
+
114
+ You can finetune this model on your own dataset.
115
+
116
+ <details><summary>Click to expand</summary>
117
+
118
+ </details>
119
+ -->
120
+
121
+ <!--
122
+ ### Out-of-Scope Use
123
+
124
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
125
+ -->
126
+
127
+ <!--
128
+ ## Bias, Risks and Limitations
129
+
130
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
131
+ -->
132
+
133
+ <!--
134
+ ### Recommendations
135
+
136
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
137
+ -->
138
+
139
+ ## Training Details
140
+
141
+ ### Training Dataset
142
+
143
+ #### Unnamed Dataset
144
+
145
+
146
+ * Size: 164,848 training samples
147
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
148
+ * Approximate statistics based on the first 1000 samples:
149
+ | | sentence_0 | sentence_1 | sentence_2 |
150
+ |:--------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
151
+ | type | string | string | string |
152
+ | details | <ul><li>min: 10 tokens</li><li>mean: 13.41 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 136 tokens</li><li>mean: 164.07 tokens</li><li>max: 239 tokens</li></ul> | <ul><li>min: 133 tokens</li><li>mean: 165.13 tokens</li><li>max: 256 tokens</li></ul> |
153
+ * Samples:
154
+ | sentence_0 | sentence_1 | sentence_2 |
155
+ |:-------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
156
+ | <code>who wrote treat you better by shawn mendes</code> | <code>{'title': '', 'text': 'Treat You Better "Treat You Better" is a song recorded by Canadian singer and songwriter Shawn Mendes. It was co-written by Mendes with Teddy Geiger, and Scott Harris. It was released on June 3, 2016 through Island Records as the lead single from his second studio album, "Illuminate" (2016). The music video was released on July 12, 2016 and features a storyline about an abusive relationship. The song peaked at number six on the US "Billboard" Hot 100, making it Mendes\' second top 10 single. In Canada, the song has peaked at number seven on the Canadian Hot 100. The'}</code> | <code>{'title': '', 'text': 'Scott Harris (songwriter) Scott Harris Friedman is an American multi-platinum, Grammy nominated songwriter, producer, and musician best known for his work with Shawn Mendes and co-writing Grammy winning song, "Don\'t Let Me Down" by The Chainsmokers featuring Daya, which reached #1 on the US Mainstream Top 40 chart in 2016. Harris has most recently written 13 songs on the self-titled third album Shawn Mendes (album), which debuted at #1 on the Billboard 200 chart, in addition to 10 songs on Shawn Mendes\' sophomore album "Illuminate" including the lead single "Treat You Better" which reached the top 3 at the US'}</code> |
157
+ | <code>where is the tanami desert located in australia</code> | <code>{'title': '', 'text': 'zone. Tanami Desert The Tanami Desert is a desert in northern Australia situated in the Northern Territory and Western Australia. It has a rocky terrain with small hills. The Tanami was the Northern Territory\'s final frontier and was not fully explored by Australians of European descent until well into the twentieth century. It is traversed by the Tanami Track. The name "Tanami" is thought to be a corruption of the Walpiri name for the area, "Chanamee", meaning "never die". This referred to certain rock holes in the desert which were said never to run dry. Under the name "Tanami", the'}</code> | <code>{'title': '', 'text': '("glomerata") is from the Latin "glomeratus", meaning "heaped" or "form into a ball". Desert tea-tree occurs in the arid parts of Australia including the far north west of New South Wales, South Australia including the Flinders Ranges, the Northern Territory and Western Australia. In the latter state it has been recorded from the Carnarvon, Central Kimberley, Central Ranges, Dampierland, Gascoyne, Gibson Desert, Great Sandy Desert, Great Victoria Desert, Little Sandy Desert, Murchison, Ord Victoria Plain, Pilbara and Tanami biogeographic areas. It grows in red sand, clay and sandy loam in rocky river beds, shallow depressions and sandy flats. "Melaleuca globifera"'}</code> |
158
+ | <code>who won the us open men s and women s singles in 2017</code> | <code>{'title': '', 'text': "that ended his season, while Kerber lost in the first round to Naomi Osaka. The men's singles tournament concluded with Rafael Nadal defeating Kevin Anderson in the final, while the women's singles tournament concluded with Sloane Stephens defeating Madison Keys in the final. The 2017 US Open was the 137th edition of the tournament and took place at the USTA Billie Jean King National Tennis Center in Flushing Meadows–Corona Park of Queens in New York City, New York, United States. The tournament was held on 14 DecoTurf hard courts. The tournament was an event run by the International Tennis Federation"}</code> | <code>{'title': '', 'text': "2017 US Open – Women's Singles Angelique Kerber was the defending champion, but was defeated in the first round by Naomi Osaka. Kerber became the second US Open defending champion to lose in the first round after Svetlana Kuznetsova in 2005. Sloane Stephens won her first Grand Slam title, defeating Madison Keys in the final, 6–3, 6–0. It was the first all-American women's final at the US Open since 2002, and the second time in three years that the final featured two first-time Grand Slam singles finalists from the same country. Stephens became the second unseeded woman in the Open"}</code> |
159
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
160
+ ```json
161
+ {
162
+ "scale": 20.0,
163
+ "similarity_fct": "cos_sim"
164
+ }
165
+ ```
166
+
167
+ ### Training Hyperparameters
168
+ #### Non-Default Hyperparameters
169
+
170
+ - `per_device_train_batch_size`: 16
171
+ - `per_device_eval_batch_size`: 16
172
+ - `num_train_epochs`: 1
173
+ - `fp16`: True
174
+ - `batch_sampler`: no_duplicates
175
+ - `multi_dataset_batch_sampler`: round_robin
176
+
177
+ #### All Hyperparameters
178
+ <details><summary>Click to expand</summary>
179
+
180
+ - `overwrite_output_dir`: False
181
+ - `do_predict`: False
182
+ - `prediction_loss_only`: True
183
+ - `per_device_train_batch_size`: 16
184
+ - `per_device_eval_batch_size`: 16
185
+ - `per_gpu_train_batch_size`: None
186
+ - `per_gpu_eval_batch_size`: None
187
+ - `gradient_accumulation_steps`: 1
188
+ - `eval_accumulation_steps`: None
189
+ - `learning_rate`: 5e-05
190
+ - `weight_decay`: 0.0
191
+ - `adam_beta1`: 0.9
192
+ - `adam_beta2`: 0.999
193
+ - `adam_epsilon`: 1e-08
194
+ - `max_grad_norm`: 1
195
+ - `num_train_epochs`: 1
196
+ - `max_steps`: -1
197
+ - `lr_scheduler_type`: linear
198
+ - `lr_scheduler_kwargs`: {}
199
+ - `warmup_ratio`: 0.0
200
+ - `warmup_steps`: 0
201
+ - `log_level`: passive
202
+ - `log_level_replica`: warning
203
+ - `log_on_each_node`: True
204
+ - `logging_nan_inf_filter`: True
205
+ - `save_safetensors`: True
206
+ - `save_on_each_node`: False
207
+ - `save_only_model`: False
208
+ - `no_cuda`: False
209
+ - `use_cpu`: False
210
+ - `use_mps_device`: False
211
+ - `seed`: 42
212
+ - `data_seed`: None
213
+ - `jit_mode_eval`: False
214
+ - `use_ipex`: False
215
+ - `bf16`: False
216
+ - `fp16`: True
217
+ - `fp16_opt_level`: O1
218
+ - `half_precision_backend`: auto
219
+ - `bf16_full_eval`: False
220
+ - `fp16_full_eval`: False
221
+ - `tf32`: None
222
+ - `local_rank`: 0
223
+ - `ddp_backend`: None
224
+ - `tpu_num_cores`: None
225
+ - `tpu_metrics_debug`: False
226
+ - `debug`: []
227
+ - `dataloader_drop_last`: False
228
+ - `dataloader_num_workers`: 0
229
+ - `dataloader_prefetch_factor`: None
230
+ - `past_index`: -1
231
+ - `disable_tqdm`: False
232
+ - `remove_unused_columns`: True
233
+ - `label_names`: None
234
+ - `load_best_model_at_end`: False
235
+ - `ignore_data_skip`: False
236
+ - `fsdp`: []
237
+ - `fsdp_min_num_params`: 0
238
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
239
+ - `fsdp_transformer_layer_cls_to_wrap`: None
240
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None}
241
+ - `deepspeed`: None
242
+ - `label_smoothing_factor`: 0.0
243
+ - `optim`: adamw_torch
244
+ - `optim_args`: None
245
+ - `adafactor`: False
246
+ - `group_by_length`: False
247
+ - `length_column_name`: length
248
+ - `ddp_find_unused_parameters`: None
249
+ - `ddp_bucket_cap_mb`: None
250
+ - `ddp_broadcast_buffers`: False
251
+ - `dataloader_pin_memory`: True
252
+ - `dataloader_persistent_workers`: False
253
+ - `skip_memory_metrics`: True
254
+ - `use_legacy_prediction_loop`: False
255
+ - `push_to_hub`: False
256
+ - `resume_from_checkpoint`: None
257
+ - `hub_model_id`: None
258
+ - `hub_strategy`: every_save
259
+ - `hub_private_repo`: False
260
+ - `hub_always_push`: False
261
+ - `gradient_checkpointing`: False
262
+ - `gradient_checkpointing_kwargs`: None
263
+ - `include_inputs_for_metrics`: False
264
+ - `eval_do_concat_batches`: True
265
+ - `fp16_backend`: auto
266
+ - `push_to_hub_model_id`: None
267
+ - `push_to_hub_organization`: None
268
+ - `mp_parameters`:
269
+ - `auto_find_batch_size`: False
270
+ - `full_determinism`: False
271
+ - `torchdynamo`: None
272
+ - `ray_scope`: last
273
+ - `ddp_timeout`: 1800
274
+ - `torch_compile`: False
275
+ - `torch_compile_backend`: None
276
+ - `torch_compile_mode`: None
277
+ - `dispatch_batches`: None
278
+ - `split_batches`: None
279
+ - `include_tokens_per_second`: False
280
+ - `include_num_input_tokens_seen`: False
281
+ - `neftune_noise_alpha`: None
282
+ - `optim_target_modules`: None
283
+ - `batch_sampler`: no_duplicates
284
+ - `multi_dataset_batch_sampler`: round_robin
285
+
286
+ </details>
287
+
288
+ ### Training Logs
289
+ | Epoch | Step | Training Loss |
290
+ |:------:|:-----:|:-------------:|
291
+ | 0.0485 | 500 | 1.6163 |
292
+ | 0.0971 | 1000 | 0.8086 |
293
+ | 0.1456 | 1500 | 0.6766 |
294
+ | 0.1941 | 2000 | 0.6124 |
295
+ | 0.2426 | 2500 | 0.5374 |
296
+ | 0.2912 | 3000 | 0.5115 |
297
+ | 0.3397 | 3500 | 0.4823 |
298
+ | 0.3882 | 4000 | 0.4268 |
299
+ | 0.4368 | 4500 | 0.422 |
300
+ | 0.4853 | 5000 | 0.4014 |
301
+ | 0.5338 | 5500 | 0.3765 |
302
+ | 0.5824 | 6000 | 0.3689 |
303
+ | 0.6309 | 6500 | 0.3551 |
304
+ | 0.6794 | 7000 | 0.3359 |
305
+ | 0.7279 | 7500 | 0.326 |
306
+ | 0.7765 | 8000 | 0.3158 |
307
+ | 0.8250 | 8500 | 0.2945 |
308
+ | 0.8735 | 9000 | 0.2836 |
309
+ | 0.9221 | 9500 | 0.3043 |
310
+ | 0.9706 | 10000 | 0.2761 |
311
+ | 1.0 | 10303 | - |
312
+
313
+
314
+ ### Framework Versions
315
+ - Python: 3.10.12
316
+ - Sentence Transformers: 3.0.0
317
+ - Transformers: 4.40.2
318
+ - PyTorch: 2.3.0+cu118
319
+ - Accelerate: 0.29.3
320
+ - Datasets: 2.19.0
321
+ - Tokenizers: 0.19.1
322
+
323
+ ## Citation
324
+
325
+ ### BibTeX
326
+
327
+ #### Sentence Transformers
328
+ ```bibtex
329
+ @inproceedings{reimers-2019-sentence-bert,
330
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
331
+ author = "Reimers, Nils and Gurevych, Iryna",
332
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
333
+ month = "11",
334
+ year = "2019",
335
+ publisher = "Association for Computational Linguistics",
336
+ url = "https://arxiv.org/abs/1908.10084",
337
+ }
338
+ ```
339
+
340
+ #### MultipleNegativesRankingLoss
341
+ ```bibtex
342
+ @misc{henderson2017efficient,
343
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
344
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
345
+ year={2017},
346
+ eprint={1705.00652},
347
+ archivePrefix={arXiv},
348
+ primaryClass={cs.CL}
349
+ }
350
+ ```
351
+
352
+ <!--
353
+ ## Glossary
354
+
355
+ *Clearly define terms in order to be accessible across audiences.*
356
+ -->
357
+
358
+ <!--
359
+ ## Model Card Authors
360
+
361
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
362
+ -->
363
+
364
+ <!--
365
+ ## Model Card Contact
366
+
367
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
368
+ -->
config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "xlm-roberta-base",
3
+ "architectures": [
4
+ "XLMRobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "xlm-roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "output_past": true,
21
+ "pad_token_id": 1,
22
+ "position_embedding_type": "absolute",
23
+ "torch_dtype": "float32",
24
+ "transformers_version": "4.40.2",
25
+ "type_vocab_size": 1,
26
+ "use_cache": true,
27
+ "vocab_size": 250002
28
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.0",
4
+ "transformers": "4.40.2",
5
+ "pytorch": "2.3.0+cu118"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4cbff62dd5aec015c5cb75772c27e60bb561892d2ba5897fc2e338870bcaeff
3
+ size 1112197096
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:883b037111086fd4dfebbbc9b7cee11e1517b5e0c0514879478661440f137085
3
+ size 17082987
tokenizer_config.json ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "250001": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "eos_token": "</s>",
48
+ "mask_token": "<mask>",
49
+ "model_max_length": 512,
50
+ "pad_token": "<pad>",
51
+ "sep_token": "</s>",
52
+ "tokenizer_class": "XLMRobertaTokenizer",
53
+ "unk_token": "<unk>"
54
+ }