jameswright commited on
Commit
d10f6ba
1 Parent(s): 260ee80

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,620 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: google-bert/bert-base-uncased
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - sentence-transformers
9
+ - sentence-similarity
10
+ - feature-extraction
11
+ - generated_from_trainer
12
+ - dataset_size:132712
13
+ - loss:DenoisingAutoEncoderLoss
14
+ widget:
15
+ - source_sentence: it got and spoke was amazing and me sequi a month . I the been
16
+ seen a in, I then started, im last one 4 ive my anxiety has horrible I have had
17
+ a read of different and part hightened, didnt stright away the only just taised
18
+ head the last . to it all than was prior to but im just for similar experiences
19
+ me I want or? taking time read.
20
+ sentences:
21
+ - 'In anycase it got too much and I spoke to my gp who was amazing and started me
22
+ on evorel sequi for a 3 month trial. So I would say the first 4 patches have been
23
+ good, I definitely seen a change in myself, I then started the conti patches,
24
+ im currently on the last one of those 4, ive been brilliant until 3 days ago my
25
+ anxiety has come back and its horrible. SO I have had a read of some different
26
+ forums and it seems that the conti part can cause hightened anxiety. now, this
27
+ didnt happen stright away and the anxiety has only just taised it head again in
28
+ the last 3 days. So I dont want to complain too much as it is all better than
29
+ it was prior to HRT but I guess im just looking for people with similar experiences
30
+ to me, and I want to know if it got better or what? Thanks for taking the time
31
+ to read. '
32
+ - Just out of interest too... In an untreated Type 1 experiencing symptoms such
33
+ as hunger, what happens to blood sugar levels. Do they just rise and rise? For
34
+ example, given a typical day of carb laden breakfast, lunch and dinner and snacks,
35
+ would you expect the BG levels to just go out of control over a short time (mmol/l
36
+ into the teens and beyond? )....
37
+ - Anyone else have post natal anxiety or depression? I’m on the wait list for counselling
38
+ and my gp has prescribed tablets (although I’m not going to take them). Just wondering
39
+ if anyone else is in the same boat and how you are coping? I’m so up and down
40
+ x 😢
41
+ - source_sentence: I still a work I am live normally . manage symptoms in daily in
42
+ there are days it like such a Do you?
43
+ sentences:
44
+ - 'I am certainly still a work in progress. But most days I am able to live my life
45
+ normally. And manage the symptoms I feel. I still have to put in the daily work
46
+ and in all honesty there are days it feels like such a pain lol. Do you know where
47
+ your anxiety began? '
48
+ - 'I''ve had the rabies anxiety too, then I met this dude in the crisis center who
49
+ told me "do you know how rare rabies is? I''ve literally wrestled hundreds of
50
+ wild animals as an exterminator and never once got rabies, and in all my years
51
+ I''ve only seen one animal with rabies, and you can easily tell" So if the odds
52
+ are low seeing an animal with rabies then the odds of rabies infected saliva in
53
+ 90 degree weather is absolutely miniscule. '
54
+ - Morning all, Im currently in the 3rd round of my 8 cycles of chemo for breast
55
+ cancer. Had my last round on 29th June. I have tested positive for covid this
56
+ morning after waking up with a sore throat. Spoke to the chemo line, they said
57
+ dont worry, just get in touch if I get a temperature or feel really unwell. Has
58
+ anyone else had covid while under going chemo? Im naturally feeling a littlw anxious
59
+ and would love to hear from anyone who's been there. Thanks in advance xx
60
+ - source_sentence: yourself out speculating your problem might be, most so overwhelmed
61
+ right that it sense note of what problems are consider possibilities .'d until
62
+ GP of It GP For, there a time have to more Do you notice problems you certain?
63
+ sentences:
64
+ - Don't freak yourself out by speculating about what your problem might be. At the
65
+ same time, most GPs are so overwhelmed right now that it makes sense to keep note
66
+ of what your problems are and to consider possibilities. I'd make a diary entry
67
+ every day until you see the GP about the details of your symptoms. It'll help
68
+ the GP. For example, is there a particular time of day that you have to urinate
69
+ more often? Do you notice certain problems after you eat certain things?
70
+ - Hiya! I was just wondering how long does it take for an appointment to come for
71
+ an urgent scan. I had my blood test result which showed ca125 of 47 and seen the
72
+ GP last Friday the 30th and it's the waiting that's driving me crazy. Is it or
73
+ is it not ovarian cancer??? I have no appetite anymore and I'm thinking if this
74
+ is due to anxiety or progression of whatever I have? I also have this dry mouth
75
+ which I am not sure if from anxiety or due to the blood pressure tablet I am taking
76
+ which has been increased recently.
77
+ - yma123 · Yesterday 12:08 No I didn't have any symptoms at all! CIN2 means moderate
78
+ cell changes. I know it all seems very scary but just try and think that things
79
+ like this is exactly what this process is trying to pick up so they can catch
80
+ things before it gets to the stage that it's anything serious you're welcome to
81
+ message me if you have any more questions or if you just want to talk about things
82
+ x Thanks for elaborating on what CIN2 means I had no idea. So is that kind of
83
+ like cell changes that happen before it develops fully into cancer? So kind of
84
+ you to offer for me to PM you, thanks so much I will do that. It’s nice to be
85
+ able to chat to people that have been through similar as it’s such an anxious
86
+ time 💐
87
+ - source_sentence: what proposing is of to lowest Good, school have the wo be, having
88
+ any, doctors That the outcome you think the way through is completely / mixed
89
+ up what is proposing it ’ s something teaching how relate world are ” “ french
90
+ lessons
91
+ sentences:
92
+ - Surely you must know that that's BS? Or am I wrong, does the fact ICE cars have
93
+ oil running through their veins equal the possession of a soul? Does that Black
94
+ Edition Golf doing the rounds have soul? I'd look at this and surmise that it
95
+ is big, cosseting, luxurious, quiet, effortlessly fast etc etc, and for me that
96
+ sounds like pretty compelling transport. Can't you lot just accept that cars,
97
+ and more specifically the enjoyment of cars, means different things to different
98
+ people, and that you can be just as enthusiastic about electric propulsion as
99
+ you can be about burning fossil fuels?
100
+ - Phoebo · Today 03:35 So what you're proposing is a dumbing down of society to
101
+ eductae the lowest common denominator? Good parents, parents but if school have
102
+ to do the parenting then there won't be time for actual schooling, so we'll end
103
+ up not having any engineers, doctors, architects etc? That's the outcome if you
104
+ think it all the way through. The OP is completely unsure / confused / mixed up
105
+ about what she is proposing. when asked to be clear… it’s something about teaching
106
+ about “feelings” how we “relate to the world” “fizzy drinks are bad” and “drop
107
+ french for lessons about domestic abuse” 😂
108
+ - 'Can you speak to a counsellor about it or ask for mental health support at your
109
+ gp? It might help you build up more coping skills around this. What do your parents
110
+ say about it? Do they intervene? Is she abusive to your parents or anyone else? '
111
+ - source_sentence: You have been sick so If questions it just - ’ being sorted, thanks
112
+ Then assertiveness course and a . overtime isn selfish it just doing ’ s right
113
+ for Why you?
114
+ sentences:
115
+ - You have been off sick so it makes sense. If anyone questions it just grey rock
116
+ - it’s being sorted, thanks. Then do an online assertiveness course and ask your
117
+ GP for a CBT referral. Not doing overtime isn’t selfish - it’s just you doing
118
+ what’s right for you. Why would you do anything else?
119
+ - 'Science works by the accumulation of evidence.  Independent groups work on projects
120
+ and publish results.  Those results are examined and tested and examined again
121
+ and tested again, such that they''re either confirmed or discarded and further
122
+ work continues accordingly.  If a scientist or doctor disagrees with the ''official
123
+ line'' they''re asked to present the data, methods and conclusions that have led
124
+ to that disagreement so that it can by examined by the broader scientific and
125
+ medical community.    And yes, someone who goes on YouTube or wherever - whether
126
+ doctor, scientist or layperson - and tells viewers that a vaccine alters DNA structure
127
+ and destroys the immune system is either a grifter or a fruitcake. 1 minute ago,
128
+ FIRETHORN1 said: ...Can you not accept that some people can hold an opposite view
129
+ quite genuinely? To me, a "conspiracy theorist" is someone who believes what they
130
+ are told, without any evidence to back it up. I can absolutely accept that someone
131
+ can genuinely believe something without having any evidence at all to support
132
+ that view.  I''m believing that right now, in fact. 1 minute ago, FIRETHORN1 said:
133
+ There is no evidence whatsoever that the vaccine works That is categorically,
134
+ absolutely and undeniably false, as the most cursory of research will tell you. 
135
+ But then you don''t actually want  to believe me, do you?'
136
+ - 'n111ck said: Thats two completely different things and lets not forget every
137
+ time the government tries to, for example, crack down on benefit fraud its slammed
138
+ as ''unfair and cruel'' in the lefty media. Where anything is provided ''free''
139
+ by the government it will always be abused and subject to fraud - covid has proven
140
+ that quite clearly. The cost of preventing fraud has to be balanced against the
141
+ cost of the actual fraud unfortunately. Mainly because they way they do it is
142
+ cruel and unfair and usually hits those that need the help by starting off with
143
+ the objective to actively refuse and punish not to actually assess and help? Also
144
+ possibly because benefit fraud is tiny compared to mistakes (usually in the governments
145
+ favour) in decisions and the amount not claimed that people are eligible for?'
146
+ ---
147
+
148
+ # SentenceTransformer based on google-bert/bert-base-uncased
149
+
150
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
151
+
152
+ ## Model Details
153
+
154
+ ### Model Description
155
+ - **Model Type:** Sentence Transformer
156
+ - **Base model:** [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) <!-- at revision 86b5e0934494bd15c9632b12f734a8a67f723594 -->
157
+ - **Maximum Sequence Length:** 512 tokens
158
+ - **Output Dimensionality:** 768 tokens
159
+ - **Similarity Function:** Cosine Similarity
160
+ <!-- - **Training Dataset:** Unknown -->
161
+ <!-- - **Language:** Unknown -->
162
+ <!-- - **License:** Unknown -->
163
+
164
+ ### Model Sources
165
+
166
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
167
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
168
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
169
+
170
+ ### Full Model Architecture
171
+
172
+ ```
173
+ SentenceTransformer(
174
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
175
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
176
+ )
177
+ ```
178
+
179
+ ## Usage
180
+
181
+ ### Direct Usage (Sentence Transformers)
182
+
183
+ First install the Sentence Transformers library:
184
+
185
+ ```bash
186
+ pip install -U sentence-transformers
187
+ ```
188
+
189
+ Then you can load this model and run inference.
190
+ ```python
191
+ from sentence_transformers import SentenceTransformer
192
+
193
+ # Download from the 🤗 Hub
194
+ model = SentenceTransformer("jameswright/ws-wr-questions-bert-TSDAE-v1")
195
+ # Run inference
196
+ sentences = [
197
+ 'You have been sick so If questions it just - ’ being sorted, thanks Then assertiveness course and a . overtime isn selfish it just doing ’ s right for Why you?',
198
+ 'You have been off sick so it makes sense. If anyone questions it just grey rock - it’s being sorted, thanks. Then do an online assertiveness course and ask your GP for a CBT referral. Not doing overtime isn’t selfish - it’s just you doing what’s right for you. Why would you do anything else?',
199
+ 'Science works by the accumulation of evidence.\xa0 Independent groups work on projects and publish results.\xa0 Those results are examined and tested and examined again and tested again, such that they\'re either confirmed or discarded and further work continues accordingly.\xa0 If a scientist or doctor disagrees with the \'official line\' they\'re asked to present the data, methods and conclusions that have led to that disagreement so that it can by examined by the broader scientific and medical community.\xa0 \xa0 And yes, someone who goes on YouTube or wherever - whether doctor, scientist or layperson - and tells viewers that a vaccine alters DNA structure and destroys the immune system is either a grifter or a fruitcake. 1 minute ago, FIRETHORN1 said: ...Can you not accept that some people can hold an opposite view quite genuinely? To me, a "conspiracy theorist" is someone who believes what they are told, without any evidence to back it up. I can absolutely accept that someone can genuinely believe something without having any evidence at all to support that view.\xa0 I\'m believing that right now, in fact. 1 minute ago, FIRETHORN1 said: There is no evidence whatsoever that the vaccine works That is categorically, absolutely and undeniably false, as the most cursory of research will tell you.\xa0 But then you don\'t actually want\xa0 to believe me, do you?',
200
+ ]
201
+ embeddings = model.encode(sentences)
202
+ print(embeddings.shape)
203
+ # [3, 768]
204
+
205
+ # Get the similarity scores for the embeddings
206
+ similarities = model.similarity(embeddings, embeddings)
207
+ print(similarities.shape)
208
+ # [3, 3]
209
+ ```
210
+
211
+ <!--
212
+ ### Direct Usage (Transformers)
213
+
214
+ <details><summary>Click to see the direct usage in Transformers</summary>
215
+
216
+ </details>
217
+ -->
218
+
219
+ <!--
220
+ ### Downstream Usage (Sentence Transformers)
221
+
222
+ You can finetune this model on your own dataset.
223
+
224
+ <details><summary>Click to expand</summary>
225
+
226
+ </details>
227
+ -->
228
+
229
+ <!--
230
+ ### Out-of-Scope Use
231
+
232
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
233
+ -->
234
+
235
+ <!--
236
+ ## Bias, Risks and Limitations
237
+
238
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
239
+ -->
240
+
241
+ <!--
242
+ ### Recommendations
243
+
244
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
245
+ -->
246
+
247
+ ## Training Details
248
+
249
+ ### Training Dataset
250
+
251
+ #### Unnamed Dataset
252
+
253
+
254
+ * Size: 132,712 training samples
255
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
256
+ * Approximate statistics based on the first 1000 samples:
257
+ | | sentence_0 | sentence_1 |
258
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
259
+ | type | string | string |
260
+ | details | <ul><li>min: 4 tokens</li><li>mean: 47.75 tokens</li><li>max: 460 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 114.94 tokens</li><li>max: 512 tokens</li></ul> |
261
+ * Samples:
262
+ | sentence_0 | sentence_1 |
263
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
264
+ | <code>’ Can really go to the doctors I ’ bored of the ” . Feels more like than a doctor, does sound depression, so seeing GP a first</code> | <code>PetersRabbitt I don’t know? Can I really go to the doctors and say “hey, yes my problem is I’m bored all of the time”. Feels more like a me problem than one a doctor can help with. Yes, absolutely. It does sound like it could be depression, so seeing your GP is a good first step.</code> |
265
+ | <code>Ursuladevine Between 11 16, if hasn t, what has been been providing education have LakieLady Yesterday 15:34 My that dismissed offhand son the school have up for assessment . Within years referred diagnosed with PTSD,,, social anxiety and, decided his be.</code> | <code>Ursuladevine · Yesterday 15:42 Between 11 and 16, if he hasn’t been attending school, what has he been doing? Has the LA been providing any education or have you been HE? LakieLady · Yesterday 15:34 My friend tried that, and the GP dismissed it offhand, saying that if her son was neurodivergent, the school would have picked up on it and referred him for assessment. Her DS was eight at the time. Within the next 2-3 years, he got much worse, was referred to CAMHS, diagnosed with significant MH problems (PTSD, GAD, depression, social anxiety disorder) and after a couple of years, CAMHS decided his mother might not be talking bollocks and that he might have ASD.</code> |
266
+ | <code>It sounds you were a child then came along realised here was he - and this to it . young, I'd imagine</code> | <code>It sounds like you were hurt by one man when you were a child, then another came along and realised here was someone damaged he could dominate - and added his own abuse. They can sniff this out and are attracted to it. How old were you when he arrived? Very young, I'd imagine. Stepfather?</code> |
267
+ * Loss: [<code>DenoisingAutoEncoderLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#denoisingautoencoderloss)
268
+
269
+ ### Training Hyperparameters
270
+ #### Non-Default Hyperparameters
271
+
272
+ - `num_train_epochs`: 5
273
+ - `multi_dataset_batch_sampler`: round_robin
274
+
275
+ #### All Hyperparameters
276
+ <details><summary>Click to expand</summary>
277
+
278
+ - `overwrite_output_dir`: False
279
+ - `do_predict`: False
280
+ - `eval_strategy`: no
281
+ - `prediction_loss_only`: True
282
+ - `per_device_train_batch_size`: 8
283
+ - `per_device_eval_batch_size`: 8
284
+ - `per_gpu_train_batch_size`: None
285
+ - `per_gpu_eval_batch_size`: None
286
+ - `gradient_accumulation_steps`: 1
287
+ - `eval_accumulation_steps`: None
288
+ - `learning_rate`: 5e-05
289
+ - `weight_decay`: 0.0
290
+ - `adam_beta1`: 0.9
291
+ - `adam_beta2`: 0.999
292
+ - `adam_epsilon`: 1e-08
293
+ - `max_grad_norm`: 1
294
+ - `num_train_epochs`: 5
295
+ - `max_steps`: -1
296
+ - `lr_scheduler_type`: linear
297
+ - `lr_scheduler_kwargs`: {}
298
+ - `warmup_ratio`: 0.0
299
+ - `warmup_steps`: 0
300
+ - `log_level`: passive
301
+ - `log_level_replica`: warning
302
+ - `log_on_each_node`: True
303
+ - `logging_nan_inf_filter`: True
304
+ - `save_safetensors`: True
305
+ - `save_on_each_node`: False
306
+ - `save_only_model`: False
307
+ - `restore_callback_states_from_checkpoint`: False
308
+ - `no_cuda`: False
309
+ - `use_cpu`: False
310
+ - `use_mps_device`: False
311
+ - `seed`: 42
312
+ - `data_seed`: None
313
+ - `jit_mode_eval`: False
314
+ - `use_ipex`: False
315
+ - `bf16`: False
316
+ - `fp16`: False
317
+ - `fp16_opt_level`: O1
318
+ - `half_precision_backend`: auto
319
+ - `bf16_full_eval`: False
320
+ - `fp16_full_eval`: False
321
+ - `tf32`: None
322
+ - `local_rank`: 0
323
+ - `ddp_backend`: None
324
+ - `tpu_num_cores`: None
325
+ - `tpu_metrics_debug`: False
326
+ - `debug`: []
327
+ - `dataloader_drop_last`: False
328
+ - `dataloader_num_workers`: 0
329
+ - `dataloader_prefetch_factor`: None
330
+ - `past_index`: -1
331
+ - `disable_tqdm`: False
332
+ - `remove_unused_columns`: True
333
+ - `label_names`: None
334
+ - `load_best_model_at_end`: False
335
+ - `ignore_data_skip`: False
336
+ - `fsdp`: []
337
+ - `fsdp_min_num_params`: 0
338
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
339
+ - `fsdp_transformer_layer_cls_to_wrap`: None
340
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
341
+ - `deepspeed`: None
342
+ - `label_smoothing_factor`: 0.0
343
+ - `optim`: adamw_torch
344
+ - `optim_args`: None
345
+ - `adafactor`: False
346
+ - `group_by_length`: False
347
+ - `length_column_name`: length
348
+ - `ddp_find_unused_parameters`: None
349
+ - `ddp_bucket_cap_mb`: None
350
+ - `ddp_broadcast_buffers`: False
351
+ - `dataloader_pin_memory`: True
352
+ - `dataloader_persistent_workers`: False
353
+ - `skip_memory_metrics`: True
354
+ - `use_legacy_prediction_loop`: False
355
+ - `push_to_hub`: False
356
+ - `resume_from_checkpoint`: None
357
+ - `hub_model_id`: None
358
+ - `hub_strategy`: every_save
359
+ - `hub_private_repo`: False
360
+ - `hub_always_push`: False
361
+ - `gradient_checkpointing`: False
362
+ - `gradient_checkpointing_kwargs`: None
363
+ - `include_inputs_for_metrics`: False
364
+ - `eval_do_concat_batches`: True
365
+ - `fp16_backend`: auto
366
+ - `push_to_hub_model_id`: None
367
+ - `push_to_hub_organization`: None
368
+ - `mp_parameters`:
369
+ - `auto_find_batch_size`: False
370
+ - `full_determinism`: False
371
+ - `torchdynamo`: None
372
+ - `ray_scope`: last
373
+ - `ddp_timeout`: 1800
374
+ - `torch_compile`: False
375
+ - `torch_compile_backend`: None
376
+ - `torch_compile_mode`: None
377
+ - `dispatch_batches`: None
378
+ - `split_batches`: None
379
+ - `include_tokens_per_second`: False
380
+ - `include_num_input_tokens_seen`: False
381
+ - `neftune_noise_alpha`: None
382
+ - `optim_target_modules`: None
383
+ - `batch_eval_metrics`: False
384
+ - `eval_on_start`: False
385
+ - `batch_sampler`: batch_sampler
386
+ - `multi_dataset_batch_sampler`: round_robin
387
+
388
+ </details>
389
+
390
+ ### Training Logs
391
+ <details><summary>Click to expand</summary>
392
+
393
+ | Epoch | Step | Training Loss |
394
+ |:------:|:-----:|:-------------:|
395
+ | 0.0301 | 500 | 4.7687 |
396
+ | 0.0603 | 1000 | 4.2523 |
397
+ | 0.0904 | 1500 | 4.1156 |
398
+ | 0.1206 | 2000 | 4.0278 |
399
+ | 0.1507 | 2500 | 3.9652 |
400
+ | 0.1808 | 3000 | 3.919 |
401
+ | 0.2110 | 3500 | 3.8629 |
402
+ | 0.2411 | 4000 | 3.7985 |
403
+ | 0.2713 | 4500 | 3.7625 |
404
+ | 0.3014 | 5000 | 3.7523 |
405
+ | 0.3315 | 5500 | 3.7316 |
406
+ | 0.3617 | 6000 | 3.6837 |
407
+ | 0.3918 | 6500 | 3.669 |
408
+ | 0.4220 | 7000 | 3.6394 |
409
+ | 0.4521 | 7500 | 3.6017 |
410
+ | 0.4822 | 8000 | 3.5693 |
411
+ | 0.5124 | 8500 | 3.5821 |
412
+ | 0.5425 | 9000 | 3.5488 |
413
+ | 0.5727 | 9500 | 3.5139 |
414
+ | 0.6028 | 10000 | 3.5119 |
415
+ | 0.6329 | 10500 | 3.4988 |
416
+ | 0.6631 | 11000 | 3.4741 |
417
+ | 0.6932 | 11500 | 3.4719 |
418
+ | 0.7234 | 12000 | 3.4501 |
419
+ | 0.7535 | 12500 | 3.4353 |
420
+ | 0.7837 | 13000 | 3.4107 |
421
+ | 0.8138 | 13500 | 3.4023 |
422
+ | 0.8439 | 14000 | 3.3902 |
423
+ | 0.8741 | 14500 | 3.3697 |
424
+ | 0.9042 | 15000 | 3.3731 |
425
+ | 0.9344 | 15500 | 3.3603 |
426
+ | 0.9645 | 16000 | 3.3284 |
427
+ | 0.9946 | 16500 | 3.3339 |
428
+ | 1.0248 | 17000 | 3.2793 |
429
+ | 1.0549 | 17500 | 3.2098 |
430
+ | 1.0851 | 18000 | 3.1994 |
431
+ | 1.1152 | 18500 | 3.1801 |
432
+ | 1.1453 | 19000 | 3.1634 |
433
+ | 1.1755 | 19500 | 3.1566 |
434
+ | 1.2056 | 20000 | 3.1205 |
435
+ | 1.2358 | 20500 | 3.1064 |
436
+ | 1.2659 | 21000 | 3.1028 |
437
+ | 1.2960 | 21500 | 3.099 |
438
+ | 1.3262 | 22000 | 3.1028 |
439
+ | 1.3563 | 22500 | 3.0653 |
440
+ | 1.3865 | 23000 | 3.044 |
441
+ | 1.4166 | 23500 | 3.0481 |
442
+ | 1.4467 | 24000 | 3.0133 |
443
+ | 1.4769 | 24500 | 2.9667 |
444
+ | 1.5070 | 25000 | 3.0226 |
445
+ | 1.5372 | 25500 | 2.991 |
446
+ | 1.5673 | 26000 | 2.9593 |
447
+ | 1.5974 | 26500 | 2.9598 |
448
+ | 1.6276 | 27000 | 2.9572 |
449
+ | 1.6577 | 27500 | 2.9579 |
450
+ | 1.6879 | 28000 | 2.9303 |
451
+ | 1.7180 | 28500 | 2.948 |
452
+ | 1.7481 | 29000 | 2.918 |
453
+ | 1.7783 | 29500 | 2.9014 |
454
+ | 1.8084 | 30000 | 2.8948 |
455
+ | 1.8386 | 30500 | 2.8916 |
456
+ | 1.8687 | 31000 | 2.8787 |
457
+ | 1.8988 | 31500 | 2.8864 |
458
+ | 1.9290 | 32000 | 2.8649 |
459
+ | 1.9591 | 32500 | 2.8419 |
460
+ | 1.9893 | 33000 | 2.8688 |
461
+ | 2.0194 | 33500 | 2.8329 |
462
+ | 2.0496 | 34000 | 2.7442 |
463
+ | 2.0797 | 34500 | 2.7501 |
464
+ | 2.1098 | 35000 | 2.7466 |
465
+ | 2.1400 | 35500 | 2.7343 |
466
+ | 2.1701 | 36000 | 2.7014 |
467
+ | 2.2003 | 36500 | 2.6891 |
468
+ | 2.2304 | 37000 | 2.6819 |
469
+ | 2.2605 | 37500 | 2.6779 |
470
+ | 2.2907 | 38000 | 2.6872 |
471
+ | 2.3208 | 38500 | 2.6758 |
472
+ | 2.3510 | 39000 | 2.6665 |
473
+ | 2.3811 | 39500 | 2.6392 |
474
+ | 2.4112 | 40000 | 2.6362 |
475
+ | 2.4414 | 40500 | 2.6038 |
476
+ | 2.4715 | 41000 | 2.5535 |
477
+ | 2.5017 | 41500 | 2.6081 |
478
+ | 2.5318 | 42000 | 2.6071 |
479
+ | 2.5619 | 42500 | 2.5571 |
480
+ | 2.5921 | 43000 | 2.5774 |
481
+ | 2.6222 | 43500 | 2.5556 |
482
+ | 2.6524 | 44000 | 2.5683 |
483
+ | 2.6825 | 44500 | 2.5317 |
484
+ | 2.7126 | 45000 | 2.5509 |
485
+ | 2.7428 | 45500 | 2.5292 |
486
+ | 2.7729 | 46000 | 2.52 |
487
+ | 2.8031 | 46500 | 2.4818 |
488
+ | 2.8332 | 47000 | 2.5258 |
489
+ | 2.8633 | 47500 | 2.482 |
490
+ | 2.8935 | 48000 | 2.5038 |
491
+ | 2.9236 | 48500 | 2.4864 |
492
+ | 2.9538 | 49000 | 2.4591 |
493
+ | 2.9839 | 49500 | 2.4887 |
494
+ | 3.0140 | 50000 | 2.4635 |
495
+ | 3.0442 | 50500 | 2.3837 |
496
+ | 3.0743 | 51000 | 2.3886 |
497
+ | 3.1045 | 51500 | 2.3836 |
498
+ | 3.1346 | 52000 | 2.38 |
499
+ | 3.1647 | 52500 | 2.3456 |
500
+ | 3.1949 | 53000 | 2.3171 |
501
+ | 3.2250 | 53500 | 2.3341 |
502
+ | 3.2552 | 54000 | 2.3228 |
503
+ | 3.2853 | 54500 | 2.3459 |
504
+ | 3.3154 | 55000 | 2.3251 |
505
+ | 3.3456 | 55500 | 2.3365 |
506
+ | 3.3757 | 56000 | 2.2838 |
507
+ | 3.4059 | 56500 | 2.3042 |
508
+ | 3.4360 | 57000 | 2.2465 |
509
+ | 3.4662 | 57500 | 2.2304 |
510
+ | 3.4963 | 58000 | 2.251 |
511
+ | 3.5264 | 58500 | 2.2727 |
512
+ | 3.5566 | 59000 | 2.2324 |
513
+ | 3.5867 | 59500 | 2.2325 |
514
+ | 3.6169 | 60000 | 2.2246 |
515
+ | 3.6470 | 60500 | 2.2287 |
516
+ | 3.6771 | 61000 | 2.2067 |
517
+ | 3.7073 | 61500 | 2.2206 |
518
+ | 3.7374 | 62000 | 2.1882 |
519
+ | 3.7676 | 62500 | 2.1889 |
520
+ | 3.7977 | 63000 | 2.1559 |
521
+ | 3.8278 | 63500 | 2.2021 |
522
+ | 3.8580 | 64000 | 2.1643 |
523
+ | 3.8881 | 64500 | 2.145 |
524
+ | 3.9183 | 65000 | 2.1707 |
525
+ | 3.9484 | 65500 | 2.1349 |
526
+ | 3.9785 | 66000 | 2.1659 |
527
+ | 4.0087 | 66500 | 2.152 |
528
+ | 4.0388 | 67000 | 2.0801 |
529
+ | 4.0690 | 67500 | 2.0729 |
530
+ | 4.0991 | 68000 | 2.0676 |
531
+ | 4.1292 | 68500 | 2.0622 |
532
+ | 4.1594 | 69000 | 2.0376 |
533
+ | 4.1895 | 69500 | 2.027 |
534
+ | 4.2197 | 70000 | 2.0227 |
535
+ | 4.2498 | 70500 | 2.0146 |
536
+ | 4.2799 | 71000 | 2.0334 |
537
+ | 4.3101 | 71500 | 2.0428 |
538
+ | 4.3402 | 72000 | 2.034 |
539
+ | 4.3704 | 72500 | 1.9907 |
540
+ | 4.4005 | 73000 | 2.0106 |
541
+ | 4.4306 | 73500 | 1.9488 |
542
+ | 4.4608 | 74000 | 1.961 |
543
+ | 4.4909 | 74500 | 1.9351 |
544
+ | 4.5211 | 75000 | 1.9875 |
545
+ | 4.5512 | 75500 | 1.9454 |
546
+ | 4.5813 | 76000 | 1.9453 |
547
+ | 4.6115 | 76500 | 1.9239 |
548
+ | 4.6416 | 77000 | 1.9664 |
549
+ | 4.6718 | 77500 | 1.906 |
550
+ | 4.7019 | 78000 | 1.9256 |
551
+ | 4.7321 | 78500 | 1.9071 |
552
+ | 4.7622 | 79000 | 1.9117 |
553
+ | 4.7923 | 79500 | 1.8817 |
554
+ | 4.8225 | 80000 | 1.9101 |
555
+ | 4.8526 | 80500 | 1.8872 |
556
+ | 4.8828 | 81000 | 1.8634 |
557
+ | 4.9129 | 81500 | 1.8791 |
558
+ | 4.9430 | 82000 | 1.8801 |
559
+ | 4.9732 | 82500 | 1.8586 |
560
+
561
+ </details>
562
+
563
+ ### Framework Versions
564
+ - Python: 3.10.12
565
+ - Sentence Transformers: 3.0.1
566
+ - Transformers: 4.42.3
567
+ - PyTorch: 2.3.0+cu121
568
+ - Accelerate: 0.31.0
569
+ - Datasets: 2.20.0
570
+ - Tokenizers: 0.19.1
571
+
572
+ ## Citation
573
+
574
+ ### BibTeX
575
+
576
+ #### Sentence Transformers
577
+ ```bibtex
578
+ @inproceedings{reimers-2019-sentence-bert,
579
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
580
+ author = "Reimers, Nils and Gurevych, Iryna",
581
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
582
+ month = "11",
583
+ year = "2019",
584
+ publisher = "Association for Computational Linguistics",
585
+ url = "https://arxiv.org/abs/1908.10084",
586
+ }
587
+ ```
588
+
589
+ #### DenoisingAutoEncoderLoss
590
+ ```bibtex
591
+ @inproceedings{wang-2021-TSDAE,
592
+ title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
593
+ author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna",
594
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
595
+ month = nov,
596
+ year = "2021",
597
+ address = "Punta Cana, Dominican Republic",
598
+ publisher = "Association for Computational Linguistics",
599
+ pages = "671--688",
600
+ url = "https://arxiv.org/abs/2104.06979",
601
+ }
602
+ ```
603
+
604
+ <!--
605
+ ## Glossary
606
+
607
+ *Clearly define terms in order to be accessible across audiences.*
608
+ -->
609
+
610
+ <!--
611
+ ## Model Card Authors
612
+
613
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
614
+ -->
615
+
616
+ <!--
617
+ ## Model Card Contact
618
+
619
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
620
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "bert-base-uncased",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.42.3",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.3",
5
+ "pytorch": "2.3.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8732534b8bfca06390adc1d086c92f91ab6075e0ea3a0e6b465769243d8412b8
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "mask_token": "[MASK]",
48
+ "model_max_length": 512,
49
+ "pad_token": "[PAD]",
50
+ "sep_token": "[SEP]",
51
+ "strip_accents": null,
52
+ "tokenize_chinese_chars": true,
53
+ "tokenizer_class": "BertTokenizer",
54
+ "unk_token": "[UNK]"
55
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff