FritzStack commited on
Commit
30f0258
·
verified ·
1 Parent(s): 0237307

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,704 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:710769
9
+ - loss:DenoisingAutoEncoderLoss
10
+ base_model: sentence-transformers/all-MiniLM-L6-v2
11
+ widget:
12
+ - source_sentence: I'll as possible I in massive spiral a back . My father me I back
13
+ through fault own currently in around loans further in with Hire I my father .
14
+ all all (£6000 through girlfriend our own to . She has is 3 have on together due
15
+ June have around disposable eachn't work care andn't be able years (until enough
16
+ credit, all are showing we be . will get child will rise here all the are correct
17
+ should £1200 each month . are looking renting time the being a in rent able and
18
+ get food.
19
+ sentences:
20
+ - I can sometimes see it happening before it's escalated too far and smooth the
21
+ situation over, but it's really hard sometimes while she's blowing up on me and
22
+ it feels very personal. It can take her hours to fully come back from an episode,
23
+ and after it's happened she feels very embarrassed. It really impacts her self
24
+ esteem as it's continued and impacted her relationships over the years. She often
25
+ says "I don't want this, I didn't ask for this" I've seen it happen as frequently
26
+ as weekly, but a couple of times of month is more common. Being in crowds can
27
+ more easily trigger it but not always, and she can be fine in them at other times
28
+ as well. She has a very hard time describing the actual feeling often saying she
29
+ doesn't have the proper words for it, but she describes it as being overwhelmed
30
+ and lost, sometimes a dark void that she's swirling down into, and she can't figure
31
+ out how to get out of. She often doesn't see it coming, and by the time it's happening
32
+ it's often too late or difficult to stop. She says she feels like she missed learning
33
+ how to cope with these everyday normal situations as a kid, and now she can't.
34
+ Sometimes she can see it happening though and can remove herself from the situation.
35
+ She's seen counsellors about it before and they've all basically been telling
36
+ her she needs to take better care of herself. Get a full nights sleep, eat properly
37
+ etc.
38
+ - My Shift Manager is extremely toxic. Every shift i close the store with him he
39
+ takes my mistakes and shoves them down my throat. I pass him the keys but miss
40
+ it's a 12 minute event of making fun of me, i say something off he takes an entire
41
+ hour just poking fun at me. There are a lot of instances but my memory is not
42
+ very good, i can't list them He was talking to a bunch of coworkers in the back
43
+ about how he used to bully this kid with autism that he reported to the point
44
+ where he's afraid he'll shoot up the store. He admitted that he's bullying me
45
+ t o a couple coworkers while i was in the room today and He told me about how
46
+ he makes the environment toxic to push out people he doesn't like out. I'm his
47
+ next victim because of my anxiety disorder and my ADHD makes me not fit in to
48
+ this very famously clique environment (the store is known in the district for
49
+ this and for being the asshole store) and the store manager doesn't really care.
50
+ - Advice on Finance and Renting I'll try keep this as short as possible. I'm in
51
+ a massive rut, I got myself into the loan debt spiral a few years back. My father
52
+ helped me clear the majority and I stupidly got back into debt through no other
53
+ fault than my own. I am currently in around £10,000 debt with Personal loans and
54
+ a further £6000 in debt with Hire Purchase. I owe my father £6000. So all in all
55
+ around £22,000 debt (only £6000 is secured debt through HP). Me and my girlfriend
56
+ desperately need our own place to live. She has a child who is 3 and we have another
57
+ child on the way together (due June). I have around £450/£500 disposable income
58
+ each month. My partner can't work due to child care and won't be able too for
59
+ another few years (properly) until our other child to be is here and old enough.
60
+ We have looked at various Universal credit calculators, all of which are showing
61
+ we should be entitled too £600 per month UC. On top of this we will get £82 a
62
+ month in child maintenance, this will rise when our girl is here. If all the calculators
63
+ are correct we should have an income of £1200 each month. We are looking at renting
64
+ privately due to the waiting time of the council being a joke. We can realistically
65
+ afford £495/500 per month in rent and be able to pay bills and get food etc.
66
+ - source_sentence: life Ok so .'m on mobile 2 have autism formatting sucks so I tired
67
+ . I'm done I had nothing literally my mom in entitledparents . l to buy laptop
68
+ to find my was . I never anything's straw that broke . my a She gas many times
69
+ cant count now the of she the landlord . I'm the . So my her out instead brother
70
+ is . am just the abuse claiming . I'm . goodbye wrote my . I through and have
71
+ had
72
+ sentences:
73
+ - My life is hell Ok so 1. I'm on mobile 2 l have autism. If my formatting sucks
74
+ so be it. I'm tired of living. I'm done. I have had nothing good in my life.
75
+ I literally lost everything exept 1 suitcase. I posted my battle with my mom in
76
+ entitledparents. My birthday is next week and l tried to buy a laptop to find
77
+ my ebay was suspended. I never did anything that's the straw that broke the cameks
78
+ back. Now my mom is a narcissist. She gas abandoned me so many times l cant count.
79
+ I am now on the verge of being homeless because she's scared the landlord is going
80
+ to kick her out. I'm on the lease too. So my brother and her are kicking me out
81
+ instead. My brother is her favorite. I am just tired of the abuse and her claiming
82
+ l am. I'm done. So goodbye life. Oh and l wrote my own obituary. I'm through
83
+ and l have had enough.
84
+ - The joy in me is just no longer there. Whenever I talk about this, no one can
85
+ relate to the unrest I feel about being left behind my by peers and not being
86
+ able to do anything about it. Everyone says that it's a natural thing to be conscripted
87
+ and I should just suck it up. Living a life like this, confined by the law even
88
+ when I did nothing wrong. Getting 2 years of my life stolen like this. Apparently
89
+ it's something that is naturally accepted in my country. I am well into my second
90
+ year with only 5 months left of the clock. People tell me that I should be happy
91
+ because it's ending soon. On one hand I am, but on the other, that just means
92
+ that I have already wasted 1 and a half years of my life. And there is nothing
93
+ I can do to stop them from stealing another 5 more months of my life. I am filled
94
+ with self doubt and self loathing. I hate myself for not being able to do the
95
+ things that actually matter to me. I hate that I didn't start learning to draw
96
+ earlier. If I knew I wanted to walk this path, I would probably have picked a
97
+ different path and would have probably been out there somewhere hard at work using
98
+ every Fibre of my being to work for something that I actually care about. But
99
+ now, its too late. I have lost most of my drive and tolerance for anything.
100
+ - Suicide grants agency Thinking about killing myself a lot lately, and it’s made
101
+ me realize how comforting it is to know I get to choose when to stop it all.
102
+ - source_sentence: So s . great motivated wash brush teeth . Go bed get and over Weekends
103
+ all, hard out bed get shower brush teeth, I stay late sleep I super depressed
104
+ I t of for a of this week and just feel like totally shit . ’ if anyone any or
105
+ has too please comment I m so
106
+ sentences:
107
+ - Suspension of overnight visitation due to lack of bed/room? Attorney required?
108
+ [PA] Hi everyone, recently I posted a thread about my stepdaughter being forced
109
+ to sleep on a futon with her grandmother every night during overnight visitation
110
+ with her father because her father took her bed and room (dad lives with his parents
111
+ and apparently decided he didn't like the basement anymore.) She has not had a
112
+ bedroom or a bed of her own for at least a couple of months and is forced to change
113
+ in the living room in front of everyone - she has no private space. She has repeatedly
114
+ requested a bed and room and has been told no. She has her own cell phone that
115
+ is limited to calls/texts (due to her father's mental issues) and we told her
116
+ she can call us to come get her if she doesn't want to sleep with grandma anymore.
117
+ Apparently, during the second-to-last sleepover, she told her dad she wanted to
118
+ come home and he told her no, and told her his mom (grandma) was buying her a
119
+ bed and it'll be here next week. So, she stayed. ​ Well, next week
120
+ was last night, and surprise - still no bed. She was told that the bed was "supposed
121
+ to be delivered the other day but the truck never showed up." So, she was stuck
122
+ on the futon again with grandma on the porch. We asked her why she didn't call
123
+ us, and she said she did not want to upset her father.
124
+ - Squatting with heel of hand on the bar I find it far more comfortable to squat
125
+ with the heel of my hand on the bar, with basically the outside three fingers
126
+ holding the bar. It puts far less pressure on my wrists when I do this. I just
127
+ want to know - is there anything inherently wrong with doing this? And as an
128
+ addendum, I have heard lots of different advice about how far your elbows travel
129
+ behind your back - if they remain the same for the whole squat does it matter?
130
+ - Weekend depression!? Please read So here’s my issue. I feel great, motivated,
131
+ and happy. I wash my face twice a day and brush my teeth. Go to bed and get good
132
+ sleep and over all am more disciplined. Weekends I lose all motivation to live,
133
+ it’s hard to get out of bed I can’t get myself to shower or brush my teeth face,
134
+ I stay up late get awful sleep. I over all feel super depressed. I usually don’t
135
+ get out of bed for much a lot of times. this will bleed into the week and I just
136
+ feel like totally shit. It’s a viciously loop if anyone has any advice or has
137
+ this too please leave a comment I’m so confused by this.
138
+ - source_sentence: It takes everything've to resist the urge break the and hurt When
139
+ having BPD day'm ready rip into my to my a dish, light and then or claw eyes out
140
+ anyone do banshee would last a minute a breath . Whenever get my core abandon
141
+ all and I've held back years Of I ca I would feel horrible on this impulse 5150
142
+ hold would relish doing whatever I felt like afterward and feel . are intense
143
+ to the least . I side me hear on and another of me that to the when I
144
+ sentences:
145
+ - Doctor is going to call in a prescription for Brand Ritalin for me (dispense as
146
+ written). Is there a chance any pharmacies will have brand in stock? Hello! I've
147
+ been taking the Mallinckrodt generic ritalin for 2 months. My doctor wants me
148
+ to try the brand name. He's going to electronically submit the script w/ "Dispense
149
+ as written". Where is the best place for him to call it in? I asked the Walgreens
150
+ I've been going to and they do not have the brand. Is it likely any other pharmacies
151
+ will have brand in stock? Or if my doctor submits it electronically to Walgreens
152
+ should they have it the next day? I've used a few pharmacies in my area (California)
153
+ and I'll be paying out of pocket (+goodrx) so it wouldn't be a problem to have
154
+ him call it in to CVS, Costco, Rite Aid, or Albertsons. I just don't want to
155
+ run out and if it takes more than a day or two I may. If anyone could suggest
156
+ the best way to do this to assure that I get the brand as efficiently as possible
157
+ I'd very much appreciate it!!! I hate calling to talk about this on the phone
158
+ as I know many pharmacies are weary about talking about stimulants unless you
159
+ come in.
160
+ - I’m exhausted and can barely sleep. I don’t know what to do. I’m supposed to go
161
+ to hospital if I get too bad but I’m scared when forced a meal. My mum presented
162
+ me with celery and carrot recently and I panicked and tried making my own portions
163
+ and ended up throwing it away. At hospital I went for another reason recently
164
+ and ate up the whole meal I was given (first meal since last year). I felt guilty
165
+ and tried restriction the next day. I am on waiting list for some public therapy
166
+ but because I’m in NZ it’s limited for support.. someone told me 3 month wait.
167
+ Can’t afford private. Dr keeps telling me to take ensure but I’m vegan so can’t.
168
+ I bought some of my own vegan powder but refuse to take it because it’s soy based
169
+ but because it’s sweet I’m craving it badly. Help me.. please
170
+ - It takes everything I've got to resist the urge to break everything in the house
171
+ and not hurt people When I'm having a BPD day, I'm ready to rip into whatever
172
+ is in my way to get my stresses out. I want to kill the tv with a baseball bat,
173
+ break every dish,mug, light bulb, and then bite anyone or claw the eyes out of
174
+ anyone who tries to stop me. I want to do a full banshee scream that would
175
+ last a full minute without taking a breath. Whenever I get angry to my core, I
176
+ want to do this all with abandon and unleash all the rage and pissed off energy
177
+ I've held back over the years. Of course, I can't. I would feel horrible if I
178
+ acted on this impulse and could possibly end up in a 5150 hold situation. In the
179
+ moment, I would relish the freedom of doing whatever the hell I felt like, but
180
+ afterward I would be an apologetic wreck and feel like shit. My emotions are
181
+ intense to say the least. I have a side of me that cries when I hear depressing
182
+ stories on the news, and another side of me that wants to rip the world apart
183
+ when I'm angry.
184
+ - source_sentence: I think the . Obviously find the She of.
185
+ sentences:
186
+ - Someone tricked me into buying phones for them So I met this guy on Craigslist
187
+ offering 300$ to people who would go into phone stores buy him the phones and
188
+ he would clear the line so they wouldn’t get charged. He said he was a manager
189
+ for Verizon and has the power to do this. Me being a broke college student in
190
+ need of desperate money for paying tuition bought into it. Today I got an AT&T
191
+ bill saying I have to pay 175 dollars before the 20th which was when I needed
192
+ to pay my bills for next semester. I don’t know how to report this man, or who
193
+ will be able to help me. He blocked my number, but I have a picture of his debit
194
+ card number and the first name on it. He’s also been on surveillance camera at
195
+ target, and I know the exact date and time frame. What I need to know is how
196
+ I can resolve this issue or where to start, I was so stupid.
197
+ - Yeah I think at the time she was 13/14. Obviously didn't find out at the time.
198
+ She was also sexualy assaulted by a member of the Murderdolls as well.
199
+ - Index Fund Investment I am 22, and I have approximately 160k net worth. Currently,
200
+ I have close to 90k in Ally savings earning close to 2.25% interest. I am looking
201
+ to invest in index funds - Schwab 1000 in particular. Is it wise to invest about
202
+ 30k for starters in it? I am a total beginner at investing. Any suggesting on
203
+ how to better distribute the money? Thanks!
204
+ pipeline_tag: sentence-similarity
205
+ library_name: sentence-transformers
206
+ ---
207
+
208
+ # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
209
+
210
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
211
+
212
+ ## Model Details
213
+
214
+ ### Model Description
215
+ - **Model Type:** Sentence Transformer
216
+ - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
217
+ - **Maximum Sequence Length:** 256 tokens
218
+ - **Output Dimensionality:** 384 dimensions
219
+ - **Similarity Function:** Cosine Similarity
220
+ <!-- - **Training Dataset:** Unknown -->
221
+ <!-- - **Language:** Unknown -->
222
+ <!-- - **License:** Unknown -->
223
+
224
+ ### Model Sources
225
+
226
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
227
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
228
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
229
+
230
+ ### Full Model Architecture
231
+
232
+ ```
233
+ SentenceTransformer(
234
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
235
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
236
+ (2): Normalize()
237
+ )
238
+ ```
239
+
240
+ ## Usage
241
+
242
+ ### Direct Usage (Sentence Transformers)
243
+
244
+ First install the Sentence Transformers library:
245
+
246
+ ```bash
247
+ pip install -U sentence-transformers
248
+ ```
249
+
250
+ Then you can load this model and run inference.
251
+ ```python
252
+ from sentence_transformers import SentenceTransformer
253
+
254
+ # Download from the 🤗 Hub
255
+ model = SentenceTransformer("FritzStack/tsdae-model")
256
+ # Run inference
257
+ sentences = [
258
+ 'I think the . Obviously find the She of.',
259
+ "Yeah I think at the time she was 13/14. Obviously didn't find out at the time. She was also sexualy assaulted by a member of the Murderdolls as well.",
260
+ 'Someone tricked me into buying phones for them So I met this guy on Craigslist offering 300$ to people who would go into phone stores buy him the phones and he would clear the line so they wouldn’t get charged. He said he was a manager for Verizon and has the power to do this. Me being a broke college student in need of desperate money for paying tuition bought into it. Today I got an AT&amp;T bill saying I have to pay 175 dollars before the 20th which was when I needed to pay my bills for next semester. I don’t know how to report this man, or who will be able to help me. He blocked my number, but I have a picture of his debit card number and the first name on it. He’s also been on surveillance camera at target, and I know the exact date and time frame. What I need to know is how I can resolve this issue or where to start, I was so stupid.',
261
+ ]
262
+ embeddings = model.encode(sentences)
263
+ print(embeddings.shape)
264
+ # [3, 384]
265
+
266
+ # Get the similarity scores for the embeddings
267
+ similarities = model.similarity(embeddings, embeddings)
268
+ print(similarities)
269
+ # tensor([[ 1.0000, 0.5130, -0.0890],
270
+ # [ 0.5130, 1.0000, 0.0377],
271
+ # [-0.0890, 0.0377, 1.0000]])
272
+ ```
273
+
274
+ <!--
275
+ ### Direct Usage (Transformers)
276
+
277
+ <details><summary>Click to see the direct usage in Transformers</summary>
278
+
279
+ </details>
280
+ -->
281
+
282
+ <!--
283
+ ### Downstream Usage (Sentence Transformers)
284
+
285
+ You can finetune this model on your own dataset.
286
+
287
+ <details><summary>Click to expand</summary>
288
+
289
+ </details>
290
+ -->
291
+
292
+ <!--
293
+ ### Out-of-Scope Use
294
+
295
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
296
+ -->
297
+
298
+ <!--
299
+ ## Bias, Risks and Limitations
300
+
301
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
302
+ -->
303
+
304
+ <!--
305
+ ### Recommendations
306
+
307
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
308
+ -->
309
+
310
+ ## Training Details
311
+
312
+ ### Training Dataset
313
+
314
+ #### Unnamed Dataset
315
+
316
+ * Size: 710,769 training samples
317
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
318
+ * Approximate statistics based on the first 1000 samples:
319
+ | | sentence_0 | sentence_1 |
320
+ |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
321
+ | type | string | string |
322
+ | details | <ul><li>min: 3 tokens</li><li>mean: 67.33 tokens</li><li>max: 212 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 151.71 tokens</li><li>max: 256 tokens</li></ul> |
323
+ * Samples:
324
+ | sentence_0 | sentence_1 |
325
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
326
+ | <code>my neighbor his can where If not here! Me and boyfriend] in nicer area city . was unoccupied awhile had a couple next door paper thin a on . We a move week or ago with & lt They've lot of course happens you, the are so The night we a lot of crying and was . hours guy screaming the girl very then stops she leaves screams constantly is and about Tonight yelling and heard very yelp then begging not to hurt . and on their door and one answered I've only ever them while walking my dog seeing the three of going into</code> | <code>Pretty sure my neighbor is hitting his partner. What can I do? (Unsure where this belongs. If not here, please direct me!) Me [29f] and my boyfriend [23m] live in a townhouse in a nicer area of our city. The unit next to us was unoccupied for awhile, and we recently had a couple move in next door. The walls here are paper thin so we can hear a lot that goes on. We had a couple move in a week or so ago with their &lt;4yo daughter. They've been making a lot of noise, but of course that happens when you move and again, the walls are so thin. The other night we could hear a lot of yelling and crying and we assumed it was the tv. After several hours we realized it was the guy screaming at the girl. He has a very distinctive voice. Since then, the yelling only stops when she leaves. He screams constantly about her losing something, asking her where it is, and about her male friends. Tonight he started yelling again and we heard a very clear smack then yelp then bawling and begging him not t...</code> |
327
+ | <code>I'm being excluded friends butn't want believe</code> | <code>Or maybe I'm being excluded by my friends but don't want to believe it.</code> |
328
+ | <code>Someone tricked me buying for them I met this guy Craigslist offering would go into phone phones clear the wouldn ’ t charged Verizon and has to . being college student for into . Today I AT amp; I to pay 20th which I needed I don ’ how man, or be to me He blocked my number of card and ’ s been on surveillance at, the exact What I to know I resolve this or where to start I so.</code> | <code>Someone tricked me into buying phones for them So I met this guy on Craigslist offering 300$ to people who would go into phone stores buy him the phones and he would clear the line so they wouldn’t get charged. He said he was a manager for Verizon and has the power to do this. Me being a broke college student in need of desperate money for paying tuition bought into it. Today I got an AT&amp;T bill saying I have to pay 175 dollars before the 20th which was when I needed to pay my bills for next semester. I don’t know how to report this man, or who will be able to help me. He blocked my number, but I have a picture of his debit card number and the first name on it. He’s also been on surveillance camera at target, and I know the exact date and time frame. What I need to know is how I can resolve this issue or where to start, I was so stupid.</code> |
329
+ * Loss: [<code>DenoisingAutoEncoderLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#denoisingautoencoderloss)
330
+
331
+ ### Training Hyperparameters
332
+ #### Non-Default Hyperparameters
333
+
334
+ - `num_train_epochs`: 1
335
+ - `multi_dataset_batch_sampler`: round_robin
336
+
337
+ #### All Hyperparameters
338
+ <details><summary>Click to expand</summary>
339
+
340
+ - `overwrite_output_dir`: False
341
+ - `do_predict`: False
342
+ - `eval_strategy`: no
343
+ - `prediction_loss_only`: True
344
+ - `per_device_train_batch_size`: 8
345
+ - `per_device_eval_batch_size`: 8
346
+ - `per_gpu_train_batch_size`: None
347
+ - `per_gpu_eval_batch_size`: None
348
+ - `gradient_accumulation_steps`: 1
349
+ - `eval_accumulation_steps`: None
350
+ - `torch_empty_cache_steps`: None
351
+ - `learning_rate`: 5e-05
352
+ - `weight_decay`: 0.0
353
+ - `adam_beta1`: 0.9
354
+ - `adam_beta2`: 0.999
355
+ - `adam_epsilon`: 1e-08
356
+ - `max_grad_norm`: 1
357
+ - `num_train_epochs`: 1
358
+ - `max_steps`: -1
359
+ - `lr_scheduler_type`: linear
360
+ - `lr_scheduler_kwargs`: {}
361
+ - `warmup_ratio`: 0.0
362
+ - `warmup_steps`: 0
363
+ - `log_level`: passive
364
+ - `log_level_replica`: warning
365
+ - `log_on_each_node`: True
366
+ - `logging_nan_inf_filter`: True
367
+ - `save_safetensors`: True
368
+ - `save_on_each_node`: False
369
+ - `save_only_model`: False
370
+ - `restore_callback_states_from_checkpoint`: False
371
+ - `no_cuda`: False
372
+ - `use_cpu`: False
373
+ - `use_mps_device`: False
374
+ - `seed`: 42
375
+ - `data_seed`: None
376
+ - `jit_mode_eval`: False
377
+ - `bf16`: False
378
+ - `fp16`: False
379
+ - `fp16_opt_level`: O1
380
+ - `half_precision_backend`: auto
381
+ - `bf16_full_eval`: False
382
+ - `fp16_full_eval`: False
383
+ - `tf32`: None
384
+ - `local_rank`: 0
385
+ - `ddp_backend`: None
386
+ - `tpu_num_cores`: None
387
+ - `tpu_metrics_debug`: False
388
+ - `debug`: []
389
+ - `dataloader_drop_last`: False
390
+ - `dataloader_num_workers`: 0
391
+ - `dataloader_prefetch_factor`: None
392
+ - `past_index`: -1
393
+ - `disable_tqdm`: False
394
+ - `remove_unused_columns`: True
395
+ - `label_names`: None
396
+ - `load_best_model_at_end`: False
397
+ - `ignore_data_skip`: False
398
+ - `fsdp`: []
399
+ - `fsdp_min_num_params`: 0
400
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
401
+ - `fsdp_transformer_layer_cls_to_wrap`: None
402
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
403
+ - `parallelism_config`: None
404
+ - `deepspeed`: None
405
+ - `label_smoothing_factor`: 0.0
406
+ - `optim`: adamw_torch_fused
407
+ - `optim_args`: None
408
+ - `adafactor`: False
409
+ - `group_by_length`: False
410
+ - `length_column_name`: length
411
+ - `project`: huggingface
412
+ - `trackio_space_id`: trackio
413
+ - `ddp_find_unused_parameters`: None
414
+ - `ddp_bucket_cap_mb`: None
415
+ - `ddp_broadcast_buffers`: False
416
+ - `dataloader_pin_memory`: True
417
+ - `dataloader_persistent_workers`: False
418
+ - `skip_memory_metrics`: True
419
+ - `use_legacy_prediction_loop`: False
420
+ - `push_to_hub`: False
421
+ - `resume_from_checkpoint`: None
422
+ - `hub_model_id`: None
423
+ - `hub_strategy`: every_save
424
+ - `hub_private_repo`: None
425
+ - `hub_always_push`: False
426
+ - `hub_revision`: None
427
+ - `gradient_checkpointing`: False
428
+ - `gradient_checkpointing_kwargs`: None
429
+ - `include_inputs_for_metrics`: False
430
+ - `include_for_metrics`: []
431
+ - `eval_do_concat_batches`: True
432
+ - `fp16_backend`: auto
433
+ - `push_to_hub_model_id`: None
434
+ - `push_to_hub_organization`: None
435
+ - `mp_parameters`:
436
+ - `auto_find_batch_size`: False
437
+ - `full_determinism`: False
438
+ - `torchdynamo`: None
439
+ - `ray_scope`: last
440
+ - `ddp_timeout`: 1800
441
+ - `torch_compile`: False
442
+ - `torch_compile_backend`: None
443
+ - `torch_compile_mode`: None
444
+ - `include_tokens_per_second`: False
445
+ - `include_num_input_tokens_seen`: no
446
+ - `neftune_noise_alpha`: None
447
+ - `optim_target_modules`: None
448
+ - `batch_eval_metrics`: False
449
+ - `eval_on_start`: False
450
+ - `use_liger_kernel`: False
451
+ - `liger_kernel_config`: None
452
+ - `eval_use_gather_object`: False
453
+ - `average_tokens_across_devices`: True
454
+ - `prompts`: None
455
+ - `batch_sampler`: batch_sampler
456
+ - `multi_dataset_batch_sampler`: round_robin
457
+ - `router_mapping`: {}
458
+ - `learning_rate_mapping`: {}
459
+
460
+ </details>
461
+
462
+ ### Training Logs
463
+ <details><summary>Click to expand</summary>
464
+
465
+ | Epoch | Step | Training Loss |
466
+ |:------:|:-----:|:-------------:|
467
+ | 0.0056 | 500 | 5.7586 |
468
+ | 0.0113 | 1000 | 4.9642 |
469
+ | 0.0169 | 1500 | 4.7484 |
470
+ | 0.0225 | 2000 | 4.6033 |
471
+ | 0.0281 | 2500 | 4.5275 |
472
+ | 0.0338 | 3000 | 4.4556 |
473
+ | 0.0394 | 3500 | 4.3778 |
474
+ | 0.0450 | 4000 | 4.3618 |
475
+ | 0.0506 | 4500 | 4.3195 |
476
+ | 0.0563 | 5000 | 4.294 |
477
+ | 0.0619 | 5500 | 4.2533 |
478
+ | 0.0675 | 6000 | 4.2338 |
479
+ | 0.0732 | 6500 | 4.2144 |
480
+ | 0.0788 | 7000 | 4.1689 |
481
+ | 0.0844 | 7500 | 4.1655 |
482
+ | 0.0900 | 8000 | 4.1441 |
483
+ | 0.0957 | 8500 | 4.1122 |
484
+ | 0.1013 | 9000 | 4.0953 |
485
+ | 0.1069 | 9500 | 4.0657 |
486
+ | 0.1126 | 10000 | 4.0832 |
487
+ | 0.1182 | 10500 | 4.0554 |
488
+ | 0.1238 | 11000 | 4.0392 |
489
+ | 0.1294 | 11500 | 4.0221 |
490
+ | 0.1351 | 12000 | 4.0012 |
491
+ | 0.1407 | 12500 | 4.0086 |
492
+ | 0.1463 | 13000 | 3.9904 |
493
+ | 0.1519 | 13500 | 3.9914 |
494
+ | 0.1576 | 14000 | 3.9727 |
495
+ | 0.1632 | 14500 | 3.9604 |
496
+ | 0.1688 | 15000 | 3.939 |
497
+ | 0.1745 | 15500 | 3.9351 |
498
+ | 0.1801 | 16000 | 3.9355 |
499
+ | 0.1857 | 16500 | 3.9388 |
500
+ | 0.1913 | 17000 | 3.9078 |
501
+ | 0.1970 | 17500 | 3.9127 |
502
+ | 0.2026 | 18000 | 3.9082 |
503
+ | 0.2082 | 18500 | 3.8932 |
504
+ | 0.2139 | 19000 | 3.9085 |
505
+ | 0.2195 | 19500 | 3.8767 |
506
+ | 0.2251 | 20000 | 3.8721 |
507
+ | 0.2307 | 20500 | 3.8628 |
508
+ | 0.2364 | 21000 | 3.8671 |
509
+ | 0.2420 | 21500 | 3.8577 |
510
+ | 0.2476 | 22000 | 3.8536 |
511
+ | 0.2532 | 22500 | 3.8385 |
512
+ | 0.2589 | 23000 | 3.8394 |
513
+ | 0.2645 | 23500 | 3.8452 |
514
+ | 0.2701 | 24000 | 3.831 |
515
+ | 0.2758 | 24500 | 3.8192 |
516
+ | 0.2814 | 25000 | 3.8087 |
517
+ | 0.2870 | 25500 | 3.8163 |
518
+ | 0.2926 | 26000 | 3.8078 |
519
+ | 0.2983 | 26500 | 3.8067 |
520
+ | 0.3039 | 27000 | 3.7879 |
521
+ | 0.3095 | 27500 | 3.7973 |
522
+ | 0.3151 | 28000 | 3.7763 |
523
+ | 0.3208 | 28500 | 3.777 |
524
+ | 0.3264 | 29000 | 3.7649 |
525
+ | 0.3320 | 29500 | 3.7616 |
526
+ | 0.3377 | 30000 | 3.7713 |
527
+ | 0.3433 | 30500 | 3.7623 |
528
+ | 0.3489 | 31000 | 3.757 |
529
+ | 0.3545 | 31500 | 3.7585 |
530
+ | 0.3602 | 32000 | 3.7501 |
531
+ | 0.3658 | 32500 | 3.7373 |
532
+ | 0.3714 | 33000 | 3.7425 |
533
+ | 0.3771 | 33500 | 3.7402 |
534
+ | 0.3827 | 34000 | 3.7235 |
535
+ | 0.3883 | 34500 | 3.7277 |
536
+ | 0.3939 | 35000 | 3.7434 |
537
+ | 0.3996 | 35500 | 3.7269 |
538
+ | 0.4052 | 36000 | 3.7229 |
539
+ | 0.4108 | 36500 | 3.7098 |
540
+ | 0.4164 | 37000 | 3.7204 |
541
+ | 0.4221 | 37500 | 3.7012 |
542
+ | 0.4277 | 38000 | 3.6933 |
543
+ | 0.4333 | 38500 | 3.6954 |
544
+ | 0.4390 | 39000 | 3.7132 |
545
+ | 0.4446 | 39500 | 3.704 |
546
+ | 0.4502 | 40000 | 3.6975 |
547
+ | 0.4558 | 40500 | 3.6978 |
548
+ | 0.4615 | 41000 | 3.6937 |
549
+ | 0.4671 | 41500 | 3.6813 |
550
+ | 0.4727 | 42000 | 3.6754 |
551
+ | 0.4784 | 42500 | 3.6883 |
552
+ | 0.4840 | 43000 | 3.6704 |
553
+ | 0.4896 | 43500 | 3.6816 |
554
+ | 0.4952 | 44000 | 3.6738 |
555
+ | 0.5009 | 44500 | 3.6783 |
556
+ | 0.5065 | 45000 | 3.6761 |
557
+ | 0.5121 | 45500 | 3.6672 |
558
+ | 0.5177 | 46000 | 3.6688 |
559
+ | 0.5234 | 46500 | 3.6523 |
560
+ | 0.5290 | 47000 | 3.6683 |
561
+ | 0.5346 | 47500 | 3.6563 |
562
+ | 0.5403 | 48000 | 3.6557 |
563
+ | 0.5459 | 48500 | 3.6447 |
564
+ | 0.5515 | 49000 | 3.6393 |
565
+ | 0.5571 | 49500 | 3.6423 |
566
+ | 0.5628 | 50000 | 3.6384 |
567
+ | 0.5684 | 50500 | 3.6424 |
568
+ | 0.5740 | 51000 | 3.6263 |
569
+ | 0.5796 | 51500 | 3.6067 |
570
+ | 0.5853 | 52000 | 3.6177 |
571
+ | 0.5909 | 52500 | 3.6306 |
572
+ | 0.5965 | 53000 | 3.6242 |
573
+ | 0.6022 | 53500 | 3.6087 |
574
+ | 0.6078 | 54000 | 3.6164 |
575
+ | 0.6134 | 54500 | 3.614 |
576
+ | 0.6190 | 55000 | 3.62 |
577
+ | 0.6247 | 55500 | 3.6256 |
578
+ | 0.6303 | 56000 | 3.5988 |
579
+ | 0.6359 | 56500 | 3.6065 |
580
+ | 0.6416 | 57000 | 3.5924 |
581
+ | 0.6472 | 57500 | 3.5968 |
582
+ | 0.6528 | 58000 | 3.6105 |
583
+ | 0.6584 | 58500 | 3.5961 |
584
+ | 0.6641 | 59000 | 3.6007 |
585
+ | 0.6697 | 59500 | 3.5943 |
586
+ | 0.6753 | 60000 | 3.5876 |
587
+ | 0.6809 | 60500 | 3.587 |
588
+ | 0.6866 | 61000 | 3.5938 |
589
+ | 0.6922 | 61500 | 3.5668 |
590
+ | 0.6978 | 62000 | 3.5824 |
591
+ | 0.7035 | 62500 | 3.5778 |
592
+ | 0.7091 | 63000 | 3.5811 |
593
+ | 0.7147 | 63500 | 3.5731 |
594
+ | 0.7203 | 64000 | 3.5757 |
595
+ | 0.7260 | 64500 | 3.5822 |
596
+ | 0.7316 | 65000 | 3.5778 |
597
+ | 0.7372 | 65500 | 3.5745 |
598
+ | 0.7429 | 66000 | 3.5822 |
599
+ | 0.7485 | 66500 | 3.5645 |
600
+ | 0.7541 | 67000 | 3.5589 |
601
+ | 0.7597 | 67500 | 3.5632 |
602
+ | 0.7654 | 68000 | 3.5631 |
603
+ | 0.7710 | 68500 | 3.5657 |
604
+ | 0.7766 | 69000 | 3.5438 |
605
+ | 0.7822 | 69500 | 3.5634 |
606
+ | 0.7879 | 70000 | 3.5421 |
607
+ | 0.7935 | 70500 | 3.5393 |
608
+ | 0.7991 | 71000 | 3.5547 |
609
+ | 0.8048 | 71500 | 3.5449 |
610
+ | 0.8104 | 72000 | 3.5383 |
611
+ | 0.8160 | 72500 | 3.5301 |
612
+ | 0.8216 | 73000 | 3.5402 |
613
+ | 0.8273 | 73500 | 3.5333 |
614
+ | 0.8329 | 74000 | 3.5483 |
615
+ | 0.8385 | 74500 | 3.5274 |
616
+ | 0.8441 | 75000 | 3.5353 |
617
+ | 0.8498 | 75500 | 3.5266 |
618
+ | 0.8554 | 76000 | 3.5152 |
619
+ | 0.8610 | 76500 | 3.5273 |
620
+ | 0.8667 | 77000 | 3.5428 |
621
+ | 0.8723 | 77500 | 3.5295 |
622
+ | 0.8779 | 78000 | 3.5208 |
623
+ | 0.8835 | 78500 | 3.519 |
624
+ | 0.8892 | 79000 | 3.5361 |
625
+ | 0.8948 | 79500 | 3.5256 |
626
+ | 0.9004 | 80000 | 3.5295 |
627
+ | 0.9061 | 80500 | 3.5054 |
628
+ | 0.9117 | 81000 | 3.5179 |
629
+ | 0.9173 | 81500 | 3.5154 |
630
+ | 0.9229 | 82000 | 3.5151 |
631
+ | 0.9286 | 82500 | 3.511 |
632
+ | 0.9342 | 83000 | 3.5063 |
633
+ | 0.9398 | 83500 | 3.5079 |
634
+ | 0.9454 | 84000 | 3.5035 |
635
+ | 0.9511 | 84500 | 3.5088 |
636
+ | 0.9567 | 85000 | 3.4858 |
637
+ | 0.9623 | 85500 | 3.4907 |
638
+ | 0.9680 | 86000 | 3.4935 |
639
+ | 0.9736 | 86500 | 3.4881 |
640
+ | 0.9792 | 87000 | 3.4901 |
641
+ | 0.9848 | 87500 | 3.5017 |
642
+ | 0.9905 | 88000 | 3.4993 |
643
+ | 0.9961 | 88500 | 3.4733 |
644
+
645
+ </details>
646
+
647
+ ### Framework Versions
648
+ - Python: 3.12.12
649
+ - Sentence Transformers: 5.1.2
650
+ - Transformers: 4.57.1
651
+ - PyTorch: 2.8.0+cu126
652
+ - Accelerate: 1.11.0
653
+ - Datasets: 4.0.0
654
+ - Tokenizers: 0.22.1
655
+
656
+ ## Citation
657
+
658
+ ### BibTeX
659
+
660
+ #### Sentence Transformers
661
+ ```bibtex
662
+ @inproceedings{reimers-2019-sentence-bert,
663
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
664
+ author = "Reimers, Nils and Gurevych, Iryna",
665
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
666
+ month = "11",
667
+ year = "2019",
668
+ publisher = "Association for Computational Linguistics",
669
+ url = "https://arxiv.org/abs/1908.10084",
670
+ }
671
+ ```
672
+
673
+ #### DenoisingAutoEncoderLoss
674
+ ```bibtex
675
+ @inproceedings{wang-2021-TSDAE,
676
+ title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
677
+ author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna",
678
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
679
+ month = nov,
680
+ year = "2021",
681
+ address = "Punta Cana, Dominican Republic",
682
+ publisher = "Association for Computational Linguistics",
683
+ pages = "671--688",
684
+ url = "https://arxiv.org/abs/2104.06979",
685
+ }
686
+ ```
687
+
688
+ <!--
689
+ ## Glossary
690
+
691
+ *Clearly define terms in order to be accessible across audiences.*
692
+ -->
693
+
694
+ <!--
695
+ ## Model Card Authors
696
+
697
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
698
+ -->
699
+
700
+ <!--
701
+ ## Model Card Contact
702
+
703
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
704
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "transformers_version": "4.57.1",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.1.2",
4
+ "transformers": "4.57.1",
5
+ "pytorch": "2.8.0+cu126"
6
+ },
7
+ "model_type": "SentenceTransformer",
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3e15ffc1ac620c4e19e4594a42710bc72ebbfda6ce7a36b94d5831fe53061506
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 256,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff