RikoteMaster commited on
Commit
4683c9d
·
verified ·
1 Parent(s): ec3420d

Model save

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,633 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:57126
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: sentence-transformers/all-MiniLM-L6-v2
10
+ widget:
11
+ - source_sentence: '##inemia a the the upper limit of normal concentra - b success
12
+ was defined or at least success menstrual cycles without of cycle. the most from
13
+ the trial were dizziness, reproduced, comparison bromocriptine the hyperpro n
14
+ j med ; 904. )'
15
+ sentences:
16
+ - the yearly inci - dence of symptomatic gallstones is about 1 %. cardiac effects
17
+ include sinus bradycardia ( 25 % ) and conduction disturbances ( 10 % ). pain
18
+ at the site of injection is common, especially with the long - acting octreotide
19
+ suspension. vitamin b 12 deficiency may occur with long - term use of octreotide.
20
+ a long - acting formulation of lanreotide, another octapeptide somatostatin analog,
21
+ was approved by the fda in 2007 for treat - ment of acromegaly. lanreotide appears
22
+ to have effects comparable to those of octreotide on reducing gh levels and normalizing
23
+ igf - i concentrations. pegvisomant pegvisomant is a gh receptor antagonist used
24
+ to treat acromegaly. it is the polyethylene glycol ( peg ) derivative of a mutant
25
+ gh, b2036. like native gh, pegvisomant has two gh receptor bind - ing sites. however,
26
+ one of the pegvisomant gh receptor binding sites has increased affinity for the
27
+ gh receptor, whereas its second gh receptor binding site has reduced affinity.
28
+ this differential receptor affinity allows the initial step ( gh receptor dime
29
+ - '##inemia and anovulation. a : the dotted line indicates the upper limit of normal
30
+ serum prolactin concentra - tions. b : complete success was defined as pregnancy
31
+ or at least two consecutive menses with evidence of ovulation at least once. partial
32
+ success was two menstrual cycles without evidence of ovulation or just one ovulatory
33
+ cycle. the most common reasons for withdrawal from the trial were nausea, headache,
34
+ dizziness, abdominal pain, and fatigue. ( modified and reproduced, with permission,
35
+ from webster j et al : a comparison of cabergoline and bromocriptine in the treatment
36
+ of hyperpro - lactinemic amenorrhea. n engl j med 1994 ; 331 : 904. )'
37
+ - compounds are shown in figure 38 – 5. the thiocarbamide group is essential for
38
+ antithyroid activity. pharmacokinetics methimazole is completely absorbed but
39
+ at variable rates. it is readily accumulated by the thyroid gland and has a volume
40
+ of distribution similar to that of propylthiouracil. excretion is slower than
41
+ with propylthiouracil ; 65 – 70 % of a dose is recovered in the urine in 48 hours.
42
+ in contrast, propylthiouracil is rapidly absorbed, reaching peak serum levels
43
+ after 1 hour. the bioavailability of 50 – 80 % may be due to incomplete absorption
44
+ or a large first - pass effect in the liver. the volume of distribution approximates
45
+ total body water with accumulation in the thyroid gland. most of an ingested dose
46
+ of propylthiouracil is excreted by the kidney as the inactive glucuronide within
47
+ 24 hours. the short plasma half - life of these agents ( 1. 5 hours for propyl
48
+ - thiouracil and 6 hours for methimazole ) has little influence on the duration
49
+ of the antithyroid action or the dosing interval because both agents are accumulated
50
+ by the thyroid gland. for propyl -
51
+ - source_sentence: '##sis. serious hypertension, diabetes. disorder surgical irradiation
52
+ of the pituitary tumor, or tion of both adrenals. patients large of cortisol during
53
+ and doses to mg soluble hydrocortisone as - ous intravenous day of surgery. the
54
+ dose reduced replacement levels, reduction in produce symptoms, including fever
55
+ and if adrenalectomy - term main - tenance that outlined for c. primary chrousos
56
+ syndrome — this or usually mutations the glucocorticoid - tor attempt to for hypo
57
+ pituitary ) axis hyperfunctioning'
58
+ sentences:
59
+ - '##sis. other serious disturbances include mental disorders, hypertension, and
60
+ diabetes. this disorder is treated by surgical removal of the tumor produc - ing
61
+ acth or cortisol, irradiation of the pituitary tumor, or resec - tion of one or
62
+ both adrenals. these patients must receive large doses of cortisol during and
63
+ after the surgical procedure. doses of up to 300 mg of soluble hydrocortisone
64
+ may be given as a continu - ous intravenous infusion on the day of surgery. the
65
+ dose must be reduced slowly to normal replacement levels, since rapid reduction
66
+ in dose may produce withdrawal symptoms, including fever and joint pain. if adrenalectomy
67
+ has been performed, long - term main - tenance is similar to that outlined above
68
+ for adrenal insufficiency. c. primary generalized glucocorticoid resistance (
69
+ chrousos ) syndrome — this rare sporadic or familial genetic condition is usually
70
+ due to inactivating mutations of the glucocorticoid recep - tor gene. in its attempt
71
+ to compensate for the defect, the hypo - thalamic - pituitary - adrenal ( hpa
72
+ ) axis is hyperfunctioning'
73
+ - 'oral : 200, 220, 250, 375, 500 mg tablets ; 375, 550 mg controlled - release
74
+ tablets ; 375, 500 mg delayed - release tablets ; 125 mg / 5 ml suspension oxaprozin
75
+ ( generic, daypro ) oral : 600 mg tablets, capsules piroxicam ( generic, feldene
76
+ ) oral : 10, 20 mg capsules salsalate, salicylsalicylic acid ( generic, disalcid
77
+ ) oral : 500, 750 mg tablets ; 500 mg capsules sodium salicylate ( generic ) oral
78
+ : 325, 650 mg enteric - coated tablets sodium thiosalicylate ( generic, rexolate
79
+ ) parenteral : 50 mg / ml for im injection sulindac ( generic, clinoril ) oral
80
+ : 150, 200 mg tablets suprofen ( profenal ) topical : 1 % ophthalmic solution
81
+ tolmetin ( generic, tolectin ) oral : 200, 600 mg tablets ; 400 mg capsules disease
82
+ - modifying antirheumatic drugs abatacept ( orencia ) parenteral : 250 mg / vial
83
+ lyophilized, for reconstitution for iv injection adalimumab ( humira ) parenteral
84
+ : 40 mg /'
85
+ - a half - life of about 10 hours. lutropin has only been approved for use in combination
86
+ with follitropin alfa for stimulation of follicular development in infertile women
87
+ with profound lh deficiency. it has not been approved for use with the other preparations
88
+ of fsh nor for simulating the endogenous lh surge that is needed to complete follicular
89
+ development and pre - cipitate ovulation. d. human chorionic gonadotropin hcg
90
+ is produced by the human placenta and excreted into the urine, whence it can be
91
+ extracted and purified. it is a glycoprotein consisting of a 92 - amino - acid
92
+ α chain virtually identical to that of fsh, lh, and tsh, and a β chain of 145
93
+ amino acids that resembles that of lh except for the presence of a carboxyl terminal
94
+ sequence of 30 amino acids not present in lh. choriogonadotropin alfa ( rhcg )
95
+ is a recombinant form of hcg. because of its greater consistency in biologic activity,
96
+ rhcg is packaged and dosed on the basis of weight rather than units of activity.
97
+ all of the other gonadotropins
98
+ - source_sentence: five origin. action is inhibits activation of ( chapter a has an
99
+ ( ), produced cd28 on cell or on apc, leading - abatacept which contains the -
100
+ 4 ) 86, inhibiting to pharmacokinetics abatacept infusion “ ” ( day 4 monthly
101
+ infusions. is weight patients weighing less than kg mg, 100 receiving 750 and
102
+ more than 100 kg 1000 regimens in adult group can if - 16
103
+ sentences:
104
+ - '##l 2005 ; 75 : 406. tan kr et al : neural basis for addictive properties of
105
+ benzodiazepines. nature 2010 ; 463 : 769. c a s e s t u d y a n s w e r the patient
106
+ was diagnosed with pathologic gambling sec - ondary to the dopamine agonist prescription.
107
+ compulsive behavior including gambling, binge eating, or hypersexu - ality is
108
+ observed in about 15 % of patients who receive dopamine agonist treatment. the
109
+ condition is not related to parkinson ’ s disease, as compulsive behaviors also
110
+ occur in patients with restless legs syndrome who are treated with the same medication.
111
+ the incidence with levodopa is lower, and compulsive behavior is sometimes preceded
112
+ by dose escalation.'
113
+ - α - blocking agents ( five drugs ). these dmards and biologics are discussed alphabetically,
114
+ independent of origin. abatacept mechanism of action abatacept is a co - stimulation
115
+ modulator biologic that inhibits the activation of t cells ( see also chapter
116
+ 55 ). after a t cell has engaged an antigen - presenting cell ( apc ), a second
117
+ signal is produced by cd28 on the t cell that interacts with cd80 or cd86 on the
118
+ apc, leading to t - cell activation. abatacept ( which contains the endogenous
119
+ ligand ctla - 4 ) binds to cd80 and 86, thereby inhibiting the binding to cd28
120
+ and preventing the activation of t cells. pharmacokinetics abatacept is given
121
+ as three intravenous infusion “ induction ” doses ( day 0, week 2, and week 4
122
+ ), followed by monthly infusions. the dose is based on body weight ; patients
123
+ weighing less than 60 kg receiving 500 mg, those 60 – 100 kg receiving 750 mg,
124
+ and those more than 100 kg receiving 1000 mg. dosing regimens in any adult group
125
+ can be increased if needed. the terminal serum half - life is 13 – 16 days. co
126
+ - chapter 33 agents used in anemias ; hematopoietic growth factors 587 groups. in
127
+ particular, the depletion of tetrahydrofolate prevents synthesis of adequate supplies
128
+ of the deoxythymidylate ( dtmp ) and purines required for dna synthesis in rapidly
129
+ dividing cells, as shown in figure 33 – 3, section 2. the accumulation of folate
130
+ as n 5 - methyltetrahydrofolate and the associated depletion of tetra - hydrofolate
131
+ cofactors in vitamin b 12 deficiency have been referred to as the “ methylfolate
132
+ trap. ” this is the biochemical step whereby vitamin b 12 and folic acid metabolism
133
+ are linked, and it explains why the megaloblastic anemia of vitamin b 12 deficiency
134
+ can be partially corrected by ingestion of large amounts of folic acid. folic
135
+ acid can be reduced to dihydrofolate by the enzyme dihydro - folate reductase
136
+ ( figure 33 – 3, section 3 ) and thereby serve as a source of the tetrahydrofolate
137
+ required for synthesis of the purines and dtmp required for dna synthesis. vitamin
138
+ b 12 deficiency causes the accumulation of homo - cysteine due to reduced formation
139
+ - source_sentence: 'in after a successful when, definition, no longer dependent. the
140
+ level : understand the long term changes by of initial molecular cellular targets
141
+ must be a com approaches animals and functional revealed target this the teg area
142
+ vta a tiny structure tip brainstem, which amygdala, hippocampus, drugs of abuse
143
+ christian retired and and parkinson ’ disease age at time, neurologist prescribed
144
+ restore dopamine levels. symptoms start fluctuate and dopamine role later, he
145
+ devel oped a interest gambling, buying lottery and visiting casino he'
146
+ sentences:
147
+ - 'common in addicts after a successful withdrawal when, by definition, they are
148
+ no longer dependent. addictive drugs increase the level of dopamine : reinforcement
149
+ to understand the long - term changes induced by drugs of abuse, their initial
150
+ molecular and cellular targets must be identified. a com - bination of approaches
151
+ in animals and humans, including functional imaging, has revealed the mesolimbic
152
+ dopamine system as the prime target of addictive drugs. this system originates
153
+ in the ventral teg - mental area ( vta ), a tiny structure at the tip of the brainstem,
154
+ which projects to the nucleus accumbens, the amygdala, the hippocampus, 32 drugs
155
+ of abuse christian luscher, md a retired accountant developed a tremor and slowing
156
+ of movements and was diagnosed with parkinson ’ s disease at age 67. at that time,
157
+ his neurologist prescribed levodopa to restore dopamine levels. two years later,
158
+ motor symptoms start to fluctuate and the dopamine receptor agonist ropini - role
159
+ is added to his treatment. ∗ a few months later, he devel - oped a strong interest
160
+ in gambling, first buying lottery tickets and then visiting a casino almost every
161
+ day. he concealed his gambling activity until'
162
+ - bleeding in over - the - counter use is low but still double that of over - the
163
+ - counter ibuprofen ( perhaps due to a dose effect ). rare cases of allergic pneumonitis,
164
+ leukocy - toclastic vasculitis, and pseudoporphyria as well as the common nsaid
165
+ - associated adverse effects have been noted. oxaprozin oxaprozin is another propionic
166
+ acid derivative nsaid. as noted in table 36 – 1, its major difference from the
167
+ other members of this subgroup is a very long half - life ( 50 – 60 hours ), although
168
+ oxapro - zin does not undergo enterohepatic circulation. it is mildly urico -
169
+ suric, making it potentially more useful in gout than some other nsaids. otherwise,
170
+ the drug has the same benefits and risks that are associated with other nsaids.
171
+ piroxicam piroxicam, an oxicam ( figure 36 – 1 ), is a nonselective cox inhib
172
+ - itor that at high concentrations also inhibits polymorphonuclear leukocyte migration,
173
+ decreases oxygen radical production, and inhibits lymphocyte function. its long
174
+ half - life
175
+ - 4 ). recombinant human g - csf ( rhug - csf ; filgrastim ) is produced in a bacterial
176
+ expression system. it is a nonglycosylated peptide of 175 amino acids, with a
177
+ molecular weight of 18 kda. recombinant human gm - csf ( rhugm - csf ; sargramostim
178
+ ) is produced in a yeast expression system. it is a partially glycosylated peptide
179
+ of 127 amino acids, with three molecular species with molecular weights of 15,
180
+ 500 ; 15, 800 ; and 19, 500. these preparations have serum half - lives of 2 –
181
+ 7 hours after intravenous or subcutaneous administration. pegfilgrastim, a covalent
182
+ conjugation product of filgrastim and a form of polyethylene glycol, has a much
183
+ longer serum half - life than recombinant g - csf, and it can be injected once
184
+ per myelo - suppressive chemotherapy cycle instead of daily for several days.
185
+ lenograstim, used widely in europe, is a glycosylated form of recombinant g -
186
+ csf. pharmacodynamics the myeloid growth
187
+ - source_sentence: chapter drugs of 571 genetic by comparing relatively modest for
188
+ very high of the relative liability of a drug – its heritability, that basis of
189
+ common all is being genomic indicates that only perhaps even allele in combination
190
+ phenotype. involved remains elusive. some substance - have identified ( dehydrogenase
191
+ ), future will also focus on the mechanisms all drugs of abuse some not for substances
192
+ without reward such the and the dissocia anesthetics ( drugs, primarily
193
+ sentences:
194
+ - chapter 33 agents used in anemias ; hematopoietic growth factors 589 catalyzed
195
+ by the enzyme dihydrofolate reductase ( “ folate reductase ” ), to give dihydrofolic
196
+ acid ( figure 33 – 3, section 3 ). tetrahydrofolate is subsequently transformed
197
+ to folate cofactors possessing one - car - bon units attached to the 5 - nitrogen,
198
+ to the 10 - nitrogen, or to both positions ( figure 33 – 3 ). folate cofactors
199
+ are interconvertible by various enzymatic reactions and serve the important biochemical
200
+ function of donating one - carbon units at various levels of oxida - tion. in
201
+ most of these, tetrahydrofolate is regenerated and becomes available for reutilization.
202
+ pharmacokinetics the average american diet contains 500 – 700 mcg of folates daily,
203
+ 50 – 200 mcg of which is usually absorbed, depending on metabolic requirements.
204
+ pregnant women may absorb as much as 300 – 400 mcg of folic acid daily. various
205
+ forms of folic acid are present in a wide variety of plant and animal tissues
206
+ ; the richest sources are yeast, liver, kidney, and green vegetables
207
+ - 602 section vi drugs used to treat diseases of the blood, inflammation, & gout
208
+ amputation or organ failure. venous clots tend to be more fibrin - rich, contain
209
+ large numbers of trapped red blood cells, and are recognized pathologically as
210
+ red thrombi. venous thrombi can cause severe swelling and pain of the affected
211
+ extremity, but the most feared consequence is pulmonary embolism. this occurs
212
+ when part or all of the clot breaks off from its location in the deep venous system
213
+ and travels as an embolus through the right side of the heart and into the pulmonary
214
+ arterial circulation. sudden occlusion of a large pulmonary artery can cause acute
215
+ right heart failure and sudden death. in addition lung ischemia or infarction
216
+ will occur distal to the occluded pulmonary arterial segment. such emboli usually
217
+ arise from the deep venous system of the proximal lower extremities or pelvis.
218
+ although all thrombi are mixed, the platelet nidus dominates the arterial thrombus
219
+ and the fibrin tail dominates the venous thrombus. blood coagulation cascade blood
220
+ coagulates due to the transformation of soluble fibrinogen into insoluble fi
221
+ - chapter 32 drugs of abuse 571 of environmental and genetic factors. heritability
222
+ of addiction, as determined by comparing monozygotic with dizygotic twins, is
223
+ relatively modest for cannabinoids but very high for cocaine. it is of interest
224
+ that the relative risk for addiction ( addiction liability ) of a drug ( table
225
+ 32 – 1 ) correlates with its heritability, suggesting that the neurobiologic basis
226
+ of addiction common to all drugs is what is being inherited. further genomic analysis
227
+ indicates that only a few alleles ( or perhaps even a single recessive allele
228
+ ) need to function in combination to produce the phenotype. however, identification
229
+ of the genes involved remains elusive. although some substance - specific candidate
230
+ genes have been identified ( eg, alcohol dehydrogenase ), future research will
231
+ also focus on genes implicated in the neurobiologic mechanisms common to all addictive
232
+ drugs. nonaddictive drugs of abuse some drugs of abuse do not lead to addiction.
233
+ this is the case for substances that alter perception without causing sensations
234
+ of reward and euphoria, such as the hallucinogens and the dissocia - tive anesthetics
235
+ ( table 32 – 1 ). unlike addictive drugs, which primarily
236
+ pipeline_tag: sentence-similarity
237
+ library_name: sentence-transformers
238
+ ---
239
+
240
+ # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
241
+
242
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
243
+
244
+ ## Model Details
245
+
246
+ ### Model Description
247
+ - **Model Type:** Sentence Transformer
248
+ - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
249
+ - **Maximum Sequence Length:** 256 tokens
250
+ - **Output Dimensionality:** 384 dimensions
251
+ - **Similarity Function:** Cosine Similarity
252
+ <!-- - **Training Dataset:** Unknown -->
253
+ <!-- - **Language:** Unknown -->
254
+ <!-- - **License:** Unknown -->
255
+
256
+ ### Model Sources
257
+
258
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
259
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
260
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
261
+
262
+ ### Full Model Architecture
263
+
264
+ ```
265
+ SentenceTransformer(
266
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
267
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
268
+ (2): Normalize()
269
+ )
270
+ ```
271
+
272
+ ## Usage
273
+
274
+ ### Direct Usage (Sentence Transformers)
275
+
276
+ First install the Sentence Transformers library:
277
+
278
+ ```bash
279
+ pip install -U sentence-transformers
280
+ ```
281
+
282
+ Then you can load this model and run inference.
283
+ ```python
284
+ from sentence_transformers import SentenceTransformer
285
+
286
+ # Download from the 🤗 Hub
287
+ model = SentenceTransformer("RikoteMaster/retriever_pdf_and_books")
288
+ # Run inference
289
+ sentences = [
290
+ 'chapter drugs of 571 genetic by comparing relatively modest for very high of the relative liability of a drug – its heritability, that basis of common all is being genomic indicates that only perhaps even allele in combination phenotype. involved remains elusive. some substance - have identified ( dehydrogenase ), future will also focus on the mechanisms all drugs of abuse some not for substances without reward such the and the dissocia anesthetics ( drugs, primarily',
291
+ 'chapter 32 drugs of abuse 571 of environmental and genetic factors. heritability of addiction, as determined by comparing monozygotic with dizygotic twins, is relatively modest for cannabinoids but very high for cocaine. it is of interest that the relative risk for addiction ( addiction liability ) of a drug ( table 32 – 1 ) correlates with its heritability, suggesting that the neurobiologic basis of addiction common to all drugs is what is being inherited. further genomic analysis indicates that only a few alleles ( or perhaps even a single recessive allele ) need to function in combination to produce the phenotype. however, identification of the genes involved remains elusive. although some substance - specific candidate genes have been identified ( eg, alcohol dehydrogenase ), future research will also focus on genes implicated in the neurobiologic mechanisms common to all addictive drugs. nonaddictive drugs of abuse some drugs of abuse do not lead to addiction. this is the case for substances that alter perception without causing sensations of reward and euphoria, such as the hallucinogens and the dissocia - tive anesthetics ( table 32 – 1 ). unlike addictive drugs, which primarily',
292
+ '602 section vi drugs used to treat diseases of the blood, inflammation, & gout amputation or organ failure. venous clots tend to be more fibrin - rich, contain large numbers of trapped red blood cells, and are recognized pathologically as red thrombi. venous thrombi can cause severe swelling and pain of the affected extremity, but the most feared consequence is pulmonary embolism. this occurs when part or all of the clot breaks off from its location in the deep venous system and travels as an embolus through the right side of the heart and into the pulmonary arterial circulation. sudden occlusion of a large pulmonary artery can cause acute right heart failure and sudden death. in addition lung ischemia or infarction will occur distal to the occluded pulmonary arterial segment. such emboli usually arise from the deep venous system of the proximal lower extremities or pelvis. although all thrombi are mixed, the platelet nidus dominates the arterial thrombus and the fibrin tail dominates the venous thrombus. blood coagulation cascade blood coagulates due to the transformation of soluble fibrinogen into insoluble fi',
293
+ ]
294
+ embeddings = model.encode(sentences)
295
+ print(embeddings.shape)
296
+ # [3, 384]
297
+
298
+ # Get the similarity scores for the embeddings
299
+ similarities = model.similarity(embeddings, embeddings)
300
+ print(similarities.shape)
301
+ # [3, 3]
302
+ ```
303
+
304
+ <!--
305
+ ### Direct Usage (Transformers)
306
+
307
+ <details><summary>Click to see the direct usage in Transformers</summary>
308
+
309
+ </details>
310
+ -->
311
+
312
+ <!--
313
+ ### Downstream Usage (Sentence Transformers)
314
+
315
+ You can finetune this model on your own dataset.
316
+
317
+ <details><summary>Click to expand</summary>
318
+
319
+ </details>
320
+ -->
321
+
322
+ <!--
323
+ ### Out-of-Scope Use
324
+
325
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
326
+ -->
327
+
328
+ <!--
329
+ ## Bias, Risks and Limitations
330
+
331
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
332
+ -->
333
+
334
+ <!--
335
+ ### Recommendations
336
+
337
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
338
+ -->
339
+
340
+ ## Training Details
341
+
342
+ ### Training Dataset
343
+
344
+ #### Unnamed Dataset
345
+
346
+ * Size: 57,126 training samples
347
+ * Columns: <code>anchor</code> and <code>positive</code>
348
+ * Approximate statistics based on the first 1000 samples:
349
+ | | anchor | positive |
350
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
351
+ | type | string | string |
352
+ | details | <ul><li>min: 3 tokens</li><li>mean: 58.68 tokens</li><li>max: 133 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 143.97 tokens</li><li>max: 256 tokens</li></ul> |
353
+ * Samples:
354
+ | anchor | positive |
355
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
356
+ | <code>advanced march lecture : lps weights ola svensson1 this lecture do the : we ( actually hedge method. solve lps. fast very solving lps approximately. version 11 of topics in tcs, ” were simon rodriguez the by that used in the last lecture. recall last the lecture, saw how fairly follow the of recall that game t and n experts was : for : i ∈ n gives up or ) based the expert, up 3. with the expert advice ’ decides the up / down</code> | <code>advanced algorithms march 22, 2022 lecture 9 : solving lps using multiplicative weights notes by ola svensson1 in this lecture we do the following : • we describe the multiplicative weight update ( actually hedge ) method. • we then use this method to solve covering lps. • this is a very fast and simple ( i. e., very attractive ) method for solving these lps approximately. these lecture notes are partly based on an updated version of “ lecture 11 of topics in tcs, 2015 ” that were written by vincent eggerling and simon rodriguez and on the lecture notes by shiva kaul that we used in the last lecture. 1 recall last lecture in the previous lecture, we saw how to use the weighted majority method in order to fairly smartly follow the advice of experts. recall that the general game - setting with t days and n experts was as follows : for t = 1,..., t : 1. each expert i ∈ [ n ] gives some advice : up or down 2. aggregator ( you ) predicts, based on the advice of the expert, up or down. 3. ad...</code> |
357
+ | <code>or down predicts, up down. adversary, of expert the the / down 4. aggregator the parameterized > 0 “ rate now as : • expert i ( 1. ( experts are in begin - at each t • predict / based weighted vote w = w t w n observing the set ( t i = w ( i · ( 1−ε i ] was ( trustworthiness experts. lecture the when / 2. the sequence of outcomes, t, and expert ∈ [ n ], of wm mistakes ≤2</code> | <code>up or down 2. aggregator ( you ) predicts, based on the advice of the expert, up or down. 3. adversary, with knowledge of the expert advice and the aggregator ’ s decision, decides the up / down outcome. 4. aggregator observes the outcome and [UNK] if his prediction was incorrect. the weighted majority algorithm, parameterized by [UNK] > 0 ( the “ learning rate ” ), now works as follows : • assign each expert i a weight w ( 1 ) i initialized to 1. ( all experts are equally trustworthy in the begin - ning. ) at each time t : • predict up / down based on a weighted majority vote per w ( t ) = ( w ( t ) 1,..., w ( t ) n ). • after observing the cost vector, set w ( t + 1 ) i = w ( t ) i · ( 1−ε ) for every expert i ∈ [ n ] whose prediction was wrong. ( discount the trustworthiness of erroneous experts. ) last lecture we analyzed the case when [UNK] = 1 / 2. the same proof gives the following theorem 1 for any sequence of outcomes, duration t, and expert i ∈ [ n ], # of wm mistakes ≤2</code> |
358
+ | <code>) last lecture analyzed [UNK] = 1 / 2. the following theorem sequence outcomes, duration t, and n wm ≤2 [UNK] ( ’ + o ( ( ). notes as for lecturer. have not inconsistent omit citations 1</code> | <code>##roneous experts. ) last lecture we analyzed the case when [UNK] = 1 / 2. the same proof gives the following theorem 1 for any sequence of outcomes, duration t, and expert i ∈ [ n ], # of wm mistakes ≤2 ( 1 + [UNK] ) · ( # of i ’ s mistakes ) + o ( log ( n ) / [UNK] ). 1disclaimer : these notes were written as notes for the lecturer. they have not been peer - reviewed and may contain inconsistent notation, typos, and omit citations of relevant works. 1</code> |
359
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
360
+ ```json
361
+ {
362
+ "scale": 20.0,
363
+ "similarity_fct": "cos_sim"
364
+ }
365
+ ```
366
+
367
+ ### Evaluation Dataset
368
+
369
+ #### Unnamed Dataset
370
+
371
+ * Size: 6,348 evaluation samples
372
+ * Columns: <code>anchor</code> and <code>positive</code>
373
+ * Approximate statistics based on the first 1000 samples:
374
+ | | anchor | positive |
375
+ |:--------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
376
+ | type | string | string |
377
+ | details | <ul><li>min: 14 tokens</li><li>mean: 97.31 tokens</li><li>max: 142 tokens</li></ul> | <ul><li>min: 53 tokens</li><li>mean: 238.84 tokens</li><li>max: 256 tokens</li></ul> |
378
+ * Samples:
379
+ | anchor | positive |
380
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
381
+ | <code>more at gesic doses. be predominantly however, it also the μ agonist weak or partial nist it mixed available. it used orally however, its injection miscellaneous tramadol is on blockade has been norepinephrine function. because it partially is μ agonist. the recommended mg orally times daily. association with ; drug contraindicated in history of epilepsy with that lower the serious risk the development sero toni</code> | <code>##phine but appears to produce more sedation at equianal - gesic doses. butorphanol is considered to be predominantly a κ agonist. however, it may also act as a partial agonist or antagonist at the μ receptor. benzomorphans pentazocine is a κ agonist with weak μ - antagonist or partial ago - nist properties. it is the oldest mixed agent available. it may be used orally or parenterally. however, because of its irritant properties, the injection of pentazocine subcutaneously is not recommended. miscellaneous tramadol is a centrally acting analgesic whose mechanism of action is predominantly based on blockade of serotonin reuptake. tramadol has also been found to inhibit norepinephrine transporter function. because it is only partially antagonized by naloxone, it is believed to be only a weak μ - receptor agonist. the recommended dosage is 50 – 100 mg orally four times daily. toxicity includes association with seizures ; the drug is relatively contraindicated in patients with a history of...</code> |
382
+ | <code>##ly four times daily. toxicity includes relatively in a of the serious is the of - inhibitor ( ). typically abate after several days of is no clinically respiration or tem thus far given that action of tramadol largely - serve as an adjunct pure opioid treatment of chronic is newer with modest μ significant norepinephrine - inhibiting models, its effects moderately by naloxone but reduced adrenoceptor antagonist. furthermore, norepinephrine</code> | <code>##ly four times daily. toxicity includes association with seizures ; the drug is relatively contraindicated in patients with a history of epilepsy and for use with other drugs that lower the seizure threshold. another serious risk is the development of sero - tonin syndrome, especially if selective serotonin reuptake inhibitor ( ssri ) antidepressants are being administered ( see chapter 16 ). other side effects include nausea and dizziness, but these symptoms typically abate after several days of therapy. it is surprising that no clinically significant effects on respiration or the cardiovascular sys - tem have thus far been reported. given the fact that the analgesic action of tramadol is largely independent of μ - receptor action, tra - madol may serve as an adjunct with pure opioid agonists in the treatment of chronic neuropathic pain. tapentadol is a newer analgesic with modest μ - opioid receptor affinity and significant norepinephrine reuptake - inhibiting action. in animal mode...</code> |
383
+ | <code>- action. in its analgesic effects were moderately by strongly adrenoceptor antagonist. porter ( 6 was than of its the transporter ( of tapentadol 2008 been shown to as oxycodone the to gastrointesti complaints nausea. carries risk for for is how in cal to tramadol mechanism based opioid antitussives the effective drugs suppression of this is analgesia. in effect</code> | <code>- inhibiting action. in animal models, its analgesic effects were only moderately reduced by naloxone but strongly reduced by an α 2 - adrenoceptor antagonist. furthermore, its binding to the norepinephrine trans - porter ( net, see chapter 6 ) was stronger than that of tramadol, whereas its binding to the serotonin transporter ( sert ) was less than that of tramadol. tapentadol was approved in 2008 and has been shown to be as effective as oxycodone in the treatment of moderate to severe pain but with a reduced profile of gastrointesti - nal complaints such as nausea. tapentadol carries risk for seizures in patients with seizure disorders and for the development of sero - tonin syndrome. it is unknown how tapentadol compares in clini - cal utility to tramadol or other analgesics whose mechanism of action is not based primarily on opioid receptor pharmacology. antitussives the opioid analgesics are among the most effective drugs available for the suppression of cough. this effect is oft...</code> |
384
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
385
+ ```json
386
+ {
387
+ "scale": 20.0,
388
+ "similarity_fct": "cos_sim"
389
+ }
390
+ ```
391
+
392
+ ### Training Hyperparameters
393
+ #### Non-Default Hyperparameters
394
+
395
+ - `eval_strategy`: steps
396
+ - `per_device_train_batch_size`: 128
397
+ - `per_device_eval_batch_size`: 128
398
+ - `learning_rate`: 2e-05
399
+ - `num_train_epochs`: 5
400
+ - `warmup_ratio`: 0.1
401
+ - `fp16`: True
402
+ - `dataloader_drop_last`: True
403
+ - `dataloader_num_workers`: 2
404
+ - `load_best_model_at_end`: True
405
+ - `push_to_hub`: True
406
+ - `hub_model_id`: RikoteMaster/retriever_pdf_and_books
407
+ - `hub_strategy`: end
408
+ - `hub_private_repo`: False
409
+
410
+ #### All Hyperparameters
411
+ <details><summary>Click to expand</summary>
412
+
413
+ - `overwrite_output_dir`: False
414
+ - `do_predict`: False
415
+ - `eval_strategy`: steps
416
+ - `prediction_loss_only`: True
417
+ - `per_device_train_batch_size`: 128
418
+ - `per_device_eval_batch_size`: 128
419
+ - `per_gpu_train_batch_size`: None
420
+ - `per_gpu_eval_batch_size`: None
421
+ - `gradient_accumulation_steps`: 1
422
+ - `eval_accumulation_steps`: None
423
+ - `torch_empty_cache_steps`: None
424
+ - `learning_rate`: 2e-05
425
+ - `weight_decay`: 0.0
426
+ - `adam_beta1`: 0.9
427
+ - `adam_beta2`: 0.999
428
+ - `adam_epsilon`: 1e-08
429
+ - `max_grad_norm`: 1.0
430
+ - `num_train_epochs`: 5
431
+ - `max_steps`: -1
432
+ - `lr_scheduler_type`: linear
433
+ - `lr_scheduler_kwargs`: {}
434
+ - `warmup_ratio`: 0.1
435
+ - `warmup_steps`: 0
436
+ - `log_level`: passive
437
+ - `log_level_replica`: warning
438
+ - `log_on_each_node`: True
439
+ - `logging_nan_inf_filter`: True
440
+ - `save_safetensors`: True
441
+ - `save_on_each_node`: False
442
+ - `save_only_model`: False
443
+ - `restore_callback_states_from_checkpoint`: False
444
+ - `no_cuda`: False
445
+ - `use_cpu`: False
446
+ - `use_mps_device`: False
447
+ - `seed`: 42
448
+ - `data_seed`: None
449
+ - `jit_mode_eval`: False
450
+ - `use_ipex`: False
451
+ - `bf16`: False
452
+ - `fp16`: True
453
+ - `fp16_opt_level`: O1
454
+ - `half_precision_backend`: auto
455
+ - `bf16_full_eval`: False
456
+ - `fp16_full_eval`: False
457
+ - `tf32`: None
458
+ - `local_rank`: 0
459
+ - `ddp_backend`: None
460
+ - `tpu_num_cores`: None
461
+ - `tpu_metrics_debug`: False
462
+ - `debug`: []
463
+ - `dataloader_drop_last`: True
464
+ - `dataloader_num_workers`: 2
465
+ - `dataloader_prefetch_factor`: None
466
+ - `past_index`: -1
467
+ - `disable_tqdm`: False
468
+ - `remove_unused_columns`: True
469
+ - `label_names`: None
470
+ - `load_best_model_at_end`: True
471
+ - `ignore_data_skip`: False
472
+ - `fsdp`: []
473
+ - `fsdp_min_num_params`: 0
474
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
475
+ - `fsdp_transformer_layer_cls_to_wrap`: None
476
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
477
+ - `deepspeed`: None
478
+ - `label_smoothing_factor`: 0.0
479
+ - `optim`: adamw_torch
480
+ - `optim_args`: None
481
+ - `adafactor`: False
482
+ - `group_by_length`: False
483
+ - `length_column_name`: length
484
+ - `ddp_find_unused_parameters`: None
485
+ - `ddp_bucket_cap_mb`: None
486
+ - `ddp_broadcast_buffers`: False
487
+ - `dataloader_pin_memory`: True
488
+ - `dataloader_persistent_workers`: False
489
+ - `skip_memory_metrics`: True
490
+ - `use_legacy_prediction_loop`: False
491
+ - `push_to_hub`: True
492
+ - `resume_from_checkpoint`: None
493
+ - `hub_model_id`: RikoteMaster/retriever_pdf_and_books
494
+ - `hub_strategy`: end
495
+ - `hub_private_repo`: False
496
+ - `hub_always_push`: False
497
+ - `gradient_checkpointing`: False
498
+ - `gradient_checkpointing_kwargs`: None
499
+ - `include_inputs_for_metrics`: False
500
+ - `include_for_metrics`: []
501
+ - `eval_do_concat_batches`: True
502
+ - `fp16_backend`: auto
503
+ - `push_to_hub_model_id`: None
504
+ - `push_to_hub_organization`: None
505
+ - `mp_parameters`:
506
+ - `auto_find_batch_size`: False
507
+ - `full_determinism`: False
508
+ - `torchdynamo`: None
509
+ - `ray_scope`: last
510
+ - `ddp_timeout`: 1800
511
+ - `torch_compile`: False
512
+ - `torch_compile_backend`: None
513
+ - `torch_compile_mode`: None
514
+ - `include_tokens_per_second`: False
515
+ - `include_num_input_tokens_seen`: False
516
+ - `neftune_noise_alpha`: None
517
+ - `optim_target_modules`: None
518
+ - `batch_eval_metrics`: False
519
+ - `eval_on_start`: False
520
+ - `use_liger_kernel`: False
521
+ - `eval_use_gather_object`: False
522
+ - `average_tokens_across_devices`: False
523
+ - `prompts`: None
524
+ - `batch_sampler`: batch_sampler
525
+ - `multi_dataset_batch_sampler`: proportional
526
+
527
+ </details>
528
+
529
+ ### Training Logs
530
+ | Epoch | Step | Training Loss | Validation Loss |
531
+ |:----------:|:--------:|:-------------:|:---------------:|
532
+ | 0.1121 | 50 | 0.0343 | - |
533
+ | 0.2242 | 100 | 0.0199 | - |
534
+ | 0.3363 | 150 | 0.0184 | - |
535
+ | 0.4484 | 200 | 0.0188 | 0.0069 |
536
+ | 0.5605 | 250 | 0.019 | - |
537
+ | 0.6726 | 300 | 0.0155 | - |
538
+ | 0.7848 | 350 | 0.0128 | - |
539
+ | 0.8969 | 400 | 0.0139 | 0.0048 |
540
+ | 1.0090 | 450 | 0.0151 | - |
541
+ | 1.1211 | 500 | 0.012 | - |
542
+ | 1.2332 | 550 | 0.0144 | - |
543
+ | 1.3453 | 600 | 0.0117 | 0.0037 |
544
+ | 1.4574 | 650 | 0.0164 | - |
545
+ | 1.5695 | 700 | 0.0099 | - |
546
+ | 1.6816 | 750 | 0.0128 | - |
547
+ | 1.7937 | 800 | 0.0076 | 0.0035 |
548
+ | 1.9058 | 850 | 0.0098 | - |
549
+ | 2.0179 | 900 | 0.0147 | - |
550
+ | 2.1300 | 950 | 0.0087 | - |
551
+ | 2.2422 | 1000 | 0.012 | 0.0033 |
552
+ | 2.3543 | 1050 | 0.0106 | - |
553
+ | 2.4664 | 1100 | 0.0176 | - |
554
+ | 2.5785 | 1150 | 0.0123 | - |
555
+ | 2.6906 | 1200 | 0.0122 | 0.0032 |
556
+ | 2.8027 | 1250 | 0.0126 | - |
557
+ | 2.9148 | 1300 | 0.013 | - |
558
+ | 3.0269 | 1350 | 0.011 | - |
559
+ | 3.1390 | 1400 | 0.0139 | 0.0031 |
560
+ | 3.2511 | 1450 | 0.01 | - |
561
+ | 3.3632 | 1500 | 0.0122 | - |
562
+ | 3.4753 | 1550 | 0.0094 | - |
563
+ | 3.5874 | 1600 | 0.0122 | 0.0030 |
564
+ | 3.6996 | 1650 | 0.0147 | - |
565
+ | 3.8117 | 1700 | 0.0126 | - |
566
+ | 3.9238 | 1750 | 0.0125 | - |
567
+ | 4.0359 | 1800 | 0.0138 | 0.0030 |
568
+ | 4.1480 | 1850 | 0.0105 | - |
569
+ | 4.2601 | 1900 | 0.0107 | - |
570
+ | 4.3722 | 1950 | 0.0179 | - |
571
+ | 4.4843 | 2000 | 0.011 | 0.0029 |
572
+ | 4.5964 | 2050 | 0.0126 | - |
573
+ | 4.7085 | 2100 | 0.0137 | - |
574
+ | 4.8206 | 2150 | 0.0084 | - |
575
+ | **4.9327** | **2200** | **0.012** | **0.0029** |
576
+
577
+ * The bold row denotes the saved checkpoint.
578
+
579
+ ### Framework Versions
580
+ - Python: 3.10.17
581
+ - Sentence Transformers: 4.1.0
582
+ - Transformers: 4.52.3
583
+ - PyTorch: 2.7.0+cu126
584
+ - Accelerate: 1.7.0
585
+ - Datasets: 3.6.0
586
+ - Tokenizers: 0.21.1
587
+
588
+ ## Citation
589
+
590
+ ### BibTeX
591
+
592
+ #### Sentence Transformers
593
+ ```bibtex
594
+ @inproceedings{reimers-2019-sentence-bert,
595
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
596
+ author = "Reimers, Nils and Gurevych, Iryna",
597
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
598
+ month = "11",
599
+ year = "2019",
600
+ publisher = "Association for Computational Linguistics",
601
+ url = "https://arxiv.org/abs/1908.10084",
602
+ }
603
+ ```
604
+
605
+ #### MultipleNegativesRankingLoss
606
+ ```bibtex
607
+ @misc{henderson2017efficient,
608
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
609
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
610
+ year={2017},
611
+ eprint={1705.00652},
612
+ archivePrefix={arXiv},
613
+ primaryClass={cs.CL}
614
+ }
615
+ ```
616
+
617
+ <!--
618
+ ## Glossary
619
+
620
+ *Clearly define terms in order to be accessible across audiences.*
621
+ -->
622
+
623
+ <!--
624
+ ## Model Card Authors
625
+
626
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
627
+ -->
628
+
629
+ <!--
630
+ ## Model Card Contact
631
+
632
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
633
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 1536,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 6,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.52.3",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.1.0",
4
+ "transformers": "4.52.3",
5
+ "pytorch": "2.7.0+cu126"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:688b8d59b5e233387311edabe584f06754c255fd312258aabdeadbb79dc188da
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 256,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8511ebcd4819c8c4cbd85612f91ffa256732ae19852f3129fdd07b78e6dd2b6a
3
+ size 6161
vocab.txt ADDED
The diff for this file is too large to render. See raw diff