Rui Melo commited on
Commit
835a355
1 Parent(s): 45a8e65

initial commit

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md CHANGED
@@ -1,3 +1,128 @@
1
  ---
2
- license: mit
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+ - transformers
8
  ---
9
+
10
+ # {MODEL_NAME}
11
+
12
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search.
13
+
14
+ <!--- Describe your model here -->
15
+
16
+ ## Usage (Sentence-Transformers)
17
+
18
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
19
+
20
+ ```
21
+ pip install -U sentence-transformers
22
+ ```
23
+
24
+ Then you can use the model like this:
25
+
26
+ ```python
27
+ from sentence_transformers import SentenceTransformer
28
+ sentences = ["This is an example sentence", "Each sentence is converted"]
29
+
30
+ model = SentenceTransformer('{MODEL_NAME}')
31
+ embeddings = model.encode(sentences)
32
+ print(embeddings)
33
+ ```
34
+
35
+
36
+
37
+ ## Usage (HuggingFace Transformers)
38
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
39
+
40
+ ```python
41
+ from transformers import AutoTokenizer, AutoModel
42
+ import torch
43
+
44
+
45
+ #Mean Pooling - Take attention mask into account for correct averaging
46
+ def mean_pooling(model_output, attention_mask):
47
+ token_embeddings = model_output[0] #First element of model_output contains all token embeddings
48
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
49
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
50
+
51
+
52
+ # Sentences we want sentence embeddings for
53
+ sentences = ['This is an example sentence', 'Each sentence is converted']
54
+
55
+ # Load model from HuggingFace Hub
56
+ tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
57
+ model = AutoModel.from_pretrained('{MODEL_NAME}')
58
+
59
+ # Tokenize sentences
60
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
61
+
62
+ # Compute token embeddings
63
+ with torch.no_grad():
64
+ model_output = model(**encoded_input)
65
+
66
+ # Perform pooling. In this case, mean pooling.
67
+ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
68
+
69
+ print("Sentence embeddings:")
70
+ print(sentence_embeddings)
71
+ ```
72
+
73
+
74
+
75
+ ## Evaluation Results
76
+
77
+ <!--- Describe how your model was evaluated -->
78
+
79
+ For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
80
+
81
+
82
+ ## Training
83
+ The model was trained with the parameters:
84
+
85
+ **DataLoader**:
86
+
87
+ `torch.utils.data.dataloader.DataLoader` of length 25000 with parameters:
88
+ ```
89
+ {'batch_size': 4, 'sampler': 'torch.utils.data.sampler.SequentialSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
90
+ ```
91
+
92
+ **Loss**:
93
+
94
+ `sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss` with parameters:
95
+ ```
96
+ {'scale': 20.0, 'similarity_fct': 'cos_sim'}
97
+ ```
98
+
99
+ Parameters of the fit()-Method:
100
+ ```
101
+ {
102
+ "epochs": 1,
103
+ "evaluation_steps": 100,
104
+ "evaluator": "__main__.LossEvaluator",
105
+ "max_grad_norm": 1,
106
+ "optimizer_class": "<class 'transformers.optimization.AdamW'>",
107
+ "optimizer_params": {
108
+ "lr": 1e-05
109
+ },
110
+ "scheduler": "WarmupLinear",
111
+ "steps_per_epoch": null,
112
+ "warmup_steps": 0,
113
+ "weight_decay": 0.01
114
+ }
115
+ ```
116
+
117
+
118
+ ## Full Model Architecture
119
+ ```
120
+ SentenceTransformer(
121
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
122
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
123
+ )
124
+ ```
125
+
126
+ ## Citing & Authors
127
+
128
+ <!--- Describe where people can find more information -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/home/ruimelo/.cache/torch/sentence_transformers/neuralmind_bert-large-portuguese-cased",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "directionality": "bidi",
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 4096,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 16,
18
+ "num_hidden_layers": 24,
19
+ "output_past": true,
20
+ "pad_token_id": 0,
21
+ "pooler_fc_size": 768,
22
+ "pooler_num_attention_heads": 12,
23
+ "pooler_num_fc_layers": 3,
24
+ "pooler_size_per_head": 128,
25
+ "pooler_type": "first_token_transform",
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.20.1",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 29794
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.0",
4
+ "transformers": "4.20.1",
5
+ "pytorch": "1.10.1+cu111"
6
+ }
7
+ }
eval/loss_evaluation_dev_results.csv ADDED
@@ -0,0 +1,251 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,loss
2
+ 0,100,0.19519549049928608
3
+ 0,200,0.11560484829516172
4
+ 0,300,0.08679192887225318
5
+ 0,400,0.07406212777689816
6
+ 0,500,0.06499064764375434
7
+ 0,600,0.05987601100840652
8
+ 0,700,0.055965147196659834
9
+ 0,800,0.05774995323606245
10
+ 0,900,0.05295955123265526
11
+ 0,1000,0.05274097613280328
12
+ 0,1100,0.05067385239212754
13
+ 0,1200,0.05087106744197604
14
+ 0,1300,0.049993227635551966
15
+ 0,1400,0.048679257646413274
16
+ 0,1500,0.04968225817311177
17
+ 0,1600,0.051728904252822075
18
+ 0,1700,0.04877414873885723
19
+ 0,1800,0.05247588143790906
20
+ 0,1900,0.047604353220216084
21
+ 0,2000,0.04749206604852059
22
+ 0,2100,0.04782016522758276
23
+ 0,2200,0.04651157213780695
24
+ 0,2300,0.04710628148749038
25
+ 0,2400,0.046624648567542
26
+ 0,2500,0.04509524600019381
27
+ 0,2600,0.04553730478420182
28
+ 0,2700,0.04443945448109487
29
+ 0,2800,0.0504640726274293
30
+ 0,2900,0.04794691643282241
31
+ 0,3000,0.04631040973041582
32
+ 0,3100,0.0419986342476275
33
+ 0,3200,0.04298793683305728
34
+ 0,3300,0.04493164272913312
35
+ 0,3400,0.048255282522043044
36
+ 0,3500,0.04978693392673049
37
+ 0,3600,0.04584045348395002
38
+ 0,3700,0.04929085410937766
39
+ 0,3800,0.048445018135582926
40
+ 0,3900,0.046708384944145157
41
+ 0,4000,0.04662567339258236
42
+ 0,4100,0.0472695262937003
43
+ 0,4200,0.048288902709505144
44
+ 0,4300,0.048463549996224084
45
+ 0,4400,0.043781441836662466
46
+ 0,4500,0.04312372630505224
47
+ 0,4600,0.045321417531253336
48
+ 0,4700,0.04252031567532863
49
+ 0,4800,0.0530112666398089
50
+ 0,4900,0.052159558343869504
51
+ 0,5000,0.052686183118791245
52
+ 0,5100,0.04998561888884692
53
+ 0,5200,0.044343194892997005
54
+ 0,5300,0.0423403514099241
55
+ 0,5400,0.04481474702306517
56
+ 0,5500,0.04676144200633235
57
+ 0,5600,0.04174483070197358
58
+ 0,5700,0.04355011108918061
59
+ 0,5800,0.04652475086493452
60
+ 0,5900,0.045437329526519125
61
+ 0,6000,0.044627202456709925
62
+ 0,6100,0.043920307074457196
63
+ 0,6200,0.042049196839164645
64
+ 0,6300,0.04682356477219086
65
+ 0,6400,0.04487424387279889
66
+ 0,6500,0.041516137345119
67
+ 0,6600,0.04123407529385229
68
+ 0,6700,0.03734822506002114
69
+ 0,6800,0.04004483578493084
70
+ 0,6900,0.04361605496124544
71
+ 0,7000,0.044393963018599165
72
+ 0,7100,0.04498864975572355
73
+ 0,7200,0.044416080061861235
74
+ 0,7300,0.04217950248869233
75
+ 0,7400,0.04202356934366427
76
+ 0,7500,0.04097753317170045
77
+ 0,7600,0.03903316448376711
78
+ 0,7700,0.04317112945482087
79
+ 0,7800,0.04497662772605678
80
+ 0,7900,0.04109697778423021
81
+ 0,8000,0.04386395559431636
82
+ 0,8100,0.04435155229125319
83
+ 0,8200,0.040241758321292356
84
+ 0,8300,0.04920905432964724
85
+ 0,8400,0.045273166227681634
86
+ 0,8500,0.045771352062498875
87
+ 0,8600,0.03970043939072392
88
+ 0,8700,0.041097908408486525
89
+ 0,8800,0.04337787134086743
90
+ 0,8900,0.043671976632325096
91
+ 0,9000,0.040776167853089046
92
+ 0,9100,0.04171571797915774
93
+ 0,9200,0.03746827632520056
94
+ 0,9300,0.03856413216644577
95
+ 0,9400,0.041763630464973195
96
+ 0,9500,0.0395228136582546
97
+ 0,9600,0.04500009461940554
98
+ 0,9700,0.04361399264472892
99
+ 0,9800,0.047162896827277506
100
+ 0,9900,0.04293111109975825
101
+ 0,10000,0.04538575671103895
102
+ 0,10100,0.043648700229026886
103
+ 0,10200,0.04136474249746654
104
+ 0,10300,0.04508329086149529
105
+ 0,10400,0.04102850488844959
106
+ 0,10500,0.042174578120627075
107
+ 0,10600,0.045043971799346896
108
+ 0,10700,0.0436181597908299
109
+ 0,10800,0.045259078109792475
110
+ 0,10900,0.04371035268960593
111
+ 0,11000,0.05035991068870275
112
+ 0,11100,0.050761380571160454
113
+ 0,11200,0.04406444633185185
114
+ 0,11300,0.04401907154579702
115
+ 0,11400,0.04374491291463001
116
+ 0,11500,0.041598092203370504
117
+ 0,11600,0.041415777919197524
118
+ 0,11700,0.04249067280007211
119
+ 0,11800,0.03923704199554693
120
+ 0,11900,0.0363335097560149
121
+ 0,12000,0.04222154671425733
122
+ 0,12100,0.03865254473414243
123
+ 0,12200,0.03969156562322112
124
+ 0,12300,0.03945732652428465
125
+ 0,12400,0.041877292867345935
126
+ 0,12500,0.036688783095289904
127
+ 0,12600,0.04137931299509875
128
+ 0,12700,0.037526527193307416
129
+ 0,12800,0.03955853321622893
130
+ 0,12900,0.04099604392775696
131
+ 0,13000,0.038100052026215914
132
+ 0,13100,0.04037489445645954
133
+ 0,13200,0.037006299523469385
134
+ 0,13300,0.042210353803639335
135
+ 0,13400,0.042162665614587515
136
+ 0,13500,0.04045078091329652
137
+ 0,13600,0.04178211537794941
138
+ 0,13700,0.03652732793331884
139
+ 0,13800,0.04007450492148122
140
+ 0,13900,0.040218797176888324
141
+ 0,14000,0.03825300664909627
142
+ 0,14100,0.04205769400583465
143
+ 0,14200,0.04096333694347577
144
+ 0,14300,0.0389199056238846
145
+ 0,14400,0.037719650394660416
146
+ 0,14500,0.04263562075523331
147
+ 0,14600,0.03808142022118219
148
+ 0,14700,0.04628894311818186
149
+ 0,14800,0.039785022687983417
150
+ 0,14900,0.039248060891297155
151
+ 0,15000,0.04015960164872535
152
+ 0,15100,0.04400960119832234
153
+ 0,15200,0.044337519492261744
154
+ 0,15300,0.04161765173295095
155
+ 0,15400,0.04071474287225717
156
+ 0,15500,0.039765120246020164
157
+ 0,15600,0.042707479120178665
158
+ 0,15700,0.04196122203464124
159
+ 0,15800,0.03900735156519495
160
+ 0,15900,0.036981938280766895
161
+ 0,16000,0.03967288962420271
162
+ 0,16100,0.036723857662762045
163
+ 0,16200,0.04005734996749844
164
+ 0,16300,0.04027912320752289
165
+ 0,16400,0.043616688434242885
166
+ 0,16500,0.042757092717327604
167
+ 0,16600,0.040512548224817806
168
+ 0,16700,0.03594136324969477
169
+ 0,16800,0.038857869270918104
170
+ 0,16900,0.04087193688661806
171
+ 0,17000,0.03912139527871697
172
+ 0,17100,0.03842234752314098
173
+ 0,17200,0.03649764288259497
174
+ 0,17300,0.04245655374152135
175
+ 0,17400,0.039467562094128494
176
+ 0,17500,0.03991257693460278
177
+ 0,17600,0.04171786952817289
178
+ 0,17700,0.04471105680426285
179
+ 0,17800,0.0367856082773753
180
+ 0,17900,0.03679781602542855
181
+ 0,18000,0.03854221257501377
182
+ 0,18100,0.040181813599715586
183
+ 0,18200,0.0407157541238927
184
+ 0,18300,0.037851696226764577
185
+ 0,18400,0.03831218913948021
186
+ 0,18500,0.03791270016791887
187
+ 0,18600,0.03622766606910176
188
+ 0,18700,0.03551119881726873
189
+ 0,18800,0.03778034173768933
190
+ 0,18900,0.03405767042893223
191
+ 0,19000,0.03123430945533104
192
+ 0,19100,0.037109243501212134
193
+ 0,19200,0.036391455788788406
194
+ 0,19300,0.032642522298414564
195
+ 0,19400,0.03444629282929268
196
+ 0,19500,0.03728879319979016
197
+ 0,19600,0.03744477383985601
198
+ 0,19700,0.03397694265227539
199
+ 0,19800,0.03912842301241188
200
+ 0,19900,0.03756071515860115
201
+ 0,20000,0.03825289866256772
202
+ 0,20100,0.037043497484298006
203
+ 0,20200,0.03586015019140629
204
+ 0,20300,0.03841649508690972
205
+ 0,20400,0.03709434958143799
206
+ 0,20500,0.03766999650176518
207
+ 0,20600,0.03719969458871243
208
+ 0,20700,0.03763643987506886
209
+ 0,20800,0.03661399590211345
210
+ 0,20900,0.034543956276607314
211
+ 0,21000,0.037338983882914366
212
+ 0,21100,0.038684293762035145
213
+ 0,21200,0.03122012103122229
214
+ 0,21300,0.03625594341468651
215
+ 0,21400,0.03636522202243
216
+ 0,21500,0.03669486281276811
217
+ 0,21600,0.03786981438117198
218
+ 0,21700,0.03672024818368426
219
+ 0,21800,0.036491299151409376
220
+ 0,21900,0.033634753646258855
221
+ 0,22000,0.037865872911989916
222
+ 0,22100,0.03907738132622352
223
+ 0,22200,0.034167471399115856
224
+ 0,22300,0.03912497054712691
225
+ 0,22400,0.04040111948641333
226
+ 0,22500,0.04145388534234468
227
+ 0,22600,0.03720971221760168
228
+ 0,22700,0.033648781347541845
229
+ 0,22800,0.03764335221710776
230
+ 0,22900,0.036039476440455374
231
+ 0,23000,0.03600912533784493
232
+ 0,23100,0.03687414772574997
233
+ 0,23200,0.04035678972016075
234
+ 0,23300,0.03742495229770756
235
+ 0,23400,0.0347357924013799
236
+ 0,23500,0.03706875827863819
237
+ 0,23600,0.0378347951889791
238
+ 0,23700,0.03531763351729598
239
+ 0,23800,0.036277216902136
240
+ 0,23900,0.03563792866617466
241
+ 0,24000,0.03703486210005108
242
+ 0,24100,0.037769587493760956
243
+ 0,24200,0.03749001277966459
244
+ 0,24300,0.03960652796490469
245
+ 0,24400,0.036781374730451545
246
+ 0,24500,0.03711627634396336
247
+ 0,24600,0.03975872469308434
248
+ 0,24700,0.03539313475455226
249
+ 0,24800,0.03443953339789755
250
+ 0,24900,0.03367993894758666
251
+ 0,25000,0.036054142671664645
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ca163577f3a570f7600781ce1f5ae43dc89e5c0d73c3f8ae80bb706a4a5d372
3
+ size 1337719025
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
tokenizer_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "do_basic_tokenize": true,
4
+ "do_lower_case": false,
5
+ "mask_token": "[MASK]",
6
+ "name_or_path": "/home/ruimelo/.cache/torch/sentence_transformers/neuralmind_bert-large-portuguese-cased",
7
+ "never_split": null,
8
+ "pad_token": "[PAD]",
9
+ "sep_token": "[SEP]",
10
+ "special_tokens_map_file": "/home/ruimelo/.cache/torch/sentence_transformers/neuralmind_bert-large-portuguese-cased/special_tokens_map.json",
11
+ "strip_accents": null,
12
+ "tokenize_chinese_chars": true,
13
+ "tokenizer_class": "BertTokenizer",
14
+ "unk_token": "[UNK]"
15
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff