asahi417 commited on
Commit
393abff
·
1 Parent(s): 1f7bb49
README.md CHANGED
@@ -2,7 +2,7 @@
2
  datasets:
3
  - relbert/semeval2012_relational_similarity
4
  model-index:
5
- - name: relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce
6
  results:
7
  - task:
8
  name: Relation Mapping
@@ -14,7 +14,7 @@ model-index:
14
  metrics:
15
  - name: Accuracy
16
  type: accuracy
17
- value: 0.9407142857142857
18
  - task:
19
  name: Analogy Questions (SAT full)
20
  type: multiple-choice-qa
@@ -25,7 +25,7 @@ model-index:
25
  metrics:
26
  - name: Accuracy
27
  type: accuracy
28
- value: 0.7165775401069518
29
  - task:
30
  name: Analogy Questions (SAT)
31
  type: multiple-choice-qa
@@ -36,7 +36,7 @@ model-index:
36
  metrics:
37
  - name: Accuracy
38
  type: accuracy
39
- value: 0.7151335311572701
40
  - task:
41
  name: Analogy Questions (BATS)
42
  type: multiple-choice-qa
@@ -47,7 +47,7 @@ model-index:
47
  metrics:
48
  - name: Accuracy
49
  type: accuracy
50
- value: 0.8126737076153419
51
  - task:
52
  name: Analogy Questions (Google)
53
  type: multiple-choice-qa
@@ -58,7 +58,7 @@ model-index:
58
  metrics:
59
  - name: Accuracy
60
  type: accuracy
61
- value: 0.964
62
  - task:
63
  name: Analogy Questions (U2)
64
  type: multiple-choice-qa
@@ -69,7 +69,7 @@ model-index:
69
  metrics:
70
  - name: Accuracy
71
  type: accuracy
72
- value: 0.7192982456140351
73
  - task:
74
  name: Analogy Questions (U4)
75
  type: multiple-choice-qa
@@ -80,7 +80,7 @@ model-index:
80
  metrics:
81
  - name: Accuracy
82
  type: accuracy
83
- value: 0.6921296296296297
84
  - task:
85
  name: Lexical Relation Classification (BLESS)
86
  type: classification
@@ -91,10 +91,10 @@ model-index:
91
  metrics:
92
  - name: F1
93
  type: f1
94
- value: 0.9251167696248305
95
  - name: F1 (macro)
96
  type: f1_macro
97
- value: 0.9196346516032302
98
  - task:
99
  name: Lexical Relation Classification (CogALexV)
100
  type: classification
@@ -105,10 +105,10 @@ model-index:
105
  metrics:
106
  - name: F1
107
  type: f1
108
- value: 0.8607981220657277
109
  - name: F1 (macro)
110
  type: f1_macro
111
- value: 0.7029050751301971
112
  - task:
113
  name: Lexical Relation Classification (EVALution)
114
  type: classification
@@ -119,10 +119,10 @@ model-index:
119
  metrics:
120
  - name: F1
121
  type: f1
122
- value: 0.6841820151679306
123
  - name: F1 (macro)
124
  type: f1_macro
125
- value: 0.6749077706444823
126
  - task:
127
  name: Lexical Relation Classification (K&H+N)
128
  type: classification
@@ -133,10 +133,10 @@ model-index:
133
  metrics:
134
  - name: F1
135
  type: f1
136
- value: 0.9595882312026153
137
  - name: F1 (macro)
138
  type: f1_macro
139
- value: 0.8789234569786473
140
  - task:
141
  name: Lexical Relation Classification (ROOT09)
142
  type: classification
@@ -150,33 +150,28 @@ model-index:
150
  value: 0.9103729238483234
151
  - name: F1 (macro)
152
  type: f1_macro
153
- value: 0.9076447549069412
154
 
155
  ---
 
156
 
157
- # Copy of https://huggingface.co/relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce
158
-
159
- # relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce
160
-
161
- RelBERT fine-tuned from [roberta-large](https://huggingface.co/roberta-large) on
162
- [relbert/semeval2012_relational_similarity](https://huggingface.co/datasets/relbert/semeval2012_relational_similarity).
163
- Fine-tuning is done via [RelBERT](https://github.com/asahi417/relbert) library (see the repository for more detail).
164
- It achieves the following results on the relation understanding tasks:
165
- - Analogy Question ([dataset](https://huggingface.co/datasets/relbert/analogy_questions), [full result](https://huggingface.co/relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce/raw/main/analogy.json)):
166
- - Accuracy on SAT (full): 0.7165775401069518
167
- - Accuracy on SAT: 0.7151335311572701
168
- - Accuracy on BATS: 0.8126737076153419
169
- - Accuracy on U2: 0.7192982456140351
170
- - Accuracy on U4: 0.6921296296296297
171
- - Accuracy on Google: 0.964
172
- - Lexical Relation Classification ([dataset](https://huggingface.co/datasets/relbert/lexical_relation_classification), [full result](https://huggingface.co/relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce/raw/main/classification.json)):
173
- - Micro F1 score on BLESS: 0.9251167696248305
174
- - Micro F1 score on CogALexV: 0.8607981220657277
175
- - Micro F1 score on EVALution: 0.6841820151679306
176
- - Micro F1 score on K&H+N: 0.9595882312026153
177
  - Micro F1 score on ROOT09: 0.9103729238483234
178
- - Relation Mapping ([dataset](https://huggingface.co/datasets/relbert/relation_mapping), [full result](https://huggingface.co/relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce/raw/main/relation_mapping.json)):
179
- - Accuracy on Relation Mapping: 0.9407142857142857
180
 
181
 
182
  ### Usage
@@ -187,49 +182,49 @@ pip install relbert
187
  and activate model as below.
188
  ```python
189
  from relbert import RelBERT
190
- model = RelBERT("relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce")
191
- vector = model.get_embedding(['Tokyo', 'Japan']) # shape of (1024, )
192
  ```
193
 
194
  ### Training hyperparameters
195
 
196
- The following hyperparameters were used during training:
197
  - model: roberta-large
198
  - max_length: 64
199
- - mode: average
200
- - data: relbert/semeval2012_relational_similarity
201
- - template_mode: manual
202
- - template: I wasn’t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>
203
- - loss_function: nce_logout
204
- - temperature_nce_constant: 0.05
205
- - temperature_nce_rank: {'min': 0.01, 'max': 0.05, 'type': 'linear'}
206
- - epoch: 29
207
- - batch: 128
208
- - lr: 5e-06
209
- - lr_decay: False
210
- - lr_warmup: 1
211
- - weight_decay: 0
212
  - random_seed: 0
 
 
 
 
213
  - exclude_relation: None
214
- - n_sample: 640
215
- - gradient_accumulation: 8
 
 
 
216
 
217
- The full configuration can be found at [fine-tuning parameter file](https://huggingface.co/relbert/relbert-roberta-large-semeval2012-average-prompt-d-nce/raw/main/trainer_config.json).
218
 
219
  ### Reference
220
- If you use any resource from RelBERT, please consider to cite our [paper](https://aclanthology.org/2021.eacl-demos.7/).
221
 
222
  ```
223
 
224
- @inproceedings{ushio-etal-2021-distilling-relation-embeddings,
225
- title = "{D}istilling {R}elation {E}mbeddings from {P}re-trained {L}anguage {M}odels",
226
  author = "Ushio, Asahi and
227
- Schockaert, Steven and
228
- Camacho-Collados, Jose",
229
- booktitle = "EMNLP 2021",
 
230
  year = "2021",
231
- address = "Online",
232
  publisher = "Association for Computational Linguistics",
 
 
 
 
233
  }
234
 
235
  ```
 
2
  datasets:
3
  - relbert/semeval2012_relational_similarity
4
  model-index:
5
+ - name: relbert/relbert-roberta-large-nce-d-0
6
  results:
7
  - task:
8
  name: Relation Mapping
 
14
  metrics:
15
  - name: Accuracy
16
  type: accuracy
17
+ value: 0.8022222222222222
18
  - task:
19
  name: Analogy Questions (SAT full)
20
  type: multiple-choice-qa
 
25
  metrics:
26
  - name: Accuracy
27
  type: accuracy
28
+ value: 0.7219251336898396
29
  - task:
30
  name: Analogy Questions (SAT)
31
  type: multiple-choice-qa
 
36
  metrics:
37
  - name: Accuracy
38
  type: accuracy
39
+ value: 0.7270029673590505
40
  - task:
41
  name: Analogy Questions (BATS)
42
  type: multiple-choice-qa
 
47
  metrics:
48
  - name: Accuracy
49
  type: accuracy
50
+ value: 0.7937743190661478
51
  - task:
52
  name: Analogy Questions (Google)
53
  type: multiple-choice-qa
 
58
  metrics:
59
  - name: Accuracy
60
  type: accuracy
61
+ value: 0.942
62
  - task:
63
  name: Analogy Questions (U2)
64
  type: multiple-choice-qa
 
69
  metrics:
70
  - name: Accuracy
71
  type: accuracy
72
+ value: 0.6578947368421053
73
  - task:
74
  name: Analogy Questions (U4)
75
  type: multiple-choice-qa
 
80
  metrics:
81
  - name: Accuracy
82
  type: accuracy
83
+ value: 0.6527777777777778
84
  - task:
85
  name: Lexical Relation Classification (BLESS)
86
  type: classification
 
91
  metrics:
92
  - name: F1
93
  type: f1
94
+ value: 0.9193912912460449
95
  - name: F1 (macro)
96
  type: f1_macro
97
+ value: 0.9154281897413815
98
  - task:
99
  name: Lexical Relation Classification (CogALexV)
100
  type: classification
 
105
  metrics:
106
  - name: F1
107
  type: f1
108
+ value: 0.855868544600939
109
  - name: F1 (macro)
110
  type: f1_macro
111
+ value: 0.6848847419105534
112
  - task:
113
  name: Lexical Relation Classification (EVALution)
114
  type: classification
 
119
  metrics:
120
  - name: F1
121
  type: f1
122
+ value: 0.6890574214517876
123
  - name: F1 (macro)
124
  type: f1_macro
125
+ value: 0.6776463157716582
126
  - task:
127
  name: Lexical Relation Classification (K&H+N)
128
  type: classification
 
133
  metrics:
134
  - name: F1
135
  type: f1
136
+ value: 0.960283786603603
137
  - name: F1 (macro)
138
  type: f1_macro
139
+ value: 0.8788008952006335
140
  - task:
141
  name: Lexical Relation Classification (ROOT09)
142
  type: classification
 
150
  value: 0.9103729238483234
151
  - name: F1 (macro)
152
  type: f1_macro
153
+ value: 0.9087639459806999
154
 
155
  ---
156
+ # relbert/relbert-roberta-large-nce-d-0
157
 
158
+ RelBERT based on [roberta-large](https://huggingface.co/roberta-large) fine-tuned on [relbert/semeval2012_relational_similarity](https://huggingface.co/datasets/relbert/semeval2012_relational_similarity) (see the [`relbert`](https://github.com/asahi417/relbert) for more detail of fine-tuning).
159
+ This model achieves the following results on the relation understanding tasks:
160
+ - Analogy Question ([dataset](https://huggingface.co/datasets/relbert/analogy_questions), [full result](https://huggingface.co/relbert/relbert-roberta-large-nce-d-0/raw/main/analogy.forward.json)):
161
+ - Accuracy on SAT (full): 0.7219251336898396
162
+ - Accuracy on SAT: 0.7270029673590505
163
+ - Accuracy on BATS: 0.7937743190661478
164
+ - Accuracy on U2: 0.6578947368421053
165
+ - Accuracy on U4: 0.6527777777777778
166
+ - Accuracy on Google: 0.942
167
+ - Lexical Relation Classification ([dataset](https://huggingface.co/datasets/relbert/lexical_relation_classification), [full result](https://huggingface.co/relbert/relbert-roberta-large-nce-d-0/raw/main/classification.json)):
168
+ - Micro F1 score on BLESS: 0.9193912912460449
169
+ - Micro F1 score on CogALexV: 0.855868544600939
170
+ - Micro F1 score on EVALution: 0.6890574214517876
171
+ - Micro F1 score on K&H+N: 0.960283786603603
 
 
 
 
 
 
172
  - Micro F1 score on ROOT09: 0.9103729238483234
173
+ - Relation Mapping ([dataset](https://huggingface.co/datasets/relbert/relation_mapping), [full result](https://huggingface.co/relbert/relbert-roberta-large-nce-d-0/raw/main/relation_mapping.json)):
174
+ - Accuracy on Relation Mapping: 0.8022222222222222
175
 
176
 
177
  ### Usage
 
182
  and activate model as below.
183
  ```python
184
  from relbert import RelBERT
185
+ model = RelBERT("relbert/relbert-roberta-large-nce-d-0")
186
+ vector = model.get_embedding(['Tokyo', 'Japan']) # shape of (n_dim, )
187
  ```
188
 
189
  ### Training hyperparameters
190
 
 
191
  - model: roberta-large
192
  - max_length: 64
193
+ - epoch: 10
194
+ - batch: 32
 
 
 
 
 
 
 
 
 
 
 
195
  - random_seed: 0
196
+ - lr: 5e-06
197
+ - lr_warmup: 10
198
+ - aggregation_mode: average_no_mask
199
+ - data: relbert/semeval2012_relational_similarity
200
  - exclude_relation: None
201
+ - split: train
202
+ - split_valid: validation
203
+ - loss_function: nce
204
+ - classification_loss: False
205
+ - loss_function_config: {'temperature': 0.05, 'gradient_accumulation': 1, 'num_negative': 400, 'num_positive': 10}
206
 
207
+ See the full configuration at [config file](https://huggingface.co/relbert/relbert-roberta-large-nce-d-0/raw/main/finetuning_config.json).
208
 
209
  ### Reference
210
+ If you use any resource from RelBERT, please consider to cite our [paper](https://aclanthology.org/2021.emnlp-main.712/).
211
 
212
  ```
213
 
214
+ @inproceedings{ushio-etal-2021-distilling,
215
+ title = "Distilling Relation Embeddings from Pretrained Language Models",
216
  author = "Ushio, Asahi and
217
+ Camacho-Collados, Jose and
218
+ Schockaert, Steven",
219
+ booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
220
+ month = nov,
221
  year = "2021",
222
+ address = "Online and Punta Cana, Dominican Republic",
223
  publisher = "Association for Computational Linguistics",
224
+ url = "https://aclanthology.org/2021.emnlp-main.712",
225
+ doi = "10.18653/v1/2021.emnlp-main.712",
226
+ pages = "9044--9062",
227
+ abstract = "Pre-trained language models have been found to capture a surprisingly rich amount of lexical knowledge, ranging from commonsense properties of everyday concepts to detailed factual knowledge about named entities. Among others, this makes it possible to distill high-quality word vectors from pre-trained language models. However, it is currently unclear to what extent it is possible to distill relation embeddings, i.e. vectors that characterize the relationship between two words. Such relation embeddings are appealing because they can, in principle, encode relational knowledge in a more fine-grained way than is possible with knowledge graphs. To obtain relation embeddings from a pre-trained language model, we encode word pairs using a (manually or automatically generated) prompt, and we fine-tune the language model such that relationally similar word pairs yield similar output vectors. We find that the resulting relation embeddings are highly competitive on analogy (unsupervised) and relation classification (supervised) benchmarks, even without any task-specific fine-tuning. Source code to reproduce our experimental results and the model checkpoints are available in the following repository: https://github.com/asahi417/relbert",
228
  }
229
 
230
  ```
analogy.bidirection.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"distance_function": "cosine_similarity", "model": "relbert_output/ckpt/nce/template-d.random-0/model", "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>", "aggregation": "average_no_mask", "sat/test": 0.7270029673590505, "u2/test": 0.6578947368421053, "u4/test": 0.6527777777777778, "google/test": 0.942, "bats/test": 0.7937743190661478, "sat_full": 0.7219251336898396, "sat/validation": 0.6756756756756757, "u2/validation": 0.625, "u4/validation": 0.6458333333333334, "google/validation": 0.96, "bats/validation": 0.8341708542713567}
analogy.forward.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"distance_function": "cosine_similarity", "model": "relbert_output/ckpt/nce/template-d.random-0/model", "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>", "aggregation": "average_no_mask", "sat/test": 0.7270029673590505, "u2/test": 0.6578947368421053, "u4/test": 0.6527777777777778, "google/test": 0.942, "bats/test": 0.7937743190661478, "sat_full": 0.7219251336898396, "sat/validation": 0.6756756756756757, "u2/validation": 0.625, "u4/validation": 0.6458333333333334, "google/validation": 0.96, "bats/validation": 0.8341708542713567}
analogy.reverse.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"distance_function": "cosine_similarity", "model": "relbert_output/ckpt/nce/template-d.random-0/model", "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>", "aggregation": "average_no_mask", "sat/test": 0.7270029673590505, "u2/test": 0.6578947368421053, "u4/test": 0.6527777777777778, "google/test": 0.942, "bats/test": 0.7937743190661478, "sat_full": 0.7219251336898396, "sat/validation": 0.6756756756756757, "u2/validation": 0.625, "u4/validation": 0.6458333333333334, "google/validation": 0.96, "bats/validation": 0.8341708542713567}
analogy_relation_dataset.bidirection.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"distance_function": "cosine_similarity", "model": "relbert_output/ckpt/nce/template-d.random-0/model", "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>", "aggregation": "average_no_mask", "relbert/semeval2012_relational_similarity/validation": 0.759493670886076}
analogy_relation_dataset.forward.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"distance_function": "cosine_similarity", "model": "relbert_output/ckpt/nce/template-d.random-0/model", "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>", "aggregation": "average_no_mask", "relbert/semeval2012_relational_similarity/validation": 0.759493670886076}
analogy_relation_dataset.reverse.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"distance_function": "cosine_similarity", "model": "relbert_output/ckpt/nce/template-d.random-0/model", "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>", "aggregation": "average_no_mask", "relbert/semeval2012_relational_similarity/validation": 0.7088607594936709}
classification.json CHANGED
@@ -1 +1 @@
1
- {"lexical_relation_classification/BLESS": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.9251167696248305, "test/f1_macro": 0.9196346516032302, "test/f1_micro": 0.9251167696248305, "test/p_macro": 0.9228914915504977, "test/p_micro": 0.9251167696248305, "test/r_macro": 0.9167390763548414, "test/r_micro": 0.9251167696248305}, "lexical_relation_classification/CogALexV": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.8607981220657277, "test/f1_macro": 0.7029050751301971, "test/f1_micro": 0.8607981220657277, "test/p_macro": 0.7313949055416683, "test/p_micro": 0.8607981220657277, "test/r_macro": 0.6804383005626553, "test/r_micro": 0.8607981220657277}, "lexical_relation_classification/EVALution": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.6841820151679306, "test/f1_macro": 0.6749077706444823, "test/f1_micro": 0.6841820151679306, "test/p_macro": 0.6822611969186971, "test/p_micro": 0.6841820151679306, "test/r_macro": 0.6691999606355179, "test/r_micro": 0.6841820151679306}, "lexical_relation_classification/K&H+N": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.9595882312026153, "test/f1_macro": 0.8789234569786473, "test/f1_micro": 0.9595882312026153, "test/p_macro": 0.8849555927607803, "test/p_micro": 0.9595882312026153, "test/r_macro": 0.8732016591900109, "test/r_micro": 0.9595882312026153}, "lexical_relation_classification/ROOT09": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.9103729238483234, "test/f1_macro": 0.9076447549069412, "test/f1_micro": 0.9103729238483234, "test/p_macro": 0.9086898349896138, "test/p_micro": 0.9103729238483234, "test/r_macro": 0.9069026663387998, "test/r_micro": 0.9103729238483234}}
 
1
+ {"lexical_relation_classification/BLESS": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.9193912912460449, "test/f1_macro": 0.9154281897413815, "test/f1_micro": 0.9193912912460449, "test/p_macro": 0.909030256916922, "test/p_micro": 0.9193912912460449, "test/r_macro": 0.9234515644685003, "test/r_micro": 0.9193912912460449}, "lexical_relation_classification/CogALexV": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.855868544600939, "test/f1_macro": 0.6848847419105534, "test/f1_micro": 0.855868544600939, "test/p_macro": 0.7136216119555865, "test/p_micro": 0.855868544600939, "test/r_macro": 0.6606656363820599, "test/r_micro": 0.855868544600939}, "lexical_relation_classification/EVALution": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.6890574214517876, "test/f1_macro": 0.6776463157716582, "test/f1_micro": 0.6890574214517876, "test/p_macro": 0.6879578603963917, "test/p_micro": 0.6890574214517876, "test/r_macro": 0.6716342581634464, "test/r_micro": 0.6890574214517876}, "lexical_relation_classification/K&H+N": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.960283786603603, "test/f1_macro": 0.8788008952006335, "test/f1_micro": 0.960283786603603, "test/p_macro": 0.9021078679303638, "test/p_micro": 0.960283786603603, "test/r_macro": 0.8590979187227483, "test/r_micro": 0.960283786603603}, "lexical_relation_classification/ROOT09": {"classifier_config": {"activation": "relu", "alpha": 0.0001, "batch_size": "auto", "beta_1": 0.9, "beta_2": 0.999, "early_stopping": false, "epsilon": 1e-08, "hidden_layer_sizes": [100], "learning_rate": "constant", "learning_rate_init": 0.001, "max_fun": 15000, "max_iter": 200, "momentum": 0.9, "n_iter_no_change": 10, "nesterovs_momentum": true, "power_t": 0.5, "random_state": 0, "shuffle": true, "solver": "adam", "tol": 0.0001, "validation_fraction": 0.1, "verbose": false, "warm_start": false}, "test/accuracy": 0.9103729238483234, "test/f1_macro": 0.9087639459806999, "test/f1_micro": 0.9103729238483234, "test/p_macro": 0.9080934147603116, "test/p_micro": 0.9103729238483234, "test/r_macro": 0.9094813347276341, "test/r_micro": 0.9103729238483234}}
config.json CHANGED
@@ -1,12 +1,12 @@
1
  {
2
- "_name_or_path": "roberta-large",
3
  "architectures": [
4
  "RobertaModel"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
  "bos_token_id": 0,
 
8
  "eos_token_id": 2,
9
- "gradient_checkpointing": false,
10
  "hidden_act": "gelu",
11
  "hidden_dropout_prob": 0.1,
12
  "hidden_size": 1024,
@@ -20,11 +20,11 @@
20
  "pad_token_id": 1,
21
  "position_embedding_type": "absolute",
22
  "relbert_config": {
23
- "aggregation_mode": "average",
24
- "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>",
25
- "template_mode": "manual"
26
  },
27
- "transformers_version": "4.6.1",
 
28
  "type_vocab_size": 1,
29
  "use_cache": true,
30
  "vocab_size": 50265
 
1
  {
2
+ "_name_or_path": "relbert-roberta-large-nce-d-0",
3
  "architectures": [
4
  "RobertaModel"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
  "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
  "eos_token_id": 2,
 
10
  "hidden_act": "gelu",
11
  "hidden_dropout_prob": 0.1,
12
  "hidden_size": 1024,
 
20
  "pad_token_id": 1,
21
  "position_embedding_type": "absolute",
22
  "relbert_config": {
23
+ "aggregation_mode": "average_no_mask",
24
+ "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>"
 
25
  },
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.21.2",
28
  "type_vocab_size": 1,
29
  "use_cache": true,
30
  "vocab_size": 50265
finetuning_config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "template": "I wasn\u2019t aware of this relationship, but I just read in the encyclopedia that <subj> is the <mask> of <obj>",
3
+ "model": "roberta-large",
4
+ "max_length": 64,
5
+ "epoch": 10,
6
+ "batch": 32,
7
+ "random_seed": 0,
8
+ "lr": 5e-06,
9
+ "lr_warmup": 10,
10
+ "aggregation_mode": "average_no_mask",
11
+ "data": "relbert/semeval2012_relational_similarity",
12
+ "exclude_relation": null,
13
+ "split": "train",
14
+ "split_valid": "validation",
15
+ "loss_function": "nce",
16
+ "classification_loss": false,
17
+ "loss_function_config": {
18
+ "temperature": 0.05,
19
+ "gradient_accumulation": 1,
20
+ "num_negative": 400,
21
+ "num_positive": 10
22
+ }
23
+ }
loss.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"loss": 142.95112896549483}
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dfe324fc457c22a64a75a79aa71b8a559ef77deb2afa0bd10c37ffad4791ac2f
3
- size 1421595889
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01c29cec100f4c5df9b4f783fd780f19366cadea2b7858e9a7e08f34faa678d0
3
+ size 1421575277
relation_mapping.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json CHANGED
@@ -1 +1,15 @@
1
- {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json CHANGED
@@ -1 +1,16 @@
1
- {"unk_token": "<unk>", "bos_token": "<s>", "eos_token": "</s>", "add_prefix_space": false, "errors": "replace", "sep_token": "</s>", "cls_token": "<s>", "pad_token": "<pad>", "mask_token": "<mask>", "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "roberta-large"}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "bos_token": "<s>",
4
+ "cls_token": "<s>",
5
+ "eos_token": "</s>",
6
+ "errors": "replace",
7
+ "mask_token": "<mask>",
8
+ "model_max_length": 512,
9
+ "name_or_path": "relbert-roberta-large-nce-d-0",
10
+ "pad_token": "<pad>",
11
+ "sep_token": "</s>",
12
+ "special_tokens_map_file": null,
13
+ "tokenizer_class": "RobertaTokenizer",
14
+ "trim_offsets": true,
15
+ "unk_token": "<unk>"
16
+ }