voidism commited on
Commit
f6c6052
1 Parent(s): b2dfe46

first commit

Browse files
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "voidism/diffcse-roberta-base-sts",
3
+ "architectures": [
4
+ "RobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "gradient_checkpointing": false,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "transformers_version": "4.2.1",
23
+ "type_vocab_size": 1,
24
+ "use_cache": true,
25
+ "vocab_size": 50265
26
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b792aba0b664d672181d12edb092a4e53501e5197f331ff3c97ceed4fed45320
3
+ size 1487809454
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}}
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"unk_token": "<unk>", "bos_token": "<s>", "eos_token": "</s>", "add_prefix_space": false, "errors": "replace", "sep_token": "</s>", "cls_token": "<s>", "pad_token": "<pad>", "mask_token": "<mask>", "special_tokens_map_file": null, "name_or_path": "./roberta-base"}
trainer_state.json ADDED
@@ -0,0 +1,2465 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.8442339359569462,
3
+ "best_model_checkpoint": "result/simcse-celectra-amlp-dmlp-2ep-bs64-lr1e-5-mask0.20-elew0.005-roberta-base",
4
+ "epoch": 2.0,
5
+ "global_step": 31252,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "electra_acc": 0.1088,
12
+ "electra_fix_acc": 0.0079,
13
+ "electra_rep_acc": 0.9968,
14
+ "epoch": 0.0,
15
+ "learning_rate": 9.99968002047869e-06,
16
+ "loss": 8.4419,
17
+ "neg_sim": 0.2061,
18
+ "pos_sim": 0.3449,
19
+ "step": 1
20
+ },
21
+ {
22
+ "epoch": 0.01,
23
+ "eval_avg_sts": 0.755574714462574,
24
+ "eval_sickr_spearman": 0.7098302262826256,
25
+ "eval_stsb_spearman": 0.8013192026425223,
26
+ "step": 125
27
+ },
28
+ {
29
+ "epoch": 0.02,
30
+ "eval_avg_sts": 0.761612717263195,
31
+ "eval_sickr_spearman": 0.7112582869905835,
32
+ "eval_stsb_spearman": 0.8119671475358067,
33
+ "step": 250
34
+ },
35
+ {
36
+ "epoch": 0.02,
37
+ "eval_avg_sts": 0.766754789307388,
38
+ "eval_sickr_spearman": 0.7152875680591253,
39
+ "eval_stsb_spearman": 0.8182220105556509,
40
+ "step": 375
41
+ },
42
+ {
43
+ "electra_acc": 0.8686,
44
+ "electra_fix_acc": 0.9713,
45
+ "electra_rep_acc": 0.0311,
46
+ "epoch": 0.03,
47
+ "learning_rate": 9.840010239344683e-06,
48
+ "loss": 0.1729,
49
+ "neg_sim": -0.0093,
50
+ "pos_sim": 0.695,
51
+ "step": 500
52
+ },
53
+ {
54
+ "epoch": 0.03,
55
+ "eval_avg_sts": 0.7675239503020774,
56
+ "eval_sickr_spearman": 0.7166760511394983,
57
+ "eval_stsb_spearman": 0.8183718494646566,
58
+ "step": 500
59
+ },
60
+ {
61
+ "epoch": 0.04,
62
+ "eval_avg_sts": 0.7749845711206813,
63
+ "eval_sickr_spearman": 0.724027931681123,
64
+ "eval_stsb_spearman": 0.8259412105602397,
65
+ "step": 625
66
+ },
67
+ {
68
+ "epoch": 0.05,
69
+ "eval_avg_sts": 0.7751202672393465,
70
+ "eval_sickr_spearman": 0.7197742012755609,
71
+ "eval_stsb_spearman": 0.8304663332031321,
72
+ "step": 750
73
+ },
74
+ {
75
+ "epoch": 0.06,
76
+ "eval_avg_sts": 0.7760175671946854,
77
+ "eval_sickr_spearman": 0.7239823981969596,
78
+ "eval_stsb_spearman": 0.8280527361924113,
79
+ "step": 875
80
+ },
81
+ {
82
+ "electra_acc": 0.8964,
83
+ "electra_fix_acc": 0.9953,
84
+ "electra_rep_acc": 0.0862,
85
+ "epoch": 0.06,
86
+ "learning_rate": 9.680020478689364e-06,
87
+ "loss": 0.0014,
88
+ "neg_sim": -0.0136,
89
+ "pos_sim": 0.736,
90
+ "step": 1000
91
+ },
92
+ {
93
+ "epoch": 0.06,
94
+ "eval_avg_sts": 0.7775564387382596,
95
+ "eval_sickr_spearman": 0.7228041472475814,
96
+ "eval_stsb_spearman": 0.832308730228938,
97
+ "step": 1000
98
+ },
99
+ {
100
+ "epoch": 0.07,
101
+ "eval_avg_sts": 0.7784780411860817,
102
+ "eval_sickr_spearman": 0.7230463680921337,
103
+ "eval_stsb_spearman": 0.8339097142800296,
104
+ "step": 1125
105
+ },
106
+ {
107
+ "epoch": 0.08,
108
+ "eval_avg_sts": 0.7802639531360682,
109
+ "eval_sickr_spearman": 0.726064546444302,
110
+ "eval_stsb_spearman": 0.8344633598278346,
111
+ "step": 1250
112
+ },
113
+ {
114
+ "epoch": 0.09,
115
+ "eval_avg_sts": 0.778730071271978,
116
+ "eval_sickr_spearman": 0.7238748085297384,
117
+ "eval_stsb_spearman": 0.8335853340142177,
118
+ "step": 1375
119
+ },
120
+ {
121
+ "electra_acc": 0.8989,
122
+ "electra_fix_acc": 0.9938,
123
+ "electra_rep_acc": 0.1172,
124
+ "epoch": 0.1,
125
+ "learning_rate": 9.520030718034047e-06,
126
+ "loss": 0.0013,
127
+ "neg_sim": -0.0137,
128
+ "pos_sim": 0.7472,
129
+ "step": 1500
130
+ },
131
+ {
132
+ "epoch": 0.1,
133
+ "eval_avg_sts": 0.7716526802556551,
134
+ "eval_sickr_spearman": 0.7141475498664919,
135
+ "eval_stsb_spearman": 0.8291578106448184,
136
+ "step": 1500
137
+ },
138
+ {
139
+ "epoch": 0.1,
140
+ "eval_avg_sts": 0.7758744711124749,
141
+ "eval_sickr_spearman": 0.721181992858712,
142
+ "eval_stsb_spearman": 0.8305669493662378,
143
+ "step": 1625
144
+ },
145
+ {
146
+ "epoch": 0.11,
147
+ "eval_avg_sts": 0.7763031700578917,
148
+ "eval_sickr_spearman": 0.7214235412678442,
149
+ "eval_stsb_spearman": 0.8311827988479393,
150
+ "step": 1750
151
+ },
152
+ {
153
+ "epoch": 0.12,
154
+ "eval_avg_sts": 0.7806434242533656,
155
+ "eval_sickr_spearman": 0.7289663454376855,
156
+ "eval_stsb_spearman": 0.8323205030690458,
157
+ "step": 1875
158
+ },
159
+ {
160
+ "electra_acc": 0.8998,
161
+ "electra_fix_acc": 0.9931,
162
+ "electra_rep_acc": 0.1339,
163
+ "epoch": 0.13,
164
+ "learning_rate": 9.36004095737873e-06,
165
+ "loss": 0.0012,
166
+ "neg_sim": -0.0138,
167
+ "pos_sim": 0.7538,
168
+ "step": 2000
169
+ },
170
+ {
171
+ "epoch": 0.13,
172
+ "eval_avg_sts": 0.7795467000474285,
173
+ "eval_sickr_spearman": 0.722814521965492,
174
+ "eval_stsb_spearman": 0.836278878129365,
175
+ "step": 2000
176
+ },
177
+ {
178
+ "epoch": 0.14,
179
+ "eval_avg_sts": 0.7831655326956792,
180
+ "eval_sickr_spearman": 0.7220971294344124,
181
+ "eval_stsb_spearman": 0.8442339359569462,
182
+ "step": 2125
183
+ },
184
+ {
185
+ "epoch": 0.14,
186
+ "eval_avg_sts": 0.7784529836143297,
187
+ "eval_sickr_spearman": 0.7178746672758864,
188
+ "eval_stsb_spearman": 0.8390312999527729,
189
+ "step": 2250
190
+ },
191
+ {
192
+ "epoch": 0.15,
193
+ "eval_avg_sts": 0.7838110056478796,
194
+ "eval_sickr_spearman": 0.7265441850232628,
195
+ "eval_stsb_spearman": 0.8410778262724965,
196
+ "step": 2375
197
+ },
198
+ {
199
+ "electra_acc": 0.9008,
200
+ "electra_fix_acc": 0.9925,
201
+ "electra_rep_acc": 0.1492,
202
+ "epoch": 0.16,
203
+ "learning_rate": 9.20005119672341e-06,
204
+ "loss": 0.0012,
205
+ "neg_sim": -0.0138,
206
+ "pos_sim": 0.7534,
207
+ "step": 2500
208
+ },
209
+ {
210
+ "epoch": 0.16,
211
+ "eval_avg_sts": 0.781754791702129,
212
+ "eval_sickr_spearman": 0.723123794227652,
213
+ "eval_stsb_spearman": 0.8403857891766059,
214
+ "step": 2500
215
+ },
216
+ {
217
+ "epoch": 0.17,
218
+ "eval_avg_sts": 0.7798490649002168,
219
+ "eval_sickr_spearman": 0.7198115694724797,
220
+ "eval_stsb_spearman": 0.8398865603279537,
221
+ "step": 2625
222
+ },
223
+ {
224
+ "epoch": 0.18,
225
+ "eval_avg_sts": 0.7787504502510374,
226
+ "eval_sickr_spearman": 0.7188935990617945,
227
+ "eval_stsb_spearman": 0.8386073014402803,
228
+ "step": 2750
229
+ },
230
+ {
231
+ "epoch": 0.18,
232
+ "eval_avg_sts": 0.7702513371927302,
233
+ "eval_sickr_spearman": 0.7122062768396675,
234
+ "eval_stsb_spearman": 0.828296397545793,
235
+ "step": 2875
236
+ },
237
+ {
238
+ "electra_acc": 0.9012,
239
+ "electra_fix_acc": 0.9922,
240
+ "electra_rep_acc": 0.157,
241
+ "epoch": 0.19,
242
+ "learning_rate": 9.040061436068092e-06,
243
+ "loss": 0.0012,
244
+ "neg_sim": -0.0139,
245
+ "pos_sim": 0.7662,
246
+ "step": 3000
247
+ },
248
+ {
249
+ "epoch": 0.19,
250
+ "eval_avg_sts": 0.77796510106683,
251
+ "eval_sickr_spearman": 0.7215734943665341,
252
+ "eval_stsb_spearman": 0.834356707767126,
253
+ "step": 3000
254
+ },
255
+ {
256
+ "epoch": 0.2,
257
+ "eval_avg_sts": 0.7773533173320899,
258
+ "eval_sickr_spearman": 0.7216049067068746,
259
+ "eval_stsb_spearman": 0.8331017279573052,
260
+ "step": 3125
261
+ },
262
+ {
263
+ "epoch": 0.21,
264
+ "eval_avg_sts": 0.7727609962405579,
265
+ "eval_sickr_spearman": 0.7192614692677092,
266
+ "eval_stsb_spearman": 0.8262605232134068,
267
+ "step": 3250
268
+ },
269
+ {
270
+ "epoch": 0.22,
271
+ "eval_avg_sts": 0.7705397769228735,
272
+ "eval_sickr_spearman": 0.7183345170410549,
273
+ "eval_stsb_spearman": 0.8227450368046919,
274
+ "step": 3375
275
+ },
276
+ {
277
+ "electra_acc": 0.902,
278
+ "electra_fix_acc": 0.9917,
279
+ "electra_rep_acc": 0.1668,
280
+ "epoch": 0.22,
281
+ "learning_rate": 8.880071675412775e-06,
282
+ "loss": 0.0012,
283
+ "neg_sim": -0.0138,
284
+ "pos_sim": 0.762,
285
+ "step": 3500
286
+ },
287
+ {
288
+ "epoch": 0.22,
289
+ "eval_avg_sts": 0.7669127684681659,
290
+ "eval_sickr_spearman": 0.7153820452356542,
291
+ "eval_stsb_spearman": 0.8184434917006776,
292
+ "step": 3500
293
+ },
294
+ {
295
+ "epoch": 0.23,
296
+ "eval_avg_sts": 0.762067078104866,
297
+ "eval_sickr_spearman": 0.708602167081056,
298
+ "eval_stsb_spearman": 0.815531989128676,
299
+ "step": 3625
300
+ },
301
+ {
302
+ "epoch": 0.24,
303
+ "eval_avg_sts": 0.7603436095667805,
304
+ "eval_sickr_spearman": 0.7039441757894171,
305
+ "eval_stsb_spearman": 0.8167430433441438,
306
+ "step": 3750
307
+ },
308
+ {
309
+ "epoch": 0.25,
310
+ "eval_avg_sts": 0.7572330050419072,
311
+ "eval_sickr_spearman": 0.7035958853781614,
312
+ "eval_stsb_spearman": 0.810870124705653,
313
+ "step": 3875
314
+ },
315
+ {
316
+ "electra_acc": 0.903,
317
+ "electra_fix_acc": 0.9917,
318
+ "electra_rep_acc": 0.1735,
319
+ "epoch": 0.26,
320
+ "learning_rate": 8.720081914757458e-06,
321
+ "loss": 0.0013,
322
+ "neg_sim": -0.0139,
323
+ "pos_sim": 0.769,
324
+ "step": 4000
325
+ },
326
+ {
327
+ "epoch": 0.26,
328
+ "eval_avg_sts": 0.7565075332583117,
329
+ "eval_sickr_spearman": 0.7049569907307146,
330
+ "eval_stsb_spearman": 0.8080580757859087,
331
+ "step": 4000
332
+ },
333
+ {
334
+ "epoch": 0.26,
335
+ "eval_avg_sts": 0.7571155448037524,
336
+ "eval_sickr_spearman": 0.7046521373298869,
337
+ "eval_stsb_spearman": 0.809578952277618,
338
+ "step": 4125
339
+ },
340
+ {
341
+ "epoch": 0.27,
342
+ "eval_avg_sts": 0.7558024272832136,
343
+ "eval_sickr_spearman": 0.7044353249379953,
344
+ "eval_stsb_spearman": 0.8071695296284319,
345
+ "step": 4250
346
+ },
347
+ {
348
+ "epoch": 0.28,
349
+ "eval_avg_sts": 0.7590707229344114,
350
+ "eval_sickr_spearman": 0.7043194258902251,
351
+ "eval_stsb_spearman": 0.8138220199785977,
352
+ "step": 4375
353
+ },
354
+ {
355
+ "electra_acc": 0.9026,
356
+ "electra_fix_acc": 0.9912,
357
+ "electra_rep_acc": 0.1795,
358
+ "epoch": 0.29,
359
+ "learning_rate": 8.560092154102138e-06,
360
+ "loss": 0.0012,
361
+ "neg_sim": -0.0139,
362
+ "pos_sim": 0.7711,
363
+ "step": 4500
364
+ },
365
+ {
366
+ "epoch": 0.29,
367
+ "eval_avg_sts": 0.7618055470788606,
368
+ "eval_sickr_spearman": 0.7090844954016975,
369
+ "eval_stsb_spearman": 0.8145265987560237,
370
+ "step": 4500
371
+ },
372
+ {
373
+ "epoch": 0.3,
374
+ "eval_avg_sts": 0.7642277977124863,
375
+ "eval_sickr_spearman": 0.7120559394921663,
376
+ "eval_stsb_spearman": 0.8163996559328064,
377
+ "step": 4625
378
+ },
379
+ {
380
+ "epoch": 0.3,
381
+ "eval_avg_sts": 0.7631334021943239,
382
+ "eval_sickr_spearman": 0.7097783526930724,
383
+ "eval_stsb_spearman": 0.8164884516955754,
384
+ "step": 4750
385
+ },
386
+ {
387
+ "epoch": 0.31,
388
+ "eval_avg_sts": 0.7663792588185014,
389
+ "eval_sickr_spearman": 0.7141827086327445,
390
+ "eval_stsb_spearman": 0.8185758090042584,
391
+ "step": 4875
392
+ },
393
+ {
394
+ "electra_acc": 0.9031,
395
+ "electra_fix_acc": 0.9911,
396
+ "electra_rep_acc": 0.1837,
397
+ "epoch": 0.32,
398
+ "learning_rate": 8.400102393446819e-06,
399
+ "loss": 0.0011,
400
+ "neg_sim": -0.014,
401
+ "pos_sim": 0.7781,
402
+ "step": 5000
403
+ },
404
+ {
405
+ "epoch": 0.32,
406
+ "eval_avg_sts": 0.7675828244876146,
407
+ "eval_sickr_spearman": 0.7151267119004091,
408
+ "eval_stsb_spearman": 0.8200389370748202,
409
+ "step": 5000
410
+ },
411
+ {
412
+ "epoch": 0.33,
413
+ "eval_avg_sts": 0.7664182051626903,
414
+ "eval_sickr_spearman": 0.7138797284448728,
415
+ "eval_stsb_spearman": 0.8189566818805079,
416
+ "step": 5125
417
+ },
418
+ {
419
+ "epoch": 0.34,
420
+ "eval_avg_sts": 0.763298983477001,
421
+ "eval_sickr_spearman": 0.7031210499093442,
422
+ "eval_stsb_spearman": 0.8234769170446578,
423
+ "step": 5250
424
+ },
425
+ {
426
+ "epoch": 0.34,
427
+ "eval_avg_sts": 0.7609537736402838,
428
+ "eval_sickr_spearman": 0.7014618323453161,
429
+ "eval_stsb_spearman": 0.8204457149352516,
430
+ "step": 5375
431
+ },
432
+ {
433
+ "electra_acc": 0.9041,
434
+ "electra_fix_acc": 0.9909,
435
+ "electra_rep_acc": 0.1929,
436
+ "epoch": 0.35,
437
+ "learning_rate": 8.240112632791502e-06,
438
+ "loss": 0.0013,
439
+ "neg_sim": -0.0139,
440
+ "pos_sim": 0.7703,
441
+ "step": 5500
442
+ },
443
+ {
444
+ "epoch": 0.35,
445
+ "eval_avg_sts": 0.7649139683546902,
446
+ "eval_sickr_spearman": 0.7016405392386158,
447
+ "eval_stsb_spearman": 0.8281873974707646,
448
+ "step": 5500
449
+ },
450
+ {
451
+ "epoch": 0.36,
452
+ "eval_avg_sts": 0.7660619062755463,
453
+ "eval_sickr_spearman": 0.7030690322264868,
454
+ "eval_stsb_spearman": 0.8290547803246057,
455
+ "step": 5625
456
+ },
457
+ {
458
+ "epoch": 0.37,
459
+ "eval_avg_sts": 0.7667448016301607,
460
+ "eval_sickr_spearman": 0.7080724800943964,
461
+ "eval_stsb_spearman": 0.825417123165925,
462
+ "step": 5750
463
+ },
464
+ {
465
+ "epoch": 0.38,
466
+ "eval_avg_sts": 0.7668525663367054,
467
+ "eval_sickr_spearman": 0.7080771871423374,
468
+ "eval_stsb_spearman": 0.8256279455310735,
469
+ "step": 5875
470
+ },
471
+ {
472
+ "electra_acc": 0.9044,
473
+ "electra_fix_acc": 0.991,
474
+ "electra_rep_acc": 0.1933,
475
+ "epoch": 0.38,
476
+ "learning_rate": 8.080122872136184e-06,
477
+ "loss": 0.0011,
478
+ "neg_sim": -0.014,
479
+ "pos_sim": 0.7798,
480
+ "step": 6000
481
+ },
482
+ {
483
+ "epoch": 0.38,
484
+ "eval_avg_sts": 0.7666156811168598,
485
+ "eval_sickr_spearman": 0.709114370746792,
486
+ "eval_stsb_spearman": 0.8241169914869277,
487
+ "step": 6000
488
+ },
489
+ {
490
+ "epoch": 0.39,
491
+ "eval_avg_sts": 0.7644923521403739,
492
+ "eval_sickr_spearman": 0.7072277090823029,
493
+ "eval_stsb_spearman": 0.821756995198445,
494
+ "step": 6125
495
+ },
496
+ {
497
+ "epoch": 0.4,
498
+ "eval_avg_sts": 0.7661750461200847,
499
+ "eval_sickr_spearman": 0.7058679005694886,
500
+ "eval_stsb_spearman": 0.8264821916706808,
501
+ "step": 6250
502
+ },
503
+ {
504
+ "epoch": 0.41,
505
+ "eval_avg_sts": 0.7660352662732546,
506
+ "eval_sickr_spearman": 0.7060020514358052,
507
+ "eval_stsb_spearman": 0.8260684811107039,
508
+ "step": 6375
509
+ },
510
+ {
511
+ "electra_acc": 0.9042,
512
+ "electra_fix_acc": 0.9906,
513
+ "electra_rep_acc": 0.2009,
514
+ "epoch": 0.42,
515
+ "learning_rate": 7.920133111480865e-06,
516
+ "loss": 0.0011,
517
+ "neg_sim": -0.014,
518
+ "pos_sim": 0.7785,
519
+ "step": 6500
520
+ },
521
+ {
522
+ "epoch": 0.42,
523
+ "eval_avg_sts": 0.7705073293876892,
524
+ "eval_sickr_spearman": 0.7120254877738545,
525
+ "eval_stsb_spearman": 0.8289891710015237,
526
+ "step": 6500
527
+ },
528
+ {
529
+ "epoch": 0.42,
530
+ "eval_avg_sts": 0.7701345372028097,
531
+ "eval_sickr_spearman": 0.7109871514229654,
532
+ "eval_stsb_spearman": 0.829281922982654,
533
+ "step": 6625
534
+ },
535
+ {
536
+ "epoch": 0.43,
537
+ "eval_avg_sts": 0.7673565342593036,
538
+ "eval_sickr_spearman": 0.7094820007971993,
539
+ "eval_stsb_spearman": 0.8252310677214081,
540
+ "step": 6750
541
+ },
542
+ {
543
+ "epoch": 0.44,
544
+ "eval_avg_sts": 0.7663843153446501,
545
+ "eval_sickr_spearman": 0.7069766985461874,
546
+ "eval_stsb_spearman": 0.8257919321431128,
547
+ "step": 6875
548
+ },
549
+ {
550
+ "electra_acc": 0.9052,
551
+ "electra_fix_acc": 0.9907,
552
+ "electra_rep_acc": 0.2032,
553
+ "epoch": 0.45,
554
+ "learning_rate": 7.760143350825547e-06,
555
+ "loss": 0.0011,
556
+ "neg_sim": -0.0141,
557
+ "pos_sim": 0.7899,
558
+ "step": 7000
559
+ },
560
+ {
561
+ "epoch": 0.45,
562
+ "eval_avg_sts": 0.7671374698909221,
563
+ "eval_sickr_spearman": 0.7035898814904816,
564
+ "eval_stsb_spearman": 0.8306850582913625,
565
+ "step": 7000
566
+ },
567
+ {
568
+ "epoch": 0.46,
569
+ "eval_avg_sts": 0.7657223525531744,
570
+ "eval_sickr_spearman": 0.7024468853695589,
571
+ "eval_stsb_spearman": 0.82899781973679,
572
+ "step": 7125
573
+ },
574
+ {
575
+ "epoch": 0.46,
576
+ "eval_avg_sts": 0.7644676880553354,
577
+ "eval_sickr_spearman": 0.707080829974105,
578
+ "eval_stsb_spearman": 0.8218545461365657,
579
+ "step": 7250
580
+ },
581
+ {
582
+ "epoch": 0.47,
583
+ "eval_avg_sts": 0.7659127144064692,
584
+ "eval_sickr_spearman": 0.7085741649489175,
585
+ "eval_stsb_spearman": 0.8232512638640211,
586
+ "step": 7375
587
+ },
588
+ {
589
+ "electra_acc": 0.9051,
590
+ "electra_fix_acc": 0.9904,
591
+ "electra_rep_acc": 0.2082,
592
+ "epoch": 0.48,
593
+ "learning_rate": 7.600153590170229e-06,
594
+ "loss": 0.0011,
595
+ "neg_sim": -0.0141,
596
+ "pos_sim": 0.7901,
597
+ "step": 7500
598
+ },
599
+ {
600
+ "epoch": 0.48,
601
+ "eval_avg_sts": 0.7608800819642288,
602
+ "eval_sickr_spearman": 0.7044842686303607,
603
+ "eval_stsb_spearman": 0.817275895298097,
604
+ "step": 7500
605
+ },
606
+ {
607
+ "epoch": 0.49,
608
+ "eval_avg_sts": 0.7596082078600037,
609
+ "eval_sickr_spearman": 0.714175648060833,
610
+ "eval_stsb_spearman": 0.8050407676591743,
611
+ "step": 7625
612
+ },
613
+ {
614
+ "epoch": 0.5,
615
+ "eval_avg_sts": 0.7601954211728121,
616
+ "eval_sickr_spearman": 0.713427179407123,
617
+ "eval_stsb_spearman": 0.8069636629385014,
618
+ "step": 7750
619
+ },
620
+ {
621
+ "epoch": 0.5,
622
+ "eval_avg_sts": 0.7595124397232403,
623
+ "eval_sickr_spearman": 0.7107700028133637,
624
+ "eval_stsb_spearman": 0.808254876633117,
625
+ "step": 7875
626
+ },
627
+ {
628
+ "electra_acc": 0.9053,
629
+ "electra_fix_acc": 0.9904,
630
+ "electra_rep_acc": 0.2114,
631
+ "epoch": 0.51,
632
+ "learning_rate": 7.440163829514912e-06,
633
+ "loss": 0.0011,
634
+ "neg_sim": -0.014,
635
+ "pos_sim": 0.784,
636
+ "step": 8000
637
+ },
638
+ {
639
+ "epoch": 0.51,
640
+ "eval_avg_sts": 0.7600307296458793,
641
+ "eval_sickr_spearman": 0.7110970465830557,
642
+ "eval_stsb_spearman": 0.8089644127087027,
643
+ "step": 8000
644
+ },
645
+ {
646
+ "epoch": 0.52,
647
+ "eval_avg_sts": 0.7591940023385597,
648
+ "eval_sickr_spearman": 0.709230365856765,
649
+ "eval_stsb_spearman": 0.8091576388203543,
650
+ "step": 8125
651
+ },
652
+ {
653
+ "epoch": 0.53,
654
+ "eval_avg_sts": 0.7591314547260735,
655
+ "eval_sickr_spearman": 0.7092257548710269,
656
+ "eval_stsb_spearman": 0.8090371545811201,
657
+ "step": 8250
658
+ },
659
+ {
660
+ "epoch": 0.54,
661
+ "eval_avg_sts": 0.7574056509175848,
662
+ "eval_sickr_spearman": 0.7065820269856703,
663
+ "eval_stsb_spearman": 0.8082292748494992,
664
+ "step": 8375
665
+ },
666
+ {
667
+ "electra_acc": 0.9051,
668
+ "electra_fix_acc": 0.9903,
669
+ "electra_rep_acc": 0.2115,
670
+ "epoch": 0.54,
671
+ "learning_rate": 7.280174068859593e-06,
672
+ "loss": 0.001,
673
+ "neg_sim": -0.0141,
674
+ "pos_sim": 0.7991,
675
+ "step": 8500
676
+ },
677
+ {
678
+ "epoch": 0.54,
679
+ "eval_avg_sts": 0.7579907562127961,
680
+ "eval_sickr_spearman": 0.7058790918161235,
681
+ "eval_stsb_spearman": 0.8101024206094687,
682
+ "step": 8500
683
+ },
684
+ {
685
+ "epoch": 0.55,
686
+ "eval_avg_sts": 0.7588600922765898,
687
+ "eval_sickr_spearman": 0.7066013835195499,
688
+ "eval_stsb_spearman": 0.8111188010336298,
689
+ "step": 8625
690
+ },
691
+ {
692
+ "epoch": 0.56,
693
+ "eval_avg_sts": 0.7579781185788305,
694
+ "eval_sickr_spearman": 0.7017153716946564,
695
+ "eval_stsb_spearman": 0.8142408654630046,
696
+ "step": 8750
697
+ },
698
+ {
699
+ "epoch": 0.57,
700
+ "eval_avg_sts": 0.7583650057103912,
701
+ "eval_sickr_spearman": 0.7015226228845852,
702
+ "eval_stsb_spearman": 0.8152073885361973,
703
+ "step": 8875
704
+ },
705
+ {
706
+ "electra_acc": 0.9061,
707
+ "electra_fix_acc": 0.9906,
708
+ "electra_rep_acc": 0.2147,
709
+ "epoch": 0.58,
710
+ "learning_rate": 7.120184308204276e-06,
711
+ "loss": 0.0011,
712
+ "neg_sim": -0.0142,
713
+ "pos_sim": 0.8009,
714
+ "step": 9000
715
+ },
716
+ {
717
+ "epoch": 0.58,
718
+ "eval_avg_sts": 0.7584798299886761,
719
+ "eval_sickr_spearman": 0.7013053301816793,
720
+ "eval_stsb_spearman": 0.815654329795673,
721
+ "step": 9000
722
+ },
723
+ {
724
+ "epoch": 0.58,
725
+ "eval_avg_sts": 0.7568426056984886,
726
+ "eval_sickr_spearman": 0.6953886669821268,
727
+ "eval_stsb_spearman": 0.8182965444148504,
728
+ "step": 9125
729
+ },
730
+ {
731
+ "epoch": 0.59,
732
+ "eval_avg_sts": 0.7582279310008209,
733
+ "eval_sickr_spearman": 0.6970180260362119,
734
+ "eval_stsb_spearman": 0.8194378359654299,
735
+ "step": 9250
736
+ },
737
+ {
738
+ "epoch": 0.6,
739
+ "eval_avg_sts": 0.7573982613352882,
740
+ "eval_sickr_spearman": 0.6927659286880986,
741
+ "eval_stsb_spearman": 0.8220305939824778,
742
+ "step": 9375
743
+ },
744
+ {
745
+ "electra_acc": 0.9064,
746
+ "electra_fix_acc": 0.9904,
747
+ "electra_rep_acc": 0.2163,
748
+ "epoch": 0.61,
749
+ "learning_rate": 6.9601945475489575e-06,
750
+ "loss": 0.0011,
751
+ "neg_sim": -0.014,
752
+ "pos_sim": 0.7876,
753
+ "step": 9500
754
+ },
755
+ {
756
+ "epoch": 0.61,
757
+ "eval_avg_sts": 0.7553820601323997,
758
+ "eval_sickr_spearman": 0.690547148530363,
759
+ "eval_stsb_spearman": 0.8202169717344365,
760
+ "step": 9500
761
+ },
762
+ {
763
+ "epoch": 0.62,
764
+ "eval_avg_sts": 0.7567629669074094,
765
+ "eval_sickr_spearman": 0.6929571405029238,
766
+ "eval_stsb_spearman": 0.8205687933118949,
767
+ "step": 9625
768
+ },
769
+ {
770
+ "epoch": 0.62,
771
+ "eval_avg_sts": 0.7568625521954009,
772
+ "eval_sickr_spearman": 0.6927340360367437,
773
+ "eval_stsb_spearman": 0.8209910683540581,
774
+ "step": 9750
775
+ },
776
+ {
777
+ "epoch": 0.63,
778
+ "eval_avg_sts": 0.7524242981805003,
779
+ "eval_sickr_spearman": 0.6836492413863334,
780
+ "eval_stsb_spearman": 0.8211993549746672,
781
+ "step": 9875
782
+ },
783
+ {
784
+ "electra_acc": 0.9058,
785
+ "electra_fix_acc": 0.9902,
786
+ "electra_rep_acc": 0.2192,
787
+ "epoch": 0.64,
788
+ "learning_rate": 6.800204786893639e-06,
789
+ "loss": 0.001,
790
+ "neg_sim": -0.0141,
791
+ "pos_sim": 0.7942,
792
+ "step": 10000
793
+ },
794
+ {
795
+ "epoch": 0.64,
796
+ "eval_avg_sts": 0.7508138636055113,
797
+ "eval_sickr_spearman": 0.6821391435571191,
798
+ "eval_stsb_spearman": 0.8194885836539034,
799
+ "step": 10000
800
+ },
801
+ {
802
+ "epoch": 0.65,
803
+ "eval_avg_sts": 0.7510417193281674,
804
+ "eval_sickr_spearman": 0.683803901532964,
805
+ "eval_stsb_spearman": 0.8182795371233709,
806
+ "step": 10125
807
+ },
808
+ {
809
+ "epoch": 0.66,
810
+ "eval_avg_sts": 0.7560510909697331,
811
+ "eval_sickr_spearman": 0.694022134115111,
812
+ "eval_stsb_spearman": 0.8180800478243552,
813
+ "step": 10250
814
+ },
815
+ {
816
+ "epoch": 0.66,
817
+ "eval_avg_sts": 0.7554822011048945,
818
+ "eval_sickr_spearman": 0.6934548868071269,
819
+ "eval_stsb_spearman": 0.8175095154026621,
820
+ "step": 10375
821
+ },
822
+ {
823
+ "electra_acc": 0.9073,
824
+ "electra_fix_acc": 0.9904,
825
+ "electra_rep_acc": 0.2211,
826
+ "epoch": 0.67,
827
+ "learning_rate": 6.640215026238322e-06,
828
+ "loss": 0.0011,
829
+ "neg_sim": -0.0141,
830
+ "pos_sim": 0.7946,
831
+ "step": 10500
832
+ },
833
+ {
834
+ "epoch": 0.67,
835
+ "eval_avg_sts": 0.7614561025331773,
836
+ "eval_sickr_spearman": 0.7013578281755511,
837
+ "eval_stsb_spearman": 0.8215543768908035,
838
+ "step": 10500
839
+ },
840
+ {
841
+ "epoch": 0.68,
842
+ "eval_avg_sts": 0.7569156258584822,
843
+ "eval_sickr_spearman": 0.6959233972344373,
844
+ "eval_stsb_spearman": 0.817907854482527,
845
+ "step": 10625
846
+ },
847
+ {
848
+ "epoch": 0.69,
849
+ "eval_avg_sts": 0.7561592629720358,
850
+ "eval_sickr_spearman": 0.695029538436674,
851
+ "eval_stsb_spearman": 0.8172889875073976,
852
+ "step": 10750
853
+ },
854
+ {
855
+ "epoch": 0.7,
856
+ "eval_avg_sts": 0.7563428139477524,
857
+ "eval_sickr_spearman": 0.6957568734057513,
858
+ "eval_stsb_spearman": 0.8169287544897534,
859
+ "step": 10875
860
+ },
861
+ {
862
+ "electra_acc": 0.9065,
863
+ "electra_fix_acc": 0.9902,
864
+ "electra_rep_acc": 0.2217,
865
+ "epoch": 0.7,
866
+ "learning_rate": 6.480225265583003e-06,
867
+ "loss": 0.001,
868
+ "neg_sim": -0.0142,
869
+ "pos_sim": 0.8004,
870
+ "step": 11000
871
+ },
872
+ {
873
+ "epoch": 0.7,
874
+ "eval_avg_sts": 0.7516024333799518,
875
+ "eval_sickr_spearman": 0.6886655615894285,
876
+ "eval_stsb_spearman": 0.8145393051704751,
877
+ "step": 11000
878
+ },
879
+ {
880
+ "epoch": 0.71,
881
+ "eval_avg_sts": 0.7513622883897111,
882
+ "eval_sickr_spearman": 0.6895086034818702,
883
+ "eval_stsb_spearman": 0.8132159732975519,
884
+ "step": 11125
885
+ },
886
+ {
887
+ "epoch": 0.72,
888
+ "eval_avg_sts": 0.7524578381239166,
889
+ "eval_sickr_spearman": 0.6914134689338043,
890
+ "eval_stsb_spearman": 0.813502207314029,
891
+ "step": 11250
892
+ },
893
+ {
894
+ "epoch": 0.73,
895
+ "eval_avg_sts": 0.7529077801461861,
896
+ "eval_sickr_spearman": 0.7008521567396107,
897
+ "eval_stsb_spearman": 0.8049634035527615,
898
+ "step": 11375
899
+ },
900
+ {
901
+ "electra_acc": 0.9069,
902
+ "electra_fix_acc": 0.9903,
903
+ "electra_rep_acc": 0.2207,
904
+ "epoch": 0.74,
905
+ "learning_rate": 6.320235504927685e-06,
906
+ "loss": 0.0011,
907
+ "neg_sim": -0.0141,
908
+ "pos_sim": 0.8003,
909
+ "step": 11500
910
+ },
911
+ {
912
+ "epoch": 0.74,
913
+ "eval_avg_sts": 0.7520009086329529,
914
+ "eval_sickr_spearman": 0.6998127156733884,
915
+ "eval_stsb_spearman": 0.8041891015925174,
916
+ "step": 11500
917
+ },
918
+ {
919
+ "epoch": 0.74,
920
+ "eval_avg_sts": 0.7528385713572732,
921
+ "eval_sickr_spearman": 0.6952667640466769,
922
+ "eval_stsb_spearman": 0.8104103786678696,
923
+ "step": 11625
924
+ },
925
+ {
926
+ "epoch": 0.75,
927
+ "eval_avg_sts": 0.7543110510513436,
928
+ "eval_sickr_spearman": 0.6966553431892527,
929
+ "eval_stsb_spearman": 0.8119667589134344,
930
+ "step": 11750
931
+ },
932
+ {
933
+ "epoch": 0.76,
934
+ "eval_avg_sts": 0.7483201703337362,
935
+ "eval_sickr_spearman": 0.6935844266877055,
936
+ "eval_stsb_spearman": 0.8030559139797669,
937
+ "step": 11875
938
+ },
939
+ {
940
+ "electra_acc": 0.9071,
941
+ "electra_fix_acc": 0.99,
942
+ "electra_rep_acc": 0.2271,
943
+ "epoch": 0.77,
944
+ "learning_rate": 6.1602457442723675e-06,
945
+ "loss": 0.0011,
946
+ "neg_sim": -0.0141,
947
+ "pos_sim": 0.7871,
948
+ "step": 12000
949
+ },
950
+ {
951
+ "epoch": 0.77,
952
+ "eval_avg_sts": 0.751054540493113,
953
+ "eval_sickr_spearman": 0.692834276945445,
954
+ "eval_stsb_spearman": 0.809274804040781,
955
+ "step": 12000
956
+ },
957
+ {
958
+ "epoch": 0.78,
959
+ "eval_avg_sts": 0.7490596138849832,
960
+ "eval_sickr_spearman": 0.6888325657291288,
961
+ "eval_stsb_spearman": 0.8092866620408378,
962
+ "step": 12125
963
+ },
964
+ {
965
+ "epoch": 0.78,
966
+ "eval_avg_sts": 0.7507815525034611,
967
+ "eval_sickr_spearman": 0.6925058883049127,
968
+ "eval_stsb_spearman": 0.8090572167020096,
969
+ "step": 12250
970
+ },
971
+ {
972
+ "epoch": 0.79,
973
+ "eval_avg_sts": 0.7527821368803079,
974
+ "eval_sickr_spearman": 0.6942120971212987,
975
+ "eval_stsb_spearman": 0.8113521766393171,
976
+ "step": 12375
977
+ },
978
+ {
979
+ "electra_acc": 0.9072,
980
+ "electra_fix_acc": 0.99,
981
+ "electra_rep_acc": 0.226,
982
+ "epoch": 0.8,
983
+ "learning_rate": 6.000255983617049e-06,
984
+ "loss": 0.0012,
985
+ "neg_sim": -0.014,
986
+ "pos_sim": 0.7847,
987
+ "step": 12500
988
+ },
989
+ {
990
+ "epoch": 0.8,
991
+ "eval_avg_sts": 0.7522991659637694,
992
+ "eval_sickr_spearman": 0.6944025884696023,
993
+ "eval_stsb_spearman": 0.8101957434579367,
994
+ "step": 12500
995
+ },
996
+ {
997
+ "epoch": 0.81,
998
+ "eval_avg_sts": 0.7521192684266835,
999
+ "eval_sickr_spearman": 0.6942240568655569,
1000
+ "eval_stsb_spearman": 0.8100144799878102,
1001
+ "step": 12625
1002
+ },
1003
+ {
1004
+ "epoch": 0.82,
1005
+ "eval_avg_sts": 0.7503450494614133,
1006
+ "eval_sickr_spearman": 0.6922287968807163,
1007
+ "eval_stsb_spearman": 0.8084613020421104,
1008
+ "step": 12750
1009
+ },
1010
+ {
1011
+ "epoch": 0.82,
1012
+ "eval_avg_sts": 0.7517083341962134,
1013
+ "eval_sickr_spearman": 0.6941311647153756,
1014
+ "eval_stsb_spearman": 0.8092855036770511,
1015
+ "step": 12875
1016
+ },
1017
+ {
1018
+ "electra_acc": 0.9071,
1019
+ "electra_fix_acc": 0.9901,
1020
+ "electra_rep_acc": 0.2276,
1021
+ "epoch": 0.83,
1022
+ "learning_rate": 5.840266222961732e-06,
1023
+ "loss": 0.0011,
1024
+ "neg_sim": -0.0141,
1025
+ "pos_sim": 0.798,
1026
+ "step": 13000
1027
+ },
1028
+ {
1029
+ "epoch": 0.83,
1030
+ "eval_avg_sts": 0.7504582415898822,
1031
+ "eval_sickr_spearman": 0.6935004202912902,
1032
+ "eval_stsb_spearman": 0.8074160628884742,
1033
+ "step": 13000
1034
+ },
1035
+ {
1036
+ "epoch": 0.84,
1037
+ "eval_avg_sts": 0.7504838292265648,
1038
+ "eval_sickr_spearman": 0.6931516664637482,
1039
+ "eval_stsb_spearman": 0.8078159919893814,
1040
+ "step": 13125
1041
+ },
1042
+ {
1043
+ "epoch": 0.85,
1044
+ "eval_avg_sts": 0.7511643537875756,
1045
+ "eval_sickr_spearman": 0.6905446509130283,
1046
+ "eval_stsb_spearman": 0.8117840566621228,
1047
+ "step": 13250
1048
+ },
1049
+ {
1050
+ "epoch": 0.86,
1051
+ "eval_avg_sts": 0.7516853154108011,
1052
+ "eval_sickr_spearman": 0.6920386417501228,
1053
+ "eval_stsb_spearman": 0.8113319890714794,
1054
+ "step": 13375
1055
+ },
1056
+ {
1057
+ "electra_acc": 0.9078,
1058
+ "electra_fix_acc": 0.9901,
1059
+ "electra_rep_acc": 0.2307,
1060
+ "epoch": 0.86,
1061
+ "learning_rate": 5.680276462306413e-06,
1062
+ "loss": 0.0011,
1063
+ "neg_sim": -0.0141,
1064
+ "pos_sim": 0.7933,
1065
+ "step": 13500
1066
+ },
1067
+ {
1068
+ "epoch": 0.86,
1069
+ "eval_avg_sts": 0.7513575139583819,
1070
+ "eval_sickr_spearman": 0.6917417615121337,
1071
+ "eval_stsb_spearman": 0.8109732664046301,
1072
+ "step": 13500
1073
+ },
1074
+ {
1075
+ "epoch": 0.87,
1076
+ "eval_avg_sts": 0.751417134937681,
1077
+ "eval_sickr_spearman": 0.6917422418231481,
1078
+ "eval_stsb_spearman": 0.8110920280522138,
1079
+ "step": 13625
1080
+ },
1081
+ {
1082
+ "epoch": 0.88,
1083
+ "eval_avg_sts": 0.7526199054172857,
1084
+ "eval_sickr_spearman": 0.6928212605169554,
1085
+ "eval_stsb_spearman": 0.812418550317616,
1086
+ "step": 13750
1087
+ },
1088
+ {
1089
+ "epoch": 0.89,
1090
+ "eval_avg_sts": 0.7532239951741806,
1091
+ "eval_sickr_spearman": 0.693511563506824,
1092
+ "eval_stsb_spearman": 0.8129364268415373,
1093
+ "step": 13875
1094
+ },
1095
+ {
1096
+ "electra_acc": 0.9078,
1097
+ "electra_fix_acc": 0.9899,
1098
+ "electra_rep_acc": 0.2332,
1099
+ "epoch": 0.9,
1100
+ "learning_rate": 5.520286701651095e-06,
1101
+ "loss": 0.001,
1102
+ "neg_sim": -0.0142,
1103
+ "pos_sim": 0.8032,
1104
+ "step": 14000
1105
+ },
1106
+ {
1107
+ "epoch": 0.9,
1108
+ "eval_avg_sts": 0.7535878459898898,
1109
+ "eval_sickr_spearman": 0.6938975894690821,
1110
+ "eval_stsb_spearman": 0.8132781025106975,
1111
+ "step": 14000
1112
+ },
1113
+ {
1114
+ "epoch": 0.9,
1115
+ "eval_avg_sts": 0.7579602803336107,
1116
+ "eval_sickr_spearman": 0.6938505189896726,
1117
+ "eval_stsb_spearman": 0.8220700416775488,
1118
+ "step": 14125
1119
+ },
1120
+ {
1121
+ "epoch": 0.91,
1122
+ "eval_avg_sts": 0.7582493904128486,
1123
+ "eval_sickr_spearman": 0.6947031190713006,
1124
+ "eval_stsb_spearman": 0.8217956617543966,
1125
+ "step": 14250
1126
+ },
1127
+ {
1128
+ "epoch": 0.92,
1129
+ "eval_avg_sts": 0.7585045342871344,
1130
+ "eval_sickr_spearman": 0.6946038387846281,
1131
+ "eval_stsb_spearman": 0.8224052297896408,
1132
+ "step": 14375
1133
+ },
1134
+ {
1135
+ "electra_acc": 0.9081,
1136
+ "electra_fix_acc": 0.99,
1137
+ "electra_rep_acc": 0.2354,
1138
+ "epoch": 0.93,
1139
+ "learning_rate": 5.3602969409957775e-06,
1140
+ "loss": 0.001,
1141
+ "neg_sim": -0.0142,
1142
+ "pos_sim": 0.8056,
1143
+ "step": 14500
1144
+ },
1145
+ {
1146
+ "epoch": 0.93,
1147
+ "eval_avg_sts": 0.7586329610292095,
1148
+ "eval_sickr_spearman": 0.6949082118744413,
1149
+ "eval_stsb_spearman": 0.8223577101839777,
1150
+ "step": 14500
1151
+ },
1152
+ {
1153
+ "epoch": 0.94,
1154
+ "eval_avg_sts": 0.7580983862167923,
1155
+ "eval_sickr_spearman": 0.6947662799696918,
1156
+ "eval_stsb_spearman": 0.8214304924638928,
1157
+ "step": 14625
1158
+ },
1159
+ {
1160
+ "epoch": 0.94,
1161
+ "eval_avg_sts": 0.7543501652245072,
1162
+ "eval_sickr_spearman": 0.6945084970482733,
1163
+ "eval_stsb_spearman": 0.814191833400741,
1164
+ "step": 14750
1165
+ },
1166
+ {
1167
+ "epoch": 0.95,
1168
+ "eval_avg_sts": 0.753735401541382,
1169
+ "eval_sickr_spearman": 0.6931743297643269,
1170
+ "eval_stsb_spearman": 0.8142964733184371,
1171
+ "step": 14875
1172
+ },
1173
+ {
1174
+ "electra_acc": 0.9081,
1175
+ "electra_fix_acc": 0.9899,
1176
+ "electra_rep_acc": 0.2356,
1177
+ "epoch": 0.96,
1178
+ "learning_rate": 5.200307180340458e-06,
1179
+ "loss": 0.001,
1180
+ "neg_sim": -0.0142,
1181
+ "pos_sim": 0.8068,
1182
+ "step": 15000
1183
+ },
1184
+ {
1185
+ "epoch": 0.96,
1186
+ "eval_avg_sts": 0.7565776728970167,
1187
+ "eval_sickr_spearman": 0.699184805084288,
1188
+ "eval_stsb_spearman": 0.8139705407097454,
1189
+ "step": 15000
1190
+ },
1191
+ {
1192
+ "epoch": 0.97,
1193
+ "eval_avg_sts": 0.7555687559765775,
1194
+ "eval_sickr_spearman": 0.6981326357761847,
1195
+ "eval_stsb_spearman": 0.8130048761769703,
1196
+ "step": 15125
1197
+ },
1198
+ {
1199
+ "epoch": 0.98,
1200
+ "eval_avg_sts": 0.7607557679069236,
1201
+ "eval_sickr_spearman": 0.7007437985747662,
1202
+ "eval_stsb_spearman": 0.8207677372390809,
1203
+ "step": 15250
1204
+ },
1205
+ {
1206
+ "epoch": 0.98,
1207
+ "eval_avg_sts": 0.7560246287263773,
1208
+ "eval_sickr_spearman": 0.6948236771359103,
1209
+ "eval_stsb_spearman": 0.8172255803168443,
1210
+ "step": 15375
1211
+ },
1212
+ {
1213
+ "electra_acc": 0.9077,
1214
+ "electra_fix_acc": 0.9897,
1215
+ "electra_rep_acc": 0.24,
1216
+ "epoch": 0.99,
1217
+ "learning_rate": 5.04031741968514e-06,
1218
+ "loss": 0.0011,
1219
+ "neg_sim": -0.0141,
1220
+ "pos_sim": 0.7969,
1221
+ "step": 15500
1222
+ },
1223
+ {
1224
+ "epoch": 0.99,
1225
+ "eval_avg_sts": 0.7559485354838383,
1226
+ "eval_sickr_spearman": 0.6950054268237521,
1227
+ "eval_stsb_spearman": 0.8168916441439246,
1228
+ "step": 15500
1229
+ },
1230
+ {
1231
+ "epoch": 1.0,
1232
+ "eval_avg_sts": 0.7502675969292933,
1233
+ "eval_sickr_spearman": 0.6866761613989633,
1234
+ "eval_stsb_spearman": 0.8138590324596235,
1235
+ "step": 15625
1236
+ },
1237
+ {
1238
+ "epoch": 1.01,
1239
+ "eval_avg_sts": 0.7497139540559181,
1240
+ "eval_sickr_spearman": 0.6875880798908671,
1241
+ "eval_stsb_spearman": 0.811839828220969,
1242
+ "step": 15750
1243
+ },
1244
+ {
1245
+ "epoch": 1.02,
1246
+ "eval_avg_sts": 0.7507851484505732,
1247
+ "eval_sickr_spearman": 0.6884938984328886,
1248
+ "eval_stsb_spearman": 0.8130763984682579,
1249
+ "step": 15875
1250
+ },
1251
+ {
1252
+ "electra_acc": 0.9081,
1253
+ "electra_fix_acc": 0.9898,
1254
+ "electra_rep_acc": 0.239,
1255
+ "epoch": 1.02,
1256
+ "learning_rate": 4.8803276590298225e-06,
1257
+ "loss": 0.001,
1258
+ "neg_sim": NaN,
1259
+ "pos_sim": 0.7953,
1260
+ "step": 16000
1261
+ },
1262
+ {
1263
+ "epoch": 1.02,
1264
+ "eval_avg_sts": 0.7479678946675965,
1265
+ "eval_sickr_spearman": 0.6853699555953537,
1266
+ "eval_stsb_spearman": 0.8105658337398391,
1267
+ "step": 16000
1268
+ },
1269
+ {
1270
+ "epoch": 1.03,
1271
+ "eval_avg_sts": 0.7479128193954279,
1272
+ "eval_sickr_spearman": 0.6867578623025095,
1273
+ "eval_stsb_spearman": 0.8090677764883464,
1274
+ "step": 16125
1275
+ },
1276
+ {
1277
+ "epoch": 1.04,
1278
+ "eval_avg_sts": 0.7487959084542308,
1279
+ "eval_sickr_spearman": 0.68736439905147,
1280
+ "eval_stsb_spearman": 0.8102274178569916,
1281
+ "step": 16250
1282
+ },
1283
+ {
1284
+ "epoch": 1.05,
1285
+ "eval_avg_sts": 0.7500295795078514,
1286
+ "eval_sickr_spearman": 0.688809030489424,
1287
+ "eval_stsb_spearman": 0.8112501285262789,
1288
+ "step": 16375
1289
+ },
1290
+ {
1291
+ "electra_acc": 0.9078,
1292
+ "electra_fix_acc": 0.9898,
1293
+ "electra_rep_acc": 0.2379,
1294
+ "epoch": 1.06,
1295
+ "learning_rate": 4.720337898374505e-06,
1296
+ "loss": 0.0011,
1297
+ "neg_sim": NaN,
1298
+ "pos_sim": 0.8055,
1299
+ "step": 16500
1300
+ },
1301
+ {
1302
+ "epoch": 1.06,
1303
+ "eval_avg_sts": 0.7515077711907536,
1304
+ "eval_sickr_spearman": 0.690066388662871,
1305
+ "eval_stsb_spearman": 0.8129491537186362,
1306
+ "step": 16500
1307
+ },
1308
+ {
1309
+ "epoch": 1.06,
1310
+ "eval_avg_sts": 0.7490196746582567,
1311
+ "eval_sickr_spearman": 0.6842396396852107,
1312
+ "eval_stsb_spearman": 0.8137997096313027,
1313
+ "step": 16625
1314
+ },
1315
+ {
1316
+ "epoch": 1.07,
1317
+ "eval_avg_sts": 0.7500539670025419,
1318
+ "eval_sickr_spearman": 0.6851718753330229,
1319
+ "eval_stsb_spearman": 0.8149360586720608,
1320
+ "step": 16750
1321
+ },
1322
+ {
1323
+ "epoch": 1.08,
1324
+ "eval_avg_sts": 0.7490200514710061,
1325
+ "eval_sickr_spearman": 0.6843091406889915,
1326
+ "eval_stsb_spearman": 0.8137309622530207,
1327
+ "step": 16875
1328
+ },
1329
+ {
1330
+ "electra_acc": 0.9084,
1331
+ "electra_fix_acc": 0.9895,
1332
+ "electra_rep_acc": 0.2437,
1333
+ "epoch": 1.09,
1334
+ "learning_rate": 4.560348137719187e-06,
1335
+ "loss": 0.001,
1336
+ "neg_sim": NaN,
1337
+ "pos_sim": 0.8007,
1338
+ "step": 17000
1339
+ },
1340
+ {
1341
+ "epoch": 1.09,
1342
+ "eval_avg_sts": 0.7434034453107932,
1343
+ "eval_sickr_spearman": 0.677096502310436,
1344
+ "eval_stsb_spearman": 0.8097103883111502,
1345
+ "step": 17000
1346
+ },
1347
+ {
1348
+ "epoch": 1.1,
1349
+ "eval_avg_sts": 0.7431564820160605,
1350
+ "eval_sickr_spearman": 0.677125561126806,
1351
+ "eval_stsb_spearman": 0.8091874029053152,
1352
+ "step": 17125
1353
+ },
1354
+ {
1355
+ "epoch": 1.1,
1356
+ "eval_avg_sts": 0.7442892283774005,
1357
+ "eval_sickr_spearman": 0.6786750444591996,
1358
+ "eval_stsb_spearman": 0.8099034122956015,
1359
+ "step": 17250
1360
+ },
1361
+ {
1362
+ "epoch": 1.11,
1363
+ "eval_avg_sts": 0.7452452513650849,
1364
+ "eval_sickr_spearman": 0.6798861967130629,
1365
+ "eval_stsb_spearman": 0.8106043060171069,
1366
+ "step": 17375
1367
+ },
1368
+ {
1369
+ "electra_acc": 0.9085,
1370
+ "electra_fix_acc": 0.9896,
1371
+ "electra_rep_acc": 0.2435,
1372
+ "epoch": 1.12,
1373
+ "learning_rate": 4.400358377063868e-06,
1374
+ "loss": 0.001,
1375
+ "neg_sim": NaN,
1376
+ "pos_sim": 0.8058,
1377
+ "step": 17500
1378
+ },
1379
+ {
1380
+ "epoch": 1.12,
1381
+ "eval_avg_sts": 0.7457855940878162,
1382
+ "eval_sickr_spearman": 0.6806044538039685,
1383
+ "eval_stsb_spearman": 0.810966734371664,
1384
+ "step": 17500
1385
+ },
1386
+ {
1387
+ "epoch": 1.13,
1388
+ "eval_avg_sts": 0.7416386948034206,
1389
+ "eval_sickr_spearman": 0.6797112674416254,
1390
+ "eval_stsb_spearman": 0.8035661221652158,
1391
+ "step": 17625
1392
+ },
1393
+ {
1394
+ "epoch": 1.14,
1395
+ "eval_avg_sts": 0.7423254082827856,
1396
+ "eval_sickr_spearman": 0.6808730437232104,
1397
+ "eval_stsb_spearman": 0.8037777728423607,
1398
+ "step": 17750
1399
+ },
1400
+ {
1401
+ "epoch": 1.14,
1402
+ "eval_avg_sts": 0.7413901886271141,
1403
+ "eval_sickr_spearman": 0.681068914554875,
1404
+ "eval_stsb_spearman": 0.8017114626993534,
1405
+ "step": 17875
1406
+ },
1407
+ {
1408
+ "electra_acc": 0.9082,
1409
+ "electra_fix_acc": 0.9892,
1410
+ "electra_rep_acc": 0.247,
1411
+ "epoch": 1.15,
1412
+ "learning_rate": 4.24036861640855e-06,
1413
+ "loss": 0.001,
1414
+ "neg_sim": NaN,
1415
+ "pos_sim": 0.8062,
1416
+ "step": 18000
1417
+ },
1418
+ {
1419
+ "epoch": 1.15,
1420
+ "eval_avg_sts": 0.7429114158358341,
1421
+ "eval_sickr_spearman": 0.6825547086467617,
1422
+ "eval_stsb_spearman": 0.8032681230249066,
1423
+ "step": 18000
1424
+ },
1425
+ {
1426
+ "epoch": 1.16,
1427
+ "eval_avg_sts": 0.7439526113533993,
1428
+ "eval_sickr_spearman": 0.6837044771529871,
1429
+ "eval_stsb_spearman": 0.8042007455538115,
1430
+ "step": 18125
1431
+ },
1432
+ {
1433
+ "epoch": 1.17,
1434
+ "eval_avg_sts": 0.7450607408838441,
1435
+ "eval_sickr_spearman": 0.686960841737187,
1436
+ "eval_stsb_spearman": 0.8031606400305011,
1437
+ "step": 18250
1438
+ },
1439
+ {
1440
+ "epoch": 1.18,
1441
+ "eval_avg_sts": 0.745855277528227,
1442
+ "eval_sickr_spearman": 0.6876279457050609,
1443
+ "eval_stsb_spearman": 0.804082609351393,
1444
+ "step": 18375
1445
+ },
1446
+ {
1447
+ "electra_acc": 0.9078,
1448
+ "electra_fix_acc": 0.9895,
1449
+ "electra_rep_acc": 0.242,
1450
+ "epoch": 1.18,
1451
+ "learning_rate": 4.080378855753232e-06,
1452
+ "loss": 0.0011,
1453
+ "neg_sim": NaN,
1454
+ "pos_sim": 0.8079,
1455
+ "step": 18500
1456
+ },
1457
+ {
1458
+ "epoch": 1.18,
1459
+ "eval_avg_sts": 0.7462797427639389,
1460
+ "eval_sickr_spearman": 0.6877381290517599,
1461
+ "eval_stsb_spearman": 0.804821356476118,
1462
+ "step": 18500
1463
+ },
1464
+ {
1465
+ "epoch": 1.19,
1466
+ "eval_avg_sts": 0.744680795271585,
1467
+ "eval_sickr_spearman": 0.6867351916226306,
1468
+ "eval_stsb_spearman": 0.8026263989205394,
1469
+ "step": 18625
1470
+ },
1471
+ {
1472
+ "epoch": 1.2,
1473
+ "eval_avg_sts": 0.7415540412607211,
1474
+ "eval_sickr_spearman": 0.6830318976395493,
1475
+ "eval_stsb_spearman": 0.800076184881893,
1476
+ "step": 18750
1477
+ },
1478
+ {
1479
+ "epoch": 1.21,
1480
+ "eval_avg_sts": 0.7426288639389378,
1481
+ "eval_sickr_spearman": 0.6841978045958581,
1482
+ "eval_stsb_spearman": 0.8010599232820175,
1483
+ "step": 18875
1484
+ },
1485
+ {
1486
+ "electra_acc": 0.9088,
1487
+ "electra_fix_acc": 0.9896,
1488
+ "electra_rep_acc": 0.2456,
1489
+ "epoch": 1.22,
1490
+ "learning_rate": 3.920389095097914e-06,
1491
+ "loss": 0.001,
1492
+ "neg_sim": NaN,
1493
+ "pos_sim": 0.8107,
1494
+ "step": 19000
1495
+ },
1496
+ {
1497
+ "epoch": 1.22,
1498
+ "eval_avg_sts": 0.7440228503017443,
1499
+ "eval_sickr_spearman": 0.6852098768089162,
1500
+ "eval_stsb_spearman": 0.8028358237945724,
1501
+ "step": 19000
1502
+ },
1503
+ {
1504
+ "epoch": 1.22,
1505
+ "eval_avg_sts": 0.7446803355301828,
1506
+ "eval_sickr_spearman": 0.6853459400446347,
1507
+ "eval_stsb_spearman": 0.8040147310157307,
1508
+ "step": 19125
1509
+ },
1510
+ {
1511
+ "epoch": 1.23,
1512
+ "eval_avg_sts": 0.745178310449073,
1513
+ "eval_sickr_spearman": 0.685846664277127,
1514
+ "eval_stsb_spearman": 0.804509956621019,
1515
+ "step": 19250
1516
+ },
1517
+ {
1518
+ "epoch": 1.24,
1519
+ "eval_avg_sts": 0.7501895198558063,
1520
+ "eval_sickr_spearman": 0.6936651189381215,
1521
+ "eval_stsb_spearman": 0.806713920773491,
1522
+ "step": 19375
1523
+ },
1524
+ {
1525
+ "electra_acc": 0.9087,
1526
+ "electra_fix_acc": 0.9894,
1527
+ "electra_rep_acc": 0.2472,
1528
+ "epoch": 1.25,
1529
+ "learning_rate": 3.760399334442596e-06,
1530
+ "loss": 0.001,
1531
+ "neg_sim": NaN,
1532
+ "pos_sim": 0.815,
1533
+ "step": 19500
1534
+ },
1535
+ {
1536
+ "epoch": 1.25,
1537
+ "eval_avg_sts": 0.7504868457577191,
1538
+ "eval_sickr_spearman": 0.693660123703572,
1539
+ "eval_stsb_spearman": 0.8073135678118661,
1540
+ "step": 19500
1541
+ },
1542
+ {
1543
+ "epoch": 1.26,
1544
+ "eval_avg_sts": 0.7515054770519676,
1545
+ "eval_sickr_spearman": 0.6934854345876417,
1546
+ "eval_stsb_spearman": 0.8095255195162936,
1547
+ "step": 19625
1548
+ },
1549
+ {
1550
+ "epoch": 1.26,
1551
+ "eval_avg_sts": 0.7519980000949484,
1552
+ "eval_sickr_spearman": 0.6937166563099646,
1553
+ "eval_stsb_spearman": 0.8102793438799323,
1554
+ "step": 19750
1555
+ },
1556
+ {
1557
+ "epoch": 1.27,
1558
+ "eval_avg_sts": 0.7521186477822681,
1559
+ "eval_sickr_spearman": 0.693651237949806,
1560
+ "eval_stsb_spearman": 0.8105860576147302,
1561
+ "step": 19875
1562
+ },
1563
+ {
1564
+ "electra_acc": 0.9094,
1565
+ "electra_fix_acc": 0.9895,
1566
+ "electra_rep_acc": 0.2499,
1567
+ "epoch": 1.28,
1568
+ "learning_rate": 3.600409573787278e-06,
1569
+ "loss": 0.001,
1570
+ "neg_sim": NaN,
1571
+ "pos_sim": 0.8155,
1572
+ "step": 20000
1573
+ },
1574
+ {
1575
+ "epoch": 1.28,
1576
+ "eval_avg_sts": 0.752341790734355,
1577
+ "eval_sickr_spearman": 0.6935850991231257,
1578
+ "eval_stsb_spearman": 0.8110984823455842,
1579
+ "step": 20000
1580
+ },
1581
+ {
1582
+ "epoch": 1.29,
1583
+ "eval_avg_sts": 0.7524918475690561,
1584
+ "eval_sickr_spearman": 0.6935691527974482,
1585
+ "eval_stsb_spearman": 0.811414542340664,
1586
+ "step": 20125
1587
+ },
1588
+ {
1589
+ "epoch": 1.3,
1590
+ "eval_avg_sts": 0.7557856359443814,
1591
+ "eval_sickr_spearman": 0.6970248944837176,
1592
+ "eval_stsb_spearman": 0.8145463774050452,
1593
+ "step": 20250
1594
+ },
1595
+ {
1596
+ "epoch": 1.3,
1597
+ "eval_avg_sts": 0.7552826569601827,
1598
+ "eval_sickr_spearman": 0.6959100445882376,
1599
+ "eval_stsb_spearman": 0.814655269332128,
1600
+ "step": 20375
1601
+ },
1602
+ {
1603
+ "electra_acc": 0.9094,
1604
+ "electra_fix_acc": 0.9895,
1605
+ "electra_rep_acc": 0.2515,
1606
+ "epoch": 1.31,
1607
+ "learning_rate": 3.44041981313196e-06,
1608
+ "loss": 0.001,
1609
+ "neg_sim": NaN,
1610
+ "pos_sim": 0.8179,
1611
+ "step": 20500
1612
+ },
1613
+ {
1614
+ "epoch": 1.31,
1615
+ "eval_avg_sts": 0.7551388763512048,
1616
+ "eval_sickr_spearman": 0.6953964480205598,
1617
+ "eval_stsb_spearman": 0.8148813046818498,
1618
+ "step": 20500
1619
+ },
1620
+ {
1621
+ "epoch": 1.32,
1622
+ "eval_avg_sts": 0.7551894637379806,
1623
+ "eval_sickr_spearman": 0.6956948652537948,
1624
+ "eval_stsb_spearman": 0.8146840622221664,
1625
+ "step": 20625
1626
+ },
1627
+ {
1628
+ "epoch": 1.33,
1629
+ "eval_avg_sts": 0.7549148751641654,
1630
+ "eval_sickr_spearman": 0.6951038425505989,
1631
+ "eval_stsb_spearman": 0.8147259077777321,
1632
+ "step": 20750
1633
+ },
1634
+ {
1635
+ "epoch": 1.34,
1636
+ "eval_avg_sts": 0.754268177606626,
1637
+ "eval_sickr_spearman": 0.6945316000080651,
1638
+ "eval_stsb_spearman": 0.814004755205187,
1639
+ "step": 20875
1640
+ },
1641
+ {
1642
+ "electra_acc": 0.9089,
1643
+ "electra_fix_acc": 0.9891,
1644
+ "electra_rep_acc": 0.2537,
1645
+ "epoch": 1.34,
1646
+ "learning_rate": 3.280430052476642e-06,
1647
+ "loss": 0.0009,
1648
+ "neg_sim": NaN,
1649
+ "pos_sim": 0.8193,
1650
+ "step": 21000
1651
+ },
1652
+ {
1653
+ "epoch": 1.34,
1654
+ "eval_avg_sts": 0.7544681011807751,
1655
+ "eval_sickr_spearman": 0.6945047025912598,
1656
+ "eval_stsb_spearman": 0.8144314997702904,
1657
+ "step": 21000
1658
+ },
1659
+ {
1660
+ "epoch": 1.35,
1661
+ "eval_avg_sts": 0.7543169348596496,
1662
+ "eval_sickr_spearman": 0.694174296644467,
1663
+ "eval_stsb_spearman": 0.8144595730748322,
1664
+ "step": 21125
1665
+ },
1666
+ {
1667
+ "epoch": 1.36,
1668
+ "eval_avg_sts": 0.7553283942077516,
1669
+ "eval_sickr_spearman": 0.6958436656060499,
1670
+ "eval_stsb_spearman": 0.8148131228094532,
1671
+ "step": 21250
1672
+ },
1673
+ {
1674
+ "epoch": 1.37,
1675
+ "eval_avg_sts": 0.7557282247685372,
1676
+ "eval_sickr_spearman": 0.6962398741618131,
1677
+ "eval_stsb_spearman": 0.8152165753752613,
1678
+ "step": 21375
1679
+ },
1680
+ {
1681
+ "electra_acc": 0.9093,
1682
+ "electra_fix_acc": 0.9893,
1683
+ "electra_rep_acc": 0.2553,
1684
+ "epoch": 1.38,
1685
+ "learning_rate": 3.1204402918213238e-06,
1686
+ "loss": 0.001,
1687
+ "neg_sim": NaN,
1688
+ "pos_sim": 0.822,
1689
+ "step": 21500
1690
+ },
1691
+ {
1692
+ "epoch": 1.38,
1693
+ "eval_avg_sts": 0.7533899213084664,
1694
+ "eval_sickr_spearman": 0.6928663617212057,
1695
+ "eval_stsb_spearman": 0.813913480895727,
1696
+ "step": 21500
1697
+ },
1698
+ {
1699
+ "epoch": 1.38,
1700
+ "eval_avg_sts": 0.7534343863779988,
1701
+ "eval_sickr_spearman": 0.6929559877564893,
1702
+ "eval_stsb_spearman": 0.8139127849995083,
1703
+ "step": 21625
1704
+ },
1705
+ {
1706
+ "epoch": 1.39,
1707
+ "eval_avg_sts": 0.7539297584804894,
1708
+ "eval_sickr_spearman": 0.6930869205390096,
1709
+ "eval_stsb_spearman": 0.8147725964219692,
1710
+ "step": 21750
1711
+ },
1712
+ {
1713
+ "epoch": 1.4,
1714
+ "eval_avg_sts": 0.7523440794130264,
1715
+ "eval_sickr_spearman": 0.691800551580294,
1716
+ "eval_stsb_spearman": 0.8128876072457588,
1717
+ "step": 21875
1718
+ },
1719
+ {
1720
+ "electra_acc": 0.9091,
1721
+ "electra_fix_acc": 0.9891,
1722
+ "electra_rep_acc": 0.253,
1723
+ "epoch": 1.41,
1724
+ "learning_rate": 2.960450531166006e-06,
1725
+ "loss": 0.001,
1726
+ "neg_sim": NaN,
1727
+ "pos_sim": 0.8192,
1728
+ "step": 22000
1729
+ },
1730
+ {
1731
+ "epoch": 1.41,
1732
+ "eval_avg_sts": 0.7535926716898982,
1733
+ "eval_sickr_spearman": 0.696256204736302,
1734
+ "eval_stsb_spearman": 0.8109291386434943,
1735
+ "step": 22000
1736
+ },
1737
+ {
1738
+ "epoch": 1.42,
1739
+ "eval_avg_sts": 0.7527606084226345,
1740
+ "eval_sickr_spearman": 0.6954414051315059,
1741
+ "eval_stsb_spearman": 0.8100798117137631,
1742
+ "step": 22125
1743
+ },
1744
+ {
1745
+ "epoch": 1.42,
1746
+ "eval_avg_sts": 0.7533114920096069,
1747
+ "eval_sickr_spearman": 0.695455142026517,
1748
+ "eval_stsb_spearman": 0.8111678419926969,
1749
+ "step": 22250
1750
+ },
1751
+ {
1752
+ "epoch": 1.43,
1753
+ "eval_avg_sts": 0.7519280470002075,
1754
+ "eval_sickr_spearman": 0.6946230992563046,
1755
+ "eval_stsb_spearman": 0.8092329947441105,
1756
+ "step": 22375
1757
+ },
1758
+ {
1759
+ "electra_acc": 0.9093,
1760
+ "electra_fix_acc": 0.9893,
1761
+ "electra_rep_acc": 0.2551,
1762
+ "epoch": 1.44,
1763
+ "learning_rate": 2.8004607705106875e-06,
1764
+ "loss": 0.0011,
1765
+ "neg_sim": NaN,
1766
+ "pos_sim": 0.8135,
1767
+ "step": 22500
1768
+ },
1769
+ {
1770
+ "epoch": 1.44,
1771
+ "eval_avg_sts": 0.7522224828721787,
1772
+ "eval_sickr_spearman": 0.694889767931489,
1773
+ "eval_stsb_spearman": 0.8095551978128684,
1774
+ "step": 22500
1775
+ },
1776
+ {
1777
+ "epoch": 1.45,
1778
+ "eval_avg_sts": 0.7509487345671295,
1779
+ "eval_sickr_spearman": 0.6947804011135146,
1780
+ "eval_stsb_spearman": 0.8071170680207445,
1781
+ "step": 22625
1782
+ },
1783
+ {
1784
+ "epoch": 1.46,
1785
+ "eval_avg_sts": 0.7535117851178551,
1786
+ "eval_sickr_spearman": 0.6966824807615652,
1787
+ "eval_stsb_spearman": 0.810341089474145,
1788
+ "step": 22750
1789
+ },
1790
+ {
1791
+ "epoch": 1.46,
1792
+ "eval_avg_sts": 0.7554057278055278,
1793
+ "eval_sickr_spearman": 0.6974278273936819,
1794
+ "eval_stsb_spearman": 0.8133836282173738,
1795
+ "step": 22875
1796
+ },
1797
+ {
1798
+ "electra_acc": 0.9094,
1799
+ "electra_fix_acc": 0.9891,
1800
+ "electra_rep_acc": 0.2557,
1801
+ "epoch": 1.47,
1802
+ "learning_rate": 2.640471009855369e-06,
1803
+ "loss": 0.0012,
1804
+ "neg_sim": NaN,
1805
+ "pos_sim": 0.8108,
1806
+ "step": 23000
1807
+ },
1808
+ {
1809
+ "epoch": 1.47,
1810
+ "eval_avg_sts": 0.7573042151426209,
1811
+ "eval_sickr_spearman": 0.6996995063672988,
1812
+ "eval_stsb_spearman": 0.814908923917943,
1813
+ "step": 23000
1814
+ },
1815
+ {
1816
+ "epoch": 1.48,
1817
+ "eval_avg_sts": 0.7572778466663158,
1818
+ "eval_sickr_spearman": 0.699593405664222,
1819
+ "eval_stsb_spearman": 0.8149622876684096,
1820
+ "step": 23125
1821
+ },
1822
+ {
1823
+ "epoch": 1.49,
1824
+ "eval_avg_sts": 0.7572787295536805,
1825
+ "eval_sickr_spearman": 0.6995496493308119,
1826
+ "eval_stsb_spearman": 0.815007809776549,
1827
+ "step": 23250
1828
+ },
1829
+ {
1830
+ "epoch": 1.5,
1831
+ "eval_avg_sts": 0.7578655158883527,
1832
+ "eval_sickr_spearman": 0.7004442285950968,
1833
+ "eval_stsb_spearman": 0.8152868031816085,
1834
+ "step": 23375
1835
+ },
1836
+ {
1837
+ "electra_acc": 0.9093,
1838
+ "electra_fix_acc": 0.9889,
1839
+ "electra_rep_acc": 0.2565,
1840
+ "epoch": 1.5,
1841
+ "learning_rate": 2.4804812492000513e-06,
1842
+ "loss": 0.001,
1843
+ "neg_sim": NaN,
1844
+ "pos_sim": 0.8119,
1845
+ "step": 23500
1846
+ },
1847
+ {
1848
+ "epoch": 1.5,
1849
+ "eval_avg_sts": 0.7580453272234033,
1850
+ "eval_sickr_spearman": 0.700563201633359,
1851
+ "eval_stsb_spearman": 0.8155274528134476,
1852
+ "step": 23500
1853
+ },
1854
+ {
1855
+ "epoch": 1.51,
1856
+ "eval_avg_sts": 0.7589489492705888,
1857
+ "eval_sickr_spearman": 0.7011428889966156,
1858
+ "eval_stsb_spearman": 0.8167550095445619,
1859
+ "step": 23625
1860
+ },
1861
+ {
1862
+ "epoch": 1.52,
1863
+ "eval_avg_sts": 0.759282090184078,
1864
+ "eval_sickr_spearman": 0.7009553275454997,
1865
+ "eval_stsb_spearman": 0.8176088528226563,
1866
+ "step": 23750
1867
+ },
1868
+ {
1869
+ "epoch": 1.53,
1870
+ "eval_avg_sts": 0.7594355007785425,
1871
+ "eval_sickr_spearman": 0.7010773265431525,
1872
+ "eval_stsb_spearman": 0.8177936750139324,
1873
+ "step": 23875
1874
+ },
1875
+ {
1876
+ "electra_acc": 0.9092,
1877
+ "electra_fix_acc": 0.989,
1878
+ "electra_rep_acc": 0.2555,
1879
+ "epoch": 1.54,
1880
+ "learning_rate": 2.3204914885447333e-06,
1881
+ "loss": 0.001,
1882
+ "neg_sim": NaN,
1883
+ "pos_sim": 0.8149,
1884
+ "step": 24000
1885
+ },
1886
+ {
1887
+ "epoch": 1.54,
1888
+ "eval_avg_sts": 0.7594249307434521,
1889
+ "eval_sickr_spearman": 0.7010194490659197,
1890
+ "eval_stsb_spearman": 0.8178304124209844,
1891
+ "step": 24000
1892
+ },
1893
+ {
1894
+ "epoch": 1.54,
1895
+ "eval_avg_sts": 0.7594551033154371,
1896
+ "eval_sickr_spearman": 0.7012371740487385,
1897
+ "eval_stsb_spearman": 0.8176730325821358,
1898
+ "step": 24125
1899
+ },
1900
+ {
1901
+ "epoch": 1.55,
1902
+ "eval_avg_sts": 0.7589237357383787,
1903
+ "eval_sickr_spearman": 0.7005038351919814,
1904
+ "eval_stsb_spearman": 0.8173436362847759,
1905
+ "step": 24250
1906
+ },
1907
+ {
1908
+ "epoch": 1.56,
1909
+ "eval_avg_sts": 0.7588819689189907,
1910
+ "eval_sickr_spearman": 0.7008921666471086,
1911
+ "eval_stsb_spearman": 0.816871771190873,
1912
+ "step": 24375
1913
+ },
1914
+ {
1915
+ "electra_acc": 0.9094,
1916
+ "electra_fix_acc": 0.989,
1917
+ "electra_rep_acc": 0.2584,
1918
+ "epoch": 1.57,
1919
+ "learning_rate": 2.1605017278894154e-06,
1920
+ "loss": 0.001,
1921
+ "neg_sim": NaN,
1922
+ "pos_sim": 0.8167,
1923
+ "step": 24500
1924
+ },
1925
+ {
1926
+ "epoch": 1.57,
1927
+ "eval_avg_sts": 0.7589559409437466,
1928
+ "eval_sickr_spearman": 0.7010030224292277,
1929
+ "eval_stsb_spearman": 0.8169088594582655,
1930
+ "step": 24500
1931
+ },
1932
+ {
1933
+ "epoch": 1.58,
1934
+ "eval_avg_sts": 0.7589222814938349,
1935
+ "eval_sickr_spearman": 0.7009486992535012,
1936
+ "eval_stsb_spearman": 0.8168958637341686,
1937
+ "step": 24625
1938
+ },
1939
+ {
1940
+ "epoch": 1.58,
1941
+ "eval_avg_sts": 0.7574752410333385,
1942
+ "eval_sickr_spearman": 0.6996315903898653,
1943
+ "eval_stsb_spearman": 0.8153188916768117,
1944
+ "step": 24750
1945
+ },
1946
+ {
1947
+ "epoch": 1.59,
1948
+ "eval_avg_sts": 0.7564910076346855,
1949
+ "eval_sickr_spearman": 0.6992222213123084,
1950
+ "eval_stsb_spearman": 0.8137597939570625,
1951
+ "step": 24875
1952
+ },
1953
+ {
1954
+ "electra_acc": 0.9093,
1955
+ "electra_fix_acc": 0.9891,
1956
+ "electra_rep_acc": 0.2566,
1957
+ "epoch": 1.6,
1958
+ "learning_rate": 2.000511967234097e-06,
1959
+ "loss": 0.0011,
1960
+ "neg_sim": NaN,
1961
+ "pos_sim": 0.8178,
1962
+ "step": 25000
1963
+ },
1964
+ {
1965
+ "epoch": 1.6,
1966
+ "eval_avg_sts": 0.754858122417589,
1967
+ "eval_sickr_spearman": 0.6972444446483912,
1968
+ "eval_stsb_spearman": 0.8124718001867869,
1969
+ "step": 25000
1970
+ },
1971
+ {
1972
+ "epoch": 1.61,
1973
+ "eval_avg_sts": 0.7550451133191622,
1974
+ "eval_sickr_spearman": 0.6974792206722207,
1975
+ "eval_stsb_spearman": 0.8126110059661038,
1976
+ "step": 25125
1977
+ },
1978
+ {
1979
+ "epoch": 1.62,
1980
+ "eval_avg_sts": 0.754828720368155,
1981
+ "eval_sickr_spearman": 0.6962917477513663,
1982
+ "eval_stsb_spearman": 0.8133656929849435,
1983
+ "step": 25250
1984
+ },
1985
+ {
1986
+ "epoch": 1.62,
1987
+ "eval_avg_sts": 0.7548476950408536,
1988
+ "eval_sickr_spearman": 0.6961924674646935,
1989
+ "eval_stsb_spearman": 0.8135029226170137,
1990
+ "step": 25375
1991
+ },
1992
+ {
1993
+ "electra_acc": 0.9096,
1994
+ "electra_fix_acc": 0.9892,
1995
+ "electra_rep_acc": 0.2585,
1996
+ "epoch": 1.63,
1997
+ "learning_rate": 1.8405222065787792e-06,
1998
+ "loss": 0.0011,
1999
+ "neg_sim": NaN,
2000
+ "pos_sim": 0.8169,
2001
+ "step": 25500
2002
+ },
2003
+ {
2004
+ "epoch": 1.63,
2005
+ "eval_avg_sts": 0.7530610693683093,
2006
+ "eval_sickr_spearman": 0.6936729960387574,
2007
+ "eval_stsb_spearman": 0.8124491426978612,
2008
+ "step": 25500
2009
+ },
2010
+ {
2011
+ "epoch": 1.64,
2012
+ "eval_avg_sts": 0.7537427601811764,
2013
+ "eval_sickr_spearman": 0.6928838450421292,
2014
+ "eval_stsb_spearman": 0.8146016753202237,
2015
+ "step": 25625
2016
+ },
2017
+ {
2018
+ "epoch": 1.65,
2019
+ "eval_avg_sts": 0.7538542984125725,
2020
+ "eval_sickr_spearman": 0.6928155928469857,
2021
+ "eval_stsb_spearman": 0.8148930039781592,
2022
+ "step": 25750
2023
+ },
2024
+ {
2025
+ "epoch": 1.66,
2026
+ "eval_avg_sts": 0.7539219581746606,
2027
+ "eval_sickr_spearman": 0.6929616554264589,
2028
+ "eval_stsb_spearman": 0.8148822609228621,
2029
+ "step": 25875
2030
+ },
2031
+ {
2032
+ "electra_acc": 0.9096,
2033
+ "electra_fix_acc": 0.9892,
2034
+ "electra_rep_acc": 0.2568,
2035
+ "epoch": 1.66,
2036
+ "learning_rate": 1.6805324459234608e-06,
2037
+ "loss": 0.001,
2038
+ "neg_sim": NaN,
2039
+ "pos_sim": 0.8191,
2040
+ "step": 26000
2041
+ },
2042
+ {
2043
+ "epoch": 1.66,
2044
+ "eval_avg_sts": 0.7537628256146356,
2045
+ "eval_sickr_spearman": 0.6923114103751897,
2046
+ "eval_stsb_spearman": 0.8152142408540815,
2047
+ "step": 26000
2048
+ },
2049
+ {
2050
+ "epoch": 1.67,
2051
+ "eval_avg_sts": 0.754620615839658,
2052
+ "eval_sickr_spearman": 0.6932064219193875,
2053
+ "eval_stsb_spearman": 0.8160348097599284,
2054
+ "step": 26125
2055
+ },
2056
+ {
2057
+ "epoch": 1.68,
2058
+ "eval_avg_sts": 0.7518830889314513,
2059
+ "eval_sickr_spearman": 0.690710149515446,
2060
+ "eval_stsb_spearman": 0.8130560283474566,
2061
+ "step": 26250
2062
+ },
2063
+ {
2064
+ "epoch": 1.69,
2065
+ "eval_avg_sts": 0.7509225404957667,
2066
+ "eval_sickr_spearman": 0.6898265693733905,
2067
+ "eval_stsb_spearman": 0.8120185116181428,
2068
+ "step": 26375
2069
+ },
2070
+ {
2071
+ "electra_acc": 0.9096,
2072
+ "electra_fix_acc": 0.9892,
2073
+ "electra_rep_acc": 0.256,
2074
+ "epoch": 1.7,
2075
+ "learning_rate": 1.520542685268143e-06,
2076
+ "loss": 0.0011,
2077
+ "neg_sim": NaN,
2078
+ "pos_sim": 0.8194,
2079
+ "step": 26500
2080
+ },
2081
+ {
2082
+ "epoch": 1.7,
2083
+ "eval_avg_sts": 0.7509117147704546,
2084
+ "eval_sickr_spearman": 0.690458994886026,
2085
+ "eval_stsb_spearman": 0.8113644346548832,
2086
+ "step": 26500
2087
+ },
2088
+ {
2089
+ "epoch": 1.7,
2090
+ "eval_avg_sts": 0.751325870403242,
2091
+ "eval_sickr_spearman": 0.6910303728687339,
2092
+ "eval_stsb_spearman": 0.8116213679377499,
2093
+ "step": 26625
2094
+ },
2095
+ {
2096
+ "epoch": 1.71,
2097
+ "eval_avg_sts": 0.7505685715846826,
2098
+ "eval_sickr_spearman": 0.6889766590334431,
2099
+ "eval_stsb_spearman": 0.812160484135922,
2100
+ "step": 26750
2101
+ },
2102
+ {
2103
+ "epoch": 1.72,
2104
+ "eval_avg_sts": 0.7505717215143468,
2105
+ "eval_sickr_spearman": 0.6889912124571789,
2106
+ "eval_stsb_spearman": 0.8121522305715146,
2107
+ "step": 26875
2108
+ },
2109
+ {
2110
+ "electra_acc": 0.9097,
2111
+ "electra_fix_acc": 0.989,
2112
+ "electra_rep_acc": 0.2597,
2113
+ "epoch": 1.73,
2114
+ "learning_rate": 1.3605529246128248e-06,
2115
+ "loss": 0.001,
2116
+ "neg_sim": NaN,
2117
+ "pos_sim": 0.8193,
2118
+ "step": 27000
2119
+ },
2120
+ {
2121
+ "epoch": 1.73,
2122
+ "eval_avg_sts": 0.7506556450635647,
2123
+ "eval_sickr_spearman": 0.6890205594601575,
2124
+ "eval_stsb_spearman": 0.812290730666972,
2125
+ "step": 27000
2126
+ },
2127
+ {
2128
+ "epoch": 1.74,
2129
+ "eval_avg_sts": 0.7509018804381178,
2130
+ "eval_sickr_spearman": 0.6900798373712735,
2131
+ "eval_stsb_spearman": 0.811723923504962,
2132
+ "step": 27125
2133
+ },
2134
+ {
2135
+ "epoch": 1.74,
2136
+ "eval_avg_sts": 0.7500895498195226,
2137
+ "eval_sickr_spearman": 0.6892968343556296,
2138
+ "eval_stsb_spearman": 0.8108822652834156,
2139
+ "step": 27250
2140
+ },
2141
+ {
2142
+ "epoch": 1.75,
2143
+ "eval_avg_sts": 0.7506787202343813,
2144
+ "eval_sickr_spearman": 0.6893182081957694,
2145
+ "eval_stsb_spearman": 0.8120392322729932,
2146
+ "step": 27375
2147
+ },
2148
+ {
2149
+ "electra_acc": 0.9101,
2150
+ "electra_fix_acc": 0.989,
2151
+ "electra_rep_acc": 0.262,
2152
+ "epoch": 1.76,
2153
+ "learning_rate": 1.2005631639575069e-06,
2154
+ "loss": 0.001,
2155
+ "neg_sim": NaN,
2156
+ "pos_sim": 0.8192,
2157
+ "step": 27500
2158
+ },
2159
+ {
2160
+ "epoch": 1.76,
2161
+ "eval_avg_sts": 0.7507236365099843,
2162
+ "eval_sickr_spearman": 0.689373636086829,
2163
+ "eval_stsb_spearman": 0.8120736369331395,
2164
+ "step": 27500
2165
+ },
2166
+ {
2167
+ "epoch": 1.77,
2168
+ "eval_avg_sts": 0.7507778777687245,
2169
+ "eval_sickr_spearman": 0.6893403025024311,
2170
+ "eval_stsb_spearman": 0.812215453035018,
2171
+ "step": 27625
2172
+ },
2173
+ {
2174
+ "epoch": 1.78,
2175
+ "eval_avg_sts": 0.7508593494865333,
2176
+ "eval_sickr_spearman": 0.6894537519640278,
2177
+ "eval_stsb_spearman": 0.8122649470090388,
2178
+ "step": 27750
2179
+ },
2180
+ {
2181
+ "epoch": 1.78,
2182
+ "eval_avg_sts": 0.7509704306691856,
2183
+ "eval_sickr_spearman": 0.6894628778733011,
2184
+ "eval_stsb_spearman": 0.8124779834650699,
2185
+ "step": 27875
2186
+ },
2187
+ {
2188
+ "electra_acc": 0.9101,
2189
+ "electra_fix_acc": 0.9891,
2190
+ "electra_rep_acc": 0.2586,
2191
+ "epoch": 1.79,
2192
+ "learning_rate": 1.0405734033021888e-06,
2193
+ "loss": 0.001,
2194
+ "neg_sim": NaN,
2195
+ "pos_sim": 0.8201,
2196
+ "step": 28000
2197
+ },
2198
+ {
2199
+ "epoch": 1.79,
2200
+ "eval_avg_sts": 0.7509106758642159,
2201
+ "eval_sickr_spearman": 0.6893112436860609,
2202
+ "eval_stsb_spearman": 0.812510108042371,
2203
+ "step": 28000
2204
+ },
2205
+ {
2206
+ "epoch": 1.8,
2207
+ "eval_avg_sts": 0.7508459705581159,
2208
+ "eval_sickr_spearman": 0.6899334385740902,
2209
+ "eval_stsb_spearman": 0.8117585025421415,
2210
+ "step": 28125
2211
+ },
2212
+ {
2213
+ "epoch": 1.81,
2214
+ "eval_avg_sts": 0.7510477486103114,
2215
+ "eval_sickr_spearman": 0.6902538540517839,
2216
+ "eval_stsb_spearman": 0.811841643168839,
2217
+ "step": 28250
2218
+ },
2219
+ {
2220
+ "epoch": 1.82,
2221
+ "eval_avg_sts": 0.7507757787312187,
2222
+ "eval_sickr_spearman": 0.6898934766976937,
2223
+ "eval_stsb_spearman": 0.8116580807647437,
2224
+ "step": 28375
2225
+ },
2226
+ {
2227
+ "electra_acc": 0.9098,
2228
+ "electra_fix_acc": 0.989,
2229
+ "electra_rep_acc": 0.2594,
2230
+ "epoch": 1.82,
2231
+ "learning_rate": 8.805836426468707e-07,
2232
+ "loss": 0.0011,
2233
+ "neg_sim": NaN,
2234
+ "pos_sim": 0.8208,
2235
+ "step": 28500
2236
+ },
2237
+ {
2238
+ "epoch": 1.82,
2239
+ "eval_avg_sts": 0.7507727971903551,
2240
+ "eval_sickr_spearman": 0.6899099993965885,
2241
+ "eval_stsb_spearman": 0.8116355949841215,
2242
+ "step": 28500
2243
+ },
2244
+ {
2245
+ "epoch": 1.83,
2246
+ "eval_avg_sts": 0.7508146549871678,
2247
+ "eval_sickr_spearman": 0.6899898270871786,
2248
+ "eval_stsb_spearman": 0.811639482887157,
2249
+ "step": 28625
2250
+ },
2251
+ {
2252
+ "epoch": 1.84,
2253
+ "eval_avg_sts": 0.7506882515044668,
2254
+ "eval_sickr_spearman": 0.6898797398026825,
2255
+ "eval_stsb_spearman": 0.8114967632062512,
2256
+ "step": 28750
2257
+ },
2258
+ {
2259
+ "epoch": 1.85,
2260
+ "eval_avg_sts": 0.7507604828702781,
2261
+ "eval_sickr_spearman": 0.6899782035606307,
2262
+ "eval_stsb_spearman": 0.8115427621799256,
2263
+ "step": 28875
2264
+ },
2265
+ {
2266
+ "electra_acc": 0.9101,
2267
+ "electra_fix_acc": 0.989,
2268
+ "electra_rep_acc": 0.2588,
2269
+ "epoch": 1.86,
2270
+ "learning_rate": 7.205938819915525e-07,
2271
+ "loss": 0.0009,
2272
+ "neg_sim": NaN,
2273
+ "pos_sim": 0.821,
2274
+ "step": 29000
2275
+ },
2276
+ {
2277
+ "epoch": 1.86,
2278
+ "eval_avg_sts": 0.7507686428892963,
2279
+ "eval_sickr_spearman": 0.6899909798336131,
2280
+ "eval_stsb_spearman": 0.8115463059449796,
2281
+ "step": 29000
2282
+ },
2283
+ {
2284
+ "epoch": 1.86,
2285
+ "eval_avg_sts": 0.7509200158436065,
2286
+ "eval_sickr_spearman": 0.6902724420880404,
2287
+ "eval_stsb_spearman": 0.8115675895991724,
2288
+ "step": 29125
2289
+ },
2290
+ {
2291
+ "epoch": 1.87,
2292
+ "eval_avg_sts": 0.7507926487881307,
2293
+ "eval_sickr_spearman": 0.690199482844956,
2294
+ "eval_stsb_spearman": 0.8113858147313052,
2295
+ "step": 29250
2296
+ },
2297
+ {
2298
+ "epoch": 1.88,
2299
+ "eval_avg_sts": 0.7510272796849722,
2300
+ "eval_sickr_spearman": 0.6903056796102356,
2301
+ "eval_stsb_spearman": 0.8117488797597087,
2302
+ "step": 29375
2303
+ },
2304
+ {
2305
+ "electra_acc": 0.91,
2306
+ "electra_fix_acc": 0.9889,
2307
+ "electra_rep_acc": 0.2628,
2308
+ "epoch": 1.89,
2309
+ "learning_rate": 5.606041213362345e-07,
2310
+ "loss": 0.001,
2311
+ "neg_sim": NaN,
2312
+ "pos_sim": 0.8219,
2313
+ "step": 29500
2314
+ },
2315
+ {
2316
+ "epoch": 1.89,
2317
+ "eval_avg_sts": 0.7516619765140928,
2318
+ "eval_sickr_spearman": 0.6907379114920772,
2319
+ "eval_stsb_spearman": 0.8125860415361085,
2320
+ "step": 29500
2321
+ },
2322
+ {
2323
+ "epoch": 1.9,
2324
+ "eval_avg_sts": 0.7516247721458812,
2325
+ "eval_sickr_spearman": 0.6906019834750073,
2326
+ "eval_stsb_spearman": 0.812647560816755,
2327
+ "step": 29625
2328
+ },
2329
+ {
2330
+ "epoch": 1.9,
2331
+ "eval_avg_sts": 0.7515302759169064,
2332
+ "eval_sickr_spearman": 0.6904020780308219,
2333
+ "eval_stsb_spearman": 0.8126584738029908,
2334
+ "step": 29750
2335
+ },
2336
+ {
2337
+ "epoch": 1.91,
2338
+ "eval_avg_sts": 0.7512737585285767,
2339
+ "eval_sickr_spearman": 0.6900924695509518,
2340
+ "eval_stsb_spearman": 0.8124550475062015,
2341
+ "step": 29875
2342
+ },
2343
+ {
2344
+ "electra_acc": 0.9095,
2345
+ "electra_fix_acc": 0.9889,
2346
+ "electra_rep_acc": 0.2603,
2347
+ "epoch": 1.92,
2348
+ "learning_rate": 4.0061436068091647e-07,
2349
+ "loss": 0.001,
2350
+ "neg_sim": NaN,
2351
+ "pos_sim": 0.8219,
2352
+ "step": 30000
2353
+ },
2354
+ {
2355
+ "epoch": 1.92,
2356
+ "eval_avg_sts": 0.751445438691633,
2357
+ "eval_sickr_spearman": 0.6902850742677187,
2358
+ "eval_stsb_spearman": 0.8126058031155474,
2359
+ "step": 30000
2360
+ },
2361
+ {
2362
+ "epoch": 1.93,
2363
+ "eval_avg_sts": 0.7514690784500874,
2364
+ "eval_sickr_spearman": 0.6903169188879721,
2365
+ "eval_stsb_spearman": 0.8126212380122027,
2366
+ "step": 30125
2367
+ },
2368
+ {
2369
+ "epoch": 1.94,
2370
+ "eval_avg_sts": 0.7513646557354139,
2371
+ "eval_sickr_spearman": 0.6901293094057549,
2372
+ "eval_stsb_spearman": 0.812600002065073,
2373
+ "step": 30250
2374
+ },
2375
+ {
2376
+ "epoch": 1.94,
2377
+ "eval_avg_sts": 0.7513546637483999,
2378
+ "eval_sickr_spearman": 0.6900938624528935,
2379
+ "eval_stsb_spearman": 0.8126154650439061,
2380
+ "step": 30375
2381
+ },
2382
+ {
2383
+ "electra_acc": 0.9098,
2384
+ "electra_fix_acc": 0.989,
2385
+ "electra_rep_acc": 0.2615,
2386
+ "epoch": 1.95,
2387
+ "learning_rate": 2.4062460002559835e-07,
2388
+ "loss": 0.0009,
2389
+ "neg_sim": NaN,
2390
+ "pos_sim": 0.8227,
2391
+ "step": 30500
2392
+ },
2393
+ {
2394
+ "epoch": 1.95,
2395
+ "eval_avg_sts": 0.7513856383156294,
2396
+ "eval_sickr_spearman": 0.6901277243794074,
2397
+ "eval_stsb_spearman": 0.8126435522518516,
2398
+ "step": 30500
2399
+ },
2400
+ {
2401
+ "epoch": 1.96,
2402
+ "eval_avg_sts": 0.7514646471464337,
2403
+ "eval_sickr_spearman": 0.6902287818168333,
2404
+ "eval_stsb_spearman": 0.8127005124760341,
2405
+ "step": 30625
2406
+ },
2407
+ {
2408
+ "epoch": 1.97,
2409
+ "eval_avg_sts": 0.7515241048574239,
2410
+ "eval_sickr_spearman": 0.6903239314287821,
2411
+ "eval_stsb_spearman": 0.8127242782860657,
2412
+ "step": 30750
2413
+ },
2414
+ {
2415
+ "epoch": 1.98,
2416
+ "eval_avg_sts": 0.7517451176255749,
2417
+ "eval_sickr_spearman": 0.6905863253359386,
2418
+ "eval_stsb_spearman": 0.8129039099152112,
2419
+ "step": 30875
2420
+ },
2421
+ {
2422
+ "electra_acc": 0.9096,
2423
+ "electra_fix_acc": 0.989,
2424
+ "electra_rep_acc": 0.2613,
2425
+ "epoch": 1.98,
2426
+ "learning_rate": 8.06348393702803e-08,
2427
+ "loss": 0.001,
2428
+ "neg_sim": NaN,
2429
+ "pos_sim": 0.8228,
2430
+ "step": 31000
2431
+ },
2432
+ {
2433
+ "epoch": 1.98,
2434
+ "eval_avg_sts": 0.7517585752042792,
2435
+ "eval_sickr_spearman": 0.6906067865851512,
2436
+ "eval_stsb_spearman": 0.8129103638234072,
2437
+ "step": 31000
2438
+ },
2439
+ {
2440
+ "epoch": 1.99,
2441
+ "eval_avg_sts": 0.7517603113599198,
2442
+ "eval_sickr_spearman": 0.6906109172598749,
2443
+ "eval_stsb_spearman": 0.8129097054599647,
2444
+ "step": 31125
2445
+ },
2446
+ {
2447
+ "epoch": 2.0,
2448
+ "eval_avg_sts": 0.7517610887734341,
2449
+ "eval_sickr_spearman": 0.6906109172598749,
2450
+ "eval_stsb_spearman": 0.8129112602869931,
2451
+ "step": 31250
2452
+ },
2453
+ {
2454
+ "epoch": 2.0,
2455
+ "step": 31252,
2456
+ "train_runtime": 14642.7016,
2457
+ "train_samples_per_second": 2.134
2458
+ }
2459
+ ],
2460
+ "max_steps": 31252,
2461
+ "num_train_epochs": 2,
2462
+ "total_flos": 571223595241308000,
2463
+ "trial_name": null,
2464
+ "trial_params": null
2465
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff