Miaoran commited on
Commit
292581e
1 Parent(s): 01bb636

Upload 2 files

Browse files
Files changed (2) hide show
  1. eval_results.log +192 -0
  2. mcse.pt +3 -0
eval_results.log ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2021-10-01 09:01:40,065 : ***** Transfer task : STS12 *****
2
+
3
+
4
+ 2021-10-01 09:01:42,261 : MSRpar : pearson = 0.6331, spearman = 0.6265, align_loss = 0.2011, uniform_loss = -2.5322
5
+ 2021-10-01 09:01:43,203 : MSRvid : pearson = 0.8752, spearman = 0.8747, align_loss = 0.2308, uniform_loss = -2.3171
6
+ 2021-10-01 09:01:44,077 : SMTeuroparl : pearson = 0.5281, spearman = 0.6156, align_loss = 0.2567, uniform_loss = -1.7214
7
+ 2021-10-01 09:01:45,299 : surprise.OnWN : pearson = 0.7492, spearman = 0.7087, align_loss = 0.2956, uniform_loss = -2.4748
8
+ 2021-10-01 09:01:45,905 : surprise.SMTnews : pearson = 0.7292, spearman = 0.6400, align_loss = 0.2303, uniform_loss = -1.8483
9
+ 2021-10-01 09:01:45,914 : ALL : Pearson = 0.8025, Spearman = 0.7235, align_loss = 0.2428, uniform_loss = -2.2394
10
+ 2021-10-01 09:01:45,914 : ALL (weighted average) : Pearson = 0.7164, Spearman = 0.7064, align_loss = 0.2430, uniform_loss = -2.2589
11
+ 2021-10-01 09:01:45,914 : ALL (average) : Pearson = 0.7030, Spearman = 0.6931, align_loss = 0.2429, uniform_loss = -2.1788
12
+
13
+ 2021-10-01 09:01:45,921 : ***** Transfer task : STS13 (-SMT) *****
14
+
15
+
16
+ 2021-10-01 09:01:46,481 : FNWN : pearson = 0.6060, spearman = 0.6148, align_loss = 0.3780, uniform_loss = -2.2212
17
+ 2021-10-01 09:01:47,511 : headlines : pearson = 0.7903, spearman = 0.7947, align_loss = 0.2445, uniform_loss = -2.4584
18
+ 2021-10-01 09:01:48,105 : OnWN : pearson = 0.8326, spearman = 0.8225, align_loss = 0.3193, uniform_loss = -2.2413
19
+ 2021-10-01 09:01:48,108 : ALL : Pearson = 0.8012, Spearman = 0.8073, align_loss = 0.2938, uniform_loss = -2.3384
20
+ 2021-10-01 09:01:48,108 : ALL (weighted average) : Pearson = 0.7829, Spearman = 0.7824, align_loss = 0.2893, uniform_loss = -2.3473
21
+ 2021-10-01 09:01:48,108 : ALL (average) : Pearson = 0.7430, Spearman = 0.7440, align_loss = 0.3139, uniform_loss = -2.3070
22
+
23
+ 2021-10-01 09:01:48,109 : ***** Transfer task : STS14 *****
24
+
25
+
26
+ 2021-10-01 09:01:48,965 : deft-forum : pearson = 0.5599, spearman = 0.5496, align_loss = 0.3066, uniform_loss = -2.4918
27
+ 2021-10-01 09:01:49,697 : deft-news : pearson = 0.8154, spearman = 0.7922, align_loss = 0.1642, uniform_loss = -2.2330
28
+ 2021-10-01 09:01:50,853 : headlines : pearson = 0.7922, spearman = 0.7868, align_loss = 0.2423, uniform_loss = -2.4423
29
+ 2021-10-01 09:01:51,903 : images : pearson = 0.8722, spearman = 0.8332, align_loss = 0.2738, uniform_loss = -2.6206
30
+ 2021-10-01 09:01:52,938 : OnWN : pearson = 0.8628, spearman = 0.8518, align_loss = 0.3249, uniform_loss = -2.2993
31
+ 2021-10-01 09:01:54,453 : tweet-news : pearson = 0.7894, spearman = 0.7150, align_loss = 0.3931, uniform_loss = -2.4055
32
+ 2021-10-01 09:01:54,459 : ALL : Pearson = 0.7934, Spearman = 0.7567, align_loss = 0.2943, uniform_loss = -2.4281
33
+ 2021-10-01 09:01:54,459 : ALL (weighted average) : Pearson = 0.7957, Spearman = 0.7667, align_loss = 0.2967, uniform_loss = -2.4312
34
+ 2021-10-01 09:01:54,459 : ALL (average) : Pearson = 0.7820, Spearman = 0.7548, align_loss = 0.2841, uniform_loss = -2.4154
35
+
36
+ 2021-10-01 09:01:54,469 : ***** Transfer task : STS15 *****
37
+
38
+
39
+ 2021-10-01 09:01:55,305 : answers-forums : pearson = 0.7622, spearman = 0.7691, align_loss = 0.4751, uniform_loss = -2.5207
40
+ 2021-10-01 09:01:56,116 : answers-students : pearson = 0.7274, spearman = 0.7348, align_loss = 0.2984, uniform_loss = -1.6962
41
+ 2021-10-01 09:01:56,833 : belief : pearson = 0.8280, spearman = 0.8501, align_loss = 0.3987, uniform_loss = -2.4359
42
+ 2021-10-01 09:01:58,174 : headlines : pearson = 0.8215, spearman = 0.8271, align_loss = 0.2414, uniform_loss = -2.4547
43
+ 2021-10-01 09:01:59,504 : images : pearson = 0.8820, spearman = 0.8893, align_loss = 0.2480, uniform_loss = -2.2802
44
+ 2021-10-01 09:01:59,510 : ALL : Pearson = 0.8265, Spearman = 0.8337, align_loss = 0.3062, uniform_loss = -2.2274
45
+ 2021-10-01 09:01:59,510 : ALL (weighted average) : Pearson = 0.8065, Spearman = 0.8152, align_loss = 0.3062, uniform_loss = -2.2274
46
+ 2021-10-01 09:01:59,510 : ALL (average) : Pearson = 0.8042, Spearman = 0.8141, align_loss = 0.3323, uniform_loss = -2.2776
47
+
48
+ 2021-10-01 09:01:59,515 : ***** Transfer task : STS16 *****
49
+
50
+
51
+ 2021-10-01 09:01:59,998 : answer-answer : pearson = 0.6969, spearman = 0.6955, align_loss = 0.3543, uniform_loss = -2.0589
52
+ 2021-10-01 09:02:00,259 : headlines : pearson = 0.8023, spearman = 0.8239, align_loss = 0.2242, uniform_loss = -2.4768
53
+ 2021-10-01 09:02:00,577 : plagiarism : pearson = 0.8537, spearman = 0.8668, align_loss = 0.1743, uniform_loss = -2.0453
54
+ 2021-10-01 09:02:01,089 : postediting : pearson = 0.8560, spearman = 0.8795, align_loss = 0.1283, uniform_loss = -2.4398
55
+ 2021-10-01 09:02:01,342 : question-question : pearson = 0.6924, spearman = 0.6868, align_loss = 0.2773, uniform_loss = -2.2263
56
+ 2021-10-01 09:02:01,345 : ALL : Pearson = 0.7716, Spearman = 0.7809, align_loss = 0.2317, uniform_loss = -2.2494
57
+ 2021-10-01 09:02:01,345 : ALL (weighted average) : Pearson = 0.7814, Spearman = 0.7920, align_loss = 0.2320, uniform_loss = -2.2519
58
+ 2021-10-01 09:02:01,345 : ALL (average) : Pearson = 0.7803, Spearman = 0.7905, align_loss = 0.2317, uniform_loss = -2.2494
59
+
60
+ 2021-10-01 09:02:01,349 :
61
+
62
+ ***** Transfer task : STSBenchmark*****
63
+
64
+
65
+ 2021-10-01 09:02:11,474 : train : pearson = 0.8156, spearman = 0.7934, align_loss = 0.2513, uniform_loss = -2.4757
66
+ 2021-10-01 09:02:14,290 : dev : pearson = 0.8488, spearman = 0.8472, align_loss = 0.2706, uniform_loss = -2.5047
67
+ 2021-10-01 09:02:16,780 : test : pearson = 0.7966, spearman = 0.7883, align_loss = 0.2571, uniform_loss = -2.4174
68
+ 2021-10-01 09:02:16,787 : ALL : Pearson = 0.8198, Spearman = 0.8047, align_loss = 0.2556, uniform_loss = -2.4714
69
+ 2021-10-01 09:02:16,787 : ALL (weighted average) : Pearson = 0.8183, Spearman = 0.8019, align_loss = 0.2556, uniform_loss = -2.4714
70
+ 2021-10-01 09:02:16,787 : ALL (average) : Pearson = 0.8203, Spearman = 0.8096, align_loss = 0.2597, uniform_loss = -2.4659
71
+
72
+ 2021-10-01 09:02:16,796 :
73
+
74
+ ***** Transfer task : SICKRelatedness*****
75
+
76
+
77
+ 2021-10-01 09:02:23,651 : train : pearson = 0.7910, spearman = 0.7024, align_loss = 0.2231, uniform_loss = -2.3434
78
+ 2021-10-01 09:02:24,561 : dev : pearson = 0.7941, spearman = 0.7294, align_loss = 0.2196, uniform_loss = -2.5349
79
+ 2021-10-01 09:02:31,960 : test : pearson = 0.7900, spearman = 0.6979, align_loss = 0.2213, uniform_loss = -2.3409
80
+ 2021-10-01 09:02:31,967 : ALL : Pearson = 0.7907, Spearman = 0.7016, align_loss = 0.2220, uniform_loss = -2.3518
81
+ 2021-10-01 09:02:31,967 : ALL (weighted average) : Pearson = 0.7906, Spearman = 0.7015, align_loss = 0.2220, uniform_loss = -2.3518
82
+ 2021-10-01 09:02:31,967 : ALL (average) : Pearson = 0.7917, Spearman = 0.7099, align_loss = 0.2213, uniform_loss = -2.4064
83
+
84
+ 2021-10-01 09:02:31,968 : ------ test ------
85
+ 2021-10-01 09:02:31,969 : +--------+--------+--------+--------+--------+--------------+-----------------+--------+
86
+ | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. |
87
+ +--------+--------+--------+--------+--------+--------------+-----------------+--------+
88
+ | 72.35 | 80.73 | 75.67 | 83.37 | 78.09 | 78.83 | 69.79 | 76.98 |
89
+ | 0.243 | 0.294 | 0.294 | 0.306 | 0.232 | 0.257 | 0.221 | 0.264 |
90
+ | -2.239 | -2.338 | -2.428 | -2.227 | -2.249 | -2.417 | -2.341 | -2.320 |
91
+ +--------+--------+--------+--------+--------+--------------+-----------------+--------+
92
+ 2021-10-01 09:02:31,971 : +------+------+------+------+------+------+------+------+
93
+ | MR | CR | SUBJ | MPQA | SST2 | TREC | MRPC | Avg. |
94
+ +------+------+------+------+------+------+------+------+
95
+ | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
96
+ +------+------+------+------+------+------+------+------+
97
+ 2021-10-03 08:46:37,510 : ***** Transfer task : STS12 *****
98
+
99
+
100
+ 2021-10-03 08:46:40,901 : MSRpar : pearson = 0.5880, spearman = 0.5984, align_loss = 0.2252, uniform_loss = -2.6787
101
+ 2021-10-03 08:46:42,191 : MSRvid : pearson = 0.8921, spearman = 0.8908, align_loss = 0.2316, uniform_loss = -2.5007
102
+ 2021-10-03 08:46:43,343 : SMTeuroparl : pearson = 0.5285, spearman = 0.6106, align_loss = 0.2731, uniform_loss = -1.7537
103
+ 2021-10-03 08:46:45,415 : surprise.OnWN : pearson = 0.7406, spearman = 0.6921, align_loss = 0.3048, uniform_loss = -2.5574
104
+ 2021-10-03 08:46:46,571 : surprise.SMTnews : pearson = 0.7277, spearman = 0.6356, align_loss = 0.2380, uniform_loss = -1.9060
105
+ 2021-10-03 08:46:46,574 : ALL : Pearson = 0.8076, Spearman = 0.7234, align_loss = 0.2544, uniform_loss = -2.3484
106
+ 2021-10-03 08:46:46,574 : ALL (weighted average) : Pearson = 0.7074, Spearman = 0.6982, align_loss = 0.2547, uniform_loss = -2.3707
107
+ 2021-10-03 08:46:46,574 : ALL (average) : Pearson = 0.6954, Spearman = 0.6855, align_loss = 0.2546, uniform_loss = -2.2793
108
+
109
+ 2021-10-03 08:46:46,578 : ***** Transfer task : STS13 (-SMT) *****
110
+
111
+
112
+ 2021-10-03 08:46:47,638 : FNWN : pearson = 0.6040, spearman = 0.6189, align_loss = 0.4101, uniform_loss = -2.3160
113
+ 2021-10-03 08:46:49,129 : headlines : pearson = 0.7866, spearman = 0.7911, align_loss = 0.2323, uniform_loss = -2.5041
114
+ 2021-10-03 08:46:50,342 : OnWN : pearson = 0.8134, spearman = 0.8051, align_loss = 0.3154, uniform_loss = -2.2660
115
+ 2021-10-03 08:46:50,346 : ALL : Pearson = 0.7865, Spearman = 0.7944, align_loss = 0.2916, uniform_loss = -2.3836
116
+ 2021-10-03 08:46:50,346 : ALL (weighted average) : Pearson = 0.7736, Spearman = 0.7746, align_loss = 0.2858, uniform_loss = -2.3914
117
+ 2021-10-03 08:46:50,346 : ALL (average) : Pearson = 0.7347, Spearman = 0.7383, align_loss = 0.3193, uniform_loss = -2.3621
118
+
119
+ 2021-10-03 08:46:50,348 : ***** Transfer task : STS14 *****
120
+
121
+
122
+ 2021-10-03 08:46:51,678 : deft-forum : pearson = 0.5178, spearman = 0.5013, align_loss = 0.3319, uniform_loss = -2.5933
123
+ 2021-10-03 08:46:52,875 : deft-news : pearson = 0.8103, spearman = 0.7737, align_loss = 0.1740, uniform_loss = -2.3745
124
+ 2021-10-03 08:46:54,579 : headlines : pearson = 0.7742, spearman = 0.7526, align_loss = 0.2295, uniform_loss = -2.4763
125
+ 2021-10-03 08:46:56,190 : images : pearson = 0.8803, spearman = 0.8357, align_loss = 0.2804, uniform_loss = -2.8383
126
+ 2021-10-03 08:46:57,857 : OnWN : pearson = 0.8478, spearman = 0.8432, align_loss = 0.3234, uniform_loss = -2.3374
127
+ 2021-10-03 08:46:59,957 : tweet-news : pearson = 0.7761, spearman = 0.6955, align_loss = 0.4371, uniform_loss = -2.5998
128
+ 2021-10-03 08:46:59,962 : ALL : Pearson = 0.7708, Spearman = 0.7288, align_loss = 0.3055, uniform_loss = -2.5486
129
+ 2021-10-03 08:46:59,962 : ALL (weighted average) : Pearson = 0.7826, Spearman = 0.7475, align_loss = 0.3078, uniform_loss = -2.5515
130
+ 2021-10-03 08:46:59,962 : ALL (average) : Pearson = 0.7677, Spearman = 0.7337, align_loss = 0.2960, uniform_loss = -2.5366
131
+
132
+ 2021-10-03 08:46:59,968 : ***** Transfer task : STS15 *****
133
+
134
+
135
+ 2021-10-03 08:47:01,635 : answers-forums : pearson = 0.7260, spearman = 0.7319, align_loss = 0.4913, uniform_loss = -2.6655
136
+ 2021-10-03 08:47:03,236 : answers-students : pearson = 0.7329, spearman = 0.7356, align_loss = 0.3290, uniform_loss = -1.7544
137
+ 2021-10-03 08:47:04,764 : belief : pearson = 0.8161, spearman = 0.8396, align_loss = 0.4395, uniform_loss = -2.5462
138
+ 2021-10-03 08:47:06,566 : headlines : pearson = 0.8060, spearman = 0.8116, align_loss = 0.2341, uniform_loss = -2.4907
139
+ 2021-10-03 08:47:08,375 : images : pearson = 0.9027, spearman = 0.9108, align_loss = 0.2459, uniform_loss = -2.4940
140
+ 2021-10-03 08:47:08,380 : ALL : Pearson = 0.8219, Spearman = 0.8295, align_loss = 0.3186, uniform_loss = -2.3362
141
+ 2021-10-03 08:47:08,380 : ALL (weighted average) : Pearson = 0.8031, Spearman = 0.8109, align_loss = 0.3186, uniform_loss = -2.3362
142
+ 2021-10-03 08:47:08,380 : ALL (average) : Pearson = 0.7967, Spearman = 0.8059, align_loss = 0.3480, uniform_loss = -2.3902
143
+
144
+ 2021-10-03 08:47:08,384 : ***** Transfer task : STS16 *****
145
+
146
+
147
+ 2021-10-03 08:47:09,587 : answer-answer : pearson = 0.7146, spearman = 0.7066, align_loss = 0.3366, uniform_loss = -2.1176
148
+ 2021-10-03 08:47:10,209 : headlines : pearson = 0.7777, spearman = 0.7936, align_loss = 0.2124, uniform_loss = -2.5313
149
+ 2021-10-03 08:47:10,964 : plagiarism : pearson = 0.8479, spearman = 0.8575, align_loss = 0.1975, uniform_loss = -2.0827
150
+ 2021-10-03 08:47:12,216 : postediting : pearson = 0.8545, spearman = 0.8739, align_loss = 0.1343, uniform_loss = -2.5950
151
+ 2021-10-03 08:47:12,780 : question-question : pearson = 0.7229, spearman = 0.7325, align_loss = 0.2821, uniform_loss = -2.4114
152
+ 2021-10-03 08:47:12,783 : ALL : Pearson = 0.7810, Spearman = 0.7898, align_loss = 0.2326, uniform_loss = -2.3476
153
+ 2021-10-03 08:47:12,783 : ALL (weighted average) : Pearson = 0.7839, Spearman = 0.7931, align_loss = 0.2323, uniform_loss = -2.3477
154
+ 2021-10-03 08:47:12,783 : ALL (average) : Pearson = 0.7835, Spearman = 0.7928, align_loss = 0.2326, uniform_loss = -2.3476
155
+
156
+ 2021-10-03 08:47:12,786 :
157
+
158
+ ***** Transfer task : STSBenchmark*****
159
+
160
+
161
+ 2021-10-03 08:47:32,536 : train : pearson = 0.8064, spearman = 0.7806, align_loss = 0.2562, uniform_loss = -2.5998
162
+ 2021-10-03 08:47:38,110 : dev : pearson = 0.8418, spearman = 0.8427, align_loss = 0.2780, uniform_loss = -2.6537
163
+ 2021-10-03 08:47:42,861 : test : pearson = 0.7969, spearman = 0.7901, align_loss = 0.2528, uniform_loss = -2.5614
164
+ 2021-10-03 08:47:42,870 : ALL : Pearson = 0.8125, Spearman = 0.7966, align_loss = 0.2595, uniform_loss = -2.6031
165
+ 2021-10-03 08:47:42,870 : ALL (weighted average) : Pearson = 0.8110, Spearman = 0.7929, align_loss = 0.2594, uniform_loss = -2.6030
166
+ 2021-10-03 08:47:42,870 : ALL (average) : Pearson = 0.8150, Spearman = 0.8045, align_loss = 0.2624, uniform_loss = -2.6050
167
+
168
+ 2021-10-03 08:47:42,878 :
169
+
170
+ ***** Transfer task : SICKRelatedness*****
171
+
172
+
173
+ 2021-10-03 08:47:55,348 : train : pearson = 0.8237, spearman = 0.7467, align_loss = 0.2326, uniform_loss = -2.4715
174
+ 2021-10-03 08:47:56,876 : dev : pearson = 0.8242, spearman = 0.7745, align_loss = 0.2351, uniform_loss = -2.6674
175
+ 2021-10-03 08:48:10,119 : test : pearson = 0.8150, spearman = 0.7396, align_loss = 0.2319, uniform_loss = -2.4592
176
+ 2021-10-03 08:48:10,131 : ALL : Pearson = 0.8195, Spearman = 0.7444, align_loss = 0.2324, uniform_loss = -2.4753
177
+ 2021-10-03 08:48:10,131 : ALL (weighted average) : Pearson = 0.8194, Spearman = 0.7446, align_loss = 0.2324, uniform_loss = -2.4753
178
+ 2021-10-03 08:48:10,131 : ALL (average) : Pearson = 0.8210, Spearman = 0.7536, align_loss = 0.2332, uniform_loss = -2.5327
179
+
180
+ 2021-10-03 08:48:10,132 : ------ test ------
181
+ 2021-10-03 08:48:10,133 : +--------+--------+--------+--------+--------+--------------+-----------------+--------+
182
+ | STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | Avg. |
183
+ +--------+--------+--------+--------+--------+--------------+-----------------+--------+
184
+ | 72.34 | 79.44 | 72.88 | 82.95 | 78.98 | 79.01 | 73.96 | 77.08 |
185
+ | 0.254 | 0.292 | 0.306 | 0.319 | 0.233 | 0.253 | 0.232 | 0.270 |
186
+ | -2.348 | -2.384 | -2.549 | -2.336 | -2.348 | -2.561 | -2.459 | -2.426 |
187
+ +--------+--------+--------+--------+--------+--------------+-----------------+--------+
188
+ 2021-10-03 08:48:10,135 : +------+------+------+------+------+------+------+------+
189
+ | MR | CR | SUBJ | MPQA | SST2 | TREC | MRPC | Avg. |
190
+ +------+------+------+------+------+------+------+------+
191
+ | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
192
+ +------+------+------+------+------+------+------+------+
mcse.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc677e8e7cc6c1faf9f527fa8b5f8358fc7feb9850fab32e315ed1f2af495885
3
+ size 2887571