gorkaartola commited on
Commit
3333852
1 Parent(s): 25d858e

Upload report-Model-0_Queries-4_Prompt-3_Strategies-argmax-threshold0.05-threshold0.25-threshold0.5-threshold0.75-topk9-topk7-topk5-topk3.csv

Browse files
Reports/report-Model-0_Queries-4_Prompt-3_Strategies-argmax-threshold0.05-threshold0.25-threshold0.5-threshold0.75-topk9-topk7-topk5-topk3.csv ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ argmax_max,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
2
+ 0,0,2,2,0,1,0.0,0.0,0.0,0.25
3
+ 1,1,30,30,1,0,0.03333333333333333,1.0,0.06451612903225806,0.5166666666666667
4
+ 2,2,288,288,21,24,0.07291666666666667,0.4666666666666667,0.12612612612612614,0.4947916666666667
5
+ 3,3,87,87,55,40,0.632183908045977,0.5789473684210527,0.6043956043956045,0.5862068965517241
6
+ 4,4,94,94,3,1,0.031914893617021274,0.75,0.061224489795918366,0.5106382978723404
7
+ 5,5,62,62,1,0,0.016129032258064516,1.0,0.031746031746031744,0.5080645161290323
8
+ 6,6,63,63,1,2,0.015873015873015872,0.3333333333333333,0.030303030303030304,0.49206349206349204
9
+ 7,7,17,17,0,0,0.0,0.0,0.0,0.5
10
+ 8,8,65,65,0,0,0.0,0.0,0.0,0.5
11
+ 9,9,31,31,0,0,0.0,0.0,0.0,0.5
12
+ 10,10,57,57,2,3,0.03508771929824561,0.4,0.06451612903225806,0.49122807017543857
13
+ 11,11,48,48,0,0,0.0,0.0,0.0,0.5
14
+ 12,12,36,36,1,0,0.027777777777777776,1.0,0.05405405405405406,0.5138888888888888
15
+ 13,13,17,17,0,0,0.0,0.0,0.0,0.5
16
+ 14,14,77,77,0,0,0.0,0.0,0.0,0.5
17
+ 15,15,40,40,2,2,0.05,0.5,0.09090909090909091,0.5
18
+ 16,16,29,29,3,2,0.10344827586206896,0.6,0.17647058823529413,0.5172413793103449
19
+ 17,total,1043,1043,90,75,,,,
20
+ 18,,,,,Micro avg.,0.0862895493767977,0.5454545454545454,0.14900662251655628,0.5071907957813998
21
+ 19,,,,,Macro avg.,0.05992144839601005,0.38993808049535605,0.07672125138998036,0.49298763966615267
22
+ threshold-0.05,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
23
+ 0,0,2,2,2,2,1.0,0.5,0.6666666666666666,0.5
24
+ 1,1,30,30,30,29,1.0,0.5084745762711864,0.6741573033707865,0.5166666666666667
25
+ 2,2,288,288,287,288,0.9965277777777778,0.4991304347826087,0.6651216685979142,0.4982638888888889
26
+ 3,3,87,87,86,86,0.9885057471264368,0.5,0.6640926640926641,0.5
27
+ 4,4,94,94,93,93,0.9893617021276596,0.5,0.6642857142857143,0.5
28
+ 5,5,62,62,62,62,1.0,0.5,0.6666666666666666,0.5
29
+ 6,6,63,63,62,63,0.9841269841269841,0.496,0.6595744680851063,0.49206349206349204
30
+ 7,7,17,17,17,17,1.0,0.5,0.6666666666666666,0.5
31
+ 8,8,65,65,64,63,0.9846153846153847,0.5039370078740157,0.6666666666666666,0.5076923076923077
32
+ 9,9,31,31,30,31,0.967741935483871,0.4918032786885246,0.6521739130434782,0.4838709677419355
33
+ 10,10,57,57,57,56,1.0,0.504424778761062,0.6705882352941176,0.5087719298245614
34
+ 11,11,48,48,47,48,0.9791666666666666,0.49473684210526314,0.6573426573426573,0.4895833333333333
35
+ 12,12,36,36,36,36,1.0,0.5,0.6666666666666666,0.5
36
+ 13,13,17,17,17,16,1.0,0.5151515151515151,0.6799999999999999,0.5294117647058824
37
+ 14,14,77,77,77,77,1.0,0.5,0.6666666666666666,0.5
38
+ 15,15,40,40,39,39,0.975,0.5,0.6610169491525423,0.5
39
+ 16,16,29,29,29,29,1.0,0.5,0.6666666666666666,0.5
40
+ 17,total,1043,1043,1035,1035,,,,
41
+ 18,,,,,Micro avg.,0.9923298178331735,0.5,0.6649534211371668,0.5
42
+ 19,,,,,Macro avg.,0.9920615410543988,0.5008034372725986,0.6655894258783321,0.5015484912304158
43
+ threshold-0.25,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
44
+ 0,0,2,2,2,2,1.0,0.5,0.6666666666666666,0.5
45
+ 1,1,30,30,28,20,0.9333333333333333,0.5833333333333334,0.7179487179487181,0.6333333333333333
46
+ 2,2,288,288,227,243,0.7881944444444444,0.4829787234042553,0.5989445910290238,0.4722222222222222
47
+ 3,3,87,87,80,75,0.9195402298850575,0.5161290322580645,0.6611570247933883,0.5287356321839081
48
+ 4,4,94,94,73,75,0.776595744680851,0.49324324324324326,0.6033057851239669,0.48936170212765956
49
+ 5,5,62,62,58,60,0.9354838709677419,0.4915254237288136,0.6444444444444444,0.4838709677419355
50
+ 6,6,63,63,61,56,0.9682539682539683,0.5213675213675214,0.6777777777777778,0.5396825396825397
51
+ 7,7,17,17,15,17,0.8823529411764706,0.46875,0.6122448979591837,0.4411764705882353
52
+ 8,8,65,65,61,58,0.9384615384615385,0.5126050420168067,0.6630434782608695,0.5230769230769231
53
+ 9,9,31,31,22,25,0.7096774193548387,0.46808510638297873,0.5641025641025641,0.45161290322580644
54
+ 10,10,57,57,55,54,0.9649122807017544,0.5045871559633027,0.6626506024096386,0.5087719298245614
55
+ 11,11,48,48,45,46,0.9375,0.4945054945054945,0.6474820143884892,0.4895833333333333
56
+ 12,12,36,36,35,36,0.9722222222222222,0.49295774647887325,0.6542056074766356,0.4861111111111111
57
+ 13,13,17,17,15,15,0.8823529411764706,0.5,0.6382978723404256,0.5
58
+ 14,14,77,77,75,72,0.974025974025974,0.5102040816326531,0.6696428571428571,0.5194805194805194
59
+ 15,15,40,40,28,28,0.7,0.5,0.5833333333333334,0.5
60
+ 16,16,29,29,29,28,1.0,0.5087719298245614,0.6744186046511628,0.5172413793103449
61
+ 17,total,1043,1043,909,910,,,,
62
+ 18,,,,,Micro avg.,0.8715244487056567,0.4997251236943375,0.6352201257861635,0.49952061361457334
63
+ 19,,,,,Macro avg.,0.8989945240402744,0.5028849314199942,0.6435098141087733,0.504956527484849
64
+ threshold-0.5,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
65
+ 0,0,2,2,1,0,0.5,1.0,0.6666666666666666,0.75
66
+ 1,1,30,30,13,6,0.43333333333333335,0.6842105263157895,0.5306122448979592,0.6166666666666667
67
+ 2,2,288,288,57,61,0.19791666666666666,0.4830508474576271,0.28078817733990147,0.4930555555555556
68
+ 3,3,87,87,41,23,0.47126436781609193,0.640625,0.5430463576158939,0.603448275862069
69
+ 4,4,94,94,21,22,0.22340425531914893,0.4883720930232558,0.30656934306569344,0.4946808510638298
70
+ 5,5,62,62,31,36,0.5,0.4626865671641791,0.4806201550387597,0.4596774193548387
71
+ 6,6,63,63,30,28,0.47619047619047616,0.5172413793103449,0.49586776859504134,0.5158730158730159
72
+ 7,7,17,17,6,6,0.35294117647058826,0.5,0.41379310344827586,0.5
73
+ 8,8,65,65,35,18,0.5384615384615384,0.660377358490566,0.5932203389830508,0.6307692307692307
74
+ 9,9,31,31,7,10,0.22580645161290322,0.4117647058823529,0.29166666666666663,0.45161290322580644
75
+ 10,10,57,57,29,25,0.5087719298245614,0.5370370370370371,0.5225225225225226,0.5350877192982456
76
+ 11,11,48,48,28,22,0.5833333333333334,0.56,0.5714285714285714,0.5625
77
+ 12,12,36,36,19,13,0.5277777777777778,0.59375,0.5588235294117648,0.5833333333333334
78
+ 13,13,17,17,5,5,0.29411764705882354,0.5,0.37037037037037035,0.5
79
+ 14,14,77,77,27,31,0.35064935064935066,0.46551724137931033,0.4,0.474025974025974
80
+ 15,15,40,40,7,8,0.175,0.4666666666666667,0.2545454545454546,0.4875
81
+ 16,16,29,29,18,18,0.6206896551724138,0.5,0.5538461538461539,0.5
82
+ 17,total,1043,1043,375,332,,,,
83
+ 18,,,,,Micro avg.,0.3595397890699904,0.5304101838755304,0.42857142857142855,0.5206136145733461
84
+ 19,,,,,Macro avg.,0.4105681152757063,0.5571352601604194,0.46084631908486745,0.5387194673546215
85
+ threshold-0.75,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
86
+ 0,0,2,2,0,0,0.0,0.0,0.0,0.5
87
+ 1,1,30,30,1,0,0.03333333333333333,1.0,0.06451612903225806,0.5166666666666667
88
+ 2,2,288,288,5,5,0.017361111111111112,0.5,0.03355704697986577,0.5
89
+ 3,3,87,87,2,3,0.022988505747126436,0.4,0.043478260869565216,0.4942528735632184
90
+ 4,4,94,94,2,2,0.02127659574468085,0.5,0.04081632653061224,0.5
91
+ 5,5,62,62,1,0,0.016129032258064516,1.0,0.031746031746031744,0.5080645161290323
92
+ 6,6,63,63,2,0,0.031746031746031744,1.0,0.06153846153846154,0.5158730158730159
93
+ 7,7,17,17,2,1,0.11764705882352941,0.6666666666666666,0.2,0.5294117647058824
94
+ 8,8,65,65,5,1,0.07692307692307693,0.8333333333333334,0.14084507042253522,0.5307692307692308
95
+ 9,9,31,31,0,2,0.0,0.0,0.0,0.46774193548387094
96
+ 10,10,57,57,5,1,0.08771929824561403,0.8333333333333334,0.15873015873015872,0.5350877192982456
97
+ 11,11,48,48,5,2,0.10416666666666667,0.7142857142857143,0.18181818181818182,0.53125
98
+ 12,12,36,36,4,0,0.1111111111111111,1.0,0.19999999999999998,0.5555555555555556
99
+ 13,13,17,17,0,0,0.0,0.0,0.0,0.5
100
+ 14,14,77,77,4,3,0.05194805194805195,0.5714285714285714,0.09523809523809525,0.5064935064935064
101
+ 15,15,40,40,0,0,0.0,0.0,0.0,0.5
102
+ 16,16,29,29,4,3,0.13793103448275862,0.5714285714285714,0.2222222222222222,0.5172413793103449
103
+ 17,total,1043,1043,42,23,,,,
104
+ 18,,,,,Micro avg.,0.040268456375838924,0.6461538461538462,0.07581227436823104,0.5091083413231065
105
+ 19,,,,,Macro avg.,0.04884005342006805,0.5641456582633053,0.08673564618399929,0.5122593037557983
106
+ topk-9,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
107
+ 0,0,2,2,1,2,0.5,0.3333333333333333,0.4,0.25
108
+ 1,1,30,30,9,8,0.3,0.5294117647058824,0.3829787234042553,0.5166666666666667
109
+ 2,2,288,288,264,270,0.9166666666666666,0.4943820224719101,0.6423357664233577,0.4895833333333333
110
+ 3,3,87,87,86,82,0.9885057471264368,0.5119047619047619,0.6745098039215687,0.5229885057471264
111
+ 4,4,94,94,75,74,0.7978723404255319,0.5033557046979866,0.617283950617284,0.5053191489361702
112
+ 5,5,62,62,53,53,0.8548387096774194,0.5,0.6309523809523809,0.5
113
+ 6,6,63,63,53,52,0.8412698412698413,0.5047619047619047,0.6309523809523809,0.5079365079365079
114
+ 7,7,17,17,11,8,0.6470588235294118,0.5789473684210527,0.6111111111111113,0.5882352941176471
115
+ 8,8,65,65,3,4,0.046153846153846156,0.42857142857142855,0.08333333333333333,0.49230769230769234
116
+ 9,9,31,31,10,7,0.3225806451612903,0.5882352941176471,0.41666666666666663,0.5483870967741935
117
+ 10,10,57,57,48,45,0.8421052631578947,0.5161290322580645,0.6399999999999999,0.5263157894736842
118
+ 11,11,48,48,2,5,0.041666666666666664,0.2857142857142857,0.07272727272727272,0.46875
119
+ 12,12,36,36,13,15,0.3611111111111111,0.4642857142857143,0.40625000000000006,0.4722222222222222
120
+ 13,13,17,17,7,6,0.4117647058823529,0.5384615384615384,0.4666666666666667,0.5294117647058824
121
+ 14,14,77,77,4,4,0.05194805194805195,0.5,0.09411764705882353,0.5
122
+ 15,15,40,40,15,20,0.375,0.42857142857142855,0.39999999999999997,0.4375
123
+ 16,16,29,29,16,15,0.5517241379310345,0.5161290322580645,0.5333333333333333,0.5172413793103449
124
+ 17,total,1043,1043,670,670,,,,
125
+ 18,,,,,Micro avg.,0.6423777564717162,0.5,0.5623164078892153,0.5
126
+ 19,,,,,Macro avg.,0.5206039151004443,0.4836585067373531,0.4531305315981432,0.49252149420773367
127
+ topk-7,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
128
+ 0,0,2,2,1,2,0.5,0.3333333333333333,0.4,0.25
129
+ 1,1,30,30,6,5,0.2,0.5454545454545454,0.29268292682926833,0.5166666666666667
130
+ 2,2,288,288,247,259,0.8576388888888888,0.4881422924901186,0.6221662468513853,0.4791666666666667
131
+ 3,3,87,87,82,80,0.9425287356321839,0.5061728395061729,0.6586345381526104,0.5114942528735632
132
+ 4,4,94,94,62,63,0.6595744680851063,0.496,0.5662100456621004,0.4946808510638298
133
+ 5,5,62,62,48,45,0.7741935483870968,0.5161290322580645,0.6193548387096774,0.5241935483870968
134
+ 6,6,63,63,42,40,0.6666666666666666,0.5121951219512195,0.5793103448275863,0.5158730158730159
135
+ 7,7,17,17,6,4,0.35294117647058826,0.6,0.4444444444444445,0.5588235294117647
136
+ 8,8,65,65,1,3,0.015384615384615385,0.25,0.028985507246376812,0.4846153846153846
137
+ 9,9,31,31,5,2,0.16129032258064516,0.7142857142857143,0.2631578947368421,0.5483870967741935
138
+ 10,10,57,57,39,36,0.6842105263157895,0.52,0.5909090909090909,0.5263157894736842
139
+ 11,11,48,48,1,3,0.020833333333333332,0.25,0.038461538461538464,0.4791666666666667
140
+ 12,12,36,36,7,8,0.19444444444444445,0.4666666666666667,0.27450980392156865,0.4861111111111111
141
+ 13,13,17,17,3,1,0.17647058823529413,0.75,0.2857142857142857,0.5588235294117647
142
+ 14,14,77,77,2,3,0.025974025974025976,0.4,0.04878048780487805,0.4935064935064935
143
+ 15,15,40,40,15,14,0.375,0.5172413793103449,0.4347826086956522,0.5125
144
+ 16,16,29,29,15,8,0.5172413793103449,0.6521739130434783,0.576923076923077,0.6206896551724138
145
+ 17,total,1043,1043,582,576,,,,
146
+ 18,,,,,Micro avg.,0.5580057526366251,0.5025906735751295,0.5288505224897774,0.50287631831256
147
+ 19,,,,,Macro avg.,0.4190819246887661,0.5010467551940976,0.3955898635229637,0.5035890739808421
148
+ topk-5,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
149
+ 0,0,2,2,1,2,0.5,0.3333333333333333,0.4,0.25
150
+ 1,1,30,30,3,4,0.1,0.42857142857142855,0.16216216216216217,0.48333333333333334
151
+ 2,2,288,288,215,235,0.7465277777777778,0.4777777777777778,0.5826558265582656,0.4652777777777778
152
+ 3,3,87,87,77,78,0.8850574712643678,0.4967741935483871,0.6363636363636364,0.4942528735632184
153
+ 4,4,94,94,45,49,0.4787234042553192,0.4787234042553192,0.47872340425531923,0.4787234042553192
154
+ 5,5,62,62,36,28,0.5806451612903226,0.5625,0.5714285714285715,0.5645161290322581
155
+ 6,6,63,63,30,24,0.47619047619047616,0.5555555555555556,0.5128205128205129,0.5476190476190477
156
+ 7,7,17,17,2,4,0.11764705882352941,0.3333333333333333,0.1739130434782609,0.4411764705882353
157
+ 8,8,65,65,1,3,0.015384615384615385,0.25,0.028985507246376812,0.4846153846153846
158
+ 9,9,31,31,4,1,0.12903225806451613,0.8,0.2222222222222222,0.5483870967741935
159
+ 10,10,57,57,20,23,0.3508771929824561,0.46511627906976744,0.4,0.47368421052631576
160
+ 11,11,48,48,1,2,0.020833333333333332,0.3333333333333333,0.0392156862745098,0.4895833333333333
161
+ 12,12,36,36,5,5,0.1388888888888889,0.5,0.2173913043478261,0.5
162
+ 13,13,17,17,3,1,0.17647058823529413,0.75,0.2857142857142857,0.5588235294117647
163
+ 14,14,77,77,2,1,0.025974025974025976,0.6666666666666666,0.05,0.5064935064935064
164
+ 15,15,40,40,9,6,0.225,0.6,0.3272727272727273,0.5375
165
+ 16,16,29,29,13,6,0.4482758620689655,0.6842105263157895,0.5416666666666666,0.6206896551724138
166
+ 17,total,1043,1043,467,472,,,,
167
+ 18,,,,,Micro avg.,0.4477468839884947,0.49733759318423854,0.4712411705348133,0.49760306807286675
168
+ 19,,,,,Macro avg.,0.31856047732552284,0.5126997548094525,0.331207973930079,0.49674563249977066
169
+ topk-3,class,N# of True samples,N# of False samples,True Positives,False Positives,r,p,f1,acc
170
+ 0,0,2,2,1,1,0.5,0.5,0.5,0.5
171
+ 1,1,30,30,3,1,0.1,0.75,0.17647058823529416,0.5333333333333333
172
+ 2,2,288,288,166,187,0.5763888888888888,0.4702549575070821,0.5179407176287051,0.4635416666666667
173
+ 3,3,87,87,70,66,0.8045977011494253,0.5147058823529411,0.6278026905829596,0.5229885057471264
174
+ 4,4,94,94,20,21,0.2127659574468085,0.4878048780487805,0.29629629629629634,0.4946808510638298
175
+ 5,5,62,62,9,5,0.14516129032258066,0.6428571428571429,0.2368421052631579,0.532258064516129
176
+ 6,6,63,63,16,11,0.25396825396825395,0.5925925925925926,0.3555555555555555,0.5396825396825397
177
+ 7,7,17,17,1,1,0.058823529411764705,0.5,0.10526315789473684,0.5
178
+ 8,8,65,65,0,1,0.0,0.0,0.0,0.49230769230769234
179
+ 9,9,31,31,0,0,0.0,0.0,0.0,0.5
180
+ 10,10,57,57,10,10,0.17543859649122806,0.5,0.2597402597402597,0.5
181
+ 11,11,48,48,0,1,0.0,0.0,0.0,0.4895833333333333
182
+ 12,12,36,36,2,2,0.05555555555555555,0.5,0.09999999999999999,0.5
183
+ 13,13,17,17,2,0,0.11764705882352941,1.0,0.21052631578947367,0.5588235294117647
184
+ 14,14,77,77,1,0,0.012987012987012988,1.0,0.025641025641025647,0.5064935064935064
185
+ 15,15,40,40,6,3,0.15,0.6666666666666666,0.24489795918367346,0.5375
186
+ 16,16,29,29,9,4,0.3103448275862069,0.6923076923076923,0.4285714285714286,0.5862068965517241
187
+ 17,total,1043,1043,316,314,,,,
188
+ 18,,,,,Micro avg.,0.3029721955896453,0.5015873015873016,0.3777644949193067,0.5009587727708533
189
+ 19,,,,,Macro avg.,0.2043340395665444,0.5186582242548763,0.24032635884603332,0.5151411717122144