[1] loss: 3.922, train acc: 9.710 test acc: 16.210 19.120 s [2] loss: 3.306, train acc: 19.734 test acc: 24.700 19.272 s [3] loss: 2.890, train acc: 27.350 test acc: 30.530 19.156 s [4] loss: 2.572, train acc: 33.976 test acc: 34.630 18.599 s [5] loss: 2.351, train acc: 38.444 test acc: 39.340 19.358 s [6] loss: 2.182, train acc: 42.156 test acc: 41.620 19.322 s [7] loss: 2.060, train acc: 44.980 test acc: 43.830 18.912 s [8] loss: 1.946, train acc: 47.628 test acc: 45.620 19.201 s [9] loss: 1.861, train acc: 49.356 test acc: 47.350 18.739 s [10] loss: 1.778, train acc: 51.572 test acc: 47.440 19.149 s [11] loss: 1.720, train acc: 52.758 test acc: 48.690 18.959 s [12] loss: 1.665, train acc: 54.102 test acc: 50.070 18.651 s [13] loss: 1.611, train acc: 55.504 test acc: 51.010 18.681 s [14] loss: 1.565, train acc: 56.742 test acc: 51.310 18.636 s [15] loss: 1.523, train acc: 57.590 test acc: 50.750 19.178 s [16] loss: 1.493, train acc: 58.122 test acc: 52.760 18.726 s [17] loss: 1.456, train acc: 59.148 test acc: 53.310 19.150 s [18] loss: 1.425, train acc: 60.064 test acc: 53.020 18.625 s [19] loss: 1.395, train acc: 60.686 test acc: 53.310 18.945 s [20] loss: 1.366, train acc: 61.512 test acc: 54.200 20.388 s [21] loss: 1.337, train acc: 62.098 test acc: 54.400 18.636 s [22] loss: 1.317, train acc: 62.850 test acc: 54.450 18.698 s [23] loss: 1.288, train acc: 63.556 test acc: 54.980 24.444 s [24] loss: 1.270, train acc: 63.970 test acc: 54.640 19.223 s [25] loss: 1.242, train acc: 64.418 test acc: 55.670 19.068 s [26] loss: 1.228, train acc: 65.022 test acc: 55.390 18.723 s [27] loss: 1.212, train acc: 65.308 test acc: 56.070 18.621 s [28] loss: 1.192, train acc: 65.950 test acc: 55.740 18.721 s [29] loss: 1.172, train acc: 66.610 test acc: 56.360 18.999 s [30] loss: 1.162, train acc: 66.744 test acc: 56.040 19.265 s [31] loss: 1.139, train acc: 67.142 test acc: 56.610 18.620 s [32] loss: 1.127, train acc: 67.530 test acc: 56.350 18.952 s [33] loss: 1.113, train acc: 67.938 test acc: 56.930 19.421 s [34] loss: 1.103, train acc: 68.186 test acc: 56.610 19.007 s [35] loss: 1.081, train acc: 68.868 test acc: 56.850 19.002 s [36] loss: 1.077, train acc: 68.798 test acc: 57.090 18.931 s [37] loss: 1.063, train acc: 69.366 test acc: 57.010 18.142 s [38] loss: 1.048, train acc: 69.726 test acc: 57.600 18.577 s [39] loss: 1.034, train acc: 70.048 test acc: 57.630 19.337 s [40] loss: 1.021, train acc: 70.398 test acc: 58.170 18.606 s [41] loss: 1.013, train acc: 70.720 test acc: 57.340 19.218 s [42] loss: 1.001, train acc: 71.000 test acc: 58.030 18.656 s [43] loss: 0.991, train acc: 71.130 test acc: 58.170 18.731 s [44] loss: 0.982, train acc: 71.388 test acc: 58.150 18.939 s [45] loss: 0.972, train acc: 71.786 test acc: 57.920 20.176 s [46] loss: 0.959, train acc: 72.054 test acc: 58.770 19.481 s [47] loss: 0.946, train acc: 72.474 test acc: 57.930 19.065 s [48] loss: 0.935, train acc: 72.638 test acc: 57.890 19.334 s [49] loss: 0.928, train acc: 72.724 test acc: 58.370 18.734 s [50] loss: 0.925, train acc: 72.930 test acc: 58.690 18.609 s [51] loss: 0.911, train acc: 73.478 test acc: 58.120 19.188 s [52] loss: 0.906, train acc: 73.406 test acc: 57.950 18.921 s [53] loss: 0.896, train acc: 73.732 test acc: 58.300 18.764 s [54] loss: 0.891, train acc: 73.804 test acc: 58.070 18.855 s [55] loss: 0.881, train acc: 74.204 test acc: 57.960 18.914 s [56] loss: 0.873, train acc: 74.446 test acc: 58.690 18.841 s [57] loss: 0.865, train acc: 74.332 test acc: 58.390 19.063 s [58] loss: 0.856, train acc: 74.850 test acc: 58.630 19.052 s [59] loss: 0.849, train acc: 75.136 test acc: 59.100 18.923 s [60] loss: 0.851, train acc: 74.982 test acc: 58.100 18.426 s [61] loss: 0.839, train acc: 75.072 test acc: 57.940 19.223 s [62] loss: 0.828, train acc: 75.610 test acc: 58.210 19.462 s [63] loss: 0.821, train acc: 75.916 test acc: 57.980 18.999 s [64] loss: 0.816, train acc: 75.868 test acc: 59.340 18.477 s [65] loss: 0.806, train acc: 76.154 test acc: 58.640 19.336 s [66] loss: 0.802, train acc: 76.380 test acc: 59.180 19.209 s [67] loss: 0.794, train acc: 76.694 test acc: 59.110 18.478 s [68] loss: 0.792, train acc: 76.544 test acc: 59.230 18.842 s [69] loss: 0.781, train acc: 77.010 test acc: 58.640 18.791 s [70] loss: 0.777, train acc: 77.002 test acc: 59.170 19.276 s [71] loss: 0.773, train acc: 77.146 test acc: 59.250 19.578 s [72] loss: 0.767, train acc: 77.232 test acc: 59.000 19.281 s [73] loss: 0.760, train acc: 77.390 test acc: 59.020 18.526 s [74] loss: 0.762, train acc: 77.430 test acc: 58.650 18.691 s [75] loss: 0.755, train acc: 77.836 test acc: 59.310 20.628 s [76] loss: 0.750, train acc: 77.732 test acc: 59.170 18.904 s [77] loss: 0.745, train acc: 77.560 test acc: 58.820 19.015 s [78] loss: 0.738, train acc: 78.148 test acc: 58.990 19.101 s [79] loss: 0.729, train acc: 78.210 test acc: 58.660 18.940 s [80] loss: 0.728, train acc: 78.240 test acc: 58.870 18.424 s [81] loss: 0.723, train acc: 78.442 test acc: 58.510 19.399 s [82] loss: 0.718, train acc: 78.706 test acc: 58.610 18.937 s [83] loss: 0.712, train acc: 78.724 test acc: 58.560 19.048 s [84] loss: 0.705, train acc: 78.776 test acc: 58.810 18.905 s [85] loss: 0.704, train acc: 78.982 test acc: 58.250 19.172 s [86] loss: 0.698, train acc: 79.308 test acc: 58.380 19.347 s [87] loss: 0.693, train acc: 79.318 test acc: 58.450 19.214 s [88] loss: 0.686, train acc: 79.432 test acc: 59.050 19.092 s [89] loss: 0.683, train acc: 79.574 test acc: 59.140 18.626 s [90] loss: 0.679, train acc: 79.708 test acc: 58.440 19.234 s [91] loss: 0.672, train acc: 79.968 test acc: 58.560 18.429 s [92] loss: 0.669, train acc: 80.088 test acc: 58.820 18.924 s [93] loss: 0.660, train acc: 80.174 test acc: 58.480 18.966 s [94] loss: 0.664, train acc: 80.024 test acc: 58.970 18.989 s [95] loss: 0.656, train acc: 80.338 test acc: 59.070 18.756 s [96] loss: 0.654, train acc: 80.278 test acc: 59.270 19.369 s [97] loss: 0.648, train acc: 80.548 test acc: 59.050 19.416 s [98] loss: 0.641, train acc: 80.714 test acc: 59.120 18.987 s [99] loss: 0.646, train acc: 80.624 test acc: 58.520 18.932 s [100] loss: 0.638, train acc: 80.954 test acc: 59.050 19.094 s [1] loss: 0.580, train acc: 82.956 test acc: 60.010 18.612 s [2] loss: 0.557, train acc: 83.868 test acc: 59.950 18.785 s [3] loss: 0.552, train acc: 83.906 test acc: 60.080 19.294 s [4] loss: 0.546, train acc: 84.102 test acc: 60.190 19.067 s [5] loss: 0.539, train acc: 84.412 test acc: 59.960 18.777 s [6] loss: 0.539, train acc: 84.556 test acc: 60.070 18.761 s [7] loss: 0.536, train acc: 84.534 test acc: 60.050 18.752 s [8] loss: 0.530, train acc: 84.778 test acc: 59.820 18.836 s [9] loss: 0.533, train acc: 84.568 test acc: 60.220 19.284 s [10] loss: 0.528, train acc: 84.792 test acc: 59.970 18.962 s [11] loss: 0.528, train acc: 84.710 test acc: 60.090 18.949 s [12] loss: 0.527, train acc: 84.716 test acc: 60.050 18.657 s [13] loss: 0.525, train acc: 84.716 test acc: 60.180 18.807 s [14] loss: 0.521, train acc: 84.866 test acc: 59.980 18.586 s [15] loss: 0.522, train acc: 84.864 test acc: 60.010 19.012 s [16] loss: 0.517, train acc: 85.004 test acc: 59.850 19.005 s [17] loss: 0.520, train acc: 84.860 test acc: 60.080 19.120 s [18] loss: 0.511, train acc: 85.258 test acc: 60.210 18.975 s [19] loss: 0.513, train acc: 85.128 test acc: 60.210 19.032 s [20] loss: 0.507, train acc: 85.348 test acc: 59.940 18.446 s [1] loss: 0.501, train acc: 85.592 test acc: 60.100 18.988 s [2] loss: 0.490, train acc: 86.018 test acc: 60.070 18.917 s [3] loss: 0.488, train acc: 85.992 test acc: 59.990 18.860 s [4] loss: 0.493, train acc: 86.016 test acc: 59.870 18.987 s [5] loss: 0.485, train acc: 86.248 test acc: 60.040 18.584 s [6] loss: 0.487, train acc: 86.264 test acc: 60.130 18.601 s [7] loss: 0.486, train acc: 86.110 test acc: 60.160 18.754 s [8] loss: 0.486, train acc: 86.056 test acc: 60.070 18.997 s [9] loss: 0.485, train acc: 86.114 test acc: 60.190 18.654 s [10] loss: 0.484, train acc: 86.144 test acc: 60.130 18.356 s [11] loss: 0.482, train acc: 86.410 test acc: 59.970 18.743 s [12] loss: 0.484, train acc: 86.180 test acc: 60.030 19.216 s [13] loss: 0.482, train acc: 86.230 test acc: 60.250 20.355 s [14] loss: 0.483, train acc: 86.010 test acc: 60.300 19.104 s [15] loss: 0.482, train acc: 86.146 test acc: 59.910 18.860 s [16] loss: 0.484, train acc: 86.202 test acc: 60.070 18.826 s [17] loss: 0.480, train acc: 86.304 test acc: 60.060 18.555 s [18] loss: 0.482, train acc: 86.260 test acc: 60.280 19.010 s [19] loss: 0.481, train acc: 86.156 test acc: 60.300 18.804 s [20] loss: 0.479, train acc: 86.360 test acc: 60.310 18.998 s [1] loss: 0.479, train acc: 86.142 test acc: 60.280 18.646 s [2] loss: 0.476, train acc: 86.300 test acc: 60.320 18.658 s [3] loss: 0.475, train acc: 86.410 test acc: 60.240 19.096 s [4] loss: 0.475, train acc: 86.532 test acc: 60.260 18.890 s [5] loss: 0.476, train acc: 86.228 test acc: 60.250 19.536 s [6] loss: 0.473, train acc: 86.540 test acc: 60.290 18.323 s [7] loss: 0.476, train acc: 86.352 test acc: 60.230 19.586 s [8] loss: 0.473, train acc: 86.520 test acc: 60.230 19.256 s [9] loss: 0.472, train acc: 86.624 test acc: 60.310 18.598 s [10] loss: 0.475, train acc: 86.556 test acc: 60.350 18.936 s [11] loss: 0.475, train acc: 86.476 test acc: 60.380 18.681 s [12] loss: 0.471, train acc: 86.486 test acc: 60.340 20.621 s [13] loss: 0.474, train acc: 86.558 test acc: 60.310 18.922 s [14] loss: 0.470, train acc: 86.620 test acc: 60.290 19.109 s [15] loss: 0.473, train acc: 86.634 test acc: 60.170 19.187 s [16] loss: 0.474, train acc: 86.436 test acc: 60.270 18.899 s [17] loss: 0.471, train acc: 86.656 test acc: 60.280 19.279 s [18] loss: 0.474, train acc: 86.480 test acc: 60.150 19.134 s [19] loss: 0.471, train acc: 86.580 test acc: 60.200 18.532 s [20] loss: 0.473, train acc: 86.662 test acc: 60.170 18.995 s [1] loss: 1.106, train acc: 76.134 test acc: 62.780 38.125 s [2] loss: 0.874, train acc: 80.666 test acc: 63.290 39.722 s [3] loss: 0.838, train acc: 80.908 test acc: 63.320 38.934 s [4] loss: 0.819, train acc: 81.398 test acc: 63.560 38.463 s [5] loss: 0.810, train acc: 81.292 test acc: 63.210 38.697 s [6] loss: 0.803, train acc: 81.268 test acc: 63.530 38.476 s [7] loss: 0.793, train acc: 81.176 test acc: 63.700 38.083 s [8] loss: 0.790, train acc: 81.434 test acc: 63.320 38.817 s [9] loss: 0.787, train acc: 81.242 test acc: 63.570 38.433 s [10] loss: 0.782, train acc: 81.380 test acc: 63.710 38.234 s [11] loss: 0.778, train acc: 81.572 test acc: 63.640 39.205 s [12] loss: 0.773, train acc: 81.422 test acc: 63.700 38.101 s [13] loss: 0.767, train acc: 81.550 test acc: 63.580 38.276 s [14] loss: 0.762, train acc: 81.648 test acc: 63.680 38.218 s [15] loss: 0.766, train acc: 81.220 test acc: 63.710 38.191 s [16] loss: 0.759, train acc: 81.704 test acc: 63.640 37.920 s [17] loss: 0.756, train acc: 81.480 test acc: 63.790 38.715 s [18] loss: 0.758, train acc: 81.528 test acc: 63.760 38.157 s [19] loss: 0.756, train acc: 81.654 test acc: 63.840 38.704 s [20] loss: 0.756, train acc: 81.532 test acc: 63.800 38.097 s [21] loss: 0.752, train acc: 81.542 test acc: 63.900 38.504 s [22] loss: 0.746, train acc: 81.598 test acc: 63.830 38.281 s [23] loss: 0.747, train acc: 81.616 test acc: 63.760 38.159 s restarting with half the learning rate, zero optimizer state [1] loss: 0.742, train acc: 81.706 test acc: 63.920 36.892 s [2] loss: 0.743, train acc: 81.778 test acc: 63.970 36.748 s [3] loss: 0.739, train acc: 81.960 test acc: 63.890 36.376 s [4] loss: 0.737, train acc: 81.954 test acc: 63.770 35.944 s [5] loss: 0.735, train acc: 81.996 test acc: 64.210 36.866 s [6] loss: 0.734, train acc: 82.072 test acc: 63.930 36.578 s [7] loss: 0.734, train acc: 81.916 test acc: 63.930 37.215 s [8] loss: 0.729, train acc: 81.992 test acc: 63.880 36.817 s [9] loss: 0.732, train acc: 82.108 test acc: 64.080 36.487 s [10] loss: 0.728, train acc: 82.142 test acc: 64.070 36.806 s [11] loss: 0.733, train acc: 81.934 test acc: 63.990 36.853 s [1] loss: 0.781, train acc: 81.422 test acc: 63.790 37.518 s [2] loss: 0.821, train acc: 80.904 test acc: 63.350 37.203 s [3] loss: 0.841, train acc: 80.668 test acc: 63.400 37.730 s [4] loss: 0.856, train acc: 80.196 test acc: 63.190 37.715 s [5] loss: 0.866, train acc: 80.016 test acc: 63.070 37.500 s [6] loss: 0.874, train acc: 79.680 test acc: 63.050 38.076 s [7] loss: 0.881, train acc: 79.606 test acc: 63.030 37.768 s [8] loss: 0.882, train acc: 79.624 test acc: 62.860 38.120 s [9] loss: 0.884, train acc: 79.590 test acc: 62.980 37.331 s [1] loss: 0.737, train acc: 81.764 test acc: 63.780 39.241 s [2] loss: 0.706, train acc: 81.852 test acc: 64.160 38.618 s [3] loss: 0.691, train acc: 82.076 test acc: 64.070 39.309 s [4] loss: 0.686, train acc: 82.174 test acc: 64.260 38.344 s [5] loss: 0.674, train acc: 82.528 test acc: 64.100 38.361 s [6] loss: 0.673, train acc: 82.422 test acc: 64.480 38.350 s [7] loss: 0.667, train acc: 82.700 test acc: 64.370 38.942 s [8] loss: 0.665, train acc: 82.792 test acc: 64.400 38.189 s [9] loss: 0.662, train acc: 82.726 test acc: 64.440 38.667 s [10] loss: 0.660, train acc: 82.766 test acc: 64.370 39.073 s [11] loss: 0.660, train acc: 82.808 test acc: 64.400 38.822 s [12] loss: 0.653, train acc: 83.032 test acc: 64.430 38.702 s [1] loss: 0.678, train acc: 82.660 test acc: 64.240 37.287 s [2] loss: 0.688, train acc: 82.854 test acc: 64.310 37.077 s [3] loss: 0.694, train acc: 82.710 test acc: 64.230 36.969 s [4] loss: 0.701, train acc: 82.636 test acc: 64.210 36.958 s [5] loss: 0.702, train acc: 82.640 test acc: 64.300 36.997 s [6] loss: 0.704, train acc: 82.408 test acc: 64.180 37.049 s [7] loss: 0.703, train acc: 82.806 test acc: 64.160 37.687 s [8] loss: 0.710, train acc: 82.334 test acc: 63.980 37.277 s [9] loss: 0.709, train acc: 82.544 test acc: 64.290 37.380 s [10] loss: 0.706, train acc: 82.538 test acc: 64.070 37.523 s [11] loss: 0.712, train acc: 82.400 test acc: 64.020 37.281 s [12] loss: 0.708, train acc: 82.548 test acc: 63.950 36.890 s [13] loss: 0.710, train acc: 82.606 test acc: 64.150 36.889 s [14] loss: 0.709, train acc: 82.514 test acc: 64.210 38.943 s [15] loss: 0.710, train acc: 82.704 test acc: 64.310 37.126 s [16] loss: 0.710, train acc: 82.650 test acc: 64.090 36.937 s [17] loss: 0.712, train acc: 82.526 test acc: 64.180 37.442 s [18] loss: 0.710, train acc: 82.840 test acc: 64.070 37.089 s [19] loss: 0.711, train acc: 82.582 test acc: 64.220 37.877 s [20] loss: 0.710, train acc: 82.668 test acc: 64.150 37.814 s [21] loss: 0.709, train acc: 82.544 test acc: 64.150 37.165 s