Epoch... (1/1 | Step: 10 | Loss: 3.152461051940918 | Learning Rate: 2.988296364492271e-05 | Time per step: 15.587736415863038) Epoch... (1/1 | Step: 20 | Loss: 2.984159469604492 | Learning Rate: 2.9752924092463218e-05 | Time per step: 7.810857331752777) Epoch... (1/1 | Step: 30 | Loss: 2.966938018798828 | Learning Rate: 2.9622886358993128e-05 | Time per step: 5.218205340703329) Epoch... (1/1 | Step: 40 | Loss: 2.7915446758270264 | Learning Rate: 2.9492846806533635e-05 | Time per step: 3.923228472471237) Epoch... (1/1 | Step: 50 | Loss: 2.8086533546447754 | Learning Rate: 2.9362809073063545e-05 | Time per step: 3.1455979251861574) Epoch... (1/1 | Step: 60 | Loss: 2.7164440155029297 | Learning Rate: 2.923276952060405e-05 | Time per step: 2.6309441407521565) Epoch... (1/1 | Step: 70 | Loss: 2.620852470397949 | Learning Rate: 2.910272996814456e-05 | Time per step: 2.2604179041726247) Epoch... (1/1 | Step: 80 | Loss: 2.6376171112060547 | Learning Rate: 2.897269223467447e-05 | Time per step: 1.9821367621421815) Epoch... (1/1 | Step: 90 | Loss: 2.5884041786193848 | Learning Rate: 2.8842652682214975e-05 | Time per step: 1.7655608786476984) Epoch... (1/1 | Step: 100 | Loss: 2.605238914489746 | Learning Rate: 2.8712613129755482e-05 | Time per step: 1.592444236278534) Epoch... (1/1 | Step: 110 | Loss: 2.5857369899749756 | Learning Rate: 2.858257357729599e-05 | Time per step: 1.4507770885120739) Epoch... (1/1 | Step: 120 | Loss: 2.462893009185791 | Learning Rate: 2.8452534024836496e-05 | Time per step: 1.3327178955078125) Epoch... (1/1 | Step: 130 | Loss: 2.5480122566223145 | Learning Rate: 2.8322496291366406e-05 | Time per step: 1.2347213745117187) Epoch... (1/1 | Step: 140 | Loss: 2.5385568141937256 | Learning Rate: 2.8192456738906913e-05 | Time per step: 1.1489436558314732) Epoch... (1/1 | Step: 150 | Loss: 2.5392045974731445 | Learning Rate: 2.806241718644742e-05 | Time per step: 1.0746502176920574) Epoch... (1/1 | Step: 160 | Loss: 2.5710103511810303 | Learning Rate: 2.793237945297733e-05 | Time per step: 1.0096801206469537) Epoch... (1/1 | Step: 170 | Loss: 2.4399685859680176 | Learning Rate: 2.7802339900517836e-05 | Time per step: 0.9523633339825799) Epoch... (1/1 | Step: 180 | Loss: 2.457218885421753 | Learning Rate: 2.7672302167047746e-05 | Time per step: 0.9013986693488227) Epoch... (1/1 | Step: 190 | Loss: 2.5539286136627197 | Learning Rate: 2.7542262614588253e-05 | Time per step: 0.8558672904968262) Epoch... (1/1 | Step: 200 | Loss: 2.4565377235412598 | Learning Rate: 2.741222306212876e-05 | Time per step: 0.8160686993598938) Epoch... (1/1 | Step: 210 | Loss: 2.4070143699645996 | Learning Rate: 2.728218532865867e-05 | Time per step: 0.7789805684770856) Epoch... (1/1 | Step: 220 | Loss: 2.3925132751464844 | Learning Rate: 2.7152143957209773e-05 | Time per step: 0.7452002211050553) Epoch... (1/1 | Step: 230 | Loss: 2.375119209289551 | Learning Rate: 2.7022106223739684e-05 | Time per step: 0.714317157994146) Epoch... (1/1 | Step: 240 | Loss: 2.4307475090026855 | Learning Rate: 2.689206667128019e-05 | Time per step: 0.6860215455293656) Epoch... (1/1 | Step: 250 | Loss: 2.3495891094207764 | Learning Rate: 2.6762027118820697e-05 | Time per step: 0.6600858697891235) Epoch... (1/1 | Step: 260 | Loss: 2.323152780532837 | Learning Rate: 2.6631989385350607e-05 | Time per step: 0.6362197930996235) Epoch... (1/1 | Step: 270 | Loss: 2.3550381660461426 | Learning Rate: 2.6501949832891114e-05 | Time per step: 0.6149519796724673) Epoch... (1/1 | Step: 280 | Loss: 2.418388843536377 | Learning Rate: 2.637191028043162e-05 | Time per step: 0.5942821894373213) Epoch... (1/1 | Step: 290 | Loss: 2.304417133331299 | Learning Rate: 2.624187254696153e-05 | Time per step: 0.5750318412123056) Epoch... (1/1 | Step: 300 | Loss: 2.292708396911621 | Learning Rate: 2.6111832994502038e-05 | Time per step: 0.5570862245559692) Epoch... (1/1 | Step: 310 | Loss: 2.2911200523376465 | Learning Rate: 2.5981795261031948e-05 | Time per step: 0.5402729565097439) Epoch... (1/1 | Step: 320 | Loss: 2.3142271041870117 | Learning Rate: 2.5851755708572455e-05 | Time per step: 0.5245530068874359) Epoch... (1/1 | Step: 330 | Loss: 2.2752268314361572 | Learning Rate: 2.5721714337123558e-05 | Time per step: 0.5098378614945845) Epoch... (1/1 | Step: 340 | Loss: 2.3485875129699707 | Learning Rate: 2.5591678422642872e-05 | Time per step: 0.4959293926463408) Epoch... (1/1 | Step: 350 | Loss: 2.2506933212280273 | Learning Rate: 2.5461637051193975e-05 | Time per step: 0.4828984478541783) Epoch... (1/1 | Step: 360 | Loss: 2.309922695159912 | Learning Rate: 2.5331599317723885e-05 | Time per step: 0.4705710119671292) Epoch... (1/1 | Step: 370 | Loss: 2.345146656036377 | Learning Rate: 2.5201559765264392e-05 | Time per step: 0.45886457804087044) Epoch... (1/1 | Step: 380 | Loss: 2.2607898712158203 | Learning Rate: 2.50715202128049e-05 | Time per step: 0.4477769217993084) Epoch... (1/1 | Step: 390 | Loss: 2.2573559284210205 | Learning Rate: 2.494148247933481e-05 | Time per step: 0.437354511481065) Epoch... (1/1 | Step: 400 | Loss: 2.306978464126587 | Learning Rate: 2.4811442926875316e-05 | Time per step: 0.42739131927490237) Epoch... (1/1 | Step: 410 | Loss: 2.194396495819092 | Learning Rate: 2.4681403374415822e-05 | Time per step: 0.41849243408296166) Epoch... (1/1 | Step: 420 | Loss: 2.246137857437134 | Learning Rate: 2.4551365640945733e-05 | Time per step: 0.40944125141416277) Epoch... (1/1 | Step: 430 | Loss: 2.2333340644836426 | Learning Rate: 2.442132608848624e-05 | Time per step: 0.400786896639092) Epoch... (1/1 | Step: 440 | Loss: 2.240902900695801 | Learning Rate: 2.4291286536026746e-05 | Time per step: 0.39255606640468943) Epoch... (1/1 | Step: 450 | Loss: 2.208949565887451 | Learning Rate: 2.4161248802556656e-05 | Time per step: 0.3847881078720093) Epoch... (1/1 | Step: 460 | Loss: 2.1334152221679688 | Learning Rate: 2.403120743110776e-05 | Time per step: 0.37723992026370506) Epoch... (1/1 | Step: 470 | Loss: 2.126279354095459 | Learning Rate: 2.390116969763767e-05 | Time per step: 0.3700234945784224) Epoch... (1/1 | Step: 480 | Loss: 2.2395260334014893 | Learning Rate: 2.3771130145178176e-05 | Time per step: 0.3634992683927218) Epoch... (1/1 | Step: 490 | Loss: 2.204587936401367 | Learning Rate: 2.3641092411708087e-05 | Time per step: 0.3568650634921327) Epoch... (1/1 | Step: 500 | Loss: 2.204573154449463 | Learning Rate: 2.3511052859248593e-05 | Time per step: 0.35050120973587034) Epoch... (1/1 | Step: 500 | Eval Loss: 2.153402805328369 | Eval rouge1: 38.644711 | Eval rouge2: 13.35726 | Eval rougeL: 34.996203 | Eval rougeLsum: 35.002078 | Eval gen_len: 12.159915 |) Epoch... (1/1 | Step: 510 | Loss: 2.2370595932006836 | Learning Rate: 2.33810133067891e-05 | Time per step: 0.34443876883562874) Epoch... (1/1 | Step: 520 | Loss: 2.1577632427215576 | Learning Rate: 2.325097557331901e-05 | Time per step: 0.3385900424076961) Epoch... (1/1 | Step: 530 | Loss: 2.2478108406066895 | Learning Rate: 2.3120936020859517e-05 | Time per step: 0.3329278419602592) Epoch... (1/1 | Step: 540 | Loss: 2.237516164779663 | Learning Rate: 2.2990896468400024e-05 | Time per step: 0.32740558297545824) Epoch... (1/1 | Step: 550 | Loss: 2.1444873809814453 | Learning Rate: 2.286085691594053e-05 | Time per step: 0.32209538373080165) Epoch... (1/1 | Step: 560 | Loss: 2.1897761821746826 | Learning Rate: 2.273081918247044e-05 | Time per step: 0.3174173729760306) Epoch... (1/1 | Step: 570 | Loss: 2.184631824493408 | Learning Rate: 2.2600779630010948e-05 | Time per step: 0.31251981342047974) Epoch... (1/1 | Step: 580 | Loss: 2.1819958686828613 | Learning Rate: 2.2470741896540858e-05 | Time per step: 0.3078234393021156) Epoch... (1/1 | Step: 590 | Loss: 2.202099323272705 | Learning Rate: 2.234070052509196e-05 | Time per step: 0.3032167919611527) Epoch... (1/1 | Step: 600 | Loss: 2.2110190391540527 | Learning Rate: 2.221066279162187e-05 | Time per step: 0.29876339276631675) Epoch... (1/1 | Step: 610 | Loss: 2.148010730743408 | Learning Rate: 2.2080623239162378e-05 | Time per step: 0.29445704514863064) Epoch... (1/1 | Step: 620 | Loss: 2.1506552696228027 | Learning Rate: 2.1950585505692288e-05 | Time per step: 0.29026989975283224) Epoch... (1/1 | Step: 630 | Loss: 2.2254538536071777 | Learning Rate: 2.1820545953232795e-05 | Time per step: 0.28622367003607374) Epoch... (1/1 | Step: 640 | Loss: 2.130129337310791 | Learning Rate: 2.1690506400773302e-05 | Time per step: 0.28234911486506464) Epoch... (1/1 | Step: 650 | Loss: 2.234767436981201 | Learning Rate: 2.1560468667303212e-05 | Time per step: 0.2786063777483427) Epoch... (1/1 | Step: 660 | Loss: 2.1312508583068848 | Learning Rate: 2.1430427295854315e-05 | Time per step: 0.27493493593100343) Epoch... (1/1 | Step: 670 | Loss: 2.1717793941497803 | Learning Rate: 2.1300389562384225e-05 | Time per step: 0.27138006651579444) Epoch... (1/1 | Step: 680 | Loss: 2.199887275695801 | Learning Rate: 2.1170350009924732e-05 | Time per step: 0.2679280186400694) Epoch... (1/1 | Step: 690 | Loss: 2.2302513122558594 | Learning Rate: 2.1040312276454642e-05 | Time per step: 0.26458196294480474) Epoch... (1/1 | Step: 700 | Loss: 2.2443227767944336 | Learning Rate: 2.091027272399515e-05 | Time per step: 0.2613670206069946) Epoch... (1/1 | Step: 710 | Loss: 2.1599698066711426 | Learning Rate: 2.0780233171535656e-05 | Time per step: 0.2582725964801412) Epoch... (1/1 | Step: 720 | Loss: 2.0422964096069336 | Learning Rate: 2.0650193619076163e-05 | Time per step: 0.2552153812514411) Epoch... (1/1 | Step: 730 | Loss: 2.2652714252471924 | Learning Rate: 2.0520155885606073e-05 | Time per step: 0.2522427163711966) Epoch... (1/1 | Step: 740 | Loss: 2.116323471069336 | Learning Rate: 2.039011633314658e-05 | Time per step: 0.24935495370143168) Epoch... (1/1 | Step: 750 | Loss: 2.1595919132232666 | Learning Rate: 2.026007859967649e-05 | Time per step: 0.2465395975112915) Epoch... (1/1 | Step: 760 | Loss: 2.142918586730957 | Learning Rate: 2.0130039047216997e-05 | Time per step: 0.24380498340255335) Epoch... (1/1 | Step: 770 | Loss: 2.156254291534424 | Learning Rate: 1.99999976757681e-05 | Time per step: 0.2411581157089828) Epoch... (1/1 | Step: 780 | Loss: 2.2371087074279785 | Learning Rate: 1.9869961761287414e-05 | Time per step: 0.23857017755508422) Epoch... (1/1 | Step: 790 | Loss: 2.2326855659484863 | Learning Rate: 1.9739920389838517e-05 | Time per step: 0.2360646033588844) Epoch... (1/1 | Step: 800 | Loss: 2.1136574745178223 | Learning Rate: 1.9609882656368427e-05 | Time per step: 0.23361746579408646) Epoch... (1/1 | Step: 810 | Loss: 2.058081865310669 | Learning Rate: 1.9479843103908934e-05 | Time per step: 0.23120082837563974) Epoch... (1/1 | Step: 820 | Loss: 2.159789562225342 | Learning Rate: 1.9349805370438844e-05 | Time per step: 0.22883684140879934) Epoch... (1/1 | Step: 830 | Loss: 2.176499843597412 | Learning Rate: 1.921976581797935e-05 | Time per step: 0.226584091531225) Epoch... (1/1 | Step: 840 | Loss: 2.124746799468994 | Learning Rate: 1.9089726265519857e-05 | Time per step: 0.22435513450985864) Epoch... (1/1 | Step: 850 | Loss: 2.1009159088134766 | Learning Rate: 1.8959686713060364e-05 | Time per step: 0.22247487545013428) Epoch... (1/1 | Step: 860 | Loss: 2.0962576866149902 | Learning Rate: 1.8829648979590274e-05 | Time per step: 0.22036200872687406) Epoch... (1/1 | Step: 870 | Loss: 2.165346145629883 | Learning Rate: 1.869960942713078e-05 | Time per step: 0.21827287427310285) Epoch... (1/1 | Step: 880 | Loss: 2.1078338623046875 | Learning Rate: 1.8569569874671288e-05 | Time per step: 0.2162384805354205) Epoch... (1/1 | Step: 890 | Loss: 2.137667179107666 | Learning Rate: 1.8439532141201198e-05 | Time per step: 0.21422737769866257) Epoch... (1/1 | Step: 900 | Loss: 2.091475486755371 | Learning Rate: 1.83094907697523e-05 | Time per step: 0.21231675068537395) Epoch... (1/1 | Step: 910 | Loss: 2.1219921112060547 | Learning Rate: 1.8179454855271615e-05 | Time per step: 0.2104336788366129) Epoch... (1/1 | Step: 920 | Loss: 2.1686043739318848 | Learning Rate: 1.8049413483822718e-05 | Time per step: 0.20856941109118254) Epoch... (1/1 | Step: 930 | Loss: 2.042168140411377 | Learning Rate: 1.791937575035263e-05 | Time per step: 0.2067646885430941) Epoch... (1/1 | Step: 940 | Loss: 2.076967239379883 | Learning Rate: 1.7789336197893135e-05 | Time per step: 0.2049876999347768) Epoch... (1/1 | Step: 950 | Loss: 2.0758371353149414 | Learning Rate: 1.7659296645433642e-05 | Time per step: 0.203275879558764) Epoch... (1/1 | Step: 960 | Loss: 2.1152734756469727 | Learning Rate: 1.7529258911963552e-05 | Time per step: 0.20160344789425533) Epoch... (1/1 | Step: 970 | Loss: 2.1582562923431396 | Learning Rate: 1.739921935950406e-05 | Time per step: 0.19997054645695636) Epoch... (1/1 | Step: 980 | Loss: 2.1486573219299316 | Learning Rate: 1.7269179807044566e-05 | Time per step: 0.19854059097718219) Epoch... (1/1 | Step: 990 | Loss: 2.1983158588409424 | Learning Rate: 1.7139142073574476e-05 | Time per step: 0.19692125440847993) Epoch... (1/1 | Step: 1000 | Loss: 2.1434619426727295 | Learning Rate: 1.7009102521114983e-05 | Time per step: 0.195358873128891) Epoch... (1/1 | Step: 1000 | Eval Loss: 2.071117639541626 | Eval rouge1: 39.652538 | Eval rouge2: 14.177532 | Eval rougeL: 35.850527 | Eval rougeLsum: 35.848975 | Eval gen_len: 12.273961 |) Epoch... (1/1 | Step: 1010 | Loss: 2.1113176345825195 | Learning Rate: 1.687906296865549e-05 | Time per step: 0.19387355157644443) Epoch... (1/1 | Step: 1020 | Loss: 2.125403881072998 | Learning Rate: 1.67490252351854e-05 | Time per step: 0.19240601436764587) Epoch... (1/1 | Step: 1030 | Loss: 2.109292984008789 | Learning Rate: 1.6618983863736503e-05 | Time per step: 0.1909887906417106) Epoch... (1/1 | Step: 1040 | Loss: 2.0393459796905518 | Learning Rate: 1.6488947949255817e-05 | Time per step: 0.18951574976627644) Epoch... (1/1 | Step: 1050 | Loss: 2.0851333141326904 | Learning Rate: 1.635890657780692e-05 | Time per step: 0.18808958825610933) Epoch... (1/1 | Step: 1060 | Loss: 2.032742738723755 | Learning Rate: 1.622886884433683e-05 | Time per step: 0.1866728407032085) Epoch... (1/1 | Step: 1070 | Loss: 2.090559959411621 | Learning Rate: 1.6098829291877337e-05 | Time per step: 0.185282122754605) Epoch... (1/1 | Step: 1080 | Loss: 2.0871057510375977 | Learning Rate: 1.5968789739417844e-05 | Time per step: 0.18392574345624005) Epoch... (1/1 | Step: 1090 | Loss: 2.0560994148254395 | Learning Rate: 1.5838752005947754e-05 | Time per step: 0.18262962201319705) Epoch... (1/1 | Step: 1100 | Loss: 2.1433119773864746 | Learning Rate: 1.570871245348826e-05 | Time per step: 0.18134411139921708) Epoch... (1/1 | Step: 1110 | Loss: 2.1094539165496826 | Learning Rate: 1.5578672901028767e-05 | Time per step: 0.18005623065673554) Epoch... (1/1 | Step: 1120 | Loss: 2.0397915840148926 | Learning Rate: 1.5448633348569274e-05 | Time per step: 0.17881171469177518) Epoch... (1/1 | Step: 1130 | Loss: 2.0928614139556885 | Learning Rate: 1.5318595615099184e-05 | Time per step: 0.1775794653765923) Epoch... (1/1 | Step: 1140 | Loss: 2.071333408355713 | Learning Rate: 1.518855515314499e-05 | Time per step: 0.17638430909106606) Epoch... (1/1 | Step: 1150 | Loss: 2.141636371612549 | Learning Rate: 1.5058518329169601e-05 | Time per step: 0.17520383834838868) Epoch... (1/1 | Step: 1160 | Loss: 1.9951622486114502 | Learning Rate: 1.4928477867215406e-05 | Time per step: 0.17405129814970083) Epoch... (1/1 | Step: 1170 | Loss: 2.0520851612091064 | Learning Rate: 1.4798439224250615e-05 | Time per step: 0.17289256825406327) Epoch... (1/1 | Step: 1180 | Loss: 2.053399085998535 | Learning Rate: 1.4668399671791121e-05 | Time per step: 0.17176148750014225) Epoch... (1/1 | Step: 1190 | Loss: 2.0979127883911133 | Learning Rate: 1.453836102882633e-05 | Time per step: 0.17067777028604716) Epoch... (1/1 | Step: 1200 | Loss: 2.037087917327881 | Learning Rate: 1.4408322385861538e-05 | Time per step: 0.1697588900725047) Epoch... (1/1 | Step: 1210 | Loss: 2.225379467010498 | Learning Rate: 1.4278283742896747e-05 | Time per step: 0.16866758992849304) Epoch... (1/1 | Step: 1220 | Loss: 2.072798728942871 | Learning Rate: 1.4148244190437254e-05 | Time per step: 0.16764525331434657) Epoch... (1/1 | Step: 1230 | Loss: 2.0676560401916504 | Learning Rate: 1.4018205547472462e-05 | Time per step: 0.1666269135668995) Epoch... (1/1 | Step: 1240 | Loss: 2.0550100803375244 | Learning Rate: 1.388816690450767e-05 | Time per step: 0.16559767165491657) Epoch... (1/1 | Step: 1250 | Loss: 2.090104579925537 | Learning Rate: 1.3758128261542879e-05 | Time per step: 0.16458576316833495) Epoch... (1/1 | Step: 1260 | Loss: 2.037855863571167 | Learning Rate: 1.3628086890093982e-05 | Time per step: 0.16380649502315217) Epoch... (1/1 | Step: 1270 | Loss: 2.069274663925171 | Learning Rate: 1.349804824712919e-05 | Time per step: 0.16282440039116566) Epoch... (1/1 | Step: 1280 | Loss: 2.139634370803833 | Learning Rate: 1.33680096041644e-05 | Time per step: 0.16187303308397533) Epoch... (1/1 | Step: 1290 | Loss: 2.0635054111480713 | Learning Rate: 1.3237970961199608e-05 | Time per step: 0.16094879715941673) Epoch... (1/1 | Step: 1300 | Loss: 2.115933418273926 | Learning Rate: 1.3107932318234816e-05 | Time per step: 0.16003306462214542) Epoch... (1/1 | Step: 1310 | Loss: 2.1170766353607178 | Learning Rate: 1.2977892765775323e-05 | Time per step: 0.15911450968443894) Epoch... (1/1 | Step: 1320 | Loss: 2.124302864074707 | Learning Rate: 1.2847854122810531e-05 | Time per step: 0.15840172821825202) Epoch... (1/1 | Step: 1330 | Loss: 2.0505247116088867 | Learning Rate: 1.271781547984574e-05 | Time per step: 0.15752380396190444) Epoch... (1/1 | Step: 1340 | Loss: 2.0901174545288086 | Learning Rate: 1.2587776836880948e-05 | Time per step: 0.15665231978715355) Epoch... (1/1 | Step: 1350 | Loss: 2.0725221633911133 | Learning Rate: 1.2457737284421455e-05 | Time per step: 0.15581043596620914) Epoch... (1/1 | Step: 1360 | Loss: 2.173769950866699 | Learning Rate: 1.2327698641456664e-05 | Time per step: 0.15497289966134464) Epoch... (1/1 | Step: 1370 | Loss: 1.9631630182266235 | Learning Rate: 1.2197659998491872e-05 | Time per step: 0.15411251478821691) Epoch... (1/1 | Step: 1380 | Loss: 2.0254154205322266 | Learning Rate: 1.2067619536537677e-05 | Time per step: 0.15345382293065388) Epoch... (1/1 | Step: 1390 | Loss: 2.0926127433776855 | Learning Rate: 1.1937579984078184e-05 | Time per step: 0.15263587742400683) Epoch... (1/1 | Step: 1400 | Loss: 2.1013355255126953 | Learning Rate: 1.1807541341113392e-05 | Time per step: 0.15183171289307731) Epoch... (1/1 | Step: 1410 | Loss: 2.127017021179199 | Learning Rate: 1.16775026981486e-05 | Time per step: 0.15108886035621588) Epoch... (1/1 | Step: 1420 | Loss: 2.107186794281006 | Learning Rate: 1.154746405518381e-05 | Time per step: 0.1503054445898029) Epoch... (1/1 | Step: 1430 | Loss: 2.0433754920959473 | Learning Rate: 1.1417425412219018e-05 | Time per step: 0.14954344509364842) Epoch... (1/1 | Step: 1440 | Loss: 2.196624279022217 | Learning Rate: 1.1287385859759524e-05 | Time per step: 0.1489259401957194) Epoch... (1/1 | Step: 1450 | Loss: 1.997576117515564 | Learning Rate: 1.1157347216794733e-05 | Time per step: 0.14815782185258536) Epoch... (1/1 | Step: 1460 | Loss: 2.0687851905822754 | Learning Rate: 1.1027308573829941e-05 | Time per step: 0.14740191845044698) Epoch... (1/1 | Step: 1470 | Loss: 2.0500588417053223 | Learning Rate: 1.089726993086515e-05 | Time per step: 0.14666985038186417) Epoch... (1/1 | Step: 1480 | Loss: 2.0253396034240723 | Learning Rate: 1.0767230378405657e-05 | Time per step: 0.14596247012550767) Epoch... (1/1 | Step: 1490 | Loss: 2.109578847885132 | Learning Rate: 1.0637191735440865e-05 | Time per step: 0.14524603322048316) Epoch... (1/1 | Step: 1500 | Loss: 1.9973983764648438 | Learning Rate: 1.050715127348667e-05 | Time per step: 0.14455435609817505) Epoch... (1/1 | Step: 1500 | Eval Loss: 2.0347542762756348 | Eval rouge1: 39.720462 | Eval rouge2: 14.404286 | Eval rougeL: 35.987809 | Eval rougeLsum: 35.987896 | Eval gen_len: 12.283868 |) Epoch... (1/1 | Step: 1510 | Loss: 2.17781400680542 | Learning Rate: 1.0377112630521879e-05 | Time per step: 0.14389628906123686) Epoch... (1/1 | Step: 1520 | Loss: 2.156275987625122 | Learning Rate: 1.0247073078062385e-05 | Time per step: 0.1432325072978672) Epoch... (1/1 | Step: 1530 | Loss: 1.998096227645874 | Learning Rate: 1.0117034435097594e-05 | Time per step: 0.1425560083264619) Epoch... (1/1 | Step: 1540 | Loss: 1.9965381622314453 | Learning Rate: 9.986995792132802e-06 | Time per step: 0.14192123614348373) Epoch... (1/1 | Step: 1550 | Loss: 2.104810953140259 | Learning Rate: 9.85695714916801e-06 | Time per step: 0.14126326391773839) Epoch... (1/1 | Step: 1560 | Loss: 2.025113344192505 | Learning Rate: 9.72691850620322e-06 | Time per step: 0.1406068211946732) Epoch... (1/1 | Step: 1570 | Loss: 2.1913187503814697 | Learning Rate: 9.596878953743726e-06 | Time per step: 0.1399633161581246) Epoch... (1/1 | Step: 1580 | Loss: 2.0058059692382812 | Learning Rate: 9.466840310778935e-06 | Time per step: 0.13934056834329533) Epoch... (1/1 | Step: 1590 | Loss: 2.0531082153320312 | Learning Rate: 9.336801667814143e-06 | Time per step: 0.1387292352112584) Epoch... (1/1 | Step: 1600 | Loss: 2.1364736557006836 | Learning Rate: 9.206763024849351e-06 | Time per step: 0.1381152728199959) Epoch... (1/1 | Step: 1610 | Loss: 2.1336679458618164 | Learning Rate: 9.076721653400455e-06 | Time per step: 0.1375143773807502) Epoch... (1/1 | Step: 1620 | Loss: 2.076136589050293 | Learning Rate: 8.946683010435663e-06 | Time per step: 0.13704783474957502) Epoch... (1/1 | Step: 1630 | Loss: 2.0940370559692383 | Learning Rate: 8.816644367470872e-06 | Time per step: 0.13645588005978637) Epoch... (1/1 | Step: 1640 | Loss: 2.052903652191162 | Learning Rate: 8.68660572450608e-06 | Time per step: 0.1358662321800139) Epoch... (1/1 | Step: 1650 | Loss: 2.0199592113494873 | Learning Rate: 8.556566172046587e-06 | Time per step: 0.13528442931897713) Epoch... (1/1 | Step: 1660 | Loss: 2.0727717876434326 | Learning Rate: 8.426527529081795e-06 | Time per step: 0.1346998906997313) Epoch... (1/1 | Step: 1670 | Loss: 2.0786218643188477 | Learning Rate: 8.296488886117004e-06 | Time per step: 0.13415220112143877) Epoch... (1/1 | Step: 1680 | Loss: 1.9562125205993652 | Learning Rate: 8.166450243152212e-06 | Time per step: 0.13373985659508478) Epoch... (1/1 | Step: 1690 | Loss: 2.0937490463256836 | Learning Rate: 8.03641160018742e-06 | Time per step: 0.1331933374235616) Epoch... (1/1 | Step: 1700 | Loss: 2.078627586364746 | Learning Rate: 7.906372047727928e-06 | Time per step: 0.13263628118178425) Epoch... (1/1 | Step: 1710 | Loss: 2.112215518951416 | Learning Rate: 7.776333404763136e-06 | Time per step: 0.13209281176851506) Epoch... (1/1 | Step: 1720 | Loss: 1.9597207307815552 | Learning Rate: 7.646294761798345e-06 | Time per step: 0.13154547491738963) Epoch... (1/1 | Step: 1730 | Loss: 2.0493929386138916 | Learning Rate: 7.516253845096799e-06 | Time per step: 0.131024552356301) Epoch... (1/1 | Step: 1740 | Loss: 2.0585577487945557 | Learning Rate: 7.386215202132007e-06 | Time per step: 0.1304960524898836) Epoch... (1/1 | Step: 1750 | Loss: 1.970508098602295 | Learning Rate: 7.256176104419865e-06 | Time per step: 0.12997342341286797) Epoch... (1/1 | Step: 1760 | Loss: 2.0194671154022217 | Learning Rate: 7.126137461455073e-06 | Time per step: 0.12945761531591415) Epoch... (1/1 | Step: 1770 | Loss: 2.0107901096343994 | Learning Rate: 6.996098363742931e-06 | Time per step: 0.1289362097864097) Epoch... (1/1 | Step: 1780 | Loss: 2.120974063873291 | Learning Rate: 6.866059720778139e-06 | Time per step: 0.12843746643387868) Epoch... (1/1 | Step: 1790 | Loss: 2.1194677352905273 | Learning Rate: 6.736021077813348e-06 | Time per step: 0.12795682739279124) Epoch... (1/1 | Step: 1800 | Loss: 2.0729482173919678 | Learning Rate: 6.605981980101205e-06 | Time per step: 0.1274824545118544) Epoch... (1/1 | Step: 1810 | Loss: 2.0226759910583496 | Learning Rate: 6.475943337136414e-06 | Time per step: 0.12700373612714735) Epoch... (1/1 | Step: 1820 | Loss: 2.0213565826416016 | Learning Rate: 6.3459042394242715e-06 | Time per step: 0.12653065959175863) Epoch... (1/1 | Step: 1830 | Loss: 2.0867295265197754 | Learning Rate: 6.21586559645948e-06 | Time per step: 0.1260442264744493) Epoch... (1/1 | Step: 1840 | Loss: 2.053117275238037 | Learning Rate: 6.085824679757934e-06 | Time per step: 0.12557031317897466) Epoch... (1/1 | Step: 1850 | Loss: 2.0554451942443848 | Learning Rate: 5.9557860367931426e-06 | Time per step: 0.12511137382404225) Epoch... (1/1 | Step: 1860 | Loss: 2.066519260406494 | Learning Rate: 5.825746939081e-06 | Time per step: 0.12466220983894923) Epoch... (1/1 | Step: 1870 | Loss: 1.9540824890136719 | Learning Rate: 5.695708296116209e-06 | Time per step: 0.12421020441514286) Epoch... (1/1 | Step: 1880 | Loss: 2.027324676513672 | Learning Rate: 5.565669198404066e-06 | Time per step: 0.12376356746288056) Epoch... (1/1 | Step: 1890 | Loss: 2.067963123321533 | Learning Rate: 5.435630555439275e-06 | Time per step: 0.12333493194882832) Epoch... (1/1 | Step: 1900 | Loss: 2.079224109649658 | Learning Rate: 5.305591912474483e-06 | Time per step: 0.12303855419158935) Epoch... (1/1 | Step: 1910 | Loss: 1.979988932609558 | Learning Rate: 5.175552814762341e-06 | Time per step: 0.12260543825738718) Epoch... (1/1 | Step: 1920 | Loss: 2.013458013534546 | Learning Rate: 5.045514171797549e-06 | Time per step: 0.12217357369760672) Epoch... (1/1 | Step: 1930 | Loss: 1.9864708185195923 | Learning Rate: 4.915475074085407e-06 | Time per step: 0.12175397489972682) Epoch... (1/1 | Step: 1940 | Loss: 2.100825548171997 | Learning Rate: 4.7854364311206155e-06 | Time per step: 0.12133047445533202) Epoch... (1/1 | Step: 1950 | Loss: 2.047811508178711 | Learning Rate: 4.655397333408473e-06 | Time per step: 0.12102636288373898) Epoch... (1/1 | Step: 1960 | Loss: 2.0417799949645996 | Learning Rate: 4.525356871454278e-06 | Time per step: 0.12062084309908809) Epoch... (1/1 | Step: 1970 | Loss: 2.0019946098327637 | Learning Rate: 4.395317773742136e-06 | Time per step: 0.12022043254774839) Epoch... (1/1 | Step: 1980 | Loss: 1.9874119758605957 | Learning Rate: 4.265279130777344e-06 | Time per step: 0.11983201431505608) Epoch... (1/1 | Step: 1990 | Loss: 2.0997118949890137 | Learning Rate: 4.135240033065202e-06 | Time per step: 0.11945154547092303) Epoch... (1/1 | Step: 2000 | Loss: 2.0525479316711426 | Learning Rate: 4.00520139010041e-06 | Time per step: 0.11918327701091766) Epoch... (1/1 | Step: 2000 | Eval Loss: 2.0153088569641113 | Eval rouge1: 40.086076 | Eval rouge2: 14.637774 | Eval rougeL: 36.357252 | Eval rougeLsum: 36.360759 | Eval gen_len: 12.234496 |)