content_planner / log.txt
ToughStone's picture
Upload 7 files
1e64e08
Training:At training steps 100, training MLE loss is 2.1784725362062454, train CRF loss is 12.934807199835777
Training:At training steps 200, training MLE loss is 2.0982215950638055, train CRF loss is 12.300238718390466
Training:At training steps 300, training MLE loss is 2.0392028558254243, train CRF loss is 11.546043652296067
Training:At training steps 400, training MLE loss is 1.9793093436770142, train CRF loss is 10.874783922359347
Training:At training steps 500, training MLE loss is 1.926870346069336, train CRF loss is 10.30856076514721
Validation:At training steps 500, training MLE loss is 1.926870346069336, train CRF loss is 10.30856076514721, validation MLE loss is 1.7011445776412362, validation ppl is 5.48, validation CRF loss is 7.628421112110741, validation BLEU is 2.17
Training:At training steps 600, training MLE loss is 1.6453405766189098, train CRF loss is 7.60670139580965
Training:At training steps 700, training MLE loss is 1.63284099817276, train CRF loss is 7.421994224190712
Training:At training steps 800, training MLE loss is 1.6440962411959965, train CRF loss is 7.273698962877194
Training:At training steps 900, training MLE loss is 1.6543213388882578, train CRF loss is 7.1638311782479285
Training:At training steps 1000, training MLE loss is 1.662141635477543, train CRF loss is 7.0699912850856785
Validation:At training steps 1000, training MLE loss is 1.662141635477543, train CRF loss is 7.0699912850856785, validation MLE loss is 1.6142848832042593, validation ppl is 5.024, validation CRF loss is 6.718102128882157, validation BLEU is 13.77
Training:At training steps 1100, training MLE loss is 1.6860213121399283, train CRF loss is 6.558073410093784
Training:At training steps 1200, training MLE loss is 1.6879597918875515, train CRF loss is 6.476125329062342
Training:At training steps 1300, training MLE loss is 1.6901934181898832, train CRF loss is 6.400528302540382
Training:At training steps 1400, training MLE loss is 1.6874116013851017, train CRF loss is 6.323282389268279
Training:At training steps 1500, training MLE loss is 1.685257480673492, train CRF loss is 6.2425215153992175
Validation:At training steps 1500, training MLE loss is 1.685257480673492, train CRF loss is 6.2425215153992175, validation MLE loss is 1.6475643758711063, validation ppl is 5.194, validation CRF loss is 6.161781003600673, validation BLEU is 36.95
Training:At training steps 1600, training MLE loss is 1.649856363348663, train CRF loss is 5.814491910338401
Training:At training steps 1700, training MLE loss is 1.6471128077618777, train CRF loss is 5.770473006144166
Training:At training steps 1800, training MLE loss is 1.6595591168105601, train CRF loss is 5.715045610864957
Training:At training steps 1900, training MLE loss is 1.6628169665671886, train CRF loss is 5.660444211177528
Training:At training steps 2000, training MLE loss is 1.66742782895267, train CRF loss is 5.606660210967064
Validation:At training steps 2000, training MLE loss is 1.66742782895267, train CRF loss is 5.606660210967064, validation MLE loss is 1.764728539868405, validation ppl is 5.84, validation CRF loss is 5.423004655461562, validation BLEU is 41.7
Training:At training steps 2100, training MLE loss is 1.717916771657765, train CRF loss is 5.346369043886662
Training:At training steps 2200, training MLE loss is 1.6950387885607778, train CRF loss is 5.31025716394186
Training:At training steps 2300, training MLE loss is 1.6802360815927386, train CRF loss is 5.256107118974129
Training:At training steps 2400, training MLE loss is 1.6720353521779179, train CRF loss is 5.206417094878852
Training:At training steps 2500, training MLE loss is 1.671622773014009, train CRF loss is 5.165145996421575
Validation:At training steps 2500, training MLE loss is 1.671622773014009, train CRF loss is 5.165145996421575, validation MLE loss is 1.9742059707641602, validation ppl is 7.201, validation CRF loss is 5.105836190675435, validation BLEU is 43.74
Training:At training steps 2600, training MLE loss is 1.6571612641215325, train CRF loss is 4.902831239402294
Training:At training steps 2700, training MLE loss is 1.6445588554814459, train CRF loss is 4.872643101140857
Training:At training steps 2800, training MLE loss is 1.6433387485146522, train CRF loss is 4.834612847864628
Training:At training steps 2900, training MLE loss is 1.6397519523371011, train CRF loss is 4.795379407741129
Training:At training steps 3000, training MLE loss is 1.6356980685442686, train CRF loss is 4.762988601058722
Validation:At training steps 3000, training MLE loss is 1.6356980685442686, train CRF loss is 4.762988601058722, validation MLE loss is 1.6534317606373836, validation ppl is 5.225, validation CRF loss is 4.8574807079214795, validation BLEU is 43.58
Training:At training steps 3100, training MLE loss is 1.6360853765904904, train CRF loss is 4.485567877143621
Training:At training steps 3200, training MLE loss is 1.6284693580120801, train CRF loss is 4.457970159947872
Training:At training steps 3300, training MLE loss is 1.6227223824088772, train CRF loss is 4.441106905713678
Training:At training steps 3400, training MLE loss is 1.6225302474945784, train CRF loss is 4.403005280159414
Training:At training steps 3500, training MLE loss is 1.6200870532393457, train CRF loss is 4.353320182204246
Validation:At training steps 3500, training MLE loss is 1.6200870532393457, train CRF loss is 4.353320182204246, validation MLE loss is 1.8913114964962006, validation ppl is 6.628, validation CRF loss is 4.564021948136781, validation BLEU is 45.8
Training:At training steps 3600, training MLE loss is 1.6040248465910554, train CRF loss is 4.044802532866597
Training:At training steps 3700, training MLE loss is 1.6018272586539388, train CRF loss is 4.0397713960707184
Training:At training steps 3800, training MLE loss is 1.5963242715224624, train CRF loss is 4.0197391292452815
Training:At training steps 3900, training MLE loss is 1.5921865471638739, train CRF loss is 3.9750617011077702
Training:At training steps 4000, training MLE loss is 1.5853307132795453, train CRF loss is 3.940721734583378
Validation:At training steps 4000, training MLE loss is 1.5853307132795453, train CRF loss is 3.940721734583378, validation MLE loss is 2.1705432879297355, validation ppl is 8.763, validation CRF loss is 4.453233875726399, validation BLEU is 45.63
Training:At training steps 4100, training MLE loss is 1.5391007668152452, train CRF loss is 3.6718646658957006
Training:At training steps 4200, training MLE loss is 1.5435833856463432, train CRF loss is 3.615952889546752
Training:At training steps 4300, training MLE loss is 1.5476558462902903, train CRF loss is 3.591468147709966
Training:At training steps 4400, training MLE loss is 1.5343777189403773, train CRF loss is 3.553918971568346
Training:At training steps 4500, training MLE loss is 1.5221277109012008, train CRF loss is 3.520597518607974
Validation:At training steps 4500, training MLE loss is 1.5221277109012008, train CRF loss is 3.520597518607974, validation MLE loss is 2.176476232315365, validation ppl is 8.815, validation CRF loss is 4.167857879086545, validation BLEU is 48.68
Training:At training steps 4600, training MLE loss is 1.4212393000349401, train CRF loss is 3.2431894658505915
Training:At training steps 4700, training MLE loss is 1.4229580554924905, train CRF loss is 3.1964455591887235
Training:At training steps 4800, training MLE loss is 1.4213595109681287, train CRF loss is 3.161690682694316
Training:At training steps 4900, training MLE loss is 1.4253752730600535, train CRF loss is 3.1181843662355093
Training:At training steps 5000, training MLE loss is 1.4164781229123473, train CRF loss is 3.077226140663028
Validation:At training steps 5000, training MLE loss is 1.4164781229123473, train CRF loss is 3.077226140663028, validation MLE loss is 2.561228219615786, validation ppl is 12.952, validation CRF loss is 4.061976834347374, validation BLEU is 51.0
Training:At training steps 5100, training MLE loss is 1.3682063813507557, train CRF loss is 2.809998641014099
Training:At training steps 5200, training MLE loss is 1.3568593801930546, train CRF loss is 2.797075958047062
Training:At training steps 5300, training MLE loss is 1.3430961011039715, train CRF loss is 2.763538802030186
Training:At training steps 5400, training MLE loss is 1.3248187417769806, train CRF loss is 2.712593943467364
Training:At training steps 5500, training MLE loss is 1.3149318660907447, train CRF loss is 2.680907035868615
Validation:At training steps 5500, training MLE loss is 1.3149318660907447, train CRF loss is 2.680907035868615, validation MLE loss is 2.6075468753513538, validation ppl is 13.566, validation CRF loss is 4.010923859320189, validation BLEU is 51.85
Training:At training steps 5600, training MLE loss is 1.2547741066105664, train CRF loss is 2.462791693750769
Training:At training steps 5700, training MLE loss is 1.2338031995482743, train CRF loss is 2.4133313175290825
Training:At training steps 5800, training MLE loss is 1.2256498014740647, train CRF loss is 2.3840066698255638
Training:At training steps 5900, training MLE loss is 1.2132700395653955, train CRF loss is 2.347328111664392
Training:At training steps 6000, training MLE loss is 1.2074419733490795, train CRF loss is 2.3193258109502493
Validation:At training steps 6000, training MLE loss is 1.2074419733490795, train CRF loss is 2.3193258109502493, validation MLE loss is 3.0970318191929866, validation ppl is 22.132, validation CRF loss is 4.178590947075894, validation BLEU is 52.92
Training:At training steps 6100, training MLE loss is 1.148437958834693, train CRF loss is 2.145907217897475
Training:At training steps 6200, training MLE loss is 1.1234108827030287, train CRF loss is 2.098066014042124
Training:At training steps 6300, training MLE loss is 1.1075852538490047, train CRF loss is 2.0537152569927275
Training:At training steps 6400, training MLE loss is 1.098778738884721, train CRF loss is 2.0261904539656825
Training:At training steps 6500, training MLE loss is 1.0817914879024029, train CRF loss is 1.9971989628262818
Validation:At training steps 6500, training MLE loss is 1.0817914879024029, train CRF loss is 1.9971989628262818, validation MLE loss is 3.1845500437836898, validation ppl is 24.156, validation CRF loss is 4.200787600718047, validation BLEU is 55.56
Training:At training steps 6600, training MLE loss is 1.0046104171779007, train CRF loss is 1.8302255751192569
Training:At training steps 6700, training MLE loss is 0.9805403192806988, train CRF loss is 1.786233987575397
Training:At training steps 6800, training MLE loss is 0.9647217171732336, train CRF loss is 1.7629264024148386
Training:At training steps 6900, training MLE loss is 0.9564048809395171, train CRF loss is 1.7310621694475412
Training:At training steps 7000, training MLE loss is 0.9456058407053352, train CRF loss is 1.7042760421559215
Validation:At training steps 7000, training MLE loss is 0.9456058407053352, train CRF loss is 1.7042760421559215, validation MLE loss is 3.5988758378907253, validation ppl is 36.557, validation CRF loss is 4.369878254438701, validation BLEU is 55.46
Training:At training steps 7100, training MLE loss is 0.8745595926418901, train CRF loss is 1.5353786450391635
Training:At training steps 7200, training MLE loss is 0.8637740414449945, train CRF loss is 1.526656605375465
Training:At training steps 7300, training MLE loss is 0.8642973381187766, train CRF loss is 1.5084321306242297
Training:At training steps 7400, training MLE loss is 0.8620728149765636, train CRF loss is 1.4821778037201148
Training:At training steps 7500, training MLE loss is 0.8497014679973945, train CRF loss is 1.4562445906521753
Validation:At training steps 7500, training MLE loss is 0.8497014679973945, train CRF loss is 1.4562445906521753, validation MLE loss is 3.8774175738033496, validation ppl is 48.299, validation CRF loss is 4.473550234970293, validation BLEU is 56.48
Training:At training steps 7600, training MLE loss is 0.7817190407169983, train CRF loss is 1.2995845413696951
Training:At training steps 7700, training MLE loss is 0.7713232061709278, train CRF loss is 1.283740378561197
Training:At training steps 7800, training MLE loss is 0.7612530946467693, train CRF loss is 1.2598780168849044
Training:At training steps 7900, training MLE loss is 0.7544622603757307, train CRF loss is 1.234763264853682
Training:At training steps 8000, training MLE loss is 0.749508782430552, train CRF loss is 1.213887683926383
Validation:At training steps 8000, training MLE loss is 0.749508782430552, train CRF loss is 1.213887683926383, validation MLE loss is 4.134637098563345, validation ppl is 62.467, validation CRF loss is 4.616520097381191, validation BLEU is 57.42
Training:At training steps 8100, training MLE loss is 0.7026818502834067, train CRF loss is 1.0739362951274962
Training:At training steps 8200, training MLE loss is 0.6899540470377542, train CRF loss is 1.0639628748688847
Training:At training steps 8300, training MLE loss is 0.6796083650008465, train CRF loss is 1.0466389678604902
Training:At training steps 8400, training MLE loss is 0.6697699141688644, train CRF loss is 1.0287308476210455
Training:At training steps 8500, training MLE loss is 0.6597233803421259, train CRF loss is 1.007862571743084
Validation:At training steps 8500, training MLE loss is 0.6597233803421259, train CRF loss is 1.007862571743084, validation MLE loss is 4.2928627760786755, validation ppl is 73.176, validation CRF loss is 4.9051362119222945, validation BLEU is 56.26
Training:At training steps 8600, training MLE loss is 0.6251462790905498, train CRF loss is 0.9242439621849917
Training:At training steps 8700, training MLE loss is 0.6101812116347719, train CRF loss is 0.8881829871854279
Training:At training steps 8800, training MLE loss is 0.6015447341487743, train CRF loss is 0.8656682099937462
Training:At training steps 8900, training MLE loss is 0.5966220863728086, train CRF loss is 0.8501351060831803
Training:At training steps 9000, training MLE loss is 0.5868371502426453, train CRF loss is 0.8315429028719664
Validation:At training steps 9000, training MLE loss is 0.5868371502426453, train CRF loss is 0.8315429028719664, validation MLE loss is 4.624423397214789, validation ppl is 101.944, validation CRF loss is 5.085879824663463, validation BLEU is 56.55
Training:At training steps 9100, training MLE loss is 0.5432243651268073, train CRF loss is 0.7433415025146678
Training:At training steps 9200, training MLE loss is 0.5461192704923451, train CRF loss is 0.7240626674419036
Training:At training steps 9300, training MLE loss is 0.5424998440073493, train CRF loss is 0.7079840987469167
Training:At training steps 9400, training MLE loss is 0.5395023194732493, train CRF loss is 0.6914269380346013
Training:At training steps 9500, training MLE loss is 0.5275486172467936, train CRF loss is 0.6773570795102569
Validation:At training steps 9500, training MLE loss is 0.5275486172467936, train CRF loss is 0.6773570795102569, validation MLE loss is 4.863465105232439, validation ppl is 129.472, validation CRF loss is 5.159960232282939, validation BLEU is 55.36
Training:At training steps 9600, training MLE loss is 0.48099258104339243, train CRF loss is 0.5647649196404382
Training:At training steps 9700, training MLE loss is 0.47229832061188065, train CRF loss is 0.5556708462028473
Training:At training steps 9800, training MLE loss is 0.4702873572286141, train CRF loss is 0.5412080482273207
Training:At training steps 9900, training MLE loss is 0.46143423402289047, train CRF loss is 0.5281206803090754
Training:At training steps 10000, training MLE loss is 0.4555416349087609, train CRF loss is 0.5167036624765023
Validation:At training steps 10000, training MLE loss is 0.4555416349087609, train CRF loss is 0.5167036624765023, validation MLE loss is 5.264748996809909, validation ppl is 193.398, validation CRF loss is 5.550742416005385, validation BLEU is 56.55
Training:At training steps 10100, training MLE loss is 0.4247302026383113, train CRF loss is 0.4552066841124906
Training:At training steps 10200, training MLE loss is 0.4133478711656062, train CRF loss is 0.4482459806257975
Training:At training steps 10300, training MLE loss is 0.4147153708092325, train CRF loss is 0.4382082056369836
Training:At training steps 10400, training MLE loss is 0.4114508805204241, train CRF loss is 0.425686695934146
Training:At training steps 10500, training MLE loss is 0.4075417493577697, train CRF loss is 0.41719312600653213
Validation:At training steps 10500, training MLE loss is 0.4075417493577697, train CRF loss is 0.41719312600653213, validation MLE loss is 5.318483948707581, validation ppl is 204.074, validation CRF loss is 5.541174521571712, validation BLEU is 58.7
Training:At training steps 10600, training MLE loss is 0.3699938302190276, train CRF loss is 0.3690635350090088
Training:At training steps 10700, training MLE loss is 0.3654350729330326, train CRF loss is 0.36312849806008674
Training:At training steps 10800, training MLE loss is 0.3579249342769617, train CRF loss is 0.3588216792953972
Training:At training steps 10900, training MLE loss is 0.3475873048710491, train CRF loss is 0.3457045231673601
Training:At training steps 11000, training MLE loss is 0.34709290405298815, train CRF loss is 0.3383693556598664
Validation:At training steps 11000, training MLE loss is 0.34709290405298815, train CRF loss is 0.3383693556598664, validation MLE loss is 5.622063484631087, validation ppl is 276.459, validation CRF loss is 5.85330164118817, validation BLEU is 58.39
Training:At training steps 11100, training MLE loss is 0.3272928967757616, train CRF loss is 0.2913328892843856
Training:At training steps 11200, training MLE loss is 0.31757975337037353, train CRF loss is 0.2867831326254964
Training:At training steps 11300, training MLE loss is 0.31277060515169675, train CRF loss is 0.2816899385668512
Training:At training steps 11400, training MLE loss is 0.307950277968921, train CRF loss is 0.2758089397009826
Training:At training steps 11500, training MLE loss is 0.30890544422413224, train CRF loss is 0.26996647533129725
Validation:At training steps 11500, training MLE loss is 0.30890544422413224, train CRF loss is 0.26996647533129725, validation MLE loss is 5.841583399396193, validation ppl is 344.324, validation CRF loss is 5.895392342617638, validation BLEU is 56.88
Training:At training steps 11600, training MLE loss is 0.2881745820424112, train CRF loss is 0.24491363806271693
Training:At training steps 11700, training MLE loss is 0.29300937523252285, train CRF loss is 0.2399480794579813
Training:At training steps 11800, training MLE loss is 0.2943728491971221, train CRF loss is 0.24425970093803698
Training:At training steps 11900, training MLE loss is 0.28964038355135924, train CRF loss is 0.23905303974128855
Training:At training steps 12000, training MLE loss is 0.28682241859695934, train CRF loss is 0.23301122970025245
Validation:At training steps 12000, training MLE loss is 0.28682241859695934, train CRF loss is 0.23301122970025245, validation MLE loss is 6.121181839390805, validation ppl is 455.403, validation CRF loss is 6.000786436231513, validation BLEU is 58.83
Training:At training steps 12100, training MLE loss is 0.2773310349567328, train CRF loss is 0.20967423537481408
Training:At training steps 12200, training MLE loss is 0.26523506515848566, train CRF loss is 0.20374638494204647
Training:At training steps 12300, training MLE loss is 0.25673826232910263, train CRF loss is 0.19812067757966056
Training:At training steps 12400, training MLE loss is 0.2577871160298946, train CRF loss is 0.19648294928629412
Training:At training steps 12500, training MLE loss is 0.25290717545010555, train CRF loss is 0.19159533007063329
Validation:At training steps 12500, training MLE loss is 0.25290717545010555, train CRF loss is 0.19159533007063329, validation MLE loss is 6.192658364772797, validation ppl is 489.145, validation CRF loss is 6.376368986932855, validation BLEU is 61.0
Training:At training steps 12600, training MLE loss is 0.23055440933261706, train CRF loss is 0.17231995385714982
Training:At training steps 12700, training MLE loss is 0.2293779178587647, train CRF loss is 0.16654334977430152
Training:At training steps 12800, training MLE loss is 0.21947098807768878, train CRF loss is 0.16401278129318597
Training:At training steps 12900, training MLE loss is 0.21445209587054706, train CRF loss is 0.1608177019982537
Training:At training steps 13000, training MLE loss is 0.21362188525735837, train CRF loss is 0.15905371428897752
Validation:At training steps 13000, training MLE loss is 0.21362188525735837, train CRF loss is 0.15905371428897752, validation MLE loss is 5.995989708524001, validation ppl is 401.814, validation CRF loss is 6.241055705045399, validation BLEU is 59.39
Training:At training steps 13100, training MLE loss is 0.20180555783415913, train CRF loss is 0.14134813913276958
Training:At training steps 13200, training MLE loss is 0.1955210721498679, train CRF loss is 0.1405336221260586
Training:At training steps 13300, training MLE loss is 0.1992277227852719, train CRF loss is 0.14390699789039899
Training:At training steps 13400, training MLE loss is 0.1992623662795131, train CRF loss is 0.14422308118859972
Training:At training steps 13500, training MLE loss is 0.1999259954685367, train CRF loss is 0.1418878367754769
Validation:At training steps 13500, training MLE loss is 0.1999259954685367, train CRF loss is 0.1418878367754769, validation MLE loss is 6.236251953401063, validation ppl is 510.94, validation CRF loss is 6.372128056852441, validation BLEU is 59.39
Training:At training steps 13600, training MLE loss is 0.1883058279097895, train CRF loss is 0.125663155334787
Training:At training steps 13700, training MLE loss is 0.19002788537301513, train CRF loss is 0.1270971595252877
Training:At training steps 13800, training MLE loss is 0.18502907073395364, train CRF loss is 0.12442975914201801
Training:At training steps 13900, training MLE loss is 0.1820061389997113, train CRF loss is 0.12127588374995128
Training:At training steps 14000, training MLE loss is 0.1789755030403303, train CRF loss is 0.12046926028769962
Validation:At training steps 14000, training MLE loss is 0.1789755030403303, train CRF loss is 0.12046926028769962, validation MLE loss is 6.466351640851874, validation ppl is 643.133, validation CRF loss is 6.612944377096076, validation BLEU is 58.5
Training:At training steps 14100, training MLE loss is 0.15747668566562426, train CRF loss is 0.10318211679904607
Training:At training steps 14200, training MLE loss is 0.15883921808467677, train CRF loss is 0.10610216247380436
Training:At training steps 14300, training MLE loss is 0.15630672404250012, train CRF loss is 0.10377752934908206
Training:At training steps 14400, training MLE loss is 0.1559991072965613, train CRF loss is 0.10381915502338472
Training:At training steps 14500, training MLE loss is 0.15679464786147218, train CRF loss is 0.10301892519814737
Validation:At training steps 14500, training MLE loss is 0.15679464786147218, train CRF loss is 0.10301892519814737, validation MLE loss is 6.3078018207299085, validation ppl is 548.837, validation CRF loss is 6.540762302122618, validation BLEU is 60.44
Training:At training steps 14600, training MLE loss is 0.14836147996603358, train CRF loss is 0.09121755739179208
Training:At training steps 14700, training MLE loss is 0.14612986552433996, train CRF loss is 0.09022963507594568
Training:At training steps 14800, training MLE loss is 0.14520430684613833, train CRF loss is 0.0897221003162243
Training:At training steps 14900, training MLE loss is 0.14400671402339354, train CRF loss is 0.08938064130901921
Training:At training steps 15000, training MLE loss is 0.1400690131079841, train CRF loss is 0.08755445748720377
Validation:At training steps 15000, training MLE loss is 0.1400690131079841, train CRF loss is 0.08755445748720377, validation MLE loss is 6.399616479873657, validation ppl is 601.614, validation CRF loss is 6.700088833507738, validation BLEU is 60.09
Training:At training steps 15100, training MLE loss is 0.12360495931567585, train CRF loss is 0.08529563566562047
Training:At training steps 15200, training MLE loss is 0.12195537306163146, train CRF loss is 0.08345298356503861
Training:At training steps 15300, training MLE loss is 0.12113295527466184, train CRF loss is 0.08035242358639834
Training:At training steps 15400, training MLE loss is 0.11986786501527888, train CRF loss is 0.07846641463791741
Training:At training steps 15500, training MLE loss is 0.12104202742373445, train CRF loss is 0.07812822067772374
Validation:At training steps 15500, training MLE loss is 0.12104202742373445, train CRF loss is 0.07812822067772374, validation MLE loss is 6.864885016491539, validation ppl is 958.036, validation CRF loss is 7.037457914728868, validation BLEU is 60.1
Training:At training steps 15600, training MLE loss is 0.11447354335467025, train CRF loss is 0.06221918804419602
Training:At training steps 15700, training MLE loss is 0.11298380131838712, train CRF loss is 0.061468606447449475
Training:At training steps 15800, training MLE loss is 0.11155751374867577, train CRF loss is 0.06231634511456377
Training:At training steps 15900, training MLE loss is 0.10842128410562055, train CRF loss is 0.0619604719164505
Training:At training steps 16000, training MLE loss is 0.10728312725172783, train CRF loss is 0.06118294412442674
Validation:At training steps 16000, training MLE loss is 0.10728312725172783, train CRF loss is 0.06118294412442674, validation MLE loss is 6.857525091422231, validation ppl is 951.01, validation CRF loss is 7.087279881301679, validation BLEU is 58.77
Training:At training steps 16100, training MLE loss is 0.10090909428246278, train CRF loss is 0.06604414603452483
Training:At training steps 16200, training MLE loss is 0.09925672267543974, train CRF loss is 0.062138224358846944
Training:At training steps 16300, training MLE loss is 0.10090418465669813, train CRF loss is 0.06426452727546225
Training:At training steps 16400, training MLE loss is 0.09848108276016092, train CRF loss is 0.06113000371849957
Training:At training steps 16500, training MLE loss is 0.09730565325446697, train CRF loss is 0.06009272731914967
Validation:At training steps 16500, training MLE loss is 0.09730565325446697, train CRF loss is 0.06009272731914967, validation MLE loss is 7.191787688355697, validation ppl is 1328.476, validation CRF loss is 7.3239968826896265, validation BLEU is 59.12
Training:At training steps 16600, training MLE loss is 0.09701449829760236, train CRF loss is 0.055532499634776966
Training:At training steps 16700, training MLE loss is 0.09239072546978434, train CRF loss is 0.05350442541579753
Training:At training steps 16800, training MLE loss is 0.08995584327442885, train CRF loss is 0.05382314547814739
Training:At training steps 16900, training MLE loss is 0.09131666521482032, train CRF loss is 0.055646031491279044
Training:At training steps 17000, training MLE loss is 0.08822444755904076, train CRF loss is 0.05407587670630608
Validation:At training steps 17000, training MLE loss is 0.08822444755904076, train CRF loss is 0.05407587670630608, validation MLE loss is 6.89798106645283, validation ppl is 990.273, validation CRF loss is 7.152861507315385, validation BLEU is 59.75
Training:At training steps 17100, training MLE loss is 0.07665106416889103, train CRF loss is 0.049780751858712335
Training:At training steps 17200, training MLE loss is 0.07914235677835706, train CRF loss is 0.045109050205802956
Training:At training steps 17300, training MLE loss is 0.07879692089515719, train CRF loss is 0.045889903980815445
Training:At training steps 17400, training MLE loss is 0.07926124805453934, train CRF loss is 0.046934111295499924
Training:At training steps 17500, training MLE loss is 0.0766879546859216, train CRF loss is 0.044920390413207674
Validation:At training steps 17500, training MLE loss is 0.0766879546859216, train CRF loss is 0.044920390413207674, validation MLE loss is 7.49906010063071, validation ppl is 1806.344, validation CRF loss is 7.559512593244252, validation BLEU is 59.67
Training:At training steps 17600, training MLE loss is 0.07305128398085145, train CRF loss is 0.045187681112636824
Training:At training steps 17700, training MLE loss is 0.07202580113984858, train CRF loss is 0.04605540263076125
Training:At training steps 17800, training MLE loss is 0.07372386656331993, train CRF loss is 0.0463684781668036
Training:At training steps 17900, training MLE loss is 0.07258614651050213, train CRF loss is 0.045449953905836636
Training:At training steps 18000, training MLE loss is 0.07190056971744227, train CRF loss is 0.044600839302275604
Validation:At training steps 18000, training MLE loss is 0.07190056971744227, train CRF loss is 0.044600839302275604, validation MLE loss is 7.302204273248973, validation ppl is 1483.567, validation CRF loss is 7.3966263030704695, validation BLEU is 60.19
Training:At training steps 18100, training MLE loss is 0.07104298689114216, train CRF loss is 0.043254005014250085
Training:At training steps 18200, training MLE loss is 0.06832526516072715, train CRF loss is 0.03915011136846819
Training:At training steps 18300, training MLE loss is 0.06701130104221142, train CRF loss is 0.03872835903643745
Training:At training steps 18400, training MLE loss is 0.06746382509210312, train CRF loss is 0.03920891508111934
Training:At training steps 18500, training MLE loss is 0.06635269111216435, train CRF loss is 0.038398729862256155
Validation:At training steps 18500, training MLE loss is 0.06635269111216435, train CRF loss is 0.038398729862256155, validation MLE loss is 7.568509923784356, validation ppl is 1936.253, validation CRF loss is 7.812963416701869, validation BLEU is 60.3
Training:At training steps 18600, training MLE loss is 0.05477254355117516, train CRF loss is 0.03199583519917511
Training:At training steps 18700, training MLE loss is 0.05367387758511876, train CRF loss is 0.030250844246016068
Training:At training steps 18800, training MLE loss is 0.05292981302730338, train CRF loss is 0.03072116440443086
Training:At training steps 18900, training MLE loss is 0.054997139789662565, train CRF loss is 0.03145089166077986
Training:At training steps 19000, training MLE loss is 0.055170523317550936, train CRF loss is 0.03233011078386979
Validation:At training steps 19000, training MLE loss is 0.055170523317550936, train CRF loss is 0.03233011078386979, validation MLE loss is 7.649217338938462, validation ppl is 2099.002, validation CRF loss is 7.775305942485207, validation BLEU is 60.42
Training:At training steps 19100, training MLE loss is 0.0475691871517946, train CRF loss is 0.027079495979746328
Training:At training steps 19200, training MLE loss is 0.045850706506555775, train CRF loss is 0.026501422045348805
Training:At training steps 19300, training MLE loss is 0.04543964020217655, train CRF loss is 0.026232702633176147
Training:At training steps 19400, training MLE loss is 0.045147004949221504, train CRF loss is 0.026357885943784094
Training:At training steps 19500, training MLE loss is 0.04522244256720454, train CRF loss is 0.026251692250174073
Validation:At training steps 19500, training MLE loss is 0.04522244256720454, train CRF loss is 0.026251692250174073, validation MLE loss is 7.843798129182113, validation ppl is 2549.871, validation CRF loss is 8.05462728676043, validation BLEU is 59.46
Training:At training steps 19600, training MLE loss is 0.04998258007190994, train CRF loss is 0.032368516072011036
Training:At training steps 19700, training MLE loss is 0.046934845224371594, train CRF loss is 0.029802016789532343
Training:At training steps 19800, training MLE loss is 0.04586911797331892, train CRF loss is 0.027980513513720513
Training:At training steps 19900, training MLE loss is 0.04440008558393292, train CRF loss is 0.027382685870628868
Training:At training steps 20000, training MLE loss is 0.04302087818062409, train CRF loss is 0.027495210384584027
Validation:At training steps 20000, training MLE loss is 0.04302087818062409, train CRF loss is 0.027495210384584027, validation MLE loss is 8.016214201324864, validation ppl is 3029.686, validation CRF loss is 8.206905239506773, validation BLEU is 60.65
Training:At training steps 20100, training MLE loss is 0.0380755607514022, train CRF loss is 0.025368185887666143
Training:At training steps 20200, training MLE loss is 0.03745514133182198, train CRF loss is 0.024685286896606828
Training:At training steps 20300, training MLE loss is 0.036929618129418335, train CRF loss is 0.02403564739660404
Training:At training steps 20400, training MLE loss is 0.0354188182138462, train CRF loss is 0.02288692815242975
Training:At training steps 20500, training MLE loss is 0.03517499287056841, train CRF loss is 0.022216567980479586
Validation:At training steps 20500, training MLE loss is 0.03517499287056841, train CRF loss is 0.022216567980479586, validation MLE loss is 8.00203766320881, validation ppl is 2987.038, validation CRF loss is 8.154047263296027, validation BLEU is 60.84
Training:At training steps 20600, training MLE loss is 0.02925657658560392, train CRF loss is 0.018767972310874157
Training:At training steps 20700, training MLE loss is 0.03146484537655619, train CRF loss is 0.02053292332994104
Training:At training steps 20800, training MLE loss is 0.03144789602711273, train CRF loss is 0.02064126753971745
Training:At training steps 20900, training MLE loss is 0.03011835956081351, train CRF loss is 0.01961544704134187
Training:At training steps 21000, training MLE loss is 0.029521299328669134, train CRF loss is 0.019208216029459064
Validation:At training steps 21000, training MLE loss is 0.029521299328669134, train CRF loss is 0.019208216029459064, validation MLE loss is 8.252566450520566, validation ppl is 3837.462, validation CRF loss is 8.319485639270983, validation BLEU is 59.24
Training:At training steps 21100, training MLE loss is 0.026284580850983282, train CRF loss is 0.016797429682867335
Training:At training steps 21200, training MLE loss is 0.025966182992733032, train CRF loss is 0.016694613206412138
Training:At training steps 21300, training MLE loss is 0.02509393142821284, train CRF loss is 0.015821357016575693
Training:At training steps 21400, training MLE loss is 0.025433296899051897, train CRF loss is 0.016066151486012713
Training:At training steps 21500, training MLE loss is 0.025222979960104436, train CRF loss is 0.016224831299708312
Validation:At training steps 21500, training MLE loss is 0.025222979960104436, train CRF loss is 0.016224831299708312, validation MLE loss is 8.496889142613663, validation ppl is 4899.503, validation CRF loss is 8.709328814556724, validation BLEU is 60.33
Training:At training steps 21600, training MLE loss is 0.020850242441363145, train CRF loss is 0.013305847390869312
Training:At training steps 21700, training MLE loss is 0.021390414299735486, train CRF loss is 0.013433363544656367
Training:At training steps 21800, training MLE loss is 0.020336338820841345, train CRF loss is 0.012535774282661123
Training:At training steps 21900, training MLE loss is 0.019721552840758255, train CRF loss is 0.011984550611433657
Training:At training steps 22000, training MLE loss is 0.019632916727652668, train CRF loss is 0.012070516738564782
Validation:At training steps 22000, training MLE loss is 0.019632916727652668, train CRF loss is 0.012070516738564782, validation MLE loss is 8.543186438711066, validation ppl is 5131.67, validation CRF loss is 8.780698769970945, validation BLEU is 60.77
Training:At training steps 22100, training MLE loss is 0.01708879556636876, train CRF loss is 0.011361660406617301
Training:At training steps 22200, training MLE loss is 0.016529187374230484, train CRF loss is 0.010577494024541596
Training:At training steps 22300, training MLE loss is 0.015766017189489022, train CRF loss is 0.009932244687171824
Training:At training steps 22400, training MLE loss is 0.014558933463291355, train CRF loss is 0.009187226699030661
Training:At training steps 22500, training MLE loss is 0.0145342320271107, train CRF loss is 0.009426976598033758
Validation:At training steps 22500, training MLE loss is 0.0145342320271107, train CRF loss is 0.009426976598033758, validation MLE loss is 8.745189004822782, validation ppl is 6280.4, validation CRF loss is 8.893932825640627, validation BLEU is 62.12
Training:At training steps 22600, training MLE loss is 0.015995134230351923, train CRF loss is 0.009668788914679505
Training:At training steps 22700, training MLE loss is 0.014752098881845928, train CRF loss is 0.009257284775200145
Training:At training steps 22800, training MLE loss is 0.015017593309206134, train CRF loss is 0.00897139206669961
Training:At training steps 22900, training MLE loss is 0.013974721534551047, train CRF loss is 0.00869132072634059
Training:At training steps 23000, training MLE loss is 0.013818781181257048, train CRF loss is 0.008666806319037294
Validation:At training steps 23000, training MLE loss is 0.013818781181257048, train CRF loss is 0.008666806319037294, validation MLE loss is 8.83764239988829, validation ppl is 6888.733, validation CRF loss is 9.040750861167908, validation BLEU is 60.91
Training:At training steps 23100, training MLE loss is 0.010893949910771493, train CRF loss is 0.006767193456546475
Training:At training steps 23200, training MLE loss is 0.010344168947062932, train CRF loss is 0.006716342901716028
Training:At training steps 23300, training MLE loss is 0.011225300991815804, train CRF loss is 0.007229011012391566
Training:At training steps 23400, training MLE loss is 0.010930192259508588, train CRF loss is 0.007030810294898444
Training:At training steps 23500, training MLE loss is 0.010444937532260229, train CRF loss is 0.006499980457502648
Validation:At training steps 23500, training MLE loss is 0.010444937532260229, train CRF loss is 0.006499980457502648, validation MLE loss is 9.095906119597586, validation ppl is 8918.706, validation CRF loss is 9.25924493764576, validation BLEU is 61.37
Training:At training steps 23600, training MLE loss is 0.00854926366876226, train CRF loss is 0.005252689561927308
Training:At training steps 23700, training MLE loss is 0.009848597727082591, train CRF loss is 0.006129381914407052
Training:At training steps 23800, training MLE loss is 0.009537629893066876, train CRF loss is 0.006008059269318267
Training:At training steps 23900, training MLE loss is 0.00977002863171048, train CRF loss is 0.0059465492234650925
Training:At training steps 24000, training MLE loss is 0.009341851843956608, train CRF loss is 0.0057164490867902575
Validation:At training steps 24000, training MLE loss is 0.009341851843956608, train CRF loss is 0.0057164490867902575, validation MLE loss is 9.22791019866341, validation ppl is 10177.251, validation CRF loss is 9.330364371600904, validation BLEU is 60.76
Training:At training steps 24100, training MLE loss is 0.0070669642506022865, train CRF loss is 0.004761354542571756
Training:At training steps 24200, training MLE loss is 0.006731424178663055, train CRF loss is 0.004393131263731968
Training:At training steps 24300, training MLE loss is 0.007176450324751449, train CRF loss is 0.004387882051342456
Training:At training steps 24400, training MLE loss is 0.006822508174459342, train CRF loss is 0.004343009586493624
Training:At training steps 24500, training MLE loss is 0.006879484821118949, train CRF loss is 0.004352027078910941
Validation:At training steps 24500, training MLE loss is 0.006879484821118949, train CRF loss is 0.004352027078910941, validation MLE loss is 9.181007159383674, validation ppl is 9710.928, validation CRF loss is 9.4167528340691, validation BLEU is 60.93
Training:At training steps 24600, training MLE loss is 0.006067505877877339, train CRF loss is 0.0034238513446854133
Training:At training steps 24700, training MLE loss is 0.005961891537091258, train CRF loss is 0.003641504988970803
Training:At training steps 24800, training MLE loss is 0.005845536018323509, train CRF loss is 0.003280170556028733
Training:At training steps 24900, training MLE loss is 0.006264998472165391, train CRF loss is 0.0034415040783312366
Training:At training steps 25000, training MLE loss is 0.006382254267350268, train CRF loss is 0.003588787785934205
Validation:At training steps 25000, training MLE loss is 0.006382254267350268, train CRF loss is 0.003588787785934205, validation MLE loss is 9.296922539409838, validation ppl is 10904.41, validation CRF loss is 9.471397061096994, validation BLEU is 61.53
Training:At training steps 25100, training MLE loss is 0.006390881068127352, train CRF loss is 0.0043867170969898025
Training:At training steps 25200, training MLE loss is 0.005759002213721522, train CRF loss is 0.004048751630091307
Training:At training steps 25300, training MLE loss is 0.005213465684461215, train CRF loss is 0.0037105780512518144
Training:At training steps 25400, training MLE loss is 0.005071283572279083, train CRF loss is 0.003449735242757648
Training:At training steps 25500, training MLE loss is 0.004869693233557966, train CRF loss is 0.003250637154131857
Validation:At training steps 25500, training MLE loss is 0.004869693233557966, train CRF loss is 0.003250637154131857, validation MLE loss is 9.323140031413027, validation ppl is 11194.076, validation CRF loss is 9.546792431881553, validation BLEU is 61.18
Training:At training steps 25600, training MLE loss is 0.006097755166905916, train CRF loss is 0.0034005597275779034
Training:At training steps 25700, training MLE loss is 0.00544383708806347, train CRF loss is 0.0032628180245032378
Training:At training steps 25800, training MLE loss is 0.004789618071103132, train CRF loss is 0.0029963603431185323
Training:At training steps 25900, training MLE loss is 0.0043610505598147145, train CRF loss is 0.002722119748955417
Training:At training steps 26000, training MLE loss is 0.004147443499419626, train CRF loss is 0.0026305440272084936
Validation:At training steps 26000, training MLE loss is 0.004147443499419626, train CRF loss is 0.0026305440272084936, validation MLE loss is 9.695976006357293, validation ppl is 16252.077, validation CRF loss is 9.820894071930333, validation BLEU is 61.11
Training:At training steps 26100, training MLE loss is 0.004089213970077287, train CRF loss is 0.0022049149694302896
Training:At training steps 26200, training MLE loss is 0.004218602038502442, train CRF loss is 0.0023022754695598024
Training:At training steps 26300, training MLE loss is 0.004078990325424419, train CRF loss is 0.0023782401809648752
Training:At training steps 26400, training MLE loss is 0.003797782350861504, train CRF loss is 0.002407004669573295
Training:At training steps 26500, training MLE loss is 0.0034111233692437565, train CRF loss is 0.002144823188078777
Validation:At training steps 26500, training MLE loss is 0.0034111233692437565, train CRF loss is 0.002144823188078777, validation MLE loss is 9.777539510475961, validation ppl is 17633.213, validation CRF loss is 9.864710682316831, validation BLEU is 60.33
Training:At training steps 26600, training MLE loss is 0.0025737685462320246, train CRF loss is 0.0015534839076083396
Training:At training steps 26700, training MLE loss is 0.003205704940532994, train CRF loss is 0.002018898155101667
Training:At training steps 26800, training MLE loss is 0.0030357713276044198, train CRF loss is 0.0019466419320310843
Training:At training steps 26900, training MLE loss is 0.0030505345841814846, train CRF loss is 0.0020245876066625457
Training:At training steps 27000, training MLE loss is 0.002895322879319602, train CRF loss is 0.0019930873298347647
Validation:At training steps 27000, training MLE loss is 0.002895322879319602, train CRF loss is 0.0019930873298347647, validation MLE loss is 9.86074472101111, validation ppl is 19163.155, validation CRF loss is 9.964679473324827, validation BLEU is 60.59
Training:At training steps 27100, training MLE loss is 0.003471623843662576, train CRF loss is 0.0021299945885364657
Training:At training steps 27200, training MLE loss is 0.0028345625170627525, train CRF loss is 0.0016867184574495808
Training:At training steps 27300, training MLE loss is 0.002797800329679133, train CRF loss is 0.0015972316200198538
Training:At training steps 27400, training MLE loss is 0.0030614189446456362, train CRF loss is 0.0018242224681735942
Training:At training steps 27500, training MLE loss is 0.0028990908890238207, train CRF loss is 0.0017641736082247067
Validation:At training steps 27500, training MLE loss is 0.0028990908890238207, train CRF loss is 0.0017641736082247067, validation MLE loss is 9.736431241035461, validation ppl is 16923.039, validation CRF loss is 9.967844417220668, validation BLEU is 60.77
Training:At training steps 27600, training MLE loss is 0.0022520684002095316, train CRF loss is 0.001219500389917978
Training:At training steps 27700, training MLE loss is 0.0020582025998751143, train CRF loss is 0.0012399876222221851
Training:At training steps 27800, training MLE loss is 0.0018988354374955977, train CRF loss is 0.0012656622689387195
Training:At training steps 27900, training MLE loss is 0.0020499655078100305, train CRF loss is 0.0013761961409575153
Training:At training steps 28000, training MLE loss is 0.0021000170492742358, train CRF loss is 0.0014408990253076005
Validation:At training steps 28000, training MLE loss is 0.0021000170492742358, train CRF loss is 0.0014408990253076005, validation MLE loss is 10.05707279631966, validation ppl is 23320.144, validation CRF loss is 10.216041163394326, validation BLEU is 60.87
Training:At training steps 28100, training MLE loss is 0.0030015912698652812, train CRF loss is 0.0015753465051522575
Training:At training steps 28200, training MLE loss is 0.0021553401237937035, train CRF loss is 0.0013385489776878501
Training:At training steps 28300, training MLE loss is 0.0023101436348520908, train CRF loss is 0.0014178949422297604
Training:At training steps 28400, training MLE loss is 0.0021658204908319347, train CRF loss is 0.0013128533926620566
Training:At training steps 28500, training MLE loss is 0.0020536117997030337, train CRF loss is 0.0012449698579116762
Validation:At training steps 28500, training MLE loss is 0.0020536117997030337, train CRF loss is 0.0012449698579116762, validation MLE loss is 10.128378541846024, validation ppl is 25043.724, validation CRF loss is 10.258555976968063, validation BLEU is 61.15
Training:At training steps 28600, training MLE loss is 0.0024479896528925723, train CRF loss is 0.0017714147669991266
Training:At training steps 28700, training MLE loss is 0.002281338145934391, train CRF loss is 0.001563111144292233
Training:At training steps 28800, training MLE loss is 0.0020702301666679886, train CRF loss is 0.0014385161749518598
Training:At training steps 28900, training MLE loss is 0.002008903879271698, train CRF loss is 0.001372129466636055
Training:At training steps 29000, training MLE loss is 0.0019216985737273184, train CRF loss is 0.001337039505882072
Validation:At training steps 29000, training MLE loss is 0.0019216985737273184, train CRF loss is 0.001337039505882072, validation MLE loss is 10.165614906110262, validation ppl is 25993.841, validation CRF loss is 10.272295782440587, validation BLEU is 61.57
Training:At training steps 29100, training MLE loss is 0.0021004996795419132, train CRF loss is 0.001512581290372621
Training:At training steps 29200, training MLE loss is 0.001769671635924029, train CRF loss is 0.00131976620627537
Training:At training steps 29300, training MLE loss is 0.0017995481166565741, train CRF loss is 0.0011791506613501271
Training:At training steps 29400, training MLE loss is 0.0018358306837782357, train CRF loss is 0.0012368177915022293
Training:At training steps 29500, training MLE loss is 0.0017529008837771855, train CRF loss is 0.0012212560362883673
Validation:At training steps 29500, training MLE loss is 0.0017529008837771855, train CRF loss is 0.0012212560362883673, validation MLE loss is 10.237551042908116, validation ppl is 27932.636, validation CRF loss is 10.402361838441147, validation BLEU is 61.1
Training:At training steps 29600, training MLE loss is 0.001221103362411673, train CRF loss is 0.0010092335449366496
Training:At training steps 29700, training MLE loss is 0.0010336451689000447, train CRF loss is 0.0009480215501744627
Training:At training steps 29800, training MLE loss is 0.0012007679662480045, train CRF loss is 0.0008372467470420685
Training:At training steps 29900, training MLE loss is 0.0014192314204120474, train CRF loss is 0.0009195845096815358
Training:At training steps 30000, training MLE loss is 0.0014702532679053875, train CRF loss is 0.000938574109893505
Validation:At training steps 30000, training MLE loss is 0.0014702532679053875, train CRF loss is 0.000938574109893505, validation MLE loss is 10.249113741673922, validation ppl is 28257.487, validation CRF loss is 10.354402673871894, validation BLEU is 60.47
Training:At training steps 30100, training MLE loss is 0.0010683440049524743, train CRF loss is 0.0004603646420962848
Training:At training steps 30200, training MLE loss is 0.0009929577583443002, train CRF loss is 0.0005059560729364354
Training:At training steps 30300, training MLE loss is 0.0010612938884031105, train CRF loss is 0.0006914804397244471
Training:At training steps 30400, training MLE loss is 0.0010913732225761327, train CRF loss is 0.0007712223775558436
Training:At training steps 30500, training MLE loss is 0.001146334385863262, train CRF loss is 0.000809068615499827
Validation:At training steps 30500, training MLE loss is 0.001146334385863262, train CRF loss is 0.000809068615499827, validation MLE loss is 10.296753607298198, validation ppl is 29636.252, validation CRF loss is 10.407992237492612, validation BLEU is 61.66
Training:At training steps 30600, training MLE loss is 0.0012141171151793236, train CRF loss is 0.0005131602229355581
Training:At training steps 30700, training MLE loss is 0.001347488975399609, train CRF loss is 0.0006856354009069387
Training:At training steps 30800, training MLE loss is 0.0012347654854029692, train CRF loss is 0.0007486440666295883
Training:At training steps 30900, training MLE loss is 0.0012704704979926958, train CRF loss is 0.000782448985703611
Training:At training steps 31000, training MLE loss is 0.0012557242631412112, train CRF loss is 0.0008162036603596085
Validation:At training steps 31000, training MLE loss is 0.0012557242631412112, train CRF loss is 0.0008162036603596085, validation MLE loss is 10.327420316244426, validation ppl is 30559.177, validation CRF loss is 10.501715741659465, validation BLEU is 61.48
Training:At training steps 31100, training MLE loss is 0.0012381376905844265, train CRF loss is 0.0010613700668697446
Training:At training steps 31200, training MLE loss is 0.0013575888809920721, train CRF loss is 0.0009497780362935849
Training:At training steps 31300, training MLE loss is 0.0014416057644118753, train CRF loss is 0.0009399143159723477
Training:At training steps 31400, training MLE loss is 0.0012270713445975902, train CRF loss is 0.0008238808439036471
Training:At training steps 31500, training MLE loss is 0.0012134538603784763, train CRF loss is 0.0007785439245142962
Validation:At training steps 31500, training MLE loss is 0.0012134538603784763, train CRF loss is 0.0007785439245142962, validation MLE loss is 10.45793444859354, validation ppl is 34819.556, validation CRF loss is 10.583544248028806, validation BLEU is 60.83
Training:At training steps 31600, training MLE loss is 0.0019435942114647189, train CRF loss is 0.001385372487654233
Training:At training steps 31700, training MLE loss is 0.001587936618189051, train CRF loss is 0.0010103360994991296
Training:At training steps 31800, training MLE loss is 0.001580630807659642, train CRF loss is 0.0010822541530072286
Training:At training steps 31900, training MLE loss is 0.0015256975561359393, train CRF loss is 0.0009780940454692112
Training:At training steps 32000, training MLE loss is 0.0014858926603979101, train CRF loss is 0.0009503163854977332
Validation:At training steps 32000, training MLE loss is 0.0014858926603979101, train CRF loss is 0.0009503163854977332, validation MLE loss is 10.214229263757405, validation ppl is 27288.735, validation CRF loss is 10.346967534015054, validation BLEU is 61.1
Training:At training steps 32100, training MLE loss is 0.0009594346613645296, train CRF loss is 0.0008121515492382647
Training:At training steps 32200, training MLE loss is 0.0008866923681047571, train CRF loss is 0.0006580232175866052
Training:At training steps 32300, training MLE loss is 0.0010013789902749003, train CRF loss is 0.0006493612621822717
Training:At training steps 32400, training MLE loss is 0.0009270331079260985, train CRF loss is 0.0006038494669043714
Training:At training steps 32500, training MLE loss is 0.0009404302349078937, train CRF loss is 0.000582582718504745
Validation:At training steps 32500, training MLE loss is 0.0009404302349078937, train CRF loss is 0.000582582718504745, validation MLE loss is 10.32527554662604, validation ppl is 30493.705, validation CRF loss is 10.481086329409951, validation BLEU is 61.0
Training:At training steps 32600, training MLE loss is 0.0006843339490139118, train CRF loss is 0.0005805088098744537
Training:At training steps 32700, training MLE loss is 0.0009006773443623784, train CRF loss is 0.0005286427181350728
Training:At training steps 32800, training MLE loss is 0.0009620828586010747, train CRF loss is 0.0005757921176861588
Training:At training steps 32900, training MLE loss is 0.0008745517744167128, train CRF loss is 0.0005068571203108718
Training:At training steps 33000, training MLE loss is 0.0007652360343484515, train CRF loss is 0.0004514577508804347
Validation:At training steps 33000, training MLE loss is 0.0007652360343484515, train CRF loss is 0.0004514577508804347, validation MLE loss is 10.160235605741802, validation ppl is 25854.388, validation CRF loss is 10.350101728188363, validation BLEU is 61.39
Training:At training steps 33100, training MLE loss is 0.0004951131574512869, train CRF loss is 0.0003280192510444113
Training:At training steps 33200, training MLE loss is 0.0005746110610972981, train CRF loss is 0.0004136154058865671
Training:At training steps 33300, training MLE loss is 0.0005972413244040838, train CRF loss is 0.00041764271308401363
Training:At training steps 33400, training MLE loss is 0.0005467381848096739, train CRF loss is 0.00034375473353755794
Training:At training steps 33500, training MLE loss is 0.0006404933656252765, train CRF loss is 0.00035115311487577826
Validation:At training steps 33500, training MLE loss is 0.0006404933656252765, train CRF loss is 0.00035115311487577826, validation MLE loss is 10.376764504533066, validation ppl is 32104.918, validation CRF loss is 10.546200576581453, validation BLEU is 61.28
Training:At training steps 33600, training MLE loss is 0.0003336527621318504, train CRF loss is 0.00028805335895167874
Training:At training steps 33700, training MLE loss is 0.0006406832208406914, train CRF loss is 0.0004008439461712965
Training:At training steps 33800, training MLE loss is 0.0006892377188250071, train CRF loss is 0.0004663916823390283
Training:At training steps 33900, training MLE loss is 0.0006914420112294242, train CRF loss is 0.0004362154118048078
Training:At training steps 34000, training MLE loss is 0.000665798256021813, train CRF loss is 0.000427728479259156
Validation:At training steps 34000, training MLE loss is 0.000665798256021813, train CRF loss is 0.000427728479259156, validation MLE loss is 10.381663429109674, validation ppl is 32262.583, validation CRF loss is 10.53336181138691, validation BLEU is 61.18
Training:At training steps 34100, training MLE loss is 0.0007476670282768818, train CRF loss is 0.00047058710356248754
Training:At training steps 34200, training MLE loss is 0.00047604926121749824, train CRF loss is 0.000293347556846677
Training:At training steps 34300, training MLE loss is 0.0005142965599058465, train CRF loss is 0.00033110406122135674
Training:At training steps 34400, training MLE loss is 0.0005042159683452877, train CRF loss is 0.0003126211475192553
Training:At training steps 34500, training MLE loss is 0.000559373581031626, train CRF loss is 0.00032471816586299607
Validation:At training steps 34500, training MLE loss is 0.000559373581031626, train CRF loss is 0.00032471816586299607, validation MLE loss is 10.497041570512872, validation ppl is 36208.225, validation CRF loss is 10.653609470317239, validation BLEU is 61.17
Training:At training steps 34600, training MLE loss is 0.0007020744674679349, train CRF loss is 0.000386726713762231
Training:At training steps 34700, training MLE loss is 0.0006465682019583706, train CRF loss is 0.00035439893129221025
Training:At training steps 34800, training MLE loss is 0.0005804774317089441, train CRF loss is 0.0003428007267852366
Training:At training steps 34900, training MLE loss is 0.0005483068213814661, train CRF loss is 0.00031828239730037454
Training:At training steps 35000, training MLE loss is 0.000523968372537583, train CRF loss is 0.00031291436005815676
Validation:At training steps 35000, training MLE loss is 0.000523968372537583, train CRF loss is 0.00031291436005815676, validation MLE loss is 10.464112815104032, validation ppl is 35035.349, validation CRF loss is 10.617857017015156, validation BLEU is 60.94
Training:At training steps 35100, training MLE loss is 0.0004374585101303647, train CRF loss is 0.00019844847269725907
Training:At training steps 35200, training MLE loss is 0.0005195529329054298, train CRF loss is 0.0002970732559103162
Training:At training steps 35300, training MLE loss is 0.0004531196436313076, train CRF loss is 0.00028284150284443014
Training:At training steps 35400, training MLE loss is 0.00041869693346293995, train CRF loss is 0.0002921552559068885
Training:At training steps 35500, training MLE loss is 0.0003513890073936278, train CRF loss is 0.00025479151705684356
Validation:At training steps 35500, training MLE loss is 0.0003513890073936278, train CRF loss is 0.00025479151705684356, validation MLE loss is 10.424302421118083, validation ppl is 33667.977, validation CRF loss is 10.60277481455552, validation BLEU is 61.75
Training:At training steps 35600, training MLE loss is 0.00010917394967663897, train CRF loss is 9.631098986642605e-05
Training:At training steps 35700, training MLE loss is 0.0004001802556816281, train CRF loss is 0.00014389165113262603
Training:At training steps 35800, training MLE loss is 0.00044105041714968715, train CRF loss is 0.0002491539129414866
Training:At training steps 35900, training MLE loss is 0.00040956877087343005, train CRF loss is 0.00023482881154008318
Training:At training steps 36000, training MLE loss is 0.00037723928021020934, train CRF loss is 0.00021931962241638115
Validation:At training steps 36000, training MLE loss is 0.00037723928021020934, train CRF loss is 0.00021931962241638115, validation MLE loss is 10.46721223153566, validation ppl is 35144.107, validation CRF loss is 10.652010516116494, validation BLEU is 60.81
Training:At training steps 36100, training MLE loss is 0.00023385882508546827, train CRF loss is 0.0002121308577734915
Training:At training steps 36200, training MLE loss is 0.0002849562236281854, train CRF loss is 0.0002475564964435195
Training:At training steps 36300, training MLE loss is 0.0002719033861800748, train CRF loss is 0.0001877095197993217
Training:At training steps 36400, training MLE loss is 0.00030668883926663444, train CRF loss is 0.00021689228186178356
Training:At training steps 36500, training MLE loss is 0.0003032129816013761, train CRF loss is 0.00021090761785577073
Validation:At training steps 36500, training MLE loss is 0.0003032129816013761, train CRF loss is 0.00021090761785577073, validation MLE loss is 10.431996853728043, validation ppl is 33928.032, validation CRF loss is 10.648132989281102, validation BLEU is 61.65
Training:At training steps 36600, training MLE loss is 0.0003202365969073301, train CRF loss is 0.0001841494292267809
Training:At training steps 36700, training MLE loss is 0.0005660416504842554, train CRF loss is 0.000334394457185625
Training:At training steps 36800, training MLE loss is 0.00047791387620135323, train CRF loss is 0.0002938281493715363
Training:At training steps 36900, training MLE loss is 0.000412010529142706, train CRF loss is 0.00026809153523981724
Training:At training steps 37000, training MLE loss is 0.0004400115734673943, train CRF loss is 0.00029728693774078163
Validation:At training steps 37000, training MLE loss is 0.0004400115734673943, train CRF loss is 0.00029728693774078163, validation MLE loss is 10.475453928897256, validation ppl is 35434.951, validation CRF loss is 10.644454290992336, validation BLEU is 61.47
Training:At training steps 37100, training MLE loss is 0.0002564120675499242, train CRF loss is 7.890545272571714e-05
Training:At training steps 37200, training MLE loss is 0.00034700474682673244, train CRF loss is 0.0001744734052685759
Training:At training steps 37300, training MLE loss is 0.0002792704583223041, train CRF loss is 0.0001334023774398115
Training:At training steps 37400, training MLE loss is 0.00026664697044105775, train CRF loss is 0.00013039709743654538
Training:At training steps 37500, training MLE loss is 0.00027018196315115, train CRF loss is 0.00013008480328863926
Validation:At training steps 37500, training MLE loss is 0.00027018196315115, train CRF loss is 0.00013008480328863926, validation MLE loss is 10.436691328098899, validation ppl is 34087.681, validation CRF loss is 10.593820873059725, validation BLEU is 60.91
Training:At training steps 37600, training MLE loss is 0.00046610050702748196, train CRF loss is 0.0002395525646397223
Training:At training steps 37700, training MLE loss is 0.00037808071560469013, train CRF loss is 0.00020786307528599223
Training:At training steps 37800, training MLE loss is 0.0003354244545598042, train CRF loss is 0.00019520557817120625
Training:At training steps 37900, training MLE loss is 0.00030825755816788923, train CRF loss is 0.00019238618173055165
Training:At training steps 38000, training MLE loss is 0.00029721190532471626, train CRF loss is 0.00018046758518377003
Validation:At training steps 38000, training MLE loss is 0.00029721190532471626, train CRF loss is 0.00018046758518377003, validation MLE loss is 10.466337994525308, validation ppl is 35113.396, validation CRF loss is 10.634784083617362, validation BLEU is 61.56
Training:At training steps 38100, training MLE loss is 4.107880875894386e-05, train CRF loss is 7.62385002362942e-05
Training:At training steps 38200, training MLE loss is 0.00013529233656286085, train CRF loss is 8.776119391726623e-05
Training:At training steps 38300, training MLE loss is 0.00018297991218177935, train CRF loss is 0.00012430561338416506
Training:At training steps 38400, training MLE loss is 0.0001550121932831603, train CRF loss is 0.00010504843900549288
Training:At training steps 38500, training MLE loss is 0.00017431867965484547, train CRF loss is 0.00011097325093365917
Validation:At training steps 38500, training MLE loss is 0.00017431867965484547, train CRF loss is 0.00011097325093365917, validation MLE loss is 10.474471161240025, validation ppl is 35400.144, validation CRF loss is 10.626767340459322, validation BLEU is 62.33
Training:At training steps 38600, training MLE loss is 0.00027576457968991414, train CRF loss is 9.575708831839335e-05
Training:At training steps 38700, training MLE loss is 0.0003051884739380745, train CRF loss is 0.00013856402404776702
Training:At training steps 38800, training MLE loss is 0.0002676623144180142, train CRF loss is 0.00010214313321335044
Training:At training steps 38900, training MLE loss is 0.00023456877374571455, train CRF loss is 8.364348769686725e-05
Training:At training steps 39000, training MLE loss is 0.0002007715116392144, train CRF loss is 8.697337575754816e-05
Validation:At training steps 39000, training MLE loss is 0.0002007715116392144, train CRF loss is 8.697337575754816e-05, validation MLE loss is 10.448917075207358, validation ppl is 34506.986, validation CRF loss is 10.592580023564791, validation BLEU is 61.89
Training:At training steps 39100, training MLE loss is 0.00017273576407736218, train CRF loss is 0.00011082062429894179
Training:At training steps 39200, training MLE loss is 0.00010595403907953133, train CRF loss is 8.258485301590346e-05
Training:At training steps 39300, training MLE loss is 9.603587248385327e-05, train CRF loss is 6.0796454941748714e-05
Training:At training steps 39400, training MLE loss is 8.695566031699215e-05, train CRF loss is 5.393619909866643e-05
Training:At training steps 39500, training MLE loss is 7.988362991510382e-05, train CRF loss is 4.3907779194825915e-05
Validation:At training steps 39500, training MLE loss is 7.988362991510382e-05, train CRF loss is 4.3907779194825915e-05, validation MLE loss is 10.498891516735679, validation ppl is 36275.27, validation CRF loss is 10.635401826155814, validation BLEU is 61.61
Training:At training steps 39600, training MLE loss is 0.0003583588991117656, train CRF loss is 0.00011591148407035678
Training:At training steps 39700, training MLE loss is 0.00027483402137992366, train CRF loss is 0.00010382642701362244
Training:At training steps 39800, training MLE loss is 0.0002330334807677098, train CRF loss is 8.725985465734739e-05
Training:At training steps 39900, training MLE loss is 0.00017818984687585686, train CRF loss is 6.5913905189795e-05
Training:At training steps 40000, training MLE loss is 0.00017715282742954104, train CRF loss is 7.53722471128313e-05
Validation:At training steps 40000, training MLE loss is 0.00017715282742954104, train CRF loss is 7.53722471128313e-05, validation MLE loss is 10.505817902715583, validation ppl is 36527.399, validation CRF loss is 10.669674603562607, validation BLEU is 61.7
Training:At training steps 100, training MLE loss is 2.2811869828402997, train CRF loss is 11.052510795593262
Training:At training steps 200, training MLE loss is 2.255762308463454, train CRF loss is 10.410282488167287
Training:At training steps 300, training MLE loss is 2.212592060615619, train CRF loss is 9.873975217938423
Training:At training steps 400, training MLE loss is 2.188086412567645, train CRF loss is 9.363668749406934
Training:At training steps 500, training MLE loss is 2.16455927285552, train CRF loss is 8.910168413698674
Validation:At training steps 500, training MLE loss is 2.16455927285552, train CRF loss is 8.910168413698674, validation MLE loss is 2.117206424474716, validation ppl is 8.308, validation CRF loss is 6.730422013684323, validation BLEU is 10.54
Training:At training steps 600, training MLE loss is 2.140844774246216, train CRF loss is 6.574071246981621
Training:At training steps 700, training MLE loss is 2.1258326763287188, train CRF loss is 6.418661434501409
Training:At training steps 800, training MLE loss is 2.1275195812185603, train CRF loss is 6.30054344817996
Training:At training steps 900, training MLE loss is 2.13024114029482, train CRF loss is 6.203887984864414
Training:At training steps 1000, training MLE loss is 2.126773683041334, train CRF loss is 6.113428187608719
Validation:At training steps 1000, training MLE loss is 2.126773683041334, train CRF loss is 6.113428187608719, validation MLE loss is 2.1131367448129152, validation ppl is 8.274, validation CRF loss is 5.72259164170215, validation BLEU is 17.11
Training:At training steps 1100, training MLE loss is 2.14180282831192, train CRF loss is 5.646971111148596
Training:At training steps 1200, training MLE loss is 2.137418267093599, train CRF loss is 5.588670820370316
Training:At training steps 1300, training MLE loss is 2.1248096603155138, train CRF loss is 5.526316619316737
Training:At training steps 1400, training MLE loss is 2.121144823115319, train CRF loss is 5.474404931776226
Training:At training steps 1500, training MLE loss is 2.121409834295511, train CRF loss is 5.423800172775984
Validation:At training steps 1500, training MLE loss is 2.121409834295511, train CRF loss is 5.423800172775984, validation MLE loss is 2.2897602084435915, validation ppl is 9.873, validation CRF loss is 5.685534263912, validation BLEU is 25.36
Training:At training steps 1600, training MLE loss is 2.17428931042552, train CRF loss is 5.180105352699757
Training:At training steps 1700, training MLE loss is 2.172066620290279, train CRF loss is 5.150814874470234
Training:At training steps 1800, training MLE loss is 2.177697018956145, train CRF loss is 5.100898307263851
Training:At training steps 1900, training MLE loss is 2.1857949395850302, train CRF loss is 5.0526478920131925
Training:At training steps 2000, training MLE loss is 2.1936051041334865, train CRF loss is 5.00207175296545
Validation:At training steps 2000, training MLE loss is 2.1936051041334865, train CRF loss is 5.00207175296545, validation MLE loss is 2.269936304343374, validation ppl is 9.679, validation CRF loss is 5.002905669965242, validation BLEU is 28.75
Training:At training steps 2100, training MLE loss is 2.221118821427226, train CRF loss is 4.724271337240935
Training:At training steps 2200, training MLE loss is 2.232934748530388, train CRF loss is 4.666415438428521
Training:At training steps 2300, training MLE loss is 2.2502081613987683, train CRF loss is 4.621775572101275
Training:At training steps 2400, training MLE loss is 2.2596865287981927, train CRF loss is 4.56377299990505
Training:At training steps 2500, training MLE loss is 2.274729610517621, train CRF loss is 4.504527596473694
Validation:At training steps 2500, training MLE loss is 2.274729610517621, train CRF loss is 4.504527596473694, validation MLE loss is 2.4099926854434766, validation ppl is 11.134, validation CRF loss is 4.484051898906105, validation BLEU is 29.6
Training:At training steps 2600, training MLE loss is 2.3340825448930262, train CRF loss is 4.220172623097897
Training:At training steps 2700, training MLE loss is 2.3205094004422424, train CRF loss is 4.164380846917629
Training:At training steps 2800, training MLE loss is 2.3437177654355765, train CRF loss is 4.116653722102443
Training:At training steps 2900, training MLE loss is 2.352616595812142, train CRF loss is 4.0727717224135995
Training:At training steps 3000, training MLE loss is 2.358433369085193, train CRF loss is 4.0269134711176156
Validation:At training steps 3000, training MLE loss is 2.358433369085193, train CRF loss is 4.0269134711176156, validation MLE loss is 2.5217071881419733, validation ppl is 12.45, validation CRF loss is 4.04869333380147, validation BLEU is 32.29
Training:At training steps 3100, training MLE loss is 2.3672131111472847, train CRF loss is 3.7194535579532384
Training:At training steps 3200, training MLE loss is 2.3671949372813104, train CRF loss is 3.6606295788288117
Training:At training steps 3300, training MLE loss is 2.373208217372497, train CRF loss is 3.6288332046071687
Training:At training steps 3400, training MLE loss is 2.374438435938209, train CRF loss is 3.5931807081773877
Training:At training steps 3500, training MLE loss is 2.372917861327529, train CRF loss is 3.539596389502287
Validation:At training steps 3500, training MLE loss is 2.372917861327529, train CRF loss is 3.539596389502287, validation MLE loss is 2.9422322856752494, validation ppl is 18.958, validation CRF loss is 3.84994029371362, validation BLEU is 32.76
Training:At training steps 3600, training MLE loss is 2.3928673453629017, train CRF loss is 3.2907084508985283
Training:At training steps 3700, training MLE loss is 2.362893597483635, train CRF loss is 3.2367957358807327
Training:At training steps 3800, training MLE loss is 2.35515991161267, train CRF loss is 3.1844313100725414
Training:At training steps 3900, training MLE loss is 2.345932520609349, train CRF loss is 3.1447568565793333
Training:At training steps 4000, training MLE loss is 2.340043738231063, train CRF loss is 3.0965292286723853
Validation:At training steps 4000, training MLE loss is 2.340043738231063, train CRF loss is 3.0965292286723853, validation MLE loss is 2.9864859486881055, validation ppl is 19.816, validation CRF loss is 3.4626409850622477, validation BLEU is 37.89
Training:At training steps 4100, training MLE loss is 2.290125606060028, train CRF loss is 2.838023669831455
Training:At training steps 4200, training MLE loss is 2.2514440654218197, train CRF loss is 2.795534501671791
Training:At training steps 4300, training MLE loss is 2.241100072885553, train CRF loss is 2.7590368450184664
Training:At training steps 4400, training MLE loss is 2.2177138091623783, train CRF loss is 2.724264712696895
Training:At training steps 4500, training MLE loss is 2.211236559778452, train CRF loss is 2.6797950944304465
Validation:At training steps 4500, training MLE loss is 2.211236559778452, train CRF loss is 2.6797950944304465, validation MLE loss is 2.8018128526838204, validation ppl is 16.474, validation CRF loss is 3.217300534248352, validation BLEU is 46.81
Training:At training steps 4600, training MLE loss is 2.1095707868784666, train CRF loss is 2.4829609475657346
Training:At training steps 4700, training MLE loss is 2.0821543791517616, train CRF loss is 2.431850243490189
Training:At training steps 4800, training MLE loss is 2.058922615920504, train CRF loss is 2.4104921078309416
Training:At training steps 4900, training MLE loss is 2.049734729770571, train CRF loss is 2.3815468026697637
Training:At training steps 5000, training MLE loss is 2.024888768680394, train CRF loss is 2.354393816612661
Validation:At training steps 5000, training MLE loss is 2.024888768680394, train CRF loss is 2.354393816612661, validation MLE loss is 2.8686975099538503, validation ppl is 17.614, validation CRF loss is 3.282036889540522, validation BLEU is 49.64
Training:At training steps 5100, training MLE loss is 1.9288366330787539, train CRF loss is 2.2074592044577
Training:At training steps 5200, training MLE loss is 1.9034514378011227, train CRF loss is 2.1776027478836477
Training:At training steps 5300, training MLE loss is 1.8679992469400168, train CRF loss is 2.1470287619593242
Training:At training steps 5400, training MLE loss is 1.8556466285977513, train CRF loss is 2.114688434484415
Training:At training steps 5500, training MLE loss is 1.825534968689084, train CRF loss is 2.0867674079313874
Validation:At training steps 5500, training MLE loss is 1.825534968689084, train CRF loss is 2.0867674079313874, validation MLE loss is 2.9034431294391028, validation ppl is 18.237, validation CRF loss is 3.138205520416561, validation BLEU is 51.58
Training:At training steps 5600, training MLE loss is 1.6919519149512052, train CRF loss is 1.9416148261912167
Training:At training steps 5700, training MLE loss is 1.6736714518815279, train CRF loss is 1.8996924241725355
Training:At training steps 5800, training MLE loss is 1.6573372503059607, train CRF loss is 1.8875379468065996
Training:At training steps 5900, training MLE loss is 1.6402305752364919, train CRF loss is 1.8548941772896796
Training:At training steps 6000, training MLE loss is 1.6237104040049017, train CRF loss is 1.830622211139649
Validation:At training steps 6000, training MLE loss is 1.6237104040049017, train CRF loss is 1.830622211139649, validation MLE loss is 3.0967876817050732, validation ppl is 22.127, validation CRF loss is 3.2580809420660923, validation BLEU is 52.43
Training:At training steps 6100, training MLE loss is 1.5248064261488616, train CRF loss is 1.6868244295194745
Training:At training steps 6200, training MLE loss is 1.50380803136155, train CRF loss is 1.670412194659002
Training:At training steps 6300, training MLE loss is 1.5028241553654273, train CRF loss is 1.6582558121625335
Training:At training steps 6400, training MLE loss is 1.4921988954814152, train CRF loss is 1.6343346319417469
Training:At training steps 6500, training MLE loss is 1.4679176716171205, train CRF loss is 1.6079994477778674
Validation:At training steps 6500, training MLE loss is 1.4679176716171205, train CRF loss is 1.6079994477778674, validation MLE loss is 3.079996993667201, validation ppl is 21.758, validation CRF loss is 3.249937344538538, validation BLEU is 53.05
Training:At training steps 6600, training MLE loss is 1.3547022628597916, train CRF loss is 1.411781009119004
Training:At training steps 6700, training MLE loss is 1.3558754680212588, train CRF loss is 1.4068054713774472
Training:At training steps 6800, training MLE loss is 1.3432379225889841, train CRF loss is 1.3849434855238845
Training:At training steps 6900, training MLE loss is 1.3258010378177278, train CRF loss is 1.3626220441411716
Training:At training steps 7000, training MLE loss is 1.3137775630224495, train CRF loss is 1.3388814178379254
Validation:At training steps 7000, training MLE loss is 1.3137775630224495, train CRF loss is 1.3388814178379254, validation MLE loss is 3.3773860209866573, validation ppl is 29.294, validation CRF loss is 3.337721956403632, validation BLEU is 55.48
Training:At training steps 7100, training MLE loss is 1.2313520059362053, train CRF loss is 1.2047402019612492
Training:At training steps 7200, training MLE loss is 1.1882557923905552, train CRF loss is 1.1897143433103339
Training:At training steps 7300, training MLE loss is 1.1761395513949295, train CRF loss is 1.1708725441945718
Training:At training steps 7400, training MLE loss is 1.169544610735029, train CRF loss is 1.1563723658735399
Training:At training steps 7500, training MLE loss is 1.1539238697513938, train CRF loss is 1.1324315425599925
Validation:At training steps 7500, training MLE loss is 1.1539238697513938, train CRF loss is 1.1324315425599925, validation MLE loss is 3.629344764508699, validation ppl is 37.688, validation CRF loss is 3.4491390491786755, validation BLEU is 55.48
Training:At training steps 7600, training MLE loss is 1.0777389688789845, train CRF loss is 1.0125737712625413
Training:At training steps 7700, training MLE loss is 1.0518088536872527, train CRF loss is 0.9821781302592717
Training:At training steps 7800, training MLE loss is 1.0298181820086514, train CRF loss is 0.9590522354259156
Training:At training steps 7900, training MLE loss is 1.0172471356438473, train CRF loss is 0.946991485969047
Training:At training steps 8000, training MLE loss is 0.9936261380128563, train CRF loss is 0.930291133416351
Validation:At training steps 8000, training MLE loss is 0.9936261380128563, train CRF loss is 0.930291133416351, validation MLE loss is 3.860939405466381, validation ppl is 47.51, validation CRF loss is 3.6086476743221283, validation BLEU is 56.14
Training:At training steps 8100, training MLE loss is 0.8963493474945426, train CRF loss is 0.8058710320526734
Training:At training steps 8200, training MLE loss is 0.8715645140688867, train CRF loss is 0.8041881971030671
Training:At training steps 8300, training MLE loss is 0.8604826360568404, train CRF loss is 0.798425519313023
Training:At training steps 8400, training MLE loss is 0.8521306762821041, train CRF loss is 0.7882240468500095
Training:At training steps 8500, training MLE loss is 0.8357044133162126, train CRF loss is 0.7731249325173558
Validation:At training steps 8500, training MLE loss is 0.8357044133162126, train CRF loss is 0.7731249325173558, validation MLE loss is 3.801952616164559, validation ppl is 44.789, validation CRF loss is 3.884310449424543, validation BLEU is 56.63
Training:At training steps 8600, training MLE loss is 0.7558777328813449, train CRF loss is 0.725309450016357
Training:At training steps 8700, training MLE loss is 0.7628006850951351, train CRF loss is 0.6964685618388466
Training:At training steps 8800, training MLE loss is 0.7449892454772877, train CRF loss is 0.6780233269153784
Training:At training steps 8900, training MLE loss is 0.7464025482564466, train CRF loss is 0.6600423384364694
Training:At training steps 9000, training MLE loss is 0.7413312384844758, train CRF loss is 0.6411626100423746
Validation:At training steps 9000, training MLE loss is 0.7413312384844758, train CRF loss is 0.6411626100423746, validation MLE loss is 4.232669502496719, validation ppl is 68.901, validation CRF loss is 3.9245673308247015, validation BLEU is 55.88
Training:At training steps 9100, training MLE loss is 0.6618283490836621, train CRF loss is 0.5535442643193528
Training:At training steps 9200, training MLE loss is 0.6597009405540303, train CRF loss is 0.5490144532773411
Training:At training steps 9300, training MLE loss is 0.6456825284147635, train CRF loss is 0.5330001536265869
Training:At training steps 9400, training MLE loss is 0.6363978875288739, train CRF loss is 0.5211823342645948
Training:At training steps 9500, training MLE loss is 0.630161542817019, train CRF loss is 0.5099258829173632
Validation:At training steps 9500, training MLE loss is 0.630161542817019, train CRF loss is 0.5099258829173632, validation MLE loss is 4.423436691886501, validation ppl is 83.382, validation CRF loss is 4.04938826121782, validation BLEU is 57.22
Training:At training steps 9600, training MLE loss is 0.6336405044328421, train CRF loss is 0.46609326694277115
Training:At training steps 9700, training MLE loss is 0.599951407344779, train CRF loss is 0.46085344582621474
Training:At training steps 9800, training MLE loss is 0.5851533394749276, train CRF loss is 0.44976795418876764
Training:At training steps 9900, training MLE loss is 0.567820717553841, train CRF loss is 0.4327547168326419
Training:At training steps 10000, training MLE loss is 0.5585428241614718, train CRF loss is 0.4206836387895164
Validation:At training steps 10000, training MLE loss is 0.5585428241614718, train CRF loss is 0.4206836387895164, validation MLE loss is 4.5354688261684615, validation ppl is 93.267, validation CRF loss is 4.206281133388218, validation BLEU is 58.26
Training:At training steps 10100, training MLE loss is 0.48836598204332404, train CRF loss is 0.37328967611596453
Training:At training steps 10200, training MLE loss is 0.49669513994798764, train CRF loss is 0.35814472166195627
Training:At training steps 10300, training MLE loss is 0.49046305180992933, train CRF loss is 0.3544332517061654
Training:At training steps 10400, training MLE loss is 0.4790708733079373, train CRF loss is 0.3473529315485939
Training:At training steps 10500, training MLE loss is 0.46923753186664546, train CRF loss is 0.34276967538782627
Validation:At training steps 10500, training MLE loss is 0.46923753186664546, train CRF loss is 0.34276967538782627, validation MLE loss is 4.647866409075887, validation ppl is 104.362, validation CRF loss is 4.202584978781249, validation BLEU is 58.52
Training:At training steps 10600, training MLE loss is 0.4269804530404508, train CRF loss is 0.2955153701227391
Training:At training steps 10700, training MLE loss is 0.4149965055700159, train CRF loss is 0.28982853480825727
Training:At training steps 10800, training MLE loss is 0.4045857131498633, train CRF loss is 0.2832144464737697
Training:At training steps 10900, training MLE loss is 0.40497688138493687, train CRF loss is 0.2774399580523641
Training:At training steps 11000, training MLE loss is 0.3980543533951277, train CRF loss is 0.26957699126869555
Validation:At training steps 11000, training MLE loss is 0.3980543533951277, train CRF loss is 0.26957699126869555, validation MLE loss is 4.8565697481757715, validation ppl is 128.582, validation CRF loss is 4.602603118670614, validation BLEU is 59.86
Training:At training steps 11100, training MLE loss is 0.3602533950645011, train CRF loss is 0.2562421132122108
Training:At training steps 11200, training MLE loss is 0.3550568700573058, train CRF loss is 0.23974403999658533
Training:At training steps 11300, training MLE loss is 0.350204929857379, train CRF loss is 0.2290983031821088
Training:At training steps 11400, training MLE loss is 0.3474678182957723, train CRF loss is 0.22362135005010714
Training:At training steps 11500, training MLE loss is 0.34113693255290856, train CRF loss is 0.2200235371870367
Validation:At training steps 11500, training MLE loss is 0.34113693255290856, train CRF loss is 0.2200235371870367, validation MLE loss is 4.932647639199307, validation ppl is 138.746, validation CRF loss is 4.811819038893047, validation BLEU is 59.24
Training:At training steps 11600, training MLE loss is 0.2998198435993982, train CRF loss is 0.2008126459119376
Training:At training steps 11700, training MLE loss is 0.294501487720936, train CRF loss is 0.19990219471412274
Training:At training steps 11800, training MLE loss is 0.2917395775264231, train CRF loss is 0.1986499193487786
Training:At training steps 11900, training MLE loss is 0.2935360346685502, train CRF loss is 0.19319644363205954
Training:At training steps 12000, training MLE loss is 0.2913606093061171, train CRF loss is 0.1917793658425653
Validation:At training steps 12000, training MLE loss is 0.2913606093061171, train CRF loss is 0.1917793658425653, validation MLE loss is 4.792036432968943, validation ppl is 120.547, validation CRF loss is 4.710033018338053, validation BLEU is 58.13
Training:At training steps 12100, training MLE loss is 0.2585144041152671, train CRF loss is 0.16191432874569728
Training:At training steps 12200, training MLE loss is 0.24759942043587216, train CRF loss is 0.16558446447746974
Training:At training steps 12300, training MLE loss is 0.250105771200809, train CRF loss is 0.164787760952507
Training:At training steps 12400, training MLE loss is 0.2509953876025975, train CRF loss is 0.16426988716862298
Training:At training steps 12500, training MLE loss is 0.2491815638476546, train CRF loss is 0.15947725073059155
Validation:At training steps 12500, training MLE loss is 0.2491815638476546, train CRF loss is 0.15947725073059155, validation MLE loss is 5.263330174119849, validation ppl is 193.124, validation CRF loss is 4.806929133440319, validation BLEU is 59.84
Training:At training steps 12600, training MLE loss is 0.24010614263359456, train CRF loss is 0.1384160237386095
Training:At training steps 12700, training MLE loss is 0.23761439642563345, train CRF loss is 0.14474008694542134
Training:At training steps 12800, training MLE loss is 0.2324783375064726, train CRF loss is 0.14610855065340123
Training:At training steps 12900, training MLE loss is 0.23359585191632504, train CRF loss is 0.14132644146000076
Training:At training steps 13000, training MLE loss is 0.2295194005923113, train CRF loss is 0.13909213019669187
Validation:At training steps 13000, training MLE loss is 0.2295194005923113, train CRF loss is 0.13909213019669187, validation MLE loss is 5.228508328136645, validation ppl is 186.514, validation CRF loss is 5.007791613277636, validation BLEU is 59.77
Training:At training steps 13100, training MLE loss is 0.18718456052723922, train CRF loss is 0.10957019461788149
Training:At training steps 13200, training MLE loss is 0.19174567929534533, train CRF loss is 0.11469703308668613
Training:At training steps 13300, training MLE loss is 0.1929784304558901, train CRF loss is 0.11523174274049476
Training:At training steps 13400, training MLE loss is 0.19327225416516738, train CRF loss is 0.11119195395875067
Training:At training steps 13500, training MLE loss is 0.1904881099286431, train CRF loss is 0.10978862972045317
Validation:At training steps 13500, training MLE loss is 0.1904881099286431, train CRF loss is 0.10978862972045317, validation MLE loss is 5.319792985916138, validation ppl is 204.342, validation CRF loss is 5.282060055356276, validation BLEU is 58.13
Training:At training steps 13600, training MLE loss is 0.19638578899310233, train CRF loss is 0.11198088558167
Training:At training steps 13700, training MLE loss is 0.18440076506625702, train CRF loss is 0.10627063387905764
Training:At training steps 13800, training MLE loss is 0.17836110066607944, train CRF loss is 0.10678070669229177
Training:At training steps 13900, training MLE loss is 0.17696019729643012, train CRF loss is 0.10265582479192745
Training:At training steps 14000, training MLE loss is 0.1751572461729811, train CRF loss is 0.10093850105096135
Validation:At training steps 14000, training MLE loss is 0.1751572461729811, train CRF loss is 0.10093850105096135, validation MLE loss is 5.305698441831689, validation ppl is 201.482, validation CRF loss is 5.25853944765894, validation BLEU is 59.44
Training:At training steps 14100, training MLE loss is 0.15141079688284662, train CRF loss is 0.08916159163563861
Training:At training steps 14200, training MLE loss is 0.15520621258790926, train CRF loss is 0.08921928059808124
Training:At training steps 14300, training MLE loss is 0.15797823245782638, train CRF loss is 0.09362032680255652
Training:At training steps 14400, training MLE loss is 0.15273923548782478, train CRF loss is 0.09198110856987568
Training:At training steps 14500, training MLE loss is 0.15031921767504536, train CRF loss is 0.09134074929438066
Validation:At training steps 14500, training MLE loss is 0.15031921767504536, train CRF loss is 0.09134074929438066, validation MLE loss is 5.548354459436316, validation ppl is 256.815, validation CRF loss is 5.363412565306613, validation BLEU is 60.26
Training:At training steps 14600, training MLE loss is 0.1378579749859455, train CRF loss is 0.07417594364573234
Training:At training steps 14700, training MLE loss is 0.13436570693197608, train CRF loss is 0.07453937615099904
Training:At training steps 14800, training MLE loss is 0.13526799299644457, train CRF loss is 0.07731199622492416
Training:At training steps 14900, training MLE loss is 0.13383088278211744, train CRF loss is 0.07790850646176864
Training:At training steps 15000, training MLE loss is 0.1316655975474314, train CRF loss is 0.0762639445425093
Validation:At training steps 15000, training MLE loss is 0.1316655975474314, train CRF loss is 0.0762639445425093, validation MLE loss is 5.658380392350648, validation ppl is 286.684, validation CRF loss is 5.401213297718449, validation BLEU is 60.56
Training:At training steps 15100, training MLE loss is 0.13526401138951769, train CRF loss is 0.0766662037395372
Training:At training steps 15200, training MLE loss is 0.1266317418520066, train CRF loss is 0.07197826116277611
Training:At training steps 15300, training MLE loss is 0.12143856592532756, train CRF loss is 0.07013800872276078
Training:At training steps 15400, training MLE loss is 0.11906006465910196, train CRF loss is 0.06997900382135412
Training:At training steps 15500, training MLE loss is 0.11646938196234259, train CRF loss is 0.06765468779336334
Validation:At training steps 15500, training MLE loss is 0.11646938196234259, train CRF loss is 0.06765468779336334, validation MLE loss is 5.77137700821224, validation ppl is 320.979, validation CRF loss is 5.542571092906751, validation BLEU is 60.31
Training:At training steps 15600, training MLE loss is 0.10330368589598947, train CRF loss is 0.0648820839858081
Training:At training steps 15700, training MLE loss is 0.10171667316042658, train CRF loss is 0.06044404475698002
Training:At training steps 15800, training MLE loss is 0.09808025431027696, train CRF loss is 0.05974487106431601
Training:At training steps 15900, training MLE loss is 0.09786521030469658, train CRF loss is 0.05954914711185665
Training:At training steps 16000, training MLE loss is 0.09669529003268144, train CRF loss is 0.057994166622431294
Validation:At training steps 16000, training MLE loss is 0.09669529003268144, train CRF loss is 0.057994166622431294, validation MLE loss is 6.210713424180684, validation ppl is 498.056, validation CRF loss is 5.682585753892598, validation BLEU is 58.8
Training:At training steps 16100, training MLE loss is 0.09422439041414692, train CRF loss is 0.05304858101871332
Training:At training steps 16200, training MLE loss is 0.09019525674958913, train CRF loss is 0.05150930365611089
Training:At training steps 16300, training MLE loss is 0.09045077007327261, train CRF loss is 0.052972808141828406
Training:At training steps 16400, training MLE loss is 0.0915275577648731, train CRF loss is 0.05352369403773821
Training:At training steps 16500, training MLE loss is 0.09207297397612274, train CRF loss is 0.052992570685206374
Validation:At training steps 16500, training MLE loss is 0.09207297397612274, train CRF loss is 0.052992570685206374, validation MLE loss is 5.873778703965638, validation ppl is 355.59, validation CRF loss is 5.690863113654287, validation BLEU is 59.98
Training:At training steps 16600, training MLE loss is 0.08343264362680201, train CRF loss is 0.04587626010209306
Training:At training steps 16700, training MLE loss is 0.08522337932850405, train CRF loss is 0.04691305059831052
Training:At training steps 16800, training MLE loss is 0.08202326249440375, train CRF loss is 0.04566609790548948
Training:At training steps 16900, training MLE loss is 0.08059203577801555, train CRF loss is 0.04510234564525348
Training:At training steps 17000, training MLE loss is 0.07904138555564531, train CRF loss is 0.0451074054581602
Validation:At training steps 17000, training MLE loss is 0.07904138555564531, train CRF loss is 0.0451074054581602, validation MLE loss is 6.14724892691562, validation ppl is 467.43, validation CRF loss is 6.126191873299448, validation BLEU is 59.19
Training:At training steps 17100, training MLE loss is 0.07017753975966116, train CRF loss is 0.040789195302952524
Training:At training steps 17200, training MLE loss is 0.07507420244157856, train CRF loss is 0.04622530179462956
Training:At training steps 17300, training MLE loss is 0.07180183301516081, train CRF loss is 0.04486849035568146
Training:At training steps 17400, training MLE loss is 0.06972475770909682, train CRF loss is 0.04416452746708131
Training:At training steps 17500, training MLE loss is 0.06734671839510065, train CRF loss is 0.04239770289847331
Validation:At training steps 17500, training MLE loss is 0.06734671839510065, train CRF loss is 0.04239770289847331, validation MLE loss is 6.0744720038614775, validation ppl is 434.62, validation CRF loss is 5.87323199134124, validation BLEU is 59.7
Training:At training steps 17600, training MLE loss is 0.06532064158323238, train CRF loss is 0.036492121594619675
Training:At training steps 17700, training MLE loss is 0.05946300553010701, train CRF loss is 0.03400781054884647
Training:At training steps 17800, training MLE loss is 0.05868224840117288, train CRF loss is 0.03318418687019441
Training:At training steps 17900, training MLE loss is 0.05768524668841508, train CRF loss is 0.033130023793323034
Training:At training steps 18000, training MLE loss is 0.056846171692136065, train CRF loss is 0.0325451339367844
Validation:At training steps 18000, training MLE loss is 0.056846171692136065, train CRF loss is 0.0325451339367844, validation MLE loss is 6.328240880840703, validation ppl is 560.17, validation CRF loss is 6.176153267684736, validation BLEU is 59.78
Training:At training steps 18100, training MLE loss is 0.056898363786835944, train CRF loss is 0.034449511547484234
Training:At training steps 18200, training MLE loss is 0.05756940596865093, train CRF loss is 0.0323313803549695
Training:At training steps 18300, training MLE loss is 0.05619930692499641, train CRF loss is 0.032033675248338035
Training:At training steps 18400, training MLE loss is 0.05596199887797013, train CRF loss is 0.032741378328207275
Training:At training steps 18500, training MLE loss is 0.054450411442550486, train CRF loss is 0.0312743550366327
Validation:At training steps 18500, training MLE loss is 0.054450411442550486, train CRF loss is 0.0312743550366327, validation MLE loss is 6.473591258651332, validation ppl is 647.806, validation CRF loss is 6.16940595915443, validation BLEU is 59.47
Training:At training steps 18600, training MLE loss is 0.055271960822534535, train CRF loss is 0.02691880936198686
Training:At training steps 18700, training MLE loss is 0.04961792570568548, train CRF loss is 0.0274380653111794
Training:At training steps 18800, training MLE loss is 0.04766116454037425, train CRF loss is 0.027338406819301136
Training:At training steps 18900, training MLE loss is 0.046721164390964416, train CRF loss is 0.02581818064269001
Training:At training steps 19000, training MLE loss is 0.046570689210667524, train CRF loss is 0.025904495880074478
Validation:At training steps 19000, training MLE loss is 0.046570689210667524, train CRF loss is 0.025904495880074478, validation MLE loss is 6.511913098787007, validation ppl is 673.113, validation CRF loss is 6.3976614161541585, validation BLEU is 59.17
Training:At training steps 19100, training MLE loss is 0.04300578902012376, train CRF loss is 0.023087793877416517
Training:At training steps 19200, training MLE loss is 0.04242261441807415, train CRF loss is 0.023387366648487243
Training:At training steps 19300, training MLE loss is 0.04130474360132837, train CRF loss is 0.023480234236508673
Training:At training steps 19400, training MLE loss is 0.042220038957588885, train CRF loss is 0.024507686617387277
Training:At training steps 19500, training MLE loss is 0.042377946975678926, train CRF loss is 0.024197330786378927
Validation:At training steps 19500, training MLE loss is 0.042377946975678926, train CRF loss is 0.024197330786378927, validation MLE loss is 6.685835587350946, validation ppl is 800.98, validation CRF loss is 6.609461511436262, validation BLEU is 59.19
Training:At training steps 19600, training MLE loss is 0.038817731903933464, train CRF loss is 0.0206898441902311
Training:At training steps 19700, training MLE loss is 0.03769768099455803, train CRF loss is 0.020446177033127524
Training:At training steps 19800, training MLE loss is 0.03666693733038452, train CRF loss is 0.02015159895242789
Training:At training steps 19900, training MLE loss is 0.034997903696919436, train CRF loss is 0.018874185804411282
Training:At training steps 20000, training MLE loss is 0.03394072824058881, train CRF loss is 0.018508796327002586
Validation:At training steps 20000, training MLE loss is 0.03394072824058881, train CRF loss is 0.018508796327002586, validation MLE loss is 6.924228680761237, validation ppl is 1016.61, validation CRF loss is 6.761939102097561, validation BLEU is 58.68
Training:At training steps 20100, training MLE loss is 0.034179719742722055, train CRF loss is 0.018242859078869175
Training:At training steps 20200, training MLE loss is 0.03076136422019715, train CRF loss is 0.016617623979319093
Training:At training steps 20300, training MLE loss is 0.029725008997006815, train CRF loss is 0.015172143568301901
Training:At training steps 20400, training MLE loss is 0.028330120907487776, train CRF loss is 0.014714200771544625
Training:At training steps 20500, training MLE loss is 0.02761320958571784, train CRF loss is 0.014719666100349997
Validation:At training steps 20500, training MLE loss is 0.02761320958571784, train CRF loss is 0.014719666100349997, validation MLE loss is 7.033572008735256, validation ppl is 1134.074, validation CRF loss is 6.933215552254727, validation BLEU is 59.34
Training:At training steps 20600, training MLE loss is 0.027753565851253655, train CRF loss is 0.016199706540597703
Training:At training steps 20700, training MLE loss is 0.02720805265073366, train CRF loss is 0.015625557934019697
Training:At training steps 20800, training MLE loss is 0.025578110628752027, train CRF loss is 0.014675980123859253
Training:At training steps 20900, training MLE loss is 0.025236001044915232, train CRF loss is 0.014454357534294394
Training:At training steps 21000, training MLE loss is 0.025485616164200864, train CRF loss is 0.014250949477569094
Validation:At training steps 21000, training MLE loss is 0.025485616164200864, train CRF loss is 0.014250949477569094, validation MLE loss is 7.268829408444856, validation ppl is 1434.87, validation CRF loss is 7.058154987661462, validation BLEU is 59.64
Training:At training steps 21100, training MLE loss is 0.024916738513505125, train CRF loss is 0.01374920249219457
Training:At training steps 21200, training MLE loss is 0.024497987271157023, train CRF loss is 0.013124405680012625
Training:At training steps 21300, training MLE loss is 0.023410908691491105, train CRF loss is 0.013100469584839645
Training:At training steps 21400, training MLE loss is 0.02246930603672629, train CRF loss is 0.012323801847347165
Training:At training steps 21500, training MLE loss is 0.021772877473548845, train CRF loss is 0.012306333736046646
Validation:At training steps 21500, training MLE loss is 0.021772877473548845, train CRF loss is 0.012306333736046646, validation MLE loss is 7.15752133883928, validation ppl is 1283.725, validation CRF loss is 7.088707189810903, validation BLEU is 58.85
Training:At training steps 21600, training MLE loss is 0.019091614530832645, train CRF loss is 0.009966123132877697
Training:At training steps 21700, training MLE loss is 0.019054520693142684, train CRF loss is 0.010632636492302864
Training:At training steps 21800, training MLE loss is 0.019121732207420406, train CRF loss is 0.01092351941961109
Training:At training steps 21900, training MLE loss is 0.019243717009217908, train CRF loss is 0.01102716832470791
Training:At training steps 22000, training MLE loss is 0.018271890739608953, train CRF loss is 0.010748618210364207
Validation:At training steps 22000, training MLE loss is 0.018271890739608953, train CRF loss is 0.010748618210364207, validation MLE loss is 7.249385030646073, validation ppl is 1407.239, validation CRF loss is 7.176942841002815, validation BLEU is 59.26
Training:At training steps 22100, training MLE loss is 0.014687963393810968, train CRF loss is 0.007902336400924614
Training:At training steps 22200, training MLE loss is 0.013858927675807528, train CRF loss is 0.008074244485027466
Training:At training steps 22300, training MLE loss is 0.01293655355639359, train CRF loss is 0.007249335395994026
Training:At training steps 22400, training MLE loss is 0.012610156691575804, train CRF loss is 0.007181645659162619
Training:At training steps 22500, training MLE loss is 0.01290042036865146, train CRF loss is 0.0074138508489795425
Validation:At training steps 22500, training MLE loss is 0.01290042036865146, train CRF loss is 0.0074138508489795425, validation MLE loss is 7.475815427930732, validation ppl is 1764.84, validation CRF loss is 7.307647099620418, validation BLEU is 60.18
Training:At training steps 22600, training MLE loss is 0.012220510013753447, train CRF loss is 0.0068758208873778235
Training:At training steps 22700, training MLE loss is 0.011361494749692759, train CRF loss is 0.006258202239471178
Training:At training steps 22800, training MLE loss is 0.01100386449098533, train CRF loss is 0.0060542464382633936
Training:At training steps 22900, training MLE loss is 0.010435125187198704, train CRF loss is 0.005975745639804144
Training:At training steps 23000, training MLE loss is 0.010309619750525521, train CRF loss is 0.005815412769641997
Validation:At training steps 23000, training MLE loss is 0.010309619750525521, train CRF loss is 0.005815412769641997, validation MLE loss is 7.542449436689678, validation ppl is 1886.445, validation CRF loss is 7.40793672360872, validation BLEU is 59.16
Training:At training steps 23100, training MLE loss is 0.009423308456468238, train CRF loss is 0.005032998957953723
Training:At training steps 23200, training MLE loss is 0.008279962627477024, train CRF loss is 0.004651888304401117
Training:At training steps 23300, training MLE loss is 0.007536709351483714, train CRF loss is 0.00422790987926692
Training:At training steps 23400, training MLE loss is 0.007503763066610998, train CRF loss is 0.004299489897667529
Training:At training steps 23500, training MLE loss is 0.00727002786951876, train CRF loss is 0.004258459913794949
Validation:At training steps 23500, training MLE loss is 0.00727002786951876, train CRF loss is 0.004258459913794949, validation MLE loss is 7.821245074272156, validation ppl is 2493.007, validation CRF loss is 7.688841957794993, validation BLEU is 60.2
Training:At training steps 23600, training MLE loss is 0.0072441359691425514, train CRF loss is 0.004754380027230467
Training:At training steps 23700, training MLE loss is 0.006804231651355197, train CRF loss is 0.003976369439843916
Training:At training steps 23800, training MLE loss is 0.006644084288928944, train CRF loss is 0.0038367311476777285
Training:At training steps 23900, training MLE loss is 0.006622902831516013, train CRF loss is 0.003760132558590165
Training:At training steps 24000, training MLE loss is 0.0062835748298318206, train CRF loss is 0.003715791649625193
Validation:At training steps 24000, training MLE loss is 0.0062835748298318206, train CRF loss is 0.003715791649625193, validation MLE loss is 7.895793274829262, validation ppl is 2685.959, validation CRF loss is 7.727125927021629, validation BLEU is 60.41
Training:At training steps 24100, training MLE loss is 0.005967418276311847, train CRF loss is 0.0031636869515859136
Training:At training steps 24200, training MLE loss is 0.005853381759711542, train CRF loss is 0.0033267623241122536
Training:At training steps 24300, training MLE loss is 0.0054662131483700065, train CRF loss is 0.0032255588317048898
Training:At training steps 24400, training MLE loss is 0.005135640199255105, train CRF loss is 0.002962783905714712
Training:At training steps 24500, training MLE loss is 0.005204888826888731, train CRF loss is 0.003009195495609677
Validation:At training steps 24500, training MLE loss is 0.005204888826888731, train CRF loss is 0.003009195495609677, validation MLE loss is 7.949364373558446, validation ppl is 2833.773, validation CRF loss is 7.900111185876947, validation BLEU is 58.89
Training:At training steps 24600, training MLE loss is 0.004182531828997902, train CRF loss is 0.0029700393726977393
Training:At training steps 24700, training MLE loss is 0.0046545039524923315, train CRF loss is 0.0027926614458526912
Training:At training steps 24800, training MLE loss is 0.004298811033019496, train CRF loss is 0.0027715568336695315
Training:At training steps 24900, training MLE loss is 0.003917809342358749, train CRF loss is 0.0024386813674704844
Training:At training steps 25000, training MLE loss is 0.003980168276191658, train CRF loss is 0.002355744393744174
Validation:At training steps 25000, training MLE loss is 0.003980168276191658, train CRF loss is 0.002355744393744174, validation MLE loss is 8.097884033855639, validation ppl is 3287.504, validation CRF loss is 7.990614520876031, validation BLEU is 58.83
Training:At training steps 25100, training MLE loss is 0.004675058121034059, train CRF loss is 0.0029763886162993457
Training:At training steps 25200, training MLE loss is 0.0037784124582712964, train CRF loss is 0.002618874872275363
Training:At training steps 25300, training MLE loss is 0.0035689860196306297, train CRF loss is 0.0024122887793052755
Training:At training steps 25400, training MLE loss is 0.003533787969594784, train CRF loss is 0.0024646436046418586
Training:At training steps 25500, training MLE loss is 0.003405958215272187, train CRF loss is 0.002256317597909084
Validation:At training steps 25500, training MLE loss is 0.003405958215272187, train CRF loss is 0.002256317597909084, validation MLE loss is 8.314539238026267, validation ppl is 4082.804, validation CRF loss is 8.093400760700828, validation BLEU is 59.57
Training:At training steps 25600, training MLE loss is 0.0031286512856040583, train CRF loss is 0.001964592562629486
Training:At training steps 25700, training MLE loss is 0.0033908983375960013, train CRF loss is 0.0020247611133854338
Training:At training steps 25800, training MLE loss is 0.0032873012227462445, train CRF loss is 0.0018712382102583133
Training:At training steps 25900, training MLE loss is 0.003093645832498481, train CRF loss is 0.0017085911335734194
Training:At training steps 26000, training MLE loss is 0.0030412543725846885, train CRF loss is 0.0016765684753425854
Validation:At training steps 26000, training MLE loss is 0.0030412543725846885, train CRF loss is 0.0016765684753425854, validation MLE loss is 8.262835383415222, validation ppl is 3877.072, validation CRF loss is 8.155806961812472, validation BLEU is 60.48
Training:At training steps 26100, training MLE loss is 0.0021395775896999015, train CRF loss is 0.0014979143334066115
Training:At training steps 26200, training MLE loss is 0.002262836381053034, train CRF loss is 0.001310525250652801
Training:At training steps 26300, training MLE loss is 0.0023861020890910394, train CRF loss is 0.0014086391991448308
Training:At training steps 26400, training MLE loss is 0.0023734481333168413, train CRF loss is 0.001558301159644636
Training:At training steps 26500, training MLE loss is 0.0023700270943768256, train CRF loss is 0.0015441141547740882
Validation:At training steps 26500, training MLE loss is 0.0023700270943768256, train CRF loss is 0.0015441141547740882, validation MLE loss is 8.395449964623703, validation ppl is 4426.878, validation CRF loss is 8.339540525486594, validation BLEU is 59.59
Training:At training steps 26600, training MLE loss is 0.0022114759921434946, train CRF loss is 0.001086555000487177
Training:At training steps 26700, training MLE loss is 0.0020203783926931052, train CRF loss is 0.001185757087835786
Training:At training steps 26800, training MLE loss is 0.0020258460239793853, train CRF loss is 0.0011920148134511343
Training:At training steps 26900, training MLE loss is 0.0019596842587629205, train CRF loss is 0.0011215530123395402
Training:At training steps 27000, training MLE loss is 0.0017816255147413466, train CRF loss is 0.0010490569755412977
Validation:At training steps 27000, training MLE loss is 0.0017816255147413466, train CRF loss is 0.0010490569755412977, validation MLE loss is 8.463143731418409, validation ppl is 4736.926, validation CRF loss is 8.284451948968988, validation BLEU is 59.42
Training:At training steps 27100, training MLE loss is 0.0018995248739794679, train CRF loss is 0.0010426832346591387
Training:At training steps 27200, training MLE loss is 0.0018164464626635795, train CRF loss is 0.0009604022647927457
Training:At training steps 27300, training MLE loss is 0.0019590005962557213, train CRF loss is 0.0010996540814910342
Training:At training steps 27400, training MLE loss is 0.0018182856009477306, train CRF loss is 0.0010002197591321416
Training:At training steps 27500, training MLE loss is 0.0018007155359422609, train CRF loss is 0.0010228488134730008
Validation:At training steps 27500, training MLE loss is 0.0018007155359422609, train CRF loss is 0.0010228488134730008, validation MLE loss is 8.477328181266785, validation ppl is 4804.596, validation CRF loss is 8.380302171958121, validation BLEU is 59.23
Training:At training steps 27600, training MLE loss is 0.0013562827784154884, train CRF loss is 0.0006371036545362774
Training:At training steps 27700, training MLE loss is 0.0017814121133099942, train CRF loss is 0.0011542366278810446
Training:At training steps 27800, training MLE loss is 0.001779409699964396, train CRF loss is 0.0011622062066021558
Training:At training steps 27900, training MLE loss is 0.0016273724789891584, train CRF loss is 0.0011243595611049916
Training:At training steps 28000, training MLE loss is 0.00168813160749068, train CRF loss is 0.0010966646002361538
Validation:At training steps 28000, training MLE loss is 0.00168813160749068, train CRF loss is 0.0010966646002361538, validation MLE loss is 8.553827605749431, validation ppl is 5186.569, validation CRF loss is 8.401891250359384, validation BLEU is 60.19
Training:At training steps 28100, training MLE loss is 0.0012945550419425512, train CRF loss is 0.000870232578971466
Training:At training steps 28200, training MLE loss is 0.0014717396849770602, train CRF loss is 0.0008784035750327268
Training:At training steps 28300, training MLE loss is 0.0013146445655037896, train CRF loss is 0.0007264187245200156
Training:At training steps 28400, training MLE loss is 0.001259447312326147, train CRF loss is 0.0007912448736534694
Training:At training steps 28500, training MLE loss is 0.0012849154021447478, train CRF loss is 0.0008437511857845727
Validation:At training steps 28500, training MLE loss is 0.0012849154021447478, train CRF loss is 0.0008437511857845727, validation MLE loss is 8.662770045431037, validation ppl is 5783.533, validation CRF loss is 8.627392706118131, validation BLEU is 59.74
Training:At training steps 28600, training MLE loss is 0.0018953581333481207, train CRF loss is 0.0012409654362767154
Training:At training steps 28700, training MLE loss is 0.0016083421545823456, train CRF loss is 0.0011548982247288552
Training:At training steps 28800, training MLE loss is 0.0014660304510581911, train CRF loss is 0.0011673966528910299
Training:At training steps 28900, training MLE loss is 0.0013992761734401126, train CRF loss is 0.0010861483229429713
Training:At training steps 29000, training MLE loss is 0.001432650916442278, train CRF loss is 0.0010510473843788688
Validation:At training steps 29000, training MLE loss is 0.001432650916442278, train CRF loss is 0.0010510473843788688, validation MLE loss is 8.840582866417733, validation ppl is 6909.019, validation CRF loss is 8.63013928187521, validation BLEU is 60.22
Training:At training steps 29100, training MLE loss is 0.0009220715289494186, train CRF loss is 0.0006711233031484376
Training:At training steps 29200, training MLE loss is 0.0008567686133793684, train CRF loss is 0.0005885428283515059
Training:At training steps 29300, training MLE loss is 0.0009091414823798575, train CRF loss is 0.0006562878172575074
Training:At training steps 29400, training MLE loss is 0.0009310170664010298, train CRF loss is 0.0006184217136195391
Training:At training steps 29500, training MLE loss is 0.0009046576888855071, train CRF loss is 0.000585896157975327
Validation:At training steps 29500, training MLE loss is 0.0009046576888855071, train CRF loss is 0.000585896157975327, validation MLE loss is 8.744511698421679, validation ppl is 6276.148, validation CRF loss is 8.706991201952883, validation BLEU is 60.29
Training:At training steps 29600, training MLE loss is 0.0011371050433373363, train CRF loss is 0.0007281262238895581
Training:At training steps 29700, training MLE loss is 0.001080318215091231, train CRF loss is 0.0007822654111099503
Training:At training steps 29800, training MLE loss is 0.001200096898418644, train CRF loss is 0.0007339235714778968
Training:At training steps 29900, training MLE loss is 0.0010814777367378042, train CRF loss is 0.0006636762417253994
Training:At training steps 30000, training MLE loss is 0.0011514351168421674, train CRF loss is 0.0007268207148861388
Validation:At training steps 30000, training MLE loss is 0.0011514351168421674, train CRF loss is 0.0007268207148861388, validation MLE loss is 8.854189728435717, validation ppl is 7003.671, validation CRF loss is 8.688137261491073, validation BLEU is 60.0
Training:At training steps 30100, training MLE loss is 0.001186228886303009, train CRF loss is 0.000687841667173168
Training:At training steps 30200, training MLE loss is 0.0010967150618253413, train CRF loss is 0.0005131108949634155
Training:At training steps 30300, training MLE loss is 0.0011138486114507865, train CRF loss is 0.0006158997077118325
Training:At training steps 30400, training MLE loss is 0.0010877573708411037, train CRF loss is 0.000565669786685169
Training:At training steps 30500, training MLE loss is 0.0010912412044336796, train CRF loss is 0.0005583293228621562
Validation:At training steps 30500, training MLE loss is 0.0010912412044336796, train CRF loss is 0.0005583293228621562, validation MLE loss is 8.881011523698506, validation ppl is 7194.064, validation CRF loss is 8.762970039719029, validation BLEU is 60.29
Training:At training steps 30600, training MLE loss is 0.0004045546445094916, train CRF loss is 0.0003411416820905133
Training:At training steps 30700, training MLE loss is 0.0007534510256382516, train CRF loss is 0.0004965798127264321
Training:At training steps 30800, training MLE loss is 0.0008564700485156603, train CRF loss is 0.0005012929984587977
Training:At training steps 30900, training MLE loss is 0.0007607673446651083, train CRF loss is 0.0004398360786451383
Training:At training steps 31000, training MLE loss is 0.0008440802786934134, train CRF loss is 0.00046980109716214535
Validation:At training steps 31000, training MLE loss is 0.0008440802786934134, train CRF loss is 0.00046980109716214535, validation MLE loss is 9.028084874153137, validation ppl is 8333.884, validation CRF loss is 8.861317289502997, validation BLEU is 59.77
Training:At training steps 31100, training MLE loss is 0.0009696424411291705, train CRF loss is 0.0004984254527858622
Training:At training steps 31200, training MLE loss is 0.0008259181372910687, train CRF loss is 0.0005040280343023507
Training:At training steps 31300, training MLE loss is 0.0008018411288719487, train CRF loss is 0.0005348674799591076
Training:At training steps 31400, training MLE loss is 0.0008509908050557571, train CRF loss is 0.0005256509370260487
Training:At training steps 31500, training MLE loss is 0.0007660455249306176, train CRF loss is 0.00046316798809333725
Validation:At training steps 31500, training MLE loss is 0.0007660455249306176, train CRF loss is 0.00046316798809333725, validation MLE loss is 8.904196036489386, validation ppl is 7362.803, validation CRF loss is 8.781302257588035, validation BLEU is 59.97
Training:At training steps 31600, training MLE loss is 0.0007304490935485535, train CRF loss is 0.0003681496650749816
Training:At training steps 31700, training MLE loss is 0.0008649525085072958, train CRF loss is 0.0005459122479830581
Training:At training steps 31800, training MLE loss is 0.0008105383666040471, train CRF loss is 0.0005301057144415111
Training:At training steps 31900, training MLE loss is 0.0007445861966630429, train CRF loss is 0.0005226984819618763
Training:At training steps 32000, training MLE loss is 0.0007356768569538682, train CRF loss is 0.0005291805540872465
Validation:At training steps 32000, training MLE loss is 0.0007356768569538682, train CRF loss is 0.0005291805540872465, validation MLE loss is 8.990410578878302, validation ppl is 8025.751, validation CRF loss is 8.880518210561652, validation BLEU is 59.96
Training:At training steps 32100, training MLE loss is 0.0004478577683960691, train CRF loss is 0.00031324981910114236
Training:At training steps 32200, training MLE loss is 0.0004700168558946986, train CRF loss is 0.0003054351896666096
Training:At training steps 32300, training MLE loss is 0.0005296610470812226, train CRF loss is 0.00029738804010168705
Training:At training steps 32400, training MLE loss is 0.0005462243816652642, train CRF loss is 0.0003187657158803292
Training:At training steps 32500, training MLE loss is 0.0006906071732850726, train CRF loss is 0.0004621076460010531
Validation:At training steps 32500, training MLE loss is 0.0006906071732850726, train CRF loss is 0.0004621076460010531, validation MLE loss is 8.974650037916083, validation ppl is 7900.253, validation CRF loss is 8.756417638377139, validation BLEU is 60.64
Training:At training steps 32600, training MLE loss is 0.0005890956462876857, train CRF loss is 0.0002544561433900094
Training:At training steps 32700, training MLE loss is 0.0005853552611404083, train CRF loss is 0.00047003448599192366
Training:At training steps 32800, training MLE loss is 0.0004914368139221268, train CRF loss is 0.00034492062241320287
Training:At training steps 32900, training MLE loss is 0.0005512487212963267, train CRF loss is 0.00039061786220497387
Training:At training steps 33000, training MLE loss is 0.0004912338714206295, train CRF loss is 0.00034251142991968654
Validation:At training steps 33000, training MLE loss is 0.0004912338714206295, train CRF loss is 0.00034251142991968654, validation MLE loss is 8.925241790319744, validation ppl is 7519.401, validation CRF loss is 8.749840309745387, validation BLEU is 59.85
Training:At training steps 33100, training MLE loss is 0.00039915254389900066, train CRF loss is 7.124405309991478e-05
Training:At training steps 33200, training MLE loss is 0.0004722735849560446, train CRF loss is 0.00023244764973786135
Training:At training steps 33300, training MLE loss is 0.00045749221949431686, train CRF loss is 0.00024267658222263845
Training:At training steps 33400, training MLE loss is 0.0004316265042578905, train CRF loss is 0.00021844192129607309
Training:At training steps 33500, training MLE loss is 0.0004295554419822345, train CRF loss is 0.0002567216754777375
Validation:At training steps 33500, training MLE loss is 0.0004295554419822345, train CRF loss is 0.0002567216754777375, validation MLE loss is 8.993609102148758, validation ppl is 8051.463, validation CRF loss is 8.842970728874207, validation BLEU is 59.72
Training:At training steps 33600, training MLE loss is 0.0007053993451963786, train CRF loss is 0.00035589923844975947
Training:At training steps 33700, training MLE loss is 0.0006954000099195296, train CRF loss is 0.000347201473436769
Training:At training steps 33800, training MLE loss is 0.0006648761534148599, train CRF loss is 0.00032175395634863454
Training:At training steps 33900, training MLE loss is 0.0005730871338449138, train CRF loss is 0.0002720436743104371
Training:At training steps 34000, training MLE loss is 0.0005114211395854316, train CRF loss is 0.00028754325364401456
Validation:At training steps 34000, training MLE loss is 0.0005114211395854316, train CRF loss is 0.00028754325364401456, validation MLE loss is 9.036632010811253, validation ppl is 8405.42, validation CRF loss is 8.825097874591226, validation BLEU is 60.6
Training:At training steps 34100, training MLE loss is 0.0002740559379729166, train CRF loss is 0.00022490088754515637
Training:At training steps 34200, training MLE loss is 0.0002162713903647159, train CRF loss is 0.00019838243274461264
Training:At training steps 34300, training MLE loss is 0.00036211834063262414, train CRF loss is 0.00023407684093176506
Training:At training steps 34400, training MLE loss is 0.0003247083450486349, train CRF loss is 0.000201793831672199
Training:At training steps 34500, training MLE loss is 0.0004004831044932674, train CRF loss is 0.000193590903386875
Validation:At training steps 34500, training MLE loss is 0.0004004831044932674, train CRF loss is 0.000193590903386875, validation MLE loss is 9.038910376398187, validation ppl is 8424.592, validation CRF loss is 8.94504143689808, validation BLEU is 60.15
Training:At training steps 34600, training MLE loss is 0.00024994857866278143, train CRF loss is 0.0002613514188641819
Training:At training steps 34700, training MLE loss is 0.0001975930253193018, train CRF loss is 0.0001934916914095708
Training:At training steps 34800, training MLE loss is 0.0002343052079637973, train CRF loss is 0.00019382544665128194
Training:At training steps 34900, training MLE loss is 0.0003966058204722301, train CRF loss is 0.000266108619576404
Training:At training steps 35000, training MLE loss is 0.0004108675170304942, train CRF loss is 0.0002529568079828044
Validation:At training steps 35000, training MLE loss is 0.0004108675170304942, train CRF loss is 0.0002529568079828044, validation MLE loss is 9.057928104149667, validation ppl is 8586.342, validation CRF loss is 8.920903808192202, validation BLEU is 59.31
Training:At training steps 35100, training MLE loss is 0.00019074479472717878, train CRF loss is 0.00011629682660688534
Training:At training steps 35200, training MLE loss is 0.0003425519844878845, train CRF loss is 0.00022462877604209952
Training:At training steps 35300, training MLE loss is 0.00044260266456056565, train CRF loss is 0.0003166175841771738
Training:At training steps 35400, training MLE loss is 0.00037579896951374326, train CRF loss is 0.0002647583070271964
Training:At training steps 35500, training MLE loss is 0.0003835649495078096, train CRF loss is 0.00022419798616004006
Validation:At training steps 35500, training MLE loss is 0.0003835649495078096, train CRF loss is 0.00022419798616004006, validation MLE loss is 9.056727854829086, validation ppl is 8576.043, validation CRF loss is 8.937727256825095, validation BLEU is 59.69
Training:At training steps 35600, training MLE loss is 0.00012148296304757909, train CRF loss is 0.0002315904593824669
Training:At training steps 35700, training MLE loss is 0.00018831837279820052, train CRF loss is 0.0002062504472365445
Training:At training steps 35800, training MLE loss is 0.00032653426204535857, train CRF loss is 0.00020352748863156891
Training:At training steps 35900, training MLE loss is 0.00029024602098153365, train CRF loss is 0.00016480508124656847
Training:At training steps 36000, training MLE loss is 0.00027877620701204206, train CRF loss is 0.00015773958906205364
Validation:At training steps 36000, training MLE loss is 0.00027877620701204206, train CRF loss is 0.00015773958906205364, validation MLE loss is 9.113909834309629, validation ppl is 9080.73, validation CRF loss is 8.962728067448264, validation BLEU is 59.96
Training:At training steps 36100, training MLE loss is 0.000280028254311479, train CRF loss is 0.00014521596272432102
Training:At training steps 36200, training MLE loss is 0.0003164712576156666, train CRF loss is 0.00022270302503920146
Training:At training steps 36300, training MLE loss is 0.0003244936847287547, train CRF loss is 0.00018434164621525953
Training:At training steps 36400, training MLE loss is 0.00029925522333568947, train CRF loss is 0.0001564537540797195
Training:At training steps 36500, training MLE loss is 0.000284612242473509, train CRF loss is 0.0001438977942867732
Validation:At training steps 36500, training MLE loss is 0.000284612242473509, train CRF loss is 0.0001438977942867732, validation MLE loss is 9.066988411702608, validation ppl is 8664.491, validation CRF loss is 8.91640273520821, validation BLEU is 59.38
Training:At training steps 36600, training MLE loss is 0.0004995992879753823, train CRF loss is 0.00042549213358943126
Training:At training steps 36700, training MLE loss is 0.0003232455883826898, train CRF loss is 0.0002375901303116912
Training:At training steps 36800, training MLE loss is 0.0003358177492396806, train CRF loss is 0.00024589916832544006
Training:At training steps 36900, training MLE loss is 0.0002820425705433798, train CRF loss is 0.00021476255214412986
Training:At training steps 37000, training MLE loss is 0.000239256517723117, train CRF loss is 0.00018135221427934044
Validation:At training steps 37000, training MLE loss is 0.000239256517723117, train CRF loss is 0.00018135221427934044, validation MLE loss is 9.219712257385254, validation ppl is 10094.159, validation CRF loss is 8.96977804836474, validation BLEU is 59.93
Training:At training steps 37100, training MLE loss is 0.00013602576058697837, train CRF loss is 8.385465302511719e-06
Training:At training steps 37200, training MLE loss is 0.00010695798574947124, train CRF loss is 5.299721342990615e-06
Training:At training steps 37300, training MLE loss is 0.00014124385251315126, train CRF loss is 9.258377547514278e-05
Training:At training steps 37400, training MLE loss is 0.0001544829980898468, train CRF loss is 7.610027499359795e-05
Training:At training steps 37500, training MLE loss is 0.00013431931185411343, train CRF loss is 6.488067371248008e-05
Validation:At training steps 37500, training MLE loss is 0.00013431931185411343, train CRF loss is 6.488067371248008e-05, validation MLE loss is 9.023310755428515, validation ppl is 8294.192, validation CRF loss is 8.919578715374595, validation BLEU is 60.33
Training:At training steps 37600, training MLE loss is 0.00016346494594637323, train CRF loss is 0.00013435882313134418
Training:At training steps 37700, training MLE loss is 0.00016064229294181502, train CRF loss is 7.727326025014358e-05
Training:At training steps 37800, training MLE loss is 0.0001563141271761218, train CRF loss is 7.318137769600627e-05
Training:At training steps 37900, training MLE loss is 0.0001610221009944883, train CRF loss is 8.748902269076431e-05
Training:At training steps 38000, training MLE loss is 0.00021351990030907863, train CRF loss is 0.0001581466276881418
Validation:At training steps 38000, training MLE loss is 0.00021351990030907863, train CRF loss is 0.0001581466276881418, validation MLE loss is 9.19092501464643, validation ppl is 9807.719, validation CRF loss is 9.028727142434372, validation BLEU is 60.06
Training:At training steps 38100, training MLE loss is 0.0003956249468879318, train CRF loss is 0.00042704776585571034
Training:At training steps 38200, training MLE loss is 0.0003043748875349189, train CRF loss is 0.00024947153648875676
Training:At training steps 38300, training MLE loss is 0.0002572575028410285, train CRF loss is 0.00017559248120033733
Training:At training steps 38400, training MLE loss is 0.00023829890756236593, train CRF loss is 0.000185062424429987
Training:At training steps 38500, training MLE loss is 0.00021541193719488465, train CRF loss is 0.000149773275493553
Validation:At training steps 38500, training MLE loss is 0.00021541193719488465, train CRF loss is 0.000149773275493553, validation MLE loss is 9.165278798655459, validation ppl is 9559.386, validation CRF loss is 9.026203613532218, validation BLEU is 60.42
Training:At training steps 38600, training MLE loss is 0.0003185817070587968, train CRF loss is 3.0723898791777946e-05
Training:At training steps 38700, training MLE loss is 0.00020068882109319523, train CRF loss is 2.757465692608152e-05
Training:At training steps 38800, training MLE loss is 0.00020243533085190975, train CRF loss is 6.387168240262308e-05
Training:At training steps 38900, training MLE loss is 0.00015597127351558407, train CRF loss is 5.0616512948449264e-05
Training:At training steps 39000, training MLE loss is 0.00022218504675946635, train CRF loss is 0.00010166093046882452
Validation:At training steps 39000, training MLE loss is 0.00022218504675946635, train CRF loss is 0.00010166093046882452, validation MLE loss is 9.101078635767886, validation ppl is 8964.957, validation CRF loss is 8.972085651598478, validation BLEU is 60.43
Training:At training steps 39100, training MLE loss is 6.183317741930199e-05, train CRF loss is 0.00017478625442417074
Training:At training steps 39200, training MLE loss is 3.903952098177493e-05, train CRF loss is 9.740624669076326e-05
Training:At training steps 39300, training MLE loss is 6.778269954191888e-05, train CRF loss is 6.95189669079858e-05
Training:At training steps 39400, training MLE loss is 6.009196502892169e-05, train CRF loss is 5.956618709099137e-05
Training:At training steps 39500, training MLE loss is 5.8155107025301634e-05, train CRF loss is 4.907322188369889e-05
Validation:At training steps 39500, training MLE loss is 5.8155107025301634e-05, train CRF loss is 4.907322188369889e-05, validation MLE loss is 9.143634193821958, validation ppl is 9354.7, validation CRF loss is 8.998105871049981, validation BLEU is 60.31
Training:At training steps 39600, training MLE loss is 0.0001575194046859274, train CRF loss is 0.00013918273241603885
Training:At training steps 39700, training MLE loss is 0.00011999715941125581, train CRF loss is 9.730899275701698e-05
Training:At training steps 39800, training MLE loss is 8.757245746023539e-05, train CRF loss is 6.581612688648726e-05
Training:At training steps 39900, training MLE loss is 7.453407464152393e-05, train CRF loss is 5.2497292885662625e-05
Training:At training steps 40000, training MLE loss is 6.523616401395442e-05, train CRF loss is 5.9446820020470526e-05
Validation:At training steps 40000, training MLE loss is 6.523616401395442e-05, train CRF loss is 5.9446820020470526e-05, validation MLE loss is 9.149942329055385, validation ppl is 9413.897, validation CRF loss is 8.991749957988137, validation BLEU is 60.13
Training:At training steps 100, training MLE loss is 2.1717731401324274, train CRF loss is 15.980498926639557
Training:At training steps 200, training MLE loss is 2.1496638102456926, train CRF loss is 15.447494914829731
Training:At training steps 300, training MLE loss is 2.1632305027792853, train CRF loss is 14.659334386587142
Training:At training steps 400, training MLE loss is 2.1785211004130542, train CRF loss is 13.957447227984666
Training:At training steps 500, training MLE loss is 2.1735479488968847, train CRF loss is 13.350011906743049
Validation:At training steps 500, training MLE loss is 2.1735479488968847, train CRF loss is 13.350011906743049, validation MLE loss is 2.171900921746304, validation ppl is 8.775, validation CRF loss is 9.970341167951885, validation BLEU is 0.71
Training:At training steps 600, training MLE loss is 2.1185822080075742, train CRF loss is 10.219210146069527
Training:At training steps 700, training MLE loss is 2.0987272767722605, train CRF loss is 10.038689716011286
Training:At training steps 800, training MLE loss is 2.0897069253772496, train CRF loss is 9.891252293090025
Training:At training steps 900, training MLE loss is 2.0695489360764623, train CRF loss is 9.747049093544483
Training:At training steps 1000, training MLE loss is 2.0561148837208747, train CRF loss is 9.624453610658646
Validation:At training steps 1000, training MLE loss is 2.0561148837208747, train CRF loss is 9.624453610658646, validation MLE loss is 1.9018172888379348, validation ppl is 6.698, validation CRF loss is 8.577417185432033, validation BLEU is 3.77
Training:At training steps 1100, training MLE loss is 1.9731998317688704, train CRF loss is 8.87911224067211
Training:At training steps 1200, training MLE loss is 1.9840785229578615, train CRF loss is 8.767385147362948
Training:At training steps 1300, training MLE loss is 2.0018965027232967, train CRF loss is 8.662650276521841
Training:At training steps 1400, training MLE loss is 2.0158303409069775, train CRF loss is 8.535314926728606
Training:At training steps 1500, training MLE loss is 2.0320747469067575, train CRF loss is 8.402309090077877
Validation:At training steps 1500, training MLE loss is 2.0320747469067575, train CRF loss is 8.402309090077877, validation MLE loss is 2.22620079391881, validation ppl is 9.265, validation CRF loss is 7.564110360647502, validation BLEU is 32.34
Training:At training steps 1600, training MLE loss is 2.150935985594988, train CRF loss is 7.719476763010025
Training:At training steps 1700, training MLE loss is 2.16540182325989, train CRF loss is 7.59501041829586
Training:At training steps 1800, training MLE loss is 2.1788895408560833, train CRF loss is 7.46847324659427
Training:At training steps 1900, training MLE loss is 2.1973780230619013, train CRF loss is 7.364499156028033
Training:At training steps 2000, training MLE loss is 2.214938744068146, train CRF loss is 7.2569782250523565
Validation:At training steps 2000, training MLE loss is 2.214938744068146, train CRF loss is 7.2569782250523565, validation MLE loss is 2.3509802175195595, validation ppl is 10.496, validation CRF loss is 6.739558703020999, validation BLEU is 33.55
Training:At training steps 2100, training MLE loss is 2.313026740178466, train CRF loss is 6.615828494727611
Training:At training steps 2200, training MLE loss is 2.2971064081415533, train CRF loss is 6.522432666271925
Training:At training steps 2300, training MLE loss is 2.2865994846324127, train CRF loss is 6.41377397403121
Training:At training steps 2400, training MLE loss is 2.2960541334934534, train CRF loss is 6.32523199211806
Training:At training steps 2500, training MLE loss is 2.3039891836196182, train CRF loss is 6.213100199609995
Validation:At training steps 2500, training MLE loss is 2.3039891836196182, train CRF loss is 6.213100199609995, validation MLE loss is 2.826370176516081, validation ppl is 16.884, validation CRF loss is 6.260369620825115, validation BLEU is 36.25
Training:At training steps 2600, training MLE loss is 2.3458506274223327, train CRF loss is 5.571729794293642
Training:At training steps 2700, training MLE loss is 2.3590128177031873, train CRF loss is 5.479554129168391
Training:At training steps 2800, training MLE loss is 2.399712371900678, train CRF loss is 5.428224938809872
Training:At training steps 2900, training MLE loss is 2.3985865114815534, train CRF loss is 5.368164353258908
Training:At training steps 3000, training MLE loss is 2.4024889016747473, train CRF loss is 5.291929104119539
Validation:At training steps 3000, training MLE loss is 2.4024889016747473, train CRF loss is 5.291929104119539, validation MLE loss is 2.639981796866969, validation ppl is 14.013, validation CRF loss is 4.81180004697097, validation BLEU is 40.72
Training:At training steps 3100, training MLE loss is 2.4268264627456664, train CRF loss is 4.786335317790508
Training:At training steps 3200, training MLE loss is 2.4432136641815303, train CRF loss is 4.745540032163262
Training:At training steps 3300, training MLE loss is 2.424235284005602, train CRF loss is 4.661576209565004
Training:At training steps 3400, training MLE loss is 2.416814477369189, train CRF loss is 4.576295952163637
Training:At training steps 3500, training MLE loss is 2.407819376602769, train CRF loss is 4.513830145001411
Validation:At training steps 3500, training MLE loss is 2.407819376602769, train CRF loss is 4.513830145001411, validation MLE loss is 2.568184394585459, validation ppl is 13.042, validation CRF loss is 4.938692579143925, validation BLEU is 39.25
Training:At training steps 3600, training MLE loss is 2.462867563068867, train CRF loss is 4.154083133339882
Training:At training steps 3700, training MLE loss is 2.405178325623274, train CRF loss is 4.075151573829353
Training:At training steps 3800, training MLE loss is 2.406274285316467, train CRF loss is 4.0155152530719835
Training:At training steps 3900, training MLE loss is 2.379813734292984, train CRF loss is 3.9725591743551196
Training:At training steps 4000, training MLE loss is 2.3738915291428566, train CRF loss is 3.92047960755229
Validation:At training steps 4000, training MLE loss is 2.3738915291428566, train CRF loss is 3.92047960755229, validation MLE loss is 2.8043621151070846, validation ppl is 16.517, validation CRF loss is 4.042380648223977, validation BLEU is 45.58
Training:At training steps 4100, training MLE loss is 2.2997633124142887, train CRF loss is 3.6572322091460228
Training:At training steps 4200, training MLE loss is 2.269409193303436, train CRF loss is 3.6015417101606726
Training:At training steps 4300, training MLE loss is 2.2373586850240827, train CRF loss is 3.5270946010450523
Training:At training steps 4400, training MLE loss is 2.2176802155748008, train CRF loss is 3.490497032869607
Training:At training steps 4500, training MLE loss is 2.19158246307075, train CRF loss is 3.450654644191265
Validation:At training steps 4500, training MLE loss is 2.19158246307075, train CRF loss is 3.450654644191265, validation MLE loss is 2.643473444800628, validation ppl is 14.062, validation CRF loss is 3.862715821517141, validation BLEU is 49.08
Training:At training steps 4600, training MLE loss is 2.0470344261452555, train CRF loss is 3.287897620499134
Training:At training steps 4700, training MLE loss is 2.0029062732867895, train CRF loss is 3.2161080899462102
Training:At training steps 4800, training MLE loss is 1.9674384306867918, train CRF loss is 3.1713991291075945
Training:At training steps 4900, training MLE loss is 1.941312533505261, train CRF loss is 3.1324400427844377
Training:At training steps 5000, training MLE loss is 1.9192380537465215, train CRF loss is 3.0878282637521623
Validation:At training steps 5000, training MLE loss is 1.9192380537465215, train CRF loss is 3.0878282637521623, validation MLE loss is 2.9410692265159204, validation ppl is 18.936, validation CRF loss is 3.674807824586567, validation BLEU is 52.42
Training:At training steps 5100, training MLE loss is 1.7687736926227808, train CRF loss is 2.859533615782857
Training:At training steps 5200, training MLE loss is 1.7480364665016532, train CRF loss is 2.8070631173625586
Training:At training steps 5300, training MLE loss is 1.7304582074160377, train CRF loss is 2.7696163879334925
Training:At training steps 5400, training MLE loss is 1.7033239054959268, train CRF loss is 2.728145287428051
Training:At training steps 5500, training MLE loss is 1.6838533061891794, train CRF loss is 2.696395574249327
Validation:At training steps 5500, training MLE loss is 1.6838533061891794, train CRF loss is 2.696395574249327, validation MLE loss is 2.8775617135198495, validation ppl is 17.771, validation CRF loss is 3.6494701504707336, validation BLEU is 52.93
Training:At training steps 5600, training MLE loss is 1.5936960318312048, train CRF loss is 2.480217757336795
Training:At training steps 5700, training MLE loss is 1.5558701142296194, train CRF loss is 2.458092798497528
Training:At training steps 5800, training MLE loss is 1.5420357172812025, train CRF loss is 2.427251782802244
Training:At training steps 5900, training MLE loss is 1.5231516008358448, train CRF loss is 2.39422299942933
Training:At training steps 6000, training MLE loss is 1.513269730709493, train CRF loss is 2.362602831333876
Validation:At training steps 6000, training MLE loss is 1.513269730709493, train CRF loss is 2.362602831333876, validation MLE loss is 3.152378886938095, validation ppl is 23.392, validation CRF loss is 3.5132672567116585, validation BLEU is 56.63
Training:At training steps 6100, training MLE loss is 1.4457135154679417, train CRF loss is 2.1454391354881226
Training:At training steps 6200, training MLE loss is 1.4019024896156043, train CRF loss is 2.1048167759086938
Training:At training steps 6300, training MLE loss is 1.3853856692835689, train CRF loss is 2.0899262911702197
Training:At training steps 6400, training MLE loss is 1.366945026377216, train CRF loss is 2.063477150015533
Training:At training steps 6500, training MLE loss is 1.350044223241508, train CRF loss is 2.0294142961017787
Validation:At training steps 6500, training MLE loss is 1.350044223241508, train CRF loss is 2.0294142961017787, validation MLE loss is 3.257627628351513, validation ppl is 25.988, validation CRF loss is 3.653502487822583, validation BLEU is 55.39
Training:At training steps 6600, training MLE loss is 1.2304438047204167, train CRF loss is 1.817593237310648
Training:At training steps 6700, training MLE loss is 1.2195669837063179, train CRF loss is 1.7896968400664628
Training:At training steps 6800, training MLE loss is 1.208649230155473, train CRF loss is 1.7845405550859867
Training:At training steps 6900, training MLE loss is 1.193481498749461, train CRF loss is 1.7551243675593287
Training:At training steps 7000, training MLE loss is 1.1768012634087355, train CRF loss is 1.724524119026959
Validation:At training steps 7000, training MLE loss is 1.1768012634087355, train CRF loss is 1.724524119026959, validation MLE loss is 3.1346082640321633, validation ppl is 22.98, validation CRF loss is 3.7391348324323954, validation BLEU is 56.1
Training:At training steps 7100, training MLE loss is 1.0764824985340238, train CRF loss is 1.5717228436283768
Training:At training steps 7200, training MLE loss is 1.060897398237139, train CRF loss is 1.532814249889925
Training:At training steps 7300, training MLE loss is 1.0517929436266422, train CRF loss is 1.5078141110793999
Training:At training steps 7400, training MLE loss is 1.0219652676256374, train CRF loss is 1.478792689590482
Training:At training steps 7500, training MLE loss is 1.0076446787752211, train CRF loss is 1.4563104643095284
Validation:At training steps 7500, training MLE loss is 1.0076446787752211, train CRF loss is 1.4563104643095284, validation MLE loss is 3.550917785418661, validation ppl is 34.845, validation CRF loss is 3.819064686172887, validation BLEU is 57.76
Training:At training steps 7600, training MLE loss is 0.9110962071735412, train CRF loss is 1.2960485809948294
Training:At training steps 7700, training MLE loss is 0.9166999311093241, train CRF loss is 1.2785402191383763
Training:At training steps 7800, training MLE loss is 0.8989687800997247, train CRF loss is 1.255835036681965
Training:At training steps 7900, training MLE loss is 0.8836804958828725, train CRF loss is 1.2336262367549353
Training:At training steps 8000, training MLE loss is 0.8741700017936528, train CRF loss is 1.2065055419262498
Validation:At training steps 8000, training MLE loss is 0.8741700017936528, train CRF loss is 1.2065055419262498, validation MLE loss is 3.822009968130212, validation ppl is 45.696, validation CRF loss is 4.011594789592843, validation BLEU is 59.12
Training:At training steps 8100, training MLE loss is 0.7966633305791766, train CRF loss is 1.0699411880620755
Training:At training steps 8200, training MLE loss is 0.8103040926647372, train CRF loss is 1.0512338166555855
Training:At training steps 8300, training MLE loss is 0.7985100121640911, train CRF loss is 1.0378625517408364
Training:At training steps 8400, training MLE loss is 0.7930431564035826, train CRF loss is 1.022007031394751
Training:At training steps 8500, training MLE loss is 0.7767127457740717, train CRF loss is 1.0066559033649973
Validation:At training steps 8500, training MLE loss is 0.7767127457740717, train CRF loss is 1.0066559033649973, validation MLE loss is 3.940195952591143, validation ppl is 51.429, validation CRF loss is 4.221783355662697, validation BLEU is 59.31
Training:At training steps 8600, training MLE loss is 0.674311353941448, train CRF loss is 0.8797832804685458
Training:At training steps 8700, training MLE loss is 0.672303274308797, train CRF loss is 0.8660756448027678
Training:At training steps 8800, training MLE loss is 0.6703645603684708, train CRF loss is 0.8579855678929016
Training:At training steps 8900, training MLE loss is 0.6642413517736714, train CRF loss is 0.8395459866734746
Training:At training steps 9000, training MLE loss is 0.6622886396690737, train CRF loss is 0.8271108075884404
Validation:At training steps 9000, training MLE loss is 0.6622886396690737, train CRF loss is 0.8271108075884404, validation MLE loss is 4.223971931557906, validation ppl is 68.304, validation CRF loss is 4.224803999850624, validation BLEU is 58.22
Training:At training steps 9100, training MLE loss is 0.5812943256739527, train CRF loss is 0.730990419111331
Training:At training steps 9200, training MLE loss is 0.5904437133762985, train CRF loss is 0.7165014886789141
Training:At training steps 9300, training MLE loss is 0.581485539705803, train CRF loss is 0.6967615989479237
Training:At training steps 9400, training MLE loss is 0.5696362374455203, train CRF loss is 0.683244161948096
Training:At training steps 9500, training MLE loss is 0.5661348838917911, train CRF loss is 0.6731708326146473
Validation:At training steps 9500, training MLE loss is 0.5661348838917911, train CRF loss is 0.6731708326146473, validation MLE loss is 4.484035322540684, validation ppl is 88.591, validation CRF loss is 4.578259355143497, validation BLEU is 57.97
Training:At training steps 9600, training MLE loss is 0.5361109394370578, train CRF loss is 0.5865520781348459
Training:At training steps 9700, training MLE loss is 0.5165989224845543, train CRF loss is 0.5775485247731558
Training:At training steps 9800, training MLE loss is 0.516597777606609, train CRF loss is 0.5620960137967874
Training:At training steps 9900, training MLE loss is 0.5125068700700649, train CRF loss is 0.5498648116210098
Training:At training steps 10000, training MLE loss is 0.5018749550140928, train CRF loss is 0.5385008114759112
Validation:At training steps 10000, training MLE loss is 0.5018749550140928, train CRF loss is 0.5385008114759112, validation MLE loss is 4.717786977165623, validation ppl is 111.92, validation CRF loss is 4.938375846335762, validation BLEU is 57.58
Training:At training steps 10100, training MLE loss is 0.44007028454681857, train CRF loss is 0.45448239034973087
Training:At training steps 10200, training MLE loss is 0.4291684170841472, train CRF loss is 0.4481701994704781
Training:At training steps 10300, training MLE loss is 0.4392102448132937, train CRF loss is 0.4432119843152274
Training:At training steps 10400, training MLE loss is 0.4402495170674956, train CRF loss is 0.43763818309282215
Training:At training steps 10500, training MLE loss is 0.43869951141480124, train CRF loss is 0.43077229355517194
Validation:At training steps 10500, training MLE loss is 0.43869951141480124, train CRF loss is 0.43077229355517194, validation MLE loss is 5.093419206769843, validation ppl is 162.946, validation CRF loss is 4.98552595941644, validation BLEU is 61.25
Training:At training steps 10600, training MLE loss is 0.39397344129509293, train CRF loss is 0.3879075446477509
Training:At training steps 10700, training MLE loss is 0.38643389533564915, train CRF loss is 0.37667016248058643
Training:At training steps 10800, training MLE loss is 0.39000898980380344, train CRF loss is 0.3700008495719521
Training:At training steps 10900, training MLE loss is 0.3870625981394551, train CRF loss is 0.3612003298172749
Training:At training steps 11000, training MLE loss is 0.3783071803053608, train CRF loss is 0.3538934206333797
Validation:At training steps 11000, training MLE loss is 0.3783071803053608, train CRF loss is 0.3538934206333797, validation MLE loss is 5.198952270181556, validation ppl is 181.082, validation CRF loss is 5.006847089842746, validation BLEU is 60.04
Training:At training steps 11100, training MLE loss is 0.34762958151026396, train CRF loss is 0.2997526730762911
Training:At training steps 11200, training MLE loss is 0.3433303146711842, train CRF loss is 0.2927610991296569
Training:At training steps 11300, training MLE loss is 0.33277445544508133, train CRF loss is 0.2936153232991516
Training:At training steps 11400, training MLE loss is 0.33644791719620115, train CRF loss is 0.29233406664948236
Training:At training steps 11500, training MLE loss is 0.334837684751139, train CRF loss is 0.2901142888465183
Validation:At training steps 11500, training MLE loss is 0.334837684751139, train CRF loss is 0.2901142888465183, validation MLE loss is 5.068856876147421, validation ppl is 158.992, validation CRF loss is 5.290193965560512, validation BLEU is 58.23
Training:At training steps 11600, training MLE loss is 0.33569229183718563, train CRF loss is 0.28681518539990064
Training:At training steps 11700, training MLE loss is 0.32739528450591027, train CRF loss is 0.2709402292886989
Training:At training steps 11800, training MLE loss is 0.32207789571102086, train CRF loss is 0.2599463830921862
Training:At training steps 11900, training MLE loss is 0.3166964212142193, train CRF loss is 0.25192538571500334
Training:At training steps 12000, training MLE loss is 0.30912475423893193, train CRF loss is 0.24719680222710302
Validation:At training steps 12000, training MLE loss is 0.30912475423893193, train CRF loss is 0.24719680222710302, validation MLE loss is 5.513072980077643, validation ppl is 247.912, validation CRF loss is 5.423084048848403, validation BLEU is 59.64
Training:At training steps 12100, training MLE loss is 0.2881133070791111, train CRF loss is 0.2155154546918129
Training:At training steps 12200, training MLE loss is 0.28187688747457285, train CRF loss is 0.22102766311463712
Training:At training steps 12300, training MLE loss is 0.27897889104385587, train CRF loss is 0.216276971641455
Training:At training steps 12400, training MLE loss is 0.2775996107846913, train CRF loss is 0.21178674491449784
Training:At training steps 12500, training MLE loss is 0.271354223270886, train CRF loss is 0.20662983581778827
Validation:At training steps 12500, training MLE loss is 0.271354223270886, train CRF loss is 0.20662983581778827, validation MLE loss is 5.730179927851024, validation ppl is 308.025, validation CRF loss is 5.503149365123949, validation BLEU is 58.83
Training:At training steps 12600, training MLE loss is 0.24817069953191095, train CRF loss is 0.18212594715976593
Training:At training steps 12700, training MLE loss is 0.24500605616725807, train CRF loss is 0.18114617478374384
Training:At training steps 12800, training MLE loss is 0.2417307168825937, train CRF loss is 0.1798768823331314
Training:At training steps 12900, training MLE loss is 0.23427015779063368, train CRF loss is 0.17450832342989087
Training:At training steps 13000, training MLE loss is 0.23154433552038972, train CRF loss is 0.16934881162573037
Validation:At training steps 13000, training MLE loss is 0.23154433552038972, train CRF loss is 0.16934881162573037, validation MLE loss is 5.7649819286246045, validation ppl is 318.933, validation CRF loss is 5.704931180728109, validation BLEU is 58.54
Training:At training steps 13100, training MLE loss is 0.19839937978427769, train CRF loss is 0.14283204253046278
Training:At training steps 13200, training MLE loss is 0.20413189193322978, train CRF loss is 0.14699376671104802
Training:At training steps 13300, training MLE loss is 0.20401014970793768, train CRF loss is 0.14240148000860245
Training:At training steps 13400, training MLE loss is 0.1997622122732446, train CRF loss is 0.14266008714544567
Training:At training steps 13500, training MLE loss is 0.1955870578808317, train CRF loss is 0.1405610753246474
Validation:At training steps 13500, training MLE loss is 0.1955870578808317, train CRF loss is 0.1405610753246474, validation MLE loss is 5.905915655587849, validation ppl is 367.203, validation CRF loss is 5.765597942628358, validation BLEU is 58.93
Training:At training steps 13600, training MLE loss is 0.20070843712066563, train CRF loss is 0.13475577647418505
Training:At training steps 13700, training MLE loss is 0.18734002353508913, train CRF loss is 0.13558580495742262
Training:At training steps 13800, training MLE loss is 0.18391285333957058, train CRF loss is 0.13162873656634777
Training:At training steps 13900, training MLE loss is 0.1828453440946214, train CRF loss is 0.1282918484848051
Training:At training steps 14000, training MLE loss is 0.17936903121689102, train CRF loss is 0.12460989471059838
Validation:At training steps 14000, training MLE loss is 0.17936903121689102, train CRF loss is 0.12460989471059838, validation MLE loss is 5.973139740918812, validation ppl is 392.737, validation CRF loss is 5.903415193683223, validation BLEU is 60.43
Training:At training steps 14100, training MLE loss is 0.175563222306082, train CRF loss is 0.10734408373561109
Training:At training steps 14200, training MLE loss is 0.16835514340553345, train CRF loss is 0.10446635611680222
Training:At training steps 14300, training MLE loss is 0.1656855420774688, train CRF loss is 0.10377215814768079
Training:At training steps 14400, training MLE loss is 0.16705063682726176, train CRF loss is 0.10452911580129978
Training:At training steps 14500, training MLE loss is 0.16305544432481292, train CRF loss is 0.1046986093237506
Validation:At training steps 14500, training MLE loss is 0.16305544432481292, train CRF loss is 0.1046986093237506, validation MLE loss is 6.047951968092668, validation ppl is 423.245, validation CRF loss is 5.983198793310868, validation BLEU is 60.63
Training:At training steps 14600, training MLE loss is 0.14998826491344516, train CRF loss is 0.0970267070019554
Training:At training steps 14700, training MLE loss is 0.1491893406577219, train CRF loss is 0.09931931936156388
Training:At training steps 14800, training MLE loss is 0.14921187724402443, train CRF loss is 0.0993650475714253
Training:At training steps 14900, training MLE loss is 0.14848351304428206, train CRF loss is 0.09734436790712664
Training:At training steps 15000, training MLE loss is 0.15004828367449044, train CRF loss is 0.09635930647057739
Validation:At training steps 15000, training MLE loss is 0.15004828367449044, train CRF loss is 0.09635930647057739, validation MLE loss is 6.119932246835608, validation ppl is 454.834, validation CRF loss is 6.085922184743379, validation BLEU is 60.14
Training:At training steps 15100, training MLE loss is 0.13077036108496032, train CRF loss is 0.0814539032129369
Training:At training steps 15200, training MLE loss is 0.13041718357803803, train CRF loss is 0.08004672673075675
Training:At training steps 15300, training MLE loss is 0.1266527167918654, train CRF loss is 0.08227754971871112
Training:At training steps 15400, training MLE loss is 0.12685611522728324, train CRF loss is 0.08196692770923107
Training:At training steps 15500, training MLE loss is 0.12699953495706176, train CRF loss is 0.08034730372156765
Validation:At training steps 15500, training MLE loss is 0.12699953495706176, train CRF loss is 0.08034730372156765, validation MLE loss is 6.256576745133651, validation ppl is 521.431, validation CRF loss is 6.405696407744759, validation BLEU is 60.47
Training:At training steps 15600, training MLE loss is 0.1290804371439117, train CRF loss is 0.07926899501113212
Training:At training steps 15700, training MLE loss is 0.12983942935418782, train CRF loss is 0.0824866900428134
Training:At training steps 15800, training MLE loss is 0.12841579761914546, train CRF loss is 0.08072292392570148
Training:At training steps 15900, training MLE loss is 0.12358014205566632, train CRF loss is 0.07724990531997804
Training:At training steps 16000, training MLE loss is 0.12198452301473844, train CRF loss is 0.07551724616489879
Validation:At training steps 16000, training MLE loss is 0.12198452301473844, train CRF loss is 0.07551724616489879, validation MLE loss is 6.136161455982609, validation ppl is 462.276, validation CRF loss is 6.205081315417039, validation BLEU is 60.31
Training:At training steps 16100, training MLE loss is 0.10067006707512974, train CRF loss is 0.06363780211802464
Training:At training steps 16200, training MLE loss is 0.09963532596250957, train CRF loss is 0.06305840944508702
Training:At training steps 16300, training MLE loss is 0.100169940381442, train CRF loss is 0.06306331721751993
Training:At training steps 16400, training MLE loss is 0.09947647481142724, train CRF loss is 0.06254639800373553
Training:At training steps 16500, training MLE loss is 0.09843576088388727, train CRF loss is 0.061627624527954825
Validation:At training steps 16500, training MLE loss is 0.09843576088388727, train CRF loss is 0.061627624527954825, validation MLE loss is 6.657123261376431, validation ppl is 778.309, validation CRF loss is 6.571410957135652, validation BLEU is 60.44
Training:At training steps 16600, training MLE loss is 0.09565873884613098, train CRF loss is 0.05980375367116153
Training:At training steps 16700, training MLE loss is 0.09496971733775297, train CRF loss is 0.059869349696551805
Training:At training steps 16800, training MLE loss is 0.09443216108735025, train CRF loss is 0.05928828882029601
Training:At training steps 16900, training MLE loss is 0.09513114719127856, train CRF loss is 0.05939953986642706
Training:At training steps 17000, training MLE loss is 0.09351492858881624, train CRF loss is 0.05762374127838075
Validation:At training steps 17000, training MLE loss is 0.09351492858881624, train CRF loss is 0.05762374127838075, validation MLE loss is 6.676929364078923, validation ppl is 793.878, validation CRF loss is 6.70785098954251, validation BLEU is 59.6
Training:At training steps 17100, training MLE loss is 0.0839634492730812, train CRF loss is 0.05463849070589163
Training:At training steps 17200, training MLE loss is 0.08266888214624174, train CRF loss is 0.054785945820182175
Training:At training steps 17300, training MLE loss is 0.0806015504484149, train CRF loss is 0.05346168268242726
Training:At training steps 17400, training MLE loss is 0.07997528512289463, train CRF loss is 0.0528196995916818
Training:At training steps 17500, training MLE loss is 0.07964534274214645, train CRF loss is 0.05190756082498842
Validation:At training steps 17500, training MLE loss is 0.07964534274214645, train CRF loss is 0.05190756082498842, validation MLE loss is 6.782194357169302, validation ppl is 882.002, validation CRF loss is 6.797831052228024, validation BLEU is 59.83
Training:At training steps 17600, training MLE loss is 0.08349845455010722, train CRF loss is 0.05120474166725671
Training:At training steps 17700, training MLE loss is 0.07884658023674547, train CRF loss is 0.050316845293990865
Training:At training steps 17800, training MLE loss is 0.07893113618246617, train CRF loss is 0.04941190618305759
Training:At training steps 17900, training MLE loss is 0.07600461024236864, train CRF loss is 0.047565382932629775
Training:At training steps 18000, training MLE loss is 0.07425908154555531, train CRF loss is 0.0456146661638482
Validation:At training steps 18000, training MLE loss is 0.07425908154555531, train CRF loss is 0.0456146661638482, validation MLE loss is 6.797506479840529, validation ppl is 895.611, validation CRF loss is 6.881734973505924, validation BLEU is 60.66
Training:At training steps 18100, training MLE loss is 0.07262344812114861, train CRF loss is 0.03925599606525694
Training:At training steps 18200, training MLE loss is 0.06673758022900814, train CRF loss is 0.03835455179508394
Training:At training steps 18300, training MLE loss is 0.06582639840358544, train CRF loss is 0.040590918110766934
Training:At training steps 18400, training MLE loss is 0.0644737996280837, train CRF loss is 0.03942186868672671
Training:At training steps 18500, training MLE loss is 0.0627950158194227, train CRF loss is 0.038207846653740316
Validation:At training steps 18500, training MLE loss is 0.0627950158194227, train CRF loss is 0.038207846653740316, validation MLE loss is 7.060185444982428, validation ppl is 1164.661, validation CRF loss is 7.000014267469707, validation BLEU is 60.21
Training:At training steps 18600, training MLE loss is 0.05516069285360913, train CRF loss is 0.029931640696230062
Training:At training steps 18700, training MLE loss is 0.056814460348339535, train CRF loss is 0.03320457410863476
Training:At training steps 18800, training MLE loss is 0.05614797988292854, train CRF loss is 0.03407946033164445
Training:At training steps 18900, training MLE loss is 0.055496614229970015, train CRF loss is 0.033851847471222135
Training:At training steps 19000, training MLE loss is 0.05601810005526272, train CRF loss is 0.03395858169800792
Validation:At training steps 19000, training MLE loss is 0.05601810005526272, train CRF loss is 0.03395858169800792, validation MLE loss is 7.059527541461744, validation ppl is 1163.895, validation CRF loss is 7.142239925108458, validation BLEU is 60.41
Training:At training steps 19100, training MLE loss is 0.046309531706672825, train CRF loss is 0.028031125452603618
Training:At training steps 19200, training MLE loss is 0.04874246702071417, train CRF loss is 0.028964564599387687
Training:At training steps 19300, training MLE loss is 0.04920159592324457, train CRF loss is 0.02895547699784416
Training:At training steps 19400, training MLE loss is 0.05053058615901108, train CRF loss is 0.02896469634887378
Training:At training steps 19500, training MLE loss is 0.05007314527531938, train CRF loss is 0.02969968796933938
Validation:At training steps 19500, training MLE loss is 0.05007314527531938, train CRF loss is 0.02969968796933938, validation MLE loss is 7.108850212473619, validation ppl is 1222.741, validation CRF loss is 7.137746647784584, validation BLEU is 60.83
Training:At training steps 19600, training MLE loss is 0.045273404680535805, train CRF loss is 0.02658472676064278
Training:At training steps 19700, training MLE loss is 0.04423192378329723, train CRF loss is 0.025553148792518066
Training:At training steps 19800, training MLE loss is 0.043078625478003736, train CRF loss is 0.024793328532852734
Training:At training steps 19900, training MLE loss is 0.04265422873583054, train CRF loss is 0.02417426762656066
Training:At training steps 20000, training MLE loss is 0.04223183189647075, train CRF loss is 0.023708732300586115
Validation:At training steps 20000, training MLE loss is 0.04223183189647075, train CRF loss is 0.023708732300586115, validation MLE loss is 7.4095932527592305, validation ppl is 1651.754, validation CRF loss is 7.4288085134405835, validation BLEU is 59.78
Training:At training steps 20100, training MLE loss is 0.039572962027537247, train CRF loss is 0.020859752461353152
Training:At training steps 20200, training MLE loss is 0.038057362985363025, train CRF loss is 0.021363023391541325
Training:At training steps 20300, training MLE loss is 0.03631223166554796, train CRF loss is 0.02141830156414002
Training:At training steps 20400, training MLE loss is 0.035862812753157905, train CRF loss is 0.021061514723713416
Training:At training steps 20500, training MLE loss is 0.036025846633775976, train CRF loss is 0.02130795253105657
Validation:At training steps 20500, training MLE loss is 0.036025846633775976, train CRF loss is 0.02130795253105657, validation MLE loss is 7.610862565668006, validation ppl is 2020.02, validation CRF loss is 7.522579108413897, validation BLEU is 60.17
Training:At training steps 20600, training MLE loss is 0.035684959553427334, train CRF loss is 0.021830544960980164
Training:At training steps 20700, training MLE loss is 0.033407961451956736, train CRF loss is 0.019008282018664877
Training:At training steps 20800, training MLE loss is 0.0322914581755488, train CRF loss is 0.018589785042764873
Training:At training steps 20900, training MLE loss is 0.03111004438747063, train CRF loss is 0.018985285425386366
Training:At training steps 21000, training MLE loss is 0.030870916181350773, train CRF loss is 0.018939824567242653
Validation:At training steps 21000, training MLE loss is 0.030870916181350773, train CRF loss is 0.018939824567242653, validation MLE loss is 7.733497751386542, validation ppl is 2283.576, validation CRF loss is 7.581159105426387, validation BLEU is 59.6
Training:At training steps 21100, training MLE loss is 0.03136454014078613, train CRF loss is 0.019089281890344312
Training:At training steps 21200, training MLE loss is 0.02741247770129588, train CRF loss is 0.016531603013587453
Training:At training steps 21300, training MLE loss is 0.02604389185905901, train CRF loss is 0.01638649908303097
Training:At training steps 21400, training MLE loss is 0.025502906905643056, train CRF loss is 0.016171279733479742
Training:At training steps 21500, training MLE loss is 0.025089662344703185, train CRF loss is 0.015737646238150625
Validation:At training steps 21500, training MLE loss is 0.025089662344703185, train CRF loss is 0.015737646238150625, validation MLE loss is 7.746689636456339, validation ppl is 2313.9, validation CRF loss is 7.711765047750975, validation BLEU is 59.95
Training:At training steps 21600, training MLE loss is 0.020473437932269432, train CRF loss is 0.012814852464178638
Training:At training steps 21700, training MLE loss is 0.01988071468652092, train CRF loss is 0.012602098214753958
Training:At training steps 21800, training MLE loss is 0.019289680565960265, train CRF loss is 0.012147852797887515
Training:At training steps 21900, training MLE loss is 0.020009223970256205, train CRF loss is 0.012870767126453166
Training:At training steps 22000, training MLE loss is 0.01948030669736085, train CRF loss is 0.012489868526929094
Validation:At training steps 22000, training MLE loss is 0.01948030669736085, train CRF loss is 0.012489868526929094, validation MLE loss is 8.019413116731142, validation ppl is 3039.393, validation CRF loss is 7.955217863384046, validation BLEU is 60.63
Training:At training steps 22100, training MLE loss is 0.02116915347910537, train CRF loss is 0.013076029557757449
Training:At training steps 22200, training MLE loss is 0.019739185121350634, train CRF loss is 0.012144379427563022
Training:At training steps 22300, training MLE loss is 0.01850870699027708, train CRF loss is 0.011797642291906146
Training:At training steps 22400, training MLE loss is 0.01794858765412237, train CRF loss is 0.011371721621012227
Training:At training steps 22500, training MLE loss is 0.016776499808091778, train CRF loss is 0.010570873251497996
Validation:At training steps 22500, training MLE loss is 0.016776499808091778, train CRF loss is 0.010570873251497996, validation MLE loss is 8.125297075823733, validation ppl is 3378.872, validation CRF loss is 8.096541134934677, validation BLEU is 59.82
Training:At training steps 22600, training MLE loss is 0.01223470647067932, train CRF loss is 0.00694564135903196
Training:At training steps 22700, training MLE loss is 0.011570428106290694, train CRF loss is 0.006890862259396741
Training:At training steps 22800, training MLE loss is 0.011509797823188693, train CRF loss is 0.006771619516059613
Training:At training steps 22900, training MLE loss is 0.01168060280869328, train CRF loss is 0.006889529956744642
Training:At training steps 23000, training MLE loss is 0.011272812286167243, train CRF loss is 0.006765381383093118
Validation:At training steps 23000, training MLE loss is 0.011272812286167243, train CRF loss is 0.006765381383093118, validation MLE loss is 8.236892267277366, validation ppl is 3777.782, validation CRF loss is 8.259434420811502, validation BLEU is 60.38
Training:At training steps 23100, training MLE loss is 0.010306856813612092, train CRF loss is 0.005246984092862342
Training:At training steps 23200, training MLE loss is 0.011325509233966358, train CRF loss is 0.0065131467151991985
Training:At training steps 23300, training MLE loss is 0.010790214041382458, train CRF loss is 0.005987795363325669
Training:At training steps 23400, training MLE loss is 0.010161757937046233, train CRF loss is 0.005878848037500999
Training:At training steps 23500, training MLE loss is 0.009689154436918871, train CRF loss is 0.005565631667222091
Validation:At training steps 23500, training MLE loss is 0.009689154436918871, train CRF loss is 0.005565631667222091, validation MLE loss is 8.496901876048037, validation ppl is 4899.566, validation CRF loss is 8.441153432193556, validation BLEU is 60.9
Training:At training steps 23600, training MLE loss is 0.007066319723043422, train CRF loss is 0.0037314332754575163
Training:At training steps 23700, training MLE loss is 0.007858260082173772, train CRF loss is 0.004480224244153543
Training:At training steps 23800, training MLE loss is 0.007812076380715263, train CRF loss is 0.004670692200767981
Training:At training steps 23900, training MLE loss is 0.007844829980840287, train CRF loss is 0.004745924783393277
Training:At training steps 24000, training MLE loss is 0.007597822827986308, train CRF loss is 0.004495660908040965
Validation:At training steps 24000, training MLE loss is 0.007597822827986308, train CRF loss is 0.004495660908040965, validation MLE loss is 8.660868058079167, validation ppl is 5772.543, validation CRF loss is 8.612270587369016, validation BLEU is 60.79
Training:At training steps 24100, training MLE loss is 0.0070173636745281265, train CRF loss is 0.004356562985352328
Training:At training steps 24200, training MLE loss is 0.006859469161618639, train CRF loss is 0.00439284950008815
Training:At training steps 24300, training MLE loss is 0.006802501208639503, train CRF loss is 0.004397574663161563
Training:At training steps 24400, training MLE loss is 0.006449612600239511, train CRF loss is 0.004122235058748638
Training:At training steps 24500, training MLE loss is 0.006137589003334291, train CRF loss is 0.004018123426108364
Validation:At training steps 24500, training MLE loss is 0.006137589003334291, train CRF loss is 0.004018123426108364, validation MLE loss is 8.692100242564553, validation ppl is 5955.677, validation CRF loss is 8.62780975668054, validation BLEU is 61.51
Training:At training steps 24600, training MLE loss is 0.0057054928535401825, train CRF loss is 0.0027478225962317814
Training:At training steps 24700, training MLE loss is 0.004846717523529005, train CRF loss is 0.00280162515957874
Training:At training steps 24800, training MLE loss is 0.004407260143128907, train CRF loss is 0.002640772281311503
Training:At training steps 24900, training MLE loss is 0.004386081729684407, train CRF loss is 0.0026950924405634404
Training:At training steps 25000, training MLE loss is 0.004257964514918999, train CRF loss is 0.0025356991918154994
Validation:At training steps 25000, training MLE loss is 0.004257964514918999, train CRF loss is 0.0025356991918154994, validation MLE loss is 8.868985477246737, validation ppl is 7108.066, validation CRF loss is 8.775565282294625, validation BLEU is 59.87
Training:At training steps 25100, training MLE loss is 0.004276390907955646, train CRF loss is 0.002057032702853032
Training:At training steps 25200, training MLE loss is 0.004680996004177851, train CRF loss is 0.002517517079940472
Training:At training steps 25300, training MLE loss is 0.004275634161467988, train CRF loss is 0.002417470187541942
Training:At training steps 25400, training MLE loss is 0.003988246120539593, train CRF loss is 0.0023222679181953564
Training:At training steps 25500, training MLE loss is 0.003917544523464768, train CRF loss is 0.0022772082197503336
Validation:At training steps 25500, training MLE loss is 0.003917544523464768, train CRF loss is 0.0022772082197503336, validation MLE loss is 9.046151324322349, validation ppl is 8485.816, validation CRF loss is 8.938045501708984, validation BLEU is 60.55
Training:At training steps 25600, training MLE loss is 0.003695287802101206, train CRF loss is 0.002007682339541086
Training:At training steps 25700, training MLE loss is 0.003686504942831402, train CRF loss is 0.002156487573070165
Training:At training steps 25800, training MLE loss is 0.0038509904541634693, train CRF loss is 0.0021472200794332504
Training:At training steps 25900, training MLE loss is 0.0037327702297518433, train CRF loss is 0.002120983296036967
Training:At training steps 26000, training MLE loss is 0.003445652382748268, train CRF loss is 0.0019887349954194724
Validation:At training steps 26000, training MLE loss is 0.003445652382748268, train CRF loss is 0.0019887349954194724, validation MLE loss is 8.979322307988218, validation ppl is 7937.251, validation CRF loss is 9.004903849802519, validation BLEU is 60.79
Training:At training steps 26100, training MLE loss is 0.0025502396419998293, train CRF loss is 0.0018595346565427517
Training:At training steps 26200, training MLE loss is 0.002678074081484554, train CRF loss is 0.0018551876786299236
Training:At training steps 26300, training MLE loss is 0.002820048095605587, train CRF loss is 0.001963488628054143
Training:At training steps 26400, training MLE loss is 0.0028011879187021405, train CRF loss is 0.001848061942515632
Training:At training steps 26500, training MLE loss is 0.002648124346560041, train CRF loss is 0.001723541238822154
Validation:At training steps 26500, training MLE loss is 0.002648124346560041, train CRF loss is 0.001723541238822154, validation MLE loss is 9.158129346998114, validation ppl is 9491.286, validation CRF loss is 9.180725044325778, validation BLEU is 60.45
Training:At training steps 26600, training MLE loss is 0.003280944286687454, train CRF loss is 0.0021385078888925247
Training:At training steps 26700, training MLE loss is 0.003015490714264705, train CRF loss is 0.0018760366428638875
Training:At training steps 26800, training MLE loss is 0.0030056558494925014, train CRF loss is 0.0018494342755577353
Training:At training steps 26900, training MLE loss is 0.0027697367925134425, train CRF loss is 0.0017007148355216916
Training:At training steps 27000, training MLE loss is 0.0025424340207581566, train CRF loss is 0.0015408151578929505
Validation:At training steps 27000, training MLE loss is 0.0025424340207581566, train CRF loss is 0.0015408151578929505, validation MLE loss is 9.217054950563531, validation ppl is 10067.372, validation CRF loss is 9.194199750297948, validation BLEU is 60.75
Training:At training steps 27100, training MLE loss is 0.0024642269783236217, train CRF loss is 0.0013858834581064094
Training:At training steps 27200, training MLE loss is 0.0023224512495607095, train CRF loss is 0.0012348800447995578
Training:At training steps 27300, training MLE loss is 0.001991923827420419, train CRF loss is 0.001095751971369731
Training:At training steps 27400, training MLE loss is 0.0018699756907674214, train CRF loss is 0.0010458271624301753
Training:At training steps 27500, training MLE loss is 0.0018247642553186865, train CRF loss is 0.0010403937279013452
Validation:At training steps 27500, training MLE loss is 0.0018247642553186865, train CRF loss is 0.0010403937279013452, validation MLE loss is 9.2837056988164, validation ppl is 10761.236, validation CRF loss is 9.238595096688522, validation BLEU is 60.09
Training:At training steps 27600, training MLE loss is 0.0015212469740340556, train CRF loss is 0.0006322161513428703
Training:At training steps 27700, training MLE loss is 0.0019949345825185797, train CRF loss is 0.0009705125635514578
Training:At training steps 27800, training MLE loss is 0.002350737611269068, train CRF loss is 0.0010980545701740807
Training:At training steps 27900, training MLE loss is 0.0021628422794889727, train CRF loss is 0.0011252578635290267
Training:At training steps 28000, training MLE loss is 0.001989857983789544, train CRF loss is 0.0011109737622708176
Validation:At training steps 28000, training MLE loss is 0.001989857983789544, train CRF loss is 0.0011109737622708176, validation MLE loss is 9.190084501316672, validation ppl is 9799.479, validation CRF loss is 9.175359032656017, validation BLEU is 61.36
Training:At training steps 28100, training MLE loss is 0.001355675436137612, train CRF loss is 0.0008138005192752384
Training:At training steps 28200, training MLE loss is 0.0013193811431166558, train CRF loss is 0.0007650522746707189
Training:At training steps 28300, training MLE loss is 0.0013668041602224541, train CRF loss is 0.0007625109991045805
Training:At training steps 28400, training MLE loss is 0.001447613463830034, train CRF loss is 0.0008651816853351502
Training:At training steps 28500, training MLE loss is 0.001460378371255245, train CRF loss is 0.0008422372894100211
Validation:At training steps 28500, training MLE loss is 0.001460378371255245, train CRF loss is 0.0008422372894100211, validation MLE loss is 9.364394947102195, validation ppl is 11665.545, validation CRF loss is 9.285821826834427, validation BLEU is 60.87
Training:At training steps 28600, training MLE loss is 0.001040381965453372, train CRF loss is 0.0006537775008401781
Training:At training steps 28700, training MLE loss is 0.001070986628361712, train CRF loss is 0.000693329298942742
Training:At training steps 28800, training MLE loss is 0.001206288527745656, train CRF loss is 0.0008154735138696371
Training:At training steps 28900, training MLE loss is 0.0011772760155807332, train CRF loss is 0.0008059490488670407
Training:At training steps 29000, training MLE loss is 0.001183729060305046, train CRF loss is 0.000792020517777396
Validation:At training steps 29000, training MLE loss is 0.001183729060305046, train CRF loss is 0.000792020517777396, validation MLE loss is 9.413018050946688, validation ppl is 12246.777, validation CRF loss is 9.338696580184134, validation BLEU is 61.38
Training:At training steps 29100, training MLE loss is 0.0009632638271033717, train CRF loss is 0.0005157232372483644
Training:At training steps 29200, training MLE loss is 0.0009516951984891758, train CRF loss is 0.00047316838173917076
Training:At training steps 29300, training MLE loss is 0.0010882861904523327, train CRF loss is 0.000569401514582597
Training:At training steps 29400, training MLE loss is 0.0010397495494490856, train CRF loss is 0.0005568942791559061
Training:At training steps 29500, training MLE loss is 0.0010803206638199394, train CRF loss is 0.0005930825790176533
Validation:At training steps 29500, training MLE loss is 0.0010803206638199394, train CRF loss is 0.0005930825790176533, validation MLE loss is 9.535820402597126, validation ppl is 13846.952, validation CRF loss is 9.457702605347885, validation BLEU is 60.93
Training:At training steps 29600, training MLE loss is 0.0014837551363592711, train CRF loss is 0.0010817649093758108
Training:At training steps 29700, training MLE loss is 0.0013780047825735816, train CRF loss is 0.0008551681821887324
Training:At training steps 29800, training MLE loss is 0.0013078833550590194, train CRF loss is 0.0008332984453542054
Training:At training steps 29900, training MLE loss is 0.0012355600724511776, train CRF loss is 0.0007488150734724853
Training:At training steps 30000, training MLE loss is 0.0011405022574334138, train CRF loss is 0.000727129353450219
Validation:At training steps 30000, training MLE loss is 0.0011405022574334138, train CRF loss is 0.000727129353450219, validation MLE loss is 9.638162023142764, validation ppl is 15339.125, validation CRF loss is 9.572040677070618, validation BLEU is 61.08
Training:At training steps 30100, training MLE loss is 0.0010453180960561961, train CRF loss is 0.0006713166089435507
Training:At training steps 30200, training MLE loss is 0.0010932583183681243, train CRF loss is 0.0007488170705855679
Training:At training steps 30300, training MLE loss is 0.0009876335653621704, train CRF loss is 0.0006063619781037962
Training:At training steps 30400, training MLE loss is 0.0010151631255071332, train CRF loss is 0.0005939314444131616
Training:At training steps 30500, training MLE loss is 0.0009690209511304897, train CRF loss is 0.0005670660688312275
Validation:At training steps 30500, training MLE loss is 0.0009690209511304897, train CRF loss is 0.0005670660688312275, validation MLE loss is 9.64294675776833, validation ppl is 15412.694, validation CRF loss is 9.594075331562443, validation BLEU is 61.78
Training:At training steps 30600, training MLE loss is 0.0010422022035198536, train CRF loss is 0.0007663870685691121
Training:At training steps 30700, training MLE loss is 0.0009263238636777597, train CRF loss is 0.0006129482997008239
Training:At training steps 30800, training MLE loss is 0.0008159712588062943, train CRF loss is 0.000510191772277994
Training:At training steps 30900, training MLE loss is 0.0008200029285304683, train CRF loss is 0.0005054999277120775
Training:At training steps 31000, training MLE loss is 0.0007810352526148085, train CRF loss is 0.0004441984309751028
Validation:At training steps 31000, training MLE loss is 0.0007810352526148085, train CRF loss is 0.0004441984309751028, validation MLE loss is 9.60170328617096, validation ppl is 14789.952, validation CRF loss is 9.598769893771724, validation BLEU is 61.27
Training:At training steps 31100, training MLE loss is 0.0010024526842388436, train CRF loss is 0.00034850632782057467
Training:At training steps 31200, training MLE loss is 0.0008863368745682977, train CRF loss is 0.00042416467435338887
Training:At training steps 31300, training MLE loss is 0.0007970814968997844, train CRF loss is 0.00044863110597271217
Training:At training steps 31400, training MLE loss is 0.0008369685496476101, train CRF loss is 0.00044121720447424015
Training:At training steps 31500, training MLE loss is 0.0008130812914816192, train CRF loss is 0.0004989881868291874
Validation:At training steps 31500, training MLE loss is 0.0008130812914816192, train CRF loss is 0.0004989881868291874, validation MLE loss is 9.559792800953513, validation ppl is 14182.907, validation CRF loss is 9.558710637845492, validation BLEU is 60.11
Training:At training steps 31600, training MLE loss is 0.0007735083233046844, train CRF loss is 0.0006124676830949927
Training:At training steps 31700, training MLE loss is 0.0007086367730287556, train CRF loss is 0.0005227509958066512
Training:At training steps 31800, training MLE loss is 0.0005549361868524583, train CRF loss is 0.00045517499430632806
Training:At training steps 31900, training MLE loss is 0.0004953277528529275, train CRF loss is 0.00040074402805074795
Training:At training steps 32000, training MLE loss is 0.0004762014007636294, train CRF loss is 0.00038902674164893367
Validation:At training steps 32000, training MLE loss is 0.0004762014007636294, train CRF loss is 0.00038902674164893367, validation MLE loss is 9.697299166729575, validation ppl is 16273.596, validation CRF loss is 9.630972034052798, validation BLEU is 60.71
Training:At training steps 32100, training MLE loss is 0.0007169062373930257, train CRF loss is 0.00034359939823747077
Training:At training steps 32200, training MLE loss is 0.0006314806888474024, train CRF loss is 0.0003206885018435579
Training:At training steps 32300, training MLE loss is 0.0006245398279402938, train CRF loss is 0.00030445022689677676
Training:At training steps 32400, training MLE loss is 0.0006447768024629768, train CRF loss is 0.00032632104812216014
Training:At training steps 32500, training MLE loss is 0.000599495260156036, train CRF loss is 0.0003149160504347197
Validation:At training steps 32500, training MLE loss is 0.000599495260156036, train CRF loss is 0.0003149160504347197, validation MLE loss is 9.76208435861688, validation ppl is 17362.784, validation CRF loss is 9.620432721941095, validation BLEU is 60.37
Training:At training steps 32600, training MLE loss is 0.0005047448865399674, train CRF loss is 0.0003014114464217732
Training:At training steps 32700, training MLE loss is 0.000504623145778383, train CRF loss is 0.0003451547853642989
Training:At training steps 32800, training MLE loss is 0.0004996167254043715, train CRF loss is 0.00031857095268276553
Training:At training steps 32900, training MLE loss is 0.0005115694277385758, train CRF loss is 0.000340487821089277
Training:At training steps 33000, training MLE loss is 0.0004711385574026314, train CRF loss is 0.0003265254509905944
Validation:At training steps 33000, training MLE loss is 0.0004711385574026314, train CRF loss is 0.0003265254509905944, validation MLE loss is 9.802544869874653, validation ppl is 18079.697, validation CRF loss is 9.754774984560514, validation BLEU is 60.96
Training:At training steps 33100, training MLE loss is 0.0003206946495297477, train CRF loss is 0.00026200362993352046
Training:At training steps 33200, training MLE loss is 0.0003680654768654538, train CRF loss is 0.00027268886081520226
Training:At training steps 33300, training MLE loss is 0.0003209060027795585, train CRF loss is 0.0002244330437674617
Training:At training steps 33400, training MLE loss is 0.0003419376430238053, train CRF loss is 0.0002093135209431729
Training:At training steps 33500, training MLE loss is 0.00034404167307130264, train CRF loss is 0.0002231658613797123
Validation:At training steps 33500, training MLE loss is 0.00034404167307130264, train CRF loss is 0.0002231658613797123, validation MLE loss is 9.747199133822793, validation ppl is 17106.249, validation CRF loss is 9.707024530360574, validation BLEU is 60.61
Training:At training steps 33600, training MLE loss is 0.0005807600548155445, train CRF loss is 0.0006224867083561137
Training:At training steps 33700, training MLE loss is 0.000447318074102368, train CRF loss is 0.00041478204415379416
Training:At training steps 33800, training MLE loss is 0.00045423014069865126, train CRF loss is 0.0003354489022644458
Training:At training steps 33900, training MLE loss is 0.000484105415071455, train CRF loss is 0.00036244906341274086
Training:At training steps 34000, training MLE loss is 0.00044334265394553824, train CRF loss is 0.0003227237697323657
Validation:At training steps 34000, training MLE loss is 0.00044334265394553824, train CRF loss is 0.0003227237697323657, validation MLE loss is 9.79210923219982, validation ppl is 17892.005, validation CRF loss is 9.702482348994204, validation BLEU is 61.13
Training:At training steps 34100, training MLE loss is 0.00031183010329565266, train CRF loss is 0.00014565413327529697
Training:At training steps 34200, training MLE loss is 0.0002504202316216523, train CRF loss is 9.327365677042065e-05
Training:At training steps 34300, training MLE loss is 0.0002934207678663095, train CRF loss is 9.943798958122289e-05
Training:At training steps 34400, training MLE loss is 0.0003500906563731089, train CRF loss is 0.00016833506415514954
Training:At training steps 34500, training MLE loss is 0.00034435750003175645, train CRF loss is 0.00018208915727655216
Validation:At training steps 34500, training MLE loss is 0.00034435750003175645, train CRF loss is 0.00018208915727655216, validation MLE loss is 9.89977174683621, validation ppl is 19925.822, validation CRF loss is 9.794579091824984, validation BLEU is 60.76
Training:At training steps 34600, training MLE loss is 0.0002636633473037567, train CRF loss is 0.00024900119717506273
Training:At training steps 34700, training MLE loss is 0.00025391262717864313, train CRF loss is 0.000182000255088679
Training:At training steps 34800, training MLE loss is 0.0002923463664419103, train CRF loss is 0.0002452532546318779
Training:At training steps 34900, training MLE loss is 0.0003487283387240494, train CRF loss is 0.0002252171974443895
Training:At training steps 35000, training MLE loss is 0.00034164932457156035, train CRF loss is 0.00024263661139618798
Validation:At training steps 35000, training MLE loss is 0.00034164932457156035, train CRF loss is 0.00024263661139618798, validation MLE loss is 9.833255987418326, validation ppl is 18643.559, validation CRF loss is 9.784394364607962, validation BLEU is 61.37
Training:At training steps 35100, training MLE loss is 0.0004638118056986688, train CRF loss is 0.0004239800738770949
Training:At training steps 35200, training MLE loss is 0.0003690349797205517, train CRF loss is 0.0003490947002754363
Training:At training steps 35300, training MLE loss is 0.00043481906550299157, train CRF loss is 0.0003841657892341906
Training:At training steps 35400, training MLE loss is 0.0003786025942658158, train CRF loss is 0.000305971493970757
Training:At training steps 35500, training MLE loss is 0.000337441289732618, train CRF loss is 0.0002705695741596408
Validation:At training steps 35500, training MLE loss is 0.000337441289732618, train CRF loss is 0.0002705695741596408, validation MLE loss is 9.810878960709823, validation ppl is 18231.004, validation CRF loss is 9.708735466003418, validation BLEU is 60.72
Training:At training steps 35600, training MLE loss is 0.00018434581248285884, train CRF loss is 0.00014923189772861
Training:At training steps 35700, training MLE loss is 0.00024261546163567146, train CRF loss is 0.0001688014976862462
Training:At training steps 35800, training MLE loss is 0.00030610100666153117, train CRF loss is 0.00019248613491228638
Training:At training steps 35900, training MLE loss is 0.00034967513868792364, train CRF loss is 0.00030408344654581754
Training:At training steps 36000, training MLE loss is 0.0003342243400093648, train CRF loss is 0.00027033660768220445
Validation:At training steps 36000, training MLE loss is 0.0003342243400093648, train CRF loss is 0.00027033660768220445, validation MLE loss is 9.814719909115842, validation ppl is 18301.163, validation CRF loss is 9.780684558968796, validation BLEU is 61.61
Training:At training steps 36100, training MLE loss is 0.00033675461568847057, train CRF loss is 0.00022124934467201206
Training:At training steps 36200, training MLE loss is 0.0003342222363536299, train CRF loss is 0.0001902238944702761
Training:At training steps 36300, training MLE loss is 0.0002686133421439212, train CRF loss is 0.00017215398281522187
Training:At training steps 36400, training MLE loss is 0.00024442346740250663, train CRF loss is 0.000172232037970369
Training:At training steps 36500, training MLE loss is 0.0002618969559905559, train CRF loss is 0.00017067206627161902
Validation:At training steps 36500, training MLE loss is 0.0002618969559905559, train CRF loss is 0.00017067206627161902, validation MLE loss is 9.763341847218966, validation ppl is 17384.631, validation CRF loss is 9.730069925910549, validation BLEU is 61.9
Training:At training steps 36600, training MLE loss is 0.00040980200796927933, train CRF loss is 0.00030722009392947225
Training:At training steps 36700, training MLE loss is 0.00033810234711966787, train CRF loss is 0.0002790105632355688
Training:At training steps 36800, training MLE loss is 0.00028368587709805634, train CRF loss is 0.0002273717802820648
Training:At training steps 36900, training MLE loss is 0.00025570755701597976, train CRF loss is 0.0001954674042260729
Training:At training steps 37000, training MLE loss is 0.00025138348924766494, train CRF loss is 0.00018296825597635813
Validation:At training steps 37000, training MLE loss is 0.00025138348924766494, train CRF loss is 0.00018296825597635813, validation MLE loss is 9.861475731197157, validation ppl is 19177.169, validation CRF loss is 9.794927590771726, validation BLEU is 60.98
Training:At training steps 37100, training MLE loss is 0.0003662445870105002, train CRF loss is 0.00023677196777711185
Training:At training steps 37200, training MLE loss is 0.0003242914462790827, train CRF loss is 0.00019435707976162898
Training:At training steps 37300, training MLE loss is 0.00026858285360910155, train CRF loss is 0.00018256774709154064
Training:At training steps 37400, training MLE loss is 0.0002487923845941029, train CRF loss is 0.0001823131314608306
Training:At training steps 37500, training MLE loss is 0.0002187238964552609, train CRF loss is 0.00015900399560955504
Validation:At training steps 37500, training MLE loss is 0.0002187238964552609, train CRF loss is 0.00015900399560955504, validation MLE loss is 9.826099514961243, validation ppl is 18510.613, validation CRF loss is 9.722260450061999, validation BLEU is 59.71
Training:At training steps 37600, training MLE loss is 9.415500999028208e-05, train CRF loss is 3.738037257478233e-05
Training:At training steps 37700, training MLE loss is 8.452256476053279e-05, train CRF loss is 2.9755953776970935e-05
Training:At training steps 37800, training MLE loss is 9.496620114859145e-05, train CRF loss is 3.735455193809134e-05
Training:At training steps 37900, training MLE loss is 8.450326373417912e-05, train CRF loss is 3.008256786196206e-05
Training:At training steps 38000, training MLE loss is 7.198398708339278e-05, train CRF loss is 2.8265512250309043e-05
Validation:At training steps 38000, training MLE loss is 7.198398708339278e-05, train CRF loss is 2.8265512250309043e-05, validation MLE loss is 9.803883816066541, validation ppl is 18103.921, validation CRF loss is 9.747668109442058, validation BLEU is 60.74
Training:At training steps 38100, training MLE loss is 0.00018803852216444012, train CRF loss is 0.00018965335954881368
Training:At training steps 38200, training MLE loss is 0.00019148612144174494, train CRF loss is 0.00016804112534827054
Training:At training steps 38300, training MLE loss is 0.00016899031332653992, train CRF loss is 0.0001370057471738558
Training:At training steps 38400, training MLE loss is 0.00016323141850236144, train CRF loss is 0.00010741043791386629
Training:At training steps 38500, training MLE loss is 0.0001382098434989501, train CRF loss is 8.857802377003843e-05
Validation:At training steps 38500, training MLE loss is 0.0001382098434989501, train CRF loss is 8.857802377003843e-05, validation MLE loss is 9.8071304245999, validation ppl is 18162.793, validation CRF loss is 9.740335232333132, validation BLEU is 60.75
Training:At training steps 38600, training MLE loss is 0.0001097190903363933, train CRF loss is 8.600163392089578e-06
Training:At training steps 38700, training MLE loss is 0.000161046015741464, train CRF loss is 4.161725205847899e-05
Training:At training steps 38800, training MLE loss is 0.0001304383556190819, train CRF loss is 2.848416190206038e-05
Training:At training steps 38900, training MLE loss is 0.00011836733595619399, train CRF loss is 2.952491579929384e-05
Training:At training steps 39000, training MLE loss is 0.00010724069502994215, train CRF loss is 3.1202775502459975e-05
Validation:At training steps 39000, training MLE loss is 0.00010724069502994215, train CRF loss is 3.1202775502459975e-05, validation MLE loss is 9.848012447357178, validation ppl is 18920.711, validation CRF loss is 9.797776793178759, validation BLEU is 60.92
Training:At training steps 39100, training MLE loss is 0.0001985062393567255, train CRF loss is 8.773445740856812e-05
Training:At training steps 39200, training MLE loss is 0.00018349392219061287, train CRF loss is 7.909885964910934e-05
Training:At training steps 39300, training MLE loss is 0.00014781230619068138, train CRF loss is 8.46070118458971e-05
Training:At training steps 39400, training MLE loss is 0.0001553739638361043, train CRF loss is 7.595514120468771e-05
Training:At training steps 39500, training MLE loss is 0.00017375867247594994, train CRF loss is 7.804126011314772e-05
Validation:At training steps 39500, training MLE loss is 0.00017375867247594994, train CRF loss is 7.804126011314772e-05, validation MLE loss is 9.840300271385594, validation ppl is 18775.353, validation CRF loss is 9.823768609448484, validation BLEU is 61.02
Training:At training steps 39600, training MLE loss is 6.894660001579219e-05, train CRF loss is 6.698218894026554e-05
Training:At training steps 39700, training MLE loss is 4.122589226051788e-05, train CRF loss is 3.760001469586216e-05
Training:At training steps 39800, training MLE loss is 0.00012044916037083075, train CRF loss is 5.708582668905245e-05
Training:At training steps 39900, training MLE loss is 0.00011010359424189809, train CRF loss is 4.7396103000303394e-05
Training:At training steps 40000, training MLE loss is 0.00010572806712051858, train CRF loss is 4.0423374469828134e-05
Validation:At training steps 40000, training MLE loss is 0.00010572806712051858, train CRF loss is 4.0423374469828134e-05, validation MLE loss is 9.885999547807794, validation ppl is 19653.28, validation CRF loss is 9.837015082961635, validation BLEU is 61.1