2022-09-22 02:59:26,802 ---------------------------------------------------------------------------------------------------- 2022-09-22 02:59:26,806 Model: "SequenceTagger( (embeddings): StackedEmbeddings( (list_embedding_0): WordEmbeddings( 'pt' (embedding): Embedding(592108, 300) ) (list_embedding_1): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.5, inplace=False) (encoder): Embedding(275, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=275, bias=True) ) ) (list_embedding_2): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.5, inplace=False) (encoder): Embedding(275, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=275, bias=True) ) ) ) (word_dropout): WordDropout(p=0.05) (locked_dropout): LockedDropout(p=0.5) (embedding2nn): Linear(in_features=4396, out_features=4396, bias=True) (rnn): LSTM(4396, 256, batch_first=True, bidirectional=True) (linear): Linear(in_features=512, out_features=31, bias=True) (loss_function): ViterbiLoss() (crf): CRF() )" 2022-09-22 02:59:26,810 ---------------------------------------------------------------------------------------------------- 2022-09-22 02:59:26,812 Corpus: "Corpus: 6667 train + 1429 dev + 1430 test sentences" 2022-09-22 02:59:26,815 ---------------------------------------------------------------------------------------------------- 2022-09-22 02:59:26,816 Parameters: 2022-09-22 02:59:26,818 - learning_rate: "0.100000" 2022-09-22 02:59:26,820 - mini_batch_size: "32" 2022-09-22 02:59:26,822 - patience: "3" 2022-09-22 02:59:26,824 - anneal_factor: "0.5" 2022-09-22 02:59:26,826 - max_epochs: "70" 2022-09-22 02:59:26,828 - shuffle: "True" 2022-09-22 02:59:26,830 - train_with_dev: "False" 2022-09-22 02:59:26,832 - batch_growth_annealing: "False" 2022-09-22 02:59:26,834 ---------------------------------------------------------------------------------------------------- 2022-09-22 02:59:26,836 Model training base path: "resources/taggers/sota-ner-flair" 2022-09-22 02:59:26,838 ---------------------------------------------------------------------------------------------------- 2022-09-22 02:59:26,840 Device: cuda:0 2022-09-22 02:59:26,842 ---------------------------------------------------------------------------------------------------- 2022-09-22 02:59:26,844 Embeddings storage mode: cpu 2022-09-22 02:59:26,846 ---------------------------------------------------------------------------------------------------- 2022-09-22 02:59:35,823 epoch 1 - iter 20/209 - loss 1.13410593 - samples/sec: 71.33 - lr: 0.100000 2022-09-22 02:59:48,046 epoch 1 - iter 40/209 - loss 0.71784978 - samples/sec: 52.38 - lr: 0.100000 2022-09-22 02:59:57,074 epoch 1 - iter 60/209 - loss 0.61528243 - samples/sec: 70.93 - lr: 0.100000 2022-09-22 03:00:06,243 epoch 1 - iter 80/209 - loss 0.53293891 - samples/sec: 69.84 - lr: 0.100000 2022-09-22 03:00:16,481 epoch 1 - iter 100/209 - loss 0.46878947 - samples/sec: 62.54 - lr: 0.100000 2022-09-22 03:00:26,225 epoch 1 - iter 120/209 - loss 0.43495573 - samples/sec: 65.72 - lr: 0.100000 2022-09-22 03:00:35,107 epoch 1 - iter 140/209 - loss 0.40955810 - samples/sec: 72.09 - lr: 0.100000 2022-09-22 03:00:44,083 epoch 1 - iter 160/209 - loss 0.38994258 - samples/sec: 71.33 - lr: 0.100000 2022-09-22 03:00:54,927 epoch 1 - iter 180/209 - loss 0.36711456 - samples/sec: 59.05 - lr: 0.100000 2022-09-22 03:01:04,767 epoch 1 - iter 200/209 - loss 0.34815392 - samples/sec: 65.08 - lr: 0.100000 2022-09-22 03:01:09,259 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:01:09,261 EPOCH 1 done: loss 0.3455 - lr 0.100000 2022-09-22 03:01:31,421 Evaluating as a multi-label problem: False 2022-09-22 03:01:31,444 DEV : loss 0.16570232808589935 - f1-score (micro avg) 0.4029 2022-09-22 03:01:31,573 BAD EPOCHS (no improvement): 0 2022-09-22 03:01:31,577 saving best model 2022-09-22 03:01:36,030 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:01:41,307 epoch 2 - iter 20/209 - loss 0.16969945 - samples/sec: 122.21 - lr: 0.100000 2022-09-22 03:01:45,673 epoch 2 - iter 40/209 - loss 0.17413153 - samples/sec: 146.75 - lr: 0.100000 2022-09-22 03:01:50,856 epoch 2 - iter 60/209 - loss 0.16120805 - samples/sec: 123.61 - lr: 0.100000 2022-09-22 03:01:55,466 epoch 2 - iter 80/209 - loss 0.15250645 - samples/sec: 138.99 - lr: 0.100000 2022-09-22 03:01:59,781 epoch 2 - iter 100/209 - loss 0.14887640 - samples/sec: 148.50 - lr: 0.100000 2022-09-22 03:02:03,887 epoch 2 - iter 120/209 - loss 0.14730730 - samples/sec: 156.07 - lr: 0.100000 2022-09-22 03:02:07,935 epoch 2 - iter 140/209 - loss 0.14649781 - samples/sec: 158.30 - lr: 0.100000 2022-09-22 03:02:12,389 epoch 2 - iter 160/209 - loss 0.14828806 - samples/sec: 143.83 - lr: 0.100000 2022-09-22 03:02:16,377 epoch 2 - iter 180/209 - loss 0.14447967 - samples/sec: 160.67 - lr: 0.100000 2022-09-22 03:02:21,082 epoch 2 - iter 200/209 - loss 0.13988174 - samples/sec: 136.21 - lr: 0.100000 2022-09-22 03:02:22,836 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:02:22,839 EPOCH 2 done: loss 0.1393 - lr 0.100000 2022-09-22 03:02:34,005 Evaluating as a multi-label problem: False 2022-09-22 03:02:34,026 DEV : loss 0.09656457602977753 - f1-score (micro avg) 0.6174 2022-09-22 03:02:34,157 BAD EPOCHS (no improvement): 0 2022-09-22 03:02:34,160 saving best model 2022-09-22 03:02:38,504 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:02:43,414 epoch 3 - iter 20/209 - loss 0.13227681 - samples/sec: 130.61 - lr: 0.100000 2022-09-22 03:02:48,622 epoch 3 - iter 40/209 - loss 0.11212625 - samples/sec: 123.01 - lr: 0.100000 2022-09-22 03:02:52,549 epoch 3 - iter 60/209 - loss 0.11799034 - samples/sec: 163.14 - lr: 0.100000 2022-09-22 03:02:56,771 epoch 3 - iter 80/209 - loss 0.11820362 - samples/sec: 151.78 - lr: 0.100000 2022-09-22 03:03:01,536 epoch 3 - iter 100/209 - loss 0.11197771 - samples/sec: 134.42 - lr: 0.100000 2022-09-22 03:03:06,360 epoch 3 - iter 120/209 - loss 0.10942162 - samples/sec: 132.85 - lr: 0.100000 2022-09-22 03:03:11,404 epoch 3 - iter 140/209 - loss 0.10868191 - samples/sec: 127.01 - lr: 0.100000 2022-09-22 03:03:15,003 epoch 3 - iter 160/209 - loss 0.10540559 - samples/sec: 178.08 - lr: 0.100000 2022-09-22 03:03:19,070 epoch 3 - iter 180/209 - loss 0.10467736 - samples/sec: 157.57 - lr: 0.100000 2022-09-22 03:03:23,862 epoch 3 - iter 200/209 - loss 0.10299970 - samples/sec: 133.70 - lr: 0.100000 2022-09-22 03:03:25,488 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:03:25,491 EPOCH 3 done: loss 0.1014 - lr 0.100000 2022-09-22 03:03:36,585 Evaluating as a multi-label problem: False 2022-09-22 03:03:36,606 DEV : loss 0.0682184025645256 - f1-score (micro avg) 0.7567 2022-09-22 03:03:36,736 BAD EPOCHS (no improvement): 0 2022-09-22 03:03:36,740 saving best model 2022-09-22 03:03:41,082 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:03:45,687 epoch 4 - iter 20/209 - loss 0.09960185 - samples/sec: 139.22 - lr: 0.100000 2022-09-22 03:03:49,819 epoch 4 - iter 40/209 - loss 0.08214147 - samples/sec: 155.17 - lr: 0.100000 2022-09-22 03:03:54,482 epoch 4 - iter 60/209 - loss 0.08589161 - samples/sec: 137.37 - lr: 0.100000 2022-09-22 03:03:59,028 epoch 4 - iter 80/209 - loss 0.08516185 - samples/sec: 140.94 - lr: 0.100000 2022-09-22 03:04:03,151 epoch 4 - iter 100/209 - loss 0.08198608 - samples/sec: 155.46 - lr: 0.100000 2022-09-22 03:04:07,339 epoch 4 - iter 120/209 - loss 0.07909205 - samples/sec: 152.99 - lr: 0.100000 2022-09-22 03:04:11,580 epoch 4 - iter 140/209 - loss 0.07968311 - samples/sec: 151.13 - lr: 0.100000 2022-09-22 03:04:16,826 epoch 4 - iter 160/209 - loss 0.07828966 - samples/sec: 122.13 - lr: 0.100000 2022-09-22 03:04:21,217 epoch 4 - iter 180/209 - loss 0.07688004 - samples/sec: 145.91 - lr: 0.100000 2022-09-22 03:04:25,737 epoch 4 - iter 200/209 - loss 0.07680107 - samples/sec: 141.74 - lr: 0.100000 2022-09-22 03:04:27,481 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:04:27,483 EPOCH 4 done: loss 0.0770 - lr 0.100000 2022-09-22 03:04:39,322 Evaluating as a multi-label problem: False 2022-09-22 03:04:39,340 DEV : loss 0.05980030819773674 - f1-score (micro avg) 0.819 2022-09-22 03:04:39,471 BAD EPOCHS (no improvement): 0 2022-09-22 03:04:39,474 saving best model 2022-09-22 03:04:43,869 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:04:48,996 epoch 5 - iter 20/209 - loss 0.06801026 - samples/sec: 124.99 - lr: 0.100000 2022-09-22 03:04:53,385 epoch 5 - iter 40/209 - loss 0.07502768 - samples/sec: 146.01 - lr: 0.100000 2022-09-22 03:04:57,213 epoch 5 - iter 60/209 - loss 0.07149049 - samples/sec: 167.43 - lr: 0.100000 2022-09-22 03:05:01,403 epoch 5 - iter 80/209 - loss 0.07017438 - samples/sec: 152.91 - lr: 0.100000 2022-09-22 03:05:06,419 epoch 5 - iter 100/209 - loss 0.07111710 - samples/sec: 127.72 - lr: 0.100000 2022-09-22 03:05:10,537 epoch 5 - iter 120/209 - loss 0.06963243 - samples/sec: 155.58 - lr: 0.100000 2022-09-22 03:05:14,880 epoch 5 - iter 140/209 - loss 0.06989449 - samples/sec: 147.50 - lr: 0.100000 2022-09-22 03:05:19,256 epoch 5 - iter 160/209 - loss 0.06964494 - samples/sec: 146.45 - lr: 0.100000 2022-09-22 03:05:24,250 epoch 5 - iter 180/209 - loss 0.07145644 - samples/sec: 128.30 - lr: 0.100000 2022-09-22 03:05:28,834 epoch 5 - iter 200/209 - loss 0.06956947 - samples/sec: 139.78 - lr: 0.100000 2022-09-22 03:05:30,669 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:05:30,670 EPOCH 5 done: loss 0.0687 - lr 0.100000 2022-09-22 03:05:41,782 Evaluating as a multi-label problem: False 2022-09-22 03:05:41,806 DEV : loss 0.05544961616396904 - f1-score (micro avg) 0.7985 2022-09-22 03:05:41,962 BAD EPOCHS (no improvement): 1 2022-09-22 03:05:41,966 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:05:46,466 epoch 6 - iter 20/209 - loss 0.05778742 - samples/sec: 142.54 - lr: 0.100000 2022-09-22 03:05:50,638 epoch 6 - iter 40/209 - loss 0.05155533 - samples/sec: 153.58 - lr: 0.100000 2022-09-22 03:05:55,031 epoch 6 - iter 60/209 - loss 0.05197817 - samples/sec: 145.82 - lr: 0.100000 2022-09-22 03:05:59,266 epoch 6 - iter 80/209 - loss 0.05693943 - samples/sec: 151.31 - lr: 0.100000 2022-09-22 03:06:03,493 epoch 6 - iter 100/209 - loss 0.05538277 - samples/sec: 151.56 - lr: 0.100000 2022-09-22 03:06:07,936 epoch 6 - iter 120/209 - loss 0.05687833 - samples/sec: 144.21 - lr: 0.100000 2022-09-22 03:06:12,492 epoch 6 - iter 140/209 - loss 0.05894243 - samples/sec: 140.65 - lr: 0.100000 2022-09-22 03:06:17,307 epoch 6 - iter 160/209 - loss 0.05738975 - samples/sec: 133.05 - lr: 0.100000 2022-09-22 03:06:21,221 epoch 6 - iter 180/209 - loss 0.05747754 - samples/sec: 163.72 - lr: 0.100000 2022-09-22 03:06:25,981 epoch 6 - iter 200/209 - loss 0.05863656 - samples/sec: 134.58 - lr: 0.100000 2022-09-22 03:06:28,126 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:06:28,128 EPOCH 6 done: loss 0.0595 - lr 0.100000 2022-09-22 03:06:39,519 Evaluating as a multi-label problem: False 2022-09-22 03:06:39,538 DEV : loss 0.05082135647535324 - f1-score (micro avg) 0.8373 2022-09-22 03:06:39,670 BAD EPOCHS (no improvement): 0 2022-09-22 03:06:39,673 saving best model 2022-09-22 03:06:44,101 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:06:48,767 epoch 7 - iter 20/209 - loss 0.04155280 - samples/sec: 137.41 - lr: 0.100000 2022-09-22 03:06:53,385 epoch 7 - iter 40/209 - loss 0.04282031 - samples/sec: 138.76 - lr: 0.100000 2022-09-22 03:06:58,032 epoch 7 - iter 60/209 - loss 0.04579797 - samples/sec: 137.89 - lr: 0.100000 2022-09-22 03:07:03,060 epoch 7 - iter 80/209 - loss 0.05198084 - samples/sec: 127.38 - lr: 0.100000 2022-09-22 03:07:07,376 epoch 7 - iter 100/209 - loss 0.05285636 - samples/sec: 148.47 - lr: 0.100000 2022-09-22 03:07:11,814 epoch 7 - iter 120/209 - loss 0.05413436 - samples/sec: 144.40 - lr: 0.100000 2022-09-22 03:07:16,106 epoch 7 - iter 140/209 - loss 0.05600133 - samples/sec: 149.27 - lr: 0.100000 2022-09-22 03:07:20,273 epoch 7 - iter 160/209 - loss 0.05705526 - samples/sec: 153.79 - lr: 0.100000 2022-09-22 03:07:24,386 epoch 7 - iter 180/209 - loss 0.05442772 - samples/sec: 155.81 - lr: 0.100000 2022-09-22 03:07:29,084 epoch 7 - iter 200/209 - loss 0.05232375 - samples/sec: 136.35 - lr: 0.100000 2022-09-22 03:07:30,918 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:07:30,920 EPOCH 7 done: loss 0.0528 - lr 0.100000 2022-09-22 03:07:42,066 Evaluating as a multi-label problem: False 2022-09-22 03:07:42,086 DEV : loss 0.04711301997303963 - f1-score (micro avg) 0.8592 2022-09-22 03:07:42,221 BAD EPOCHS (no improvement): 0 2022-09-22 03:07:42,224 saving best model 2022-09-22 03:07:46,668 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:07:51,089 epoch 8 - iter 20/209 - loss 0.05584345 - samples/sec: 144.97 - lr: 0.100000 2022-09-22 03:07:55,025 epoch 8 - iter 40/209 - loss 0.04638842 - samples/sec: 162.83 - lr: 0.100000 2022-09-22 03:07:59,206 epoch 8 - iter 60/209 - loss 0.04746719 - samples/sec: 153.29 - lr: 0.100000 2022-09-22 03:08:03,863 epoch 8 - iter 80/209 - loss 0.04660045 - samples/sec: 137.57 - lr: 0.100000 2022-09-22 03:08:08,202 epoch 8 - iter 100/209 - loss 0.04566145 - samples/sec: 147.68 - lr: 0.100000 2022-09-22 03:08:12,931 epoch 8 - iter 120/209 - loss 0.04524970 - samples/sec: 135.52 - lr: 0.100000 2022-09-22 03:08:17,953 epoch 8 - iter 140/209 - loss 0.04495774 - samples/sec: 127.59 - lr: 0.100000 2022-09-22 03:08:22,227 epoch 8 - iter 160/209 - loss 0.04542328 - samples/sec: 149.96 - lr: 0.100000 2022-09-22 03:08:26,753 epoch 8 - iter 180/209 - loss 0.04475461 - samples/sec: 141.56 - lr: 0.100000 2022-09-22 03:08:31,644 epoch 8 - iter 200/209 - loss 0.04409748 - samples/sec: 131.00 - lr: 0.100000 2022-09-22 03:08:33,763 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:08:33,765 EPOCH 8 done: loss 0.0443 - lr 0.100000 2022-09-22 03:08:45,397 Evaluating as a multi-label problem: False 2022-09-22 03:08:45,428 DEV : loss 0.045138511806726456 - f1-score (micro avg) 0.8627 2022-09-22 03:08:45,561 BAD EPOCHS (no improvement): 0 2022-09-22 03:08:45,565 saving best model 2022-09-22 03:08:50,012 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:08:54,324 epoch 9 - iter 20/209 - loss 0.04168603 - samples/sec: 148.66 - lr: 0.100000 2022-09-22 03:08:58,457 epoch 9 - iter 40/209 - loss 0.04548085 - samples/sec: 155.12 - lr: 0.100000 2022-09-22 03:09:02,791 epoch 9 - iter 60/209 - loss 0.04478620 - samples/sec: 147.91 - lr: 0.100000 2022-09-22 03:09:07,289 epoch 9 - iter 80/209 - loss 0.03972187 - samples/sec: 142.45 - lr: 0.100000 2022-09-22 03:09:11,521 epoch 9 - iter 100/209 - loss 0.03903134 - samples/sec: 151.37 - lr: 0.100000 2022-09-22 03:09:16,081 epoch 9 - iter 120/209 - loss 0.04208798 - samples/sec: 140.47 - lr: 0.100000 2022-09-22 03:09:20,681 epoch 9 - iter 140/209 - loss 0.04107881 - samples/sec: 139.26 - lr: 0.100000 2022-09-22 03:09:25,298 epoch 9 - iter 160/209 - loss 0.04004892 - samples/sec: 138.77 - lr: 0.100000 2022-09-22 03:09:29,337 epoch 9 - iter 180/209 - loss 0.03873920 - samples/sec: 158.60 - lr: 0.100000 2022-09-22 03:09:33,948 epoch 9 - iter 200/209 - loss 0.03941991 - samples/sec: 138.96 - lr: 0.100000 2022-09-22 03:09:36,011 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:09:36,013 EPOCH 9 done: loss 0.0413 - lr 0.100000 2022-09-22 03:09:47,378 Evaluating as a multi-label problem: False 2022-09-22 03:09:47,397 DEV : loss 0.0582578182220459 - f1-score (micro avg) 0.825 2022-09-22 03:09:47,530 BAD EPOCHS (no improvement): 1 2022-09-22 03:09:47,533 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:09:51,557 epoch 10 - iter 20/209 - loss 0.04889341 - samples/sec: 159.34 - lr: 0.100000 2022-09-22 03:09:56,292 epoch 10 - iter 40/209 - loss 0.03783871 - samples/sec: 135.34 - lr: 0.100000 2022-09-22 03:10:00,788 epoch 10 - iter 60/209 - loss 0.04071849 - samples/sec: 142.50 - lr: 0.100000 2022-09-22 03:10:05,150 epoch 10 - iter 80/209 - loss 0.03980560 - samples/sec: 146.90 - lr: 0.100000 2022-09-22 03:10:09,603 epoch 10 - iter 100/209 - loss 0.04074105 - samples/sec: 143.89 - lr: 0.100000 2022-09-22 03:10:14,558 epoch 10 - iter 120/209 - loss 0.04303124 - samples/sec: 129.32 - lr: 0.100000 2022-09-22 03:10:19,306 epoch 10 - iter 140/209 - loss 0.04142848 - samples/sec: 134.91 - lr: 0.100000 2022-09-22 03:10:24,147 epoch 10 - iter 160/209 - loss 0.04076667 - samples/sec: 132.36 - lr: 0.100000 2022-09-22 03:10:28,333 epoch 10 - iter 180/209 - loss 0.04036369 - samples/sec: 153.04 - lr: 0.100000 2022-09-22 03:10:32,527 epoch 10 - iter 200/209 - loss 0.03912577 - samples/sec: 152.78 - lr: 0.100000 2022-09-22 03:10:34,448 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:10:34,450 EPOCH 10 done: loss 0.0389 - lr 0.100000 2022-09-22 03:10:46,141 Evaluating as a multi-label problem: False 2022-09-22 03:10:46,163 DEV : loss 0.04387445002794266 - f1-score (micro avg) 0.8498 2022-09-22 03:10:46,321 BAD EPOCHS (no improvement): 2 2022-09-22 03:10:46,325 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:10:50,671 epoch 11 - iter 20/209 - loss 0.03399664 - samples/sec: 147.50 - lr: 0.100000 2022-09-22 03:10:55,088 epoch 11 - iter 40/209 - loss 0.03131688 - samples/sec: 145.06 - lr: 0.100000 2022-09-22 03:11:00,152 epoch 11 - iter 60/209 - loss 0.03091839 - samples/sec: 126.52 - lr: 0.100000 2022-09-22 03:11:04,818 epoch 11 - iter 80/209 - loss 0.03239518 - samples/sec: 137.33 - lr: 0.100000 2022-09-22 03:11:09,837 epoch 11 - iter 100/209 - loss 0.03346184 - samples/sec: 127.64 - lr: 0.100000 2022-09-22 03:11:14,092 epoch 11 - iter 120/209 - loss 0.03342324 - samples/sec: 150.59 - lr: 0.100000 2022-09-22 03:11:17,908 epoch 11 - iter 140/209 - loss 0.03411783 - samples/sec: 167.94 - lr: 0.100000 2022-09-22 03:11:21,884 epoch 11 - iter 160/209 - loss 0.03394769 - samples/sec: 161.15 - lr: 0.100000 2022-09-22 03:11:26,103 epoch 11 - iter 180/209 - loss 0.03349948 - samples/sec: 151.90 - lr: 0.100000 2022-09-22 03:11:31,724 epoch 11 - iter 200/209 - loss 0.03339439 - samples/sec: 113.93 - lr: 0.100000 2022-09-22 03:11:33,182 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:11:33,185 EPOCH 11 done: loss 0.0331 - lr 0.100000 2022-09-22 03:11:44,376 Evaluating as a multi-label problem: False 2022-09-22 03:11:44,397 DEV : loss 0.04172874242067337 - f1-score (micro avg) 0.8649 2022-09-22 03:11:44,529 BAD EPOCHS (no improvement): 0 2022-09-22 03:11:44,532 saving best model 2022-09-22 03:11:49,019 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:11:53,423 epoch 12 - iter 20/209 - loss 0.03209337 - samples/sec: 145.55 - lr: 0.100000 2022-09-22 03:11:57,849 epoch 12 - iter 40/209 - loss 0.03765221 - samples/sec: 144.78 - lr: 0.100000 2022-09-22 03:12:01,697 epoch 12 - iter 60/209 - loss 0.03676529 - samples/sec: 166.49 - lr: 0.100000 2022-09-22 03:12:06,234 epoch 12 - iter 80/209 - loss 0.03352467 - samples/sec: 141.22 - lr: 0.100000 2022-09-22 03:12:10,700 epoch 12 - iter 100/209 - loss 0.03318846 - samples/sec: 143.45 - lr: 0.100000 2022-09-22 03:12:15,597 epoch 12 - iter 120/209 - loss 0.03373384 - samples/sec: 130.85 - lr: 0.100000 2022-09-22 03:12:20,033 epoch 12 - iter 140/209 - loss 0.03208537 - samples/sec: 144.46 - lr: 0.100000 2022-09-22 03:12:24,862 epoch 12 - iter 160/209 - loss 0.03136539 - samples/sec: 132.66 - lr: 0.100000 2022-09-22 03:12:28,630 epoch 12 - iter 180/209 - loss 0.03168936 - samples/sec: 170.14 - lr: 0.100000 2022-09-22 03:12:33,467 epoch 12 - iter 200/209 - loss 0.03222253 - samples/sec: 132.43 - lr: 0.100000 2022-09-22 03:12:35,409 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:12:35,412 EPOCH 12 done: loss 0.0329 - lr 0.100000 2022-09-22 03:12:47,124 Evaluating as a multi-label problem: False 2022-09-22 03:12:47,143 DEV : loss 0.044125996530056 - f1-score (micro avg) 0.8548 2022-09-22 03:12:47,272 BAD EPOCHS (no improvement): 1 2022-09-22 03:12:47,275 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:12:51,380 epoch 13 - iter 20/209 - loss 0.01850640 - samples/sec: 156.21 - lr: 0.100000 2022-09-22 03:12:56,144 epoch 13 - iter 40/209 - loss 0.02025949 - samples/sec: 134.48 - lr: 0.100000 2022-09-22 03:13:00,634 epoch 13 - iter 60/209 - loss 0.03145947 - samples/sec: 142.71 - lr: 0.100000 2022-09-22 03:13:05,311 epoch 13 - iter 80/209 - loss 0.02793298 - samples/sec: 136.97 - lr: 0.100000 2022-09-22 03:13:09,732 epoch 13 - iter 100/209 - loss 0.02735722 - samples/sec: 144.91 - lr: 0.100000 2022-09-22 03:13:15,138 epoch 13 - iter 120/209 - loss 0.02688652 - samples/sec: 118.52 - lr: 0.100000 2022-09-22 03:13:19,352 epoch 13 - iter 140/209 - loss 0.02739783 - samples/sec: 152.05 - lr: 0.100000 2022-09-22 03:13:23,835 epoch 13 - iter 160/209 - loss 0.02696826 - samples/sec: 142.92 - lr: 0.100000 2022-09-22 03:13:28,282 epoch 13 - iter 180/209 - loss 0.02885942 - samples/sec: 144.08 - lr: 0.100000 2022-09-22 03:13:32,429 epoch 13 - iter 200/209 - loss 0.02839591 - samples/sec: 154.54 - lr: 0.100000 2022-09-22 03:13:34,092 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:13:34,094 EPOCH 13 done: loss 0.0283 - lr 0.100000 2022-09-22 03:13:45,097 Evaluating as a multi-label problem: False 2022-09-22 03:13:45,116 DEV : loss 0.037993188947439194 - f1-score (micro avg) 0.8622 2022-09-22 03:13:45,249 BAD EPOCHS (no improvement): 2 2022-09-22 03:13:45,253 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:13:49,947 epoch 14 - iter 20/209 - loss 0.02833990 - samples/sec: 136.58 - lr: 0.100000 2022-09-22 03:13:54,099 epoch 14 - iter 40/209 - loss 0.02638818 - samples/sec: 154.28 - lr: 0.100000 2022-09-22 03:13:58,434 epoch 14 - iter 60/209 - loss 0.02421293 - samples/sec: 147.80 - lr: 0.100000 2022-09-22 03:14:02,147 epoch 14 - iter 80/209 - loss 0.02510074 - samples/sec: 172.66 - lr: 0.100000 2022-09-22 03:14:06,891 epoch 14 - iter 100/209 - loss 0.02529249 - samples/sec: 135.05 - lr: 0.100000 2022-09-22 03:14:11,574 epoch 14 - iter 120/209 - loss 0.02500208 - samples/sec: 136.83 - lr: 0.100000 2022-09-22 03:14:15,898 epoch 14 - iter 140/209 - loss 0.02530272 - samples/sec: 148.20 - lr: 0.100000 2022-09-22 03:14:19,970 epoch 14 - iter 160/209 - loss 0.02610835 - samples/sec: 157.35 - lr: 0.100000 2022-09-22 03:14:24,559 epoch 14 - iter 180/209 - loss 0.02647797 - samples/sec: 139.63 - lr: 0.100000 2022-09-22 03:14:29,301 epoch 14 - iter 200/209 - loss 0.02788817 - samples/sec: 135.11 - lr: 0.100000 2022-09-22 03:14:31,566 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:14:31,568 EPOCH 14 done: loss 0.0280 - lr 0.100000 2022-09-22 03:14:43,022 Evaluating as a multi-label problem: False 2022-09-22 03:14:43,042 DEV : loss 0.04168141633272171 - f1-score (micro avg) 0.8661 2022-09-22 03:14:43,180 BAD EPOCHS (no improvement): 0 2022-09-22 03:14:43,183 saving best model 2022-09-22 03:14:48,045 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:14:52,468 epoch 15 - iter 20/209 - loss 0.02555100 - samples/sec: 144.91 - lr: 0.100000 2022-09-22 03:14:56,985 epoch 15 - iter 40/209 - loss 0.02594713 - samples/sec: 141.88 - lr: 0.100000 2022-09-22 03:15:01,001 epoch 15 - iter 60/209 - loss 0.02638328 - samples/sec: 159.55 - lr: 0.100000 2022-09-22 03:15:05,444 epoch 15 - iter 80/209 - loss 0.02518495 - samples/sec: 144.23 - lr: 0.100000 2022-09-22 03:15:09,244 epoch 15 - iter 100/209 - loss 0.02656325 - samples/sec: 168.66 - lr: 0.100000 2022-09-22 03:15:13,656 epoch 15 - iter 120/209 - loss 0.02707653 - samples/sec: 145.23 - lr: 0.100000 2022-09-22 03:15:18,179 epoch 15 - iter 140/209 - loss 0.02631382 - samples/sec: 141.68 - lr: 0.100000 2022-09-22 03:15:22,609 epoch 15 - iter 160/209 - loss 0.02754826 - samples/sec: 144.62 - lr: 0.100000 2022-09-22 03:15:27,259 epoch 15 - iter 180/209 - loss 0.02701192 - samples/sec: 137.79 - lr: 0.100000 2022-09-22 03:15:33,061 epoch 15 - iter 200/209 - loss 0.02856526 - samples/sec: 110.40 - lr: 0.100000 2022-09-22 03:15:34,705 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:15:34,708 EPOCH 15 done: loss 0.0288 - lr 0.100000 2022-09-22 03:15:45,763 Evaluating as a multi-label problem: False 2022-09-22 03:15:45,782 DEV : loss 0.03653959184885025 - f1-score (micro avg) 0.875 2022-09-22 03:15:45,916 BAD EPOCHS (no improvement): 0 2022-09-22 03:15:45,920 saving best model 2022-09-22 03:15:50,311 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:15:54,737 epoch 16 - iter 20/209 - loss 0.02922124 - samples/sec: 144.77 - lr: 0.100000 2022-09-22 03:15:59,412 epoch 16 - iter 40/209 - loss 0.02256063 - samples/sec: 137.05 - lr: 0.100000 2022-09-22 03:16:04,055 epoch 16 - iter 60/209 - loss 0.02163891 - samples/sec: 137.98 - lr: 0.100000 2022-09-22 03:16:09,082 epoch 16 - iter 80/209 - loss 0.02234348 - samples/sec: 127.42 - lr: 0.100000 2022-09-22 03:16:13,319 epoch 16 - iter 100/209 - loss 0.02246260 - samples/sec: 151.25 - lr: 0.100000 2022-09-22 03:16:18,182 epoch 16 - iter 120/209 - loss 0.02408038 - samples/sec: 131.77 - lr: 0.100000 2022-09-22 03:16:21,749 epoch 16 - iter 140/209 - loss 0.02427657 - samples/sec: 179.71 - lr: 0.100000 2022-09-22 03:16:26,811 epoch 16 - iter 160/209 - loss 0.02413024 - samples/sec: 126.55 - lr: 0.100000 2022-09-22 03:16:30,647 epoch 16 - iter 180/209 - loss 0.02375161 - samples/sec: 167.00 - lr: 0.100000 2022-09-22 03:16:35,107 epoch 16 - iter 200/209 - loss 0.02385697 - samples/sec: 143.64 - lr: 0.100000 2022-09-22 03:16:36,563 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:16:36,565 EPOCH 16 done: loss 0.0237 - lr 0.100000 2022-09-22 03:16:47,939 Evaluating as a multi-label problem: False 2022-09-22 03:16:47,958 DEV : loss 0.04710310697555542 - f1-score (micro avg) 0.8652 2022-09-22 03:16:48,089 BAD EPOCHS (no improvement): 1 2022-09-22 03:16:48,092 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:16:53,110 epoch 17 - iter 20/209 - loss 0.02529678 - samples/sec: 127.77 - lr: 0.100000 2022-09-22 03:16:57,207 epoch 17 - iter 40/209 - loss 0.02088542 - samples/sec: 156.42 - lr: 0.100000 2022-09-22 03:17:01,501 epoch 17 - iter 60/209 - loss 0.01962150 - samples/sec: 149.20 - lr: 0.100000 2022-09-22 03:17:05,798 epoch 17 - iter 80/209 - loss 0.01834150 - samples/sec: 149.19 - lr: 0.100000 2022-09-22 03:17:10,367 epoch 17 - iter 100/209 - loss 0.02077948 - samples/sec: 140.26 - lr: 0.100000 2022-09-22 03:17:14,737 epoch 17 - iter 120/209 - loss 0.02139814 - samples/sec: 146.58 - lr: 0.100000 2022-09-22 03:17:18,773 epoch 17 - iter 140/209 - loss 0.02183886 - samples/sec: 158.75 - lr: 0.100000 2022-09-22 03:17:23,177 epoch 17 - iter 160/209 - loss 0.02412608 - samples/sec: 145.48 - lr: 0.100000 2022-09-22 03:17:28,316 epoch 17 - iter 180/209 - loss 0.02398935 - samples/sec: 124.68 - lr: 0.100000 2022-09-22 03:17:32,876 epoch 17 - iter 200/209 - loss 0.02359558 - samples/sec: 140.53 - lr: 0.100000 2022-09-22 03:17:34,803 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:17:34,805 EPOCH 17 done: loss 0.0237 - lr 0.100000 2022-09-22 03:17:45,882 Evaluating as a multi-label problem: False 2022-09-22 03:17:45,904 DEV : loss 0.04203850403428078 - f1-score (micro avg) 0.8826 2022-09-22 03:17:46,053 BAD EPOCHS (no improvement): 0 2022-09-22 03:17:46,057 saving best model 2022-09-22 03:17:50,508 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:17:54,723 epoch 18 - iter 20/209 - loss 0.02792289 - samples/sec: 152.38 - lr: 0.100000 2022-09-22 03:17:59,255 epoch 18 - iter 40/209 - loss 0.02052743 - samples/sec: 141.38 - lr: 0.100000 2022-09-22 03:18:04,860 epoch 18 - iter 60/209 - loss 0.01707682 - samples/sec: 114.30 - lr: 0.100000 2022-09-22 03:18:09,278 epoch 18 - iter 80/209 - loss 0.01725821 - samples/sec: 145.03 - lr: 0.100000 2022-09-22 03:18:13,523 epoch 18 - iter 100/209 - loss 0.01864592 - samples/sec: 150.91 - lr: 0.100000 2022-09-22 03:18:17,623 epoch 18 - iter 120/209 - loss 0.01847996 - samples/sec: 156.26 - lr: 0.100000 2022-09-22 03:18:21,698 epoch 18 - iter 140/209 - loss 0.01980050 - samples/sec: 157.25 - lr: 0.100000 2022-09-22 03:18:25,980 epoch 18 - iter 160/209 - loss 0.02024245 - samples/sec: 149.66 - lr: 0.100000 2022-09-22 03:18:31,492 epoch 18 - iter 180/209 - loss 0.02047137 - samples/sec: 116.23 - lr: 0.100000 2022-09-22 03:18:35,780 epoch 18 - iter 200/209 - loss 0.02043419 - samples/sec: 149.43 - lr: 0.100000 2022-09-22 03:18:38,023 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:18:38,026 EPOCH 18 done: loss 0.0209 - lr 0.100000 2022-09-22 03:18:49,518 Evaluating as a multi-label problem: False 2022-09-22 03:18:49,537 DEV : loss 0.04047093912959099 - f1-score (micro avg) 0.8811 2022-09-22 03:18:49,673 BAD EPOCHS (no improvement): 1 2022-09-22 03:18:49,679 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:18:53,569 epoch 19 - iter 20/209 - loss 0.03078881 - samples/sec: 164.85 - lr: 0.100000 2022-09-22 03:18:57,449 epoch 19 - iter 40/209 - loss 0.02421323 - samples/sec: 165.20 - lr: 0.100000 2022-09-22 03:19:01,390 epoch 19 - iter 60/209 - loss 0.02726126 - samples/sec: 162.62 - lr: 0.100000 2022-09-22 03:19:06,212 epoch 19 - iter 80/209 - loss 0.02399136 - samples/sec: 132.89 - lr: 0.100000 2022-09-22 03:19:10,419 epoch 19 - iter 100/209 - loss 0.02286999 - samples/sec: 152.35 - lr: 0.100000 2022-09-22 03:19:14,946 epoch 19 - iter 120/209 - loss 0.02318129 - samples/sec: 141.51 - lr: 0.100000 2022-09-22 03:19:19,940 epoch 19 - iter 140/209 - loss 0.02224270 - samples/sec: 128.29 - lr: 0.100000 2022-09-22 03:19:24,355 epoch 19 - iter 160/209 - loss 0.02165349 - samples/sec: 145.12 - lr: 0.100000 2022-09-22 03:19:28,534 epoch 19 - iter 180/209 - loss 0.02222071 - samples/sec: 153.34 - lr: 0.100000 2022-09-22 03:19:33,020 epoch 19 - iter 200/209 - loss 0.02132964 - samples/sec: 142.83 - lr: 0.100000 2022-09-22 03:19:34,665 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:19:34,667 EPOCH 19 done: loss 0.0213 - lr 0.100000 2022-09-22 03:19:45,476 Evaluating as a multi-label problem: False 2022-09-22 03:19:45,496 DEV : loss 0.04064437001943588 - f1-score (micro avg) 0.8897 2022-09-22 03:19:45,629 BAD EPOCHS (no improvement): 0 2022-09-22 03:19:45,633 saving best model 2022-09-22 03:19:50,081 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:19:54,611 epoch 20 - iter 20/209 - loss 0.01137673 - samples/sec: 141.53 - lr: 0.100000 2022-09-22 03:19:58,760 epoch 20 - iter 40/209 - loss 0.01445903 - samples/sec: 154.44 - lr: 0.100000 2022-09-22 03:20:02,917 epoch 20 - iter 60/209 - loss 0.01739111 - samples/sec: 154.17 - lr: 0.100000 2022-09-22 03:20:06,772 epoch 20 - iter 80/209 - loss 0.01849754 - samples/sec: 166.21 - lr: 0.100000 2022-09-22 03:20:11,507 epoch 20 - iter 100/209 - loss 0.01745315 - samples/sec: 135.35 - lr: 0.100000 2022-09-22 03:20:16,184 epoch 20 - iter 120/209 - loss 0.01940659 - samples/sec: 137.00 - lr: 0.100000 2022-09-22 03:20:21,251 epoch 20 - iter 140/209 - loss 0.01918274 - samples/sec: 126.46 - lr: 0.100000 2022-09-22 03:20:25,876 epoch 20 - iter 160/209 - loss 0.01855748 - samples/sec: 138.53 - lr: 0.100000 2022-09-22 03:20:30,180 epoch 20 - iter 180/209 - loss 0.01890405 - samples/sec: 148.84 - lr: 0.100000 2022-09-22 03:20:34,439 epoch 20 - iter 200/209 - loss 0.01866614 - samples/sec: 150.45 - lr: 0.100000 2022-09-22 03:20:37,190 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:20:37,193 EPOCH 20 done: loss 0.0192 - lr 0.100000 2022-09-22 03:20:48,580 Evaluating as a multi-label problem: False 2022-09-22 03:20:48,599 DEV : loss 0.039903800934553146 - f1-score (micro avg) 0.8726 2022-09-22 03:20:48,732 BAD EPOCHS (no improvement): 1 2022-09-22 03:20:48,736 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:20:52,508 epoch 21 - iter 20/209 - loss 0.01266603 - samples/sec: 170.09 - lr: 0.100000 2022-09-22 03:20:56,412 epoch 21 - iter 40/209 - loss 0.01581440 - samples/sec: 164.18 - lr: 0.100000 2022-09-22 03:21:00,473 epoch 21 - iter 60/209 - loss 0.01775818 - samples/sec: 157.82 - lr: 0.100000 2022-09-22 03:21:04,739 epoch 21 - iter 80/209 - loss 0.01763207 - samples/sec: 150.23 - lr: 0.100000 2022-09-22 03:21:08,876 epoch 21 - iter 100/209 - loss 0.01726971 - samples/sec: 154.86 - lr: 0.100000 2022-09-22 03:21:13,328 epoch 21 - iter 120/209 - loss 0.01734956 - samples/sec: 143.94 - lr: 0.100000 2022-09-22 03:21:18,060 epoch 21 - iter 140/209 - loss 0.01846277 - samples/sec: 135.41 - lr: 0.100000 2022-09-22 03:21:22,592 epoch 21 - iter 160/209 - loss 0.01927382 - samples/sec: 141.35 - lr: 0.100000 2022-09-22 03:21:27,342 epoch 21 - iter 180/209 - loss 0.01900023 - samples/sec: 134.89 - lr: 0.100000 2022-09-22 03:21:32,267 epoch 21 - iter 200/209 - loss 0.01915269 - samples/sec: 130.11 - lr: 0.100000 2022-09-22 03:21:34,592 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:21:34,594 EPOCH 21 done: loss 0.0195 - lr 0.100000 2022-09-22 03:21:45,919 Evaluating as a multi-label problem: False 2022-09-22 03:21:45,940 DEV : loss 0.038961514830589294 - f1-score (micro avg) 0.874 2022-09-22 03:21:46,086 BAD EPOCHS (no improvement): 2 2022-09-22 03:21:46,089 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:21:50,103 epoch 22 - iter 20/209 - loss 0.01817463 - samples/sec: 159.75 - lr: 0.100000 2022-09-22 03:21:54,366 epoch 22 - iter 40/209 - loss 0.01881625 - samples/sec: 150.33 - lr: 0.100000 2022-09-22 03:21:58,751 epoch 22 - iter 60/209 - loss 0.01917286 - samples/sec: 146.20 - lr: 0.100000 2022-09-22 03:22:02,842 epoch 22 - iter 80/209 - loss 0.01929330 - samples/sec: 156.67 - lr: 0.100000 2022-09-22 03:22:07,557 epoch 22 - iter 100/209 - loss 0.01848071 - samples/sec: 135.88 - lr: 0.100000 2022-09-22 03:22:12,365 epoch 22 - iter 120/209 - loss 0.02016769 - samples/sec: 133.24 - lr: 0.100000 2022-09-22 03:22:17,275 epoch 22 - iter 140/209 - loss 0.01973406 - samples/sec: 130.48 - lr: 0.100000 2022-09-22 03:22:22,289 epoch 22 - iter 160/209 - loss 0.01991412 - samples/sec: 127.81 - lr: 0.100000 2022-09-22 03:22:26,816 epoch 22 - iter 180/209 - loss 0.01952125 - samples/sec: 141.54 - lr: 0.100000 2022-09-22 03:22:31,098 epoch 22 - iter 200/209 - loss 0.01896066 - samples/sec: 149.65 - lr: 0.100000 2022-09-22 03:22:32,933 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:22:32,934 EPOCH 22 done: loss 0.0193 - lr 0.100000 2022-09-22 03:22:43,924 Evaluating as a multi-label problem: False 2022-09-22 03:22:43,945 DEV : loss 0.04197212681174278 - f1-score (micro avg) 0.8839 2022-09-22 03:22:44,082 BAD EPOCHS (no improvement): 3 2022-09-22 03:22:44,085 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:22:48,564 epoch 23 - iter 20/209 - loss 0.02074136 - samples/sec: 143.13 - lr: 0.100000 2022-09-22 03:22:53,303 epoch 23 - iter 40/209 - loss 0.02026053 - samples/sec: 135.18 - lr: 0.100000 2022-09-22 03:22:57,560 epoch 23 - iter 60/209 - loss 0.01924981 - samples/sec: 150.56 - lr: 0.100000 2022-09-22 03:23:01,992 epoch 23 - iter 80/209 - loss 0.01738982 - samples/sec: 144.57 - lr: 0.100000 2022-09-22 03:23:06,877 epoch 23 - iter 100/209 - loss 0.01833485 - samples/sec: 131.14 - lr: 0.100000 2022-09-22 03:23:11,378 epoch 23 - iter 120/209 - loss 0.01763437 - samples/sec: 142.36 - lr: 0.100000 2022-09-22 03:23:16,203 epoch 23 - iter 140/209 - loss 0.01793427 - samples/sec: 132.78 - lr: 0.100000 2022-09-22 03:23:20,972 epoch 23 - iter 160/209 - loss 0.01833928 - samples/sec: 134.33 - lr: 0.100000 2022-09-22 03:23:24,469 epoch 23 - iter 180/209 - loss 0.01776556 - samples/sec: 183.30 - lr: 0.100000 2022-09-22 03:23:28,548 epoch 23 - iter 200/209 - loss 0.01799355 - samples/sec: 157.10 - lr: 0.100000 2022-09-22 03:23:30,368 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:23:30,370 EPOCH 23 done: loss 0.0179 - lr 0.100000 2022-09-22 03:23:41,490 Evaluating as a multi-label problem: False 2022-09-22 03:23:41,512 DEV : loss 0.0412365198135376 - f1-score (micro avg) 0.885 2022-09-22 03:23:41,651 Epoch 23: reducing learning rate of group 0 to 5.0000e-02. 2022-09-22 03:23:41,653 BAD EPOCHS (no improvement): 4 2022-09-22 03:23:41,655 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:23:45,833 epoch 24 - iter 20/209 - loss 0.01508962 - samples/sec: 153.52 - lr: 0.050000 2022-09-22 03:23:50,378 epoch 24 - iter 40/209 - loss 0.01396168 - samples/sec: 140.96 - lr: 0.050000 2022-09-22 03:23:55,368 epoch 24 - iter 60/209 - loss 0.01346801 - samples/sec: 128.39 - lr: 0.050000 2022-09-22 03:23:59,814 epoch 24 - iter 80/209 - loss 0.01407210 - samples/sec: 144.09 - lr: 0.050000 2022-09-22 03:24:04,064 epoch 24 - iter 100/209 - loss 0.01374968 - samples/sec: 150.77 - lr: 0.050000 2022-09-22 03:24:08,839 epoch 24 - iter 120/209 - loss 0.01344891 - samples/sec: 134.20 - lr: 0.050000 2022-09-22 03:24:13,532 epoch 24 - iter 140/209 - loss 0.01292338 - samples/sec: 136.49 - lr: 0.050000 2022-09-22 03:24:18,572 epoch 24 - iter 160/209 - loss 0.01219410 - samples/sec: 127.14 - lr: 0.050000 2022-09-22 03:24:22,441 epoch 24 - iter 180/209 - loss 0.01226887 - samples/sec: 165.62 - lr: 0.050000 2022-09-22 03:24:26,523 epoch 24 - iter 200/209 - loss 0.01245995 - samples/sec: 157.01 - lr: 0.050000 2022-09-22 03:24:28,238 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:24:28,241 EPOCH 24 done: loss 0.0122 - lr 0.050000 2022-09-22 03:24:39,493 Evaluating as a multi-label problem: False 2022-09-22 03:24:39,512 DEV : loss 0.037337951362133026 - f1-score (micro avg) 0.8753 2022-09-22 03:24:39,668 BAD EPOCHS (no improvement): 1 2022-09-22 03:24:39,671 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:24:44,837 epoch 25 - iter 20/209 - loss 0.01313769 - samples/sec: 124.06 - lr: 0.050000 2022-09-22 03:24:49,782 epoch 25 - iter 40/209 - loss 0.01469976 - samples/sec: 129.58 - lr: 0.050000 2022-09-22 03:24:54,748 epoch 25 - iter 60/209 - loss 0.01407628 - samples/sec: 129.01 - lr: 0.050000 2022-09-22 03:24:59,295 epoch 25 - iter 80/209 - loss 0.01390184 - samples/sec: 140.89 - lr: 0.050000 2022-09-22 03:25:03,625 epoch 25 - iter 100/209 - loss 0.01354144 - samples/sec: 147.99 - lr: 0.050000 2022-09-22 03:25:07,401 epoch 25 - iter 120/209 - loss 0.01337000 - samples/sec: 169.74 - lr: 0.050000 2022-09-22 03:25:11,487 epoch 25 - iter 140/209 - loss 0.01293394 - samples/sec: 156.82 - lr: 0.050000 2022-09-22 03:25:15,333 epoch 25 - iter 160/209 - loss 0.01276609 - samples/sec: 166.62 - lr: 0.050000 2022-09-22 03:25:19,615 epoch 25 - iter 180/209 - loss 0.01269660 - samples/sec: 149.68 - lr: 0.050000 2022-09-22 03:25:24,451 epoch 25 - iter 200/209 - loss 0.01255503 - samples/sec: 132.51 - lr: 0.050000 2022-09-22 03:25:26,290 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:25:26,292 EPOCH 25 done: loss 0.0124 - lr 0.050000 2022-09-22 03:25:37,383 Evaluating as a multi-label problem: False 2022-09-22 03:25:37,404 DEV : loss 0.03606007620692253 - f1-score (micro avg) 0.8842 2022-09-22 03:25:37,535 BAD EPOCHS (no improvement): 2 2022-09-22 03:25:37,540 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:25:42,941 epoch 26 - iter 20/209 - loss 0.01154679 - samples/sec: 118.69 - lr: 0.050000 2022-09-22 03:25:47,119 epoch 26 - iter 40/209 - loss 0.01014419 - samples/sec: 153.39 - lr: 0.050000 2022-09-22 03:25:51,608 epoch 26 - iter 60/209 - loss 0.01011476 - samples/sec: 142.79 - lr: 0.050000 2022-09-22 03:25:56,278 epoch 26 - iter 80/209 - loss 0.01159675 - samples/sec: 137.19 - lr: 0.050000 2022-09-22 03:26:00,429 epoch 26 - iter 100/209 - loss 0.01196109 - samples/sec: 154.34 - lr: 0.050000 2022-09-22 03:26:04,279 epoch 26 - iter 120/209 - loss 0.01142487 - samples/sec: 166.48 - lr: 0.050000 2022-09-22 03:26:08,656 epoch 26 - iter 140/209 - loss 0.01145025 - samples/sec: 146.39 - lr: 0.050000 2022-09-22 03:26:12,901 epoch 26 - iter 160/209 - loss 0.01174816 - samples/sec: 150.97 - lr: 0.050000 2022-09-22 03:26:17,545 epoch 26 - iter 180/209 - loss 0.01143065 - samples/sec: 137.95 - lr: 0.050000 2022-09-22 03:26:21,815 epoch 26 - iter 200/209 - loss 0.01079720 - samples/sec: 150.11 - lr: 0.050000 2022-09-22 03:26:23,572 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:26:23,574 EPOCH 26 done: loss 0.0112 - lr 0.050000 2022-09-22 03:26:34,920 Evaluating as a multi-label problem: False 2022-09-22 03:26:34,941 DEV : loss 0.041908808052539825 - f1-score (micro avg) 0.8828 2022-09-22 03:26:35,078 BAD EPOCHS (no improvement): 3 2022-09-22 03:26:35,081 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:26:39,239 epoch 27 - iter 20/209 - loss 0.01529717 - samples/sec: 154.14 - lr: 0.050000 2022-09-22 03:26:44,529 epoch 27 - iter 40/209 - loss 0.01069379 - samples/sec: 121.12 - lr: 0.050000 2022-09-22 03:26:49,377 epoch 27 - iter 60/209 - loss 0.01000276 - samples/sec: 132.22 - lr: 0.050000 2022-09-22 03:26:54,021 epoch 27 - iter 80/209 - loss 0.01101574 - samples/sec: 137.94 - lr: 0.050000 2022-09-22 03:26:57,891 epoch 27 - iter 100/209 - loss 0.01019044 - samples/sec: 165.60 - lr: 0.050000 2022-09-22 03:27:02,434 epoch 27 - iter 120/209 - loss 0.00994911 - samples/sec: 140.99 - lr: 0.050000 2022-09-22 03:27:06,760 epoch 27 - iter 140/209 - loss 0.01092331 - samples/sec: 148.12 - lr: 0.050000 2022-09-22 03:27:11,171 epoch 27 - iter 160/209 - loss 0.01148409 - samples/sec: 145.28 - lr: 0.050000 2022-09-22 03:27:15,562 epoch 27 - iter 180/209 - loss 0.01142766 - samples/sec: 145.94 - lr: 0.050000 2022-09-22 03:27:20,019 epoch 27 - iter 200/209 - loss 0.01102674 - samples/sec: 143.75 - lr: 0.050000 2022-09-22 03:27:22,376 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:27:22,378 EPOCH 27 done: loss 0.0111 - lr 0.050000 2022-09-22 03:27:33,735 Evaluating as a multi-label problem: False 2022-09-22 03:27:33,754 DEV : loss 0.04194819927215576 - f1-score (micro avg) 0.8857 2022-09-22 03:27:33,895 Epoch 27: reducing learning rate of group 0 to 2.5000e-02. 2022-09-22 03:27:33,897 BAD EPOCHS (no improvement): 4 2022-09-22 03:27:33,900 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:27:37,927 epoch 28 - iter 20/209 - loss 0.01196667 - samples/sec: 159.26 - lr: 0.025000 2022-09-22 03:27:42,246 epoch 28 - iter 40/209 - loss 0.00963387 - samples/sec: 148.34 - lr: 0.025000 2022-09-22 03:27:46,178 epoch 28 - iter 60/209 - loss 0.00828059 - samples/sec: 162.98 - lr: 0.025000 2022-09-22 03:27:50,532 epoch 28 - iter 80/209 - loss 0.01003947 - samples/sec: 147.14 - lr: 0.025000 2022-09-22 03:27:55,264 epoch 28 - iter 100/209 - loss 0.01104576 - samples/sec: 135.43 - lr: 0.025000 2022-09-22 03:27:59,623 epoch 28 - iter 120/209 - loss 0.01058190 - samples/sec: 147.01 - lr: 0.025000 2022-09-22 03:28:03,947 epoch 28 - iter 140/209 - loss 0.01024795 - samples/sec: 148.18 - lr: 0.025000 2022-09-22 03:28:08,788 epoch 28 - iter 160/209 - loss 0.00989437 - samples/sec: 132.34 - lr: 0.025000 2022-09-22 03:28:14,450 epoch 28 - iter 180/209 - loss 0.01042305 - samples/sec: 113.14 - lr: 0.025000 2022-09-22 03:28:18,526 epoch 28 - iter 200/209 - loss 0.01036047 - samples/sec: 157.18 - lr: 0.025000 2022-09-22 03:28:20,562 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:28:20,565 EPOCH 28 done: loss 0.0103 - lr 0.025000 2022-09-22 03:28:31,722 Evaluating as a multi-label problem: False 2022-09-22 03:28:31,745 DEV : loss 0.03861014544963837 - f1-score (micro avg) 0.8931 2022-09-22 03:28:31,892 BAD EPOCHS (no improvement): 0 2022-09-22 03:28:31,896 saving best model 2022-09-22 03:28:36,355 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:28:40,941 epoch 29 - iter 20/209 - loss 0.00588863 - samples/sec: 139.78 - lr: 0.025000 2022-09-22 03:28:45,093 epoch 29 - iter 40/209 - loss 0.00811146 - samples/sec: 154.32 - lr: 0.025000 2022-09-22 03:28:50,095 epoch 29 - iter 60/209 - loss 0.01010255 - samples/sec: 128.06 - lr: 0.025000 2022-09-22 03:28:54,921 epoch 29 - iter 80/209 - loss 0.00871076 - samples/sec: 132.77 - lr: 0.025000 2022-09-22 03:28:59,879 epoch 29 - iter 100/209 - loss 0.00973383 - samples/sec: 129.22 - lr: 0.025000 2022-09-22 03:29:03,958 epoch 29 - iter 120/209 - loss 0.00955961 - samples/sec: 157.08 - lr: 0.025000 2022-09-22 03:29:07,869 epoch 29 - iter 140/209 - loss 0.00883156 - samples/sec: 163.91 - lr: 0.025000 2022-09-22 03:29:12,534 epoch 29 - iter 160/209 - loss 0.00951916 - samples/sec: 137.38 - lr: 0.025000 2022-09-22 03:29:17,162 epoch 29 - iter 180/209 - loss 0.00977117 - samples/sec: 138.40 - lr: 0.025000 2022-09-22 03:29:21,126 epoch 29 - iter 200/209 - loss 0.00964606 - samples/sec: 161.68 - lr: 0.025000 2022-09-22 03:29:22,585 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:29:22,588 EPOCH 29 done: loss 0.0096 - lr 0.025000 2022-09-22 03:29:33,561 Evaluating as a multi-label problem: False 2022-09-22 03:29:33,582 DEV : loss 0.04020700231194496 - f1-score (micro avg) 0.8843 2022-09-22 03:29:33,712 BAD EPOCHS (no improvement): 1 2022-09-22 03:29:33,715 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:29:37,935 epoch 30 - iter 20/209 - loss 0.00780922 - samples/sec: 151.93 - lr: 0.025000 2022-09-22 03:29:42,007 epoch 30 - iter 40/209 - loss 0.00751670 - samples/sec: 157.37 - lr: 0.025000 2022-09-22 03:29:46,107 epoch 30 - iter 60/209 - loss 0.00764725 - samples/sec: 156.27 - lr: 0.025000 2022-09-22 03:29:50,513 epoch 30 - iter 80/209 - loss 0.00800249 - samples/sec: 145.41 - lr: 0.025000 2022-09-22 03:29:55,157 epoch 30 - iter 100/209 - loss 0.01018016 - samples/sec: 137.98 - lr: 0.025000 2022-09-22 03:30:00,098 epoch 30 - iter 120/209 - loss 0.01053769 - samples/sec: 129.67 - lr: 0.025000 2022-09-22 03:30:04,734 epoch 30 - iter 140/209 - loss 0.01058227 - samples/sec: 138.19 - lr: 0.025000 2022-09-22 03:30:09,259 epoch 30 - iter 160/209 - loss 0.01020763 - samples/sec: 141.59 - lr: 0.025000 2022-09-22 03:30:13,231 epoch 30 - iter 180/209 - loss 0.00978194 - samples/sec: 161.36 - lr: 0.025000 2022-09-22 03:30:17,709 epoch 30 - iter 200/209 - loss 0.00976860 - samples/sec: 143.08 - lr: 0.025000 2022-09-22 03:30:19,641 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:30:19,644 EPOCH 30 done: loss 0.0098 - lr 0.025000 2022-09-22 03:30:30,732 Evaluating as a multi-label problem: False 2022-09-22 03:30:30,750 DEV : loss 0.040246278047561646 - f1-score (micro avg) 0.8824 2022-09-22 03:30:30,882 BAD EPOCHS (no improvement): 2 2022-09-22 03:30:30,885 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:30:35,678 epoch 31 - iter 20/209 - loss 0.00562730 - samples/sec: 133.74 - lr: 0.025000 2022-09-22 03:30:39,821 epoch 31 - iter 40/209 - loss 0.00589160 - samples/sec: 154.62 - lr: 0.025000 2022-09-22 03:30:43,553 epoch 31 - iter 60/209 - loss 0.00523552 - samples/sec: 171.67 - lr: 0.025000 2022-09-22 03:30:47,818 epoch 31 - iter 80/209 - loss 0.00642315 - samples/sec: 150.26 - lr: 0.025000 2022-09-22 03:30:52,412 epoch 31 - iter 100/209 - loss 0.00661123 - samples/sec: 139.56 - lr: 0.025000 2022-09-22 03:30:56,671 epoch 31 - iter 120/209 - loss 0.00655441 - samples/sec: 150.41 - lr: 0.025000 2022-09-22 03:31:01,613 epoch 31 - iter 140/209 - loss 0.00797238 - samples/sec: 129.64 - lr: 0.025000 2022-09-22 03:31:06,262 epoch 31 - iter 160/209 - loss 0.00853243 - samples/sec: 137.81 - lr: 0.025000 2022-09-22 03:31:10,693 epoch 31 - iter 180/209 - loss 0.00839161 - samples/sec: 144.61 - lr: 0.025000 2022-09-22 03:31:15,187 epoch 31 - iter 200/209 - loss 0.00813122 - samples/sec: 142.57 - lr: 0.025000 2022-09-22 03:31:17,811 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:31:17,812 EPOCH 31 done: loss 0.0082 - lr 0.025000 2022-09-22 03:31:28,917 Evaluating as a multi-label problem: False 2022-09-22 03:31:28,934 DEV : loss 0.03930843994021416 - f1-score (micro avg) 0.891 2022-09-22 03:31:29,078 BAD EPOCHS (no improvement): 3 2022-09-22 03:31:29,081 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:31:33,550 epoch 32 - iter 20/209 - loss 0.00757585 - samples/sec: 143.37 - lr: 0.025000 2022-09-22 03:31:37,804 epoch 32 - iter 40/209 - loss 0.01084779 - samples/sec: 150.62 - lr: 0.025000 2022-09-22 03:31:42,425 epoch 32 - iter 60/209 - loss 0.00932148 - samples/sec: 138.63 - lr: 0.025000 2022-09-22 03:31:46,593 epoch 32 - iter 80/209 - loss 0.00880151 - samples/sec: 153.73 - lr: 0.025000 2022-09-22 03:31:51,068 epoch 32 - iter 100/209 - loss 0.00872236 - samples/sec: 143.18 - lr: 0.025000 2022-09-22 03:31:55,995 epoch 32 - iter 120/209 - loss 0.00898136 - samples/sec: 130.02 - lr: 0.025000 2022-09-22 03:31:59,908 epoch 32 - iter 140/209 - loss 0.00875114 - samples/sec: 163.78 - lr: 0.025000 2022-09-22 03:32:04,529 epoch 32 - iter 160/209 - loss 0.00878848 - samples/sec: 138.65 - lr: 0.025000 2022-09-22 03:32:08,764 epoch 32 - iter 180/209 - loss 0.00836800 - samples/sec: 151.31 - lr: 0.025000 2022-09-22 03:32:13,127 epoch 32 - iter 200/209 - loss 0.00839624 - samples/sec: 146.90 - lr: 0.025000 2022-09-22 03:32:15,123 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:32:15,124 EPOCH 32 done: loss 0.0087 - lr 0.025000 2022-09-22 03:32:26,135 Evaluating as a multi-label problem: False 2022-09-22 03:32:26,153 DEV : loss 0.03956405445933342 - f1-score (micro avg) 0.899 2022-09-22 03:32:26,308 BAD EPOCHS (no improvement): 0 2022-09-22 03:32:26,311 saving best model 2022-09-22 03:32:30,752 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:32:35,580 epoch 33 - iter 20/209 - loss 0.00704804 - samples/sec: 132.72 - lr: 0.025000 2022-09-22 03:32:40,567 epoch 33 - iter 40/209 - loss 0.01173646 - samples/sec: 128.46 - lr: 0.025000 2022-09-22 03:32:45,004 epoch 33 - iter 60/209 - loss 0.01066679 - samples/sec: 144.48 - lr: 0.025000 2022-09-22 03:32:49,522 epoch 33 - iter 80/209 - loss 0.01020648 - samples/sec: 141.83 - lr: 0.025000 2022-09-22 03:32:53,730 epoch 33 - iter 100/209 - loss 0.00934505 - samples/sec: 152.33 - lr: 0.025000 2022-09-22 03:32:57,861 epoch 33 - iter 120/209 - loss 0.00909380 - samples/sec: 155.11 - lr: 0.025000 2022-09-22 03:33:02,194 epoch 33 - iter 140/209 - loss 0.00903168 - samples/sec: 147.87 - lr: 0.025000 2022-09-22 03:33:06,882 epoch 33 - iter 160/209 - loss 0.00874628 - samples/sec: 136.64 - lr: 0.025000 2022-09-22 03:33:11,443 epoch 33 - iter 180/209 - loss 0.00843199 - samples/sec: 140.51 - lr: 0.025000 2022-09-22 03:33:15,647 epoch 33 - iter 200/209 - loss 0.00860312 - samples/sec: 152.43 - lr: 0.025000 2022-09-22 03:33:17,302 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:33:17,304 EPOCH 33 done: loss 0.0084 - lr 0.025000 2022-09-22 03:33:28,501 Evaluating as a multi-label problem: False 2022-09-22 03:33:28,523 DEV : loss 0.04015888273715973 - f1-score (micro avg) 0.89 2022-09-22 03:33:28,655 BAD EPOCHS (no improvement): 1 2022-09-22 03:33:28,660 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:33:33,304 epoch 34 - iter 20/209 - loss 0.00569447 - samples/sec: 138.07 - lr: 0.025000 2022-09-22 03:33:37,938 epoch 34 - iter 40/209 - loss 0.00712331 - samples/sec: 138.25 - lr: 0.025000 2022-09-22 03:33:42,781 epoch 34 - iter 60/209 - loss 0.00842634 - samples/sec: 132.26 - lr: 0.025000 2022-09-22 03:33:46,691 epoch 34 - iter 80/209 - loss 0.00907594 - samples/sec: 163.93 - lr: 0.025000 2022-09-22 03:33:51,987 epoch 34 - iter 100/209 - loss 0.00858608 - samples/sec: 120.94 - lr: 0.025000 2022-09-22 03:33:55,963 epoch 34 - iter 120/209 - loss 0.00804898 - samples/sec: 161.13 - lr: 0.025000 2022-09-22 03:34:00,021 epoch 34 - iter 140/209 - loss 0.00796750 - samples/sec: 157.91 - lr: 0.025000 2022-09-22 03:34:03,634 epoch 34 - iter 160/209 - loss 0.00803767 - samples/sec: 177.37 - lr: 0.025000 2022-09-22 03:34:08,365 epoch 34 - iter 180/209 - loss 0.00788262 - samples/sec: 135.42 - lr: 0.025000 2022-09-22 03:34:12,978 epoch 34 - iter 200/209 - loss 0.00795304 - samples/sec: 138.93 - lr: 0.025000 2022-09-22 03:34:14,541 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:34:14,543 EPOCH 34 done: loss 0.0080 - lr 0.025000 2022-09-22 03:34:25,748 Evaluating as a multi-label problem: False 2022-09-22 03:34:25,768 DEV : loss 0.0419330857694149 - f1-score (micro avg) 0.8893 2022-09-22 03:34:25,899 BAD EPOCHS (no improvement): 2 2022-09-22 03:34:25,903 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:34:30,610 epoch 35 - iter 20/209 - loss 0.00908179 - samples/sec: 136.15 - lr: 0.025000 2022-09-22 03:34:35,209 epoch 35 - iter 40/209 - loss 0.00796643 - samples/sec: 139.31 - lr: 0.025000 2022-09-22 03:34:39,618 epoch 35 - iter 60/209 - loss 0.00923109 - samples/sec: 145.30 - lr: 0.025000 2022-09-22 03:34:43,970 epoch 35 - iter 80/209 - loss 0.00834655 - samples/sec: 147.24 - lr: 0.025000 2022-09-22 03:34:49,127 epoch 35 - iter 100/209 - loss 0.00850180 - samples/sec: 124.22 - lr: 0.025000 2022-09-22 03:34:53,477 epoch 35 - iter 120/209 - loss 0.00885653 - samples/sec: 147.28 - lr: 0.025000 2022-09-22 03:34:57,808 epoch 35 - iter 140/209 - loss 0.00854671 - samples/sec: 147.94 - lr: 0.025000 2022-09-22 03:35:02,784 epoch 35 - iter 160/209 - loss 0.00858381 - samples/sec: 128.76 - lr: 0.025000 2022-09-22 03:35:07,097 epoch 35 - iter 180/209 - loss 0.00824478 - samples/sec: 148.61 - lr: 0.025000 2022-09-22 03:35:11,502 epoch 35 - iter 200/209 - loss 0.00837521 - samples/sec: 145.46 - lr: 0.025000 2022-09-22 03:35:13,138 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:35:13,141 EPOCH 35 done: loss 0.0085 - lr 0.025000 2022-09-22 03:35:24,274 Evaluating as a multi-label problem: False 2022-09-22 03:35:24,294 DEV : loss 0.04051949828863144 - f1-score (micro avg) 0.8966 2022-09-22 03:35:24,424 BAD EPOCHS (no improvement): 3 2022-09-22 03:35:24,426 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:35:29,012 epoch 36 - iter 20/209 - loss 0.00696232 - samples/sec: 139.83 - lr: 0.025000 2022-09-22 03:35:33,404 epoch 36 - iter 40/209 - loss 0.00621393 - samples/sec: 145.96 - lr: 0.025000 2022-09-22 03:35:37,516 epoch 36 - iter 60/209 - loss 0.00661304 - samples/sec: 155.82 - lr: 0.025000 2022-09-22 03:35:41,499 epoch 36 - iter 80/209 - loss 0.00694613 - samples/sec: 160.88 - lr: 0.025000 2022-09-22 03:35:46,056 epoch 36 - iter 100/209 - loss 0.00696939 - samples/sec: 140.62 - lr: 0.025000 2022-09-22 03:35:50,834 epoch 36 - iter 120/209 - loss 0.00778882 - samples/sec: 134.11 - lr: 0.025000 2022-09-22 03:35:54,966 epoch 36 - iter 140/209 - loss 0.00770968 - samples/sec: 155.14 - lr: 0.025000 2022-09-22 03:35:59,385 epoch 36 - iter 160/209 - loss 0.00819376 - samples/sec: 144.97 - lr: 0.025000 2022-09-22 03:36:04,219 epoch 36 - iter 180/209 - loss 0.00843044 - samples/sec: 132.55 - lr: 0.025000 2022-09-22 03:36:08,773 epoch 36 - iter 200/209 - loss 0.00837520 - samples/sec: 140.69 - lr: 0.025000 2022-09-22 03:36:10,596 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:36:10,599 EPOCH 36 done: loss 0.0084 - lr 0.025000 2022-09-22 03:36:21,690 Evaluating as a multi-label problem: False 2022-09-22 03:36:21,714 DEV : loss 0.04092669114470482 - f1-score (micro avg) 0.8879 2022-09-22 03:36:21,851 Epoch 36: reducing learning rate of group 0 to 1.2500e-02. 2022-09-22 03:36:21,852 BAD EPOCHS (no improvement): 4 2022-09-22 03:36:21,857 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:36:27,184 epoch 37 - iter 20/209 - loss 0.00706402 - samples/sec: 120.30 - lr: 0.012500 2022-09-22 03:36:32,133 epoch 37 - iter 40/209 - loss 0.00726995 - samples/sec: 129.47 - lr: 0.012500 2022-09-22 03:36:37,555 epoch 37 - iter 60/209 - loss 0.00699793 - samples/sec: 118.18 - lr: 0.012500 2022-09-22 03:36:42,054 epoch 37 - iter 80/209 - loss 0.00683974 - samples/sec: 142.42 - lr: 0.012500 2022-09-22 03:36:45,954 epoch 37 - iter 100/209 - loss 0.00746173 - samples/sec: 164.32 - lr: 0.012500 2022-09-22 03:36:50,077 epoch 37 - iter 120/209 - loss 0.00727686 - samples/sec: 155.42 - lr: 0.012500 2022-09-22 03:36:53,946 epoch 37 - iter 140/209 - loss 0.00734845 - samples/sec: 165.65 - lr: 0.012500 2022-09-22 03:36:58,294 epoch 37 - iter 160/209 - loss 0.00739597 - samples/sec: 147.35 - lr: 0.012500 2022-09-22 03:37:02,248 epoch 37 - iter 180/209 - loss 0.00730706 - samples/sec: 162.04 - lr: 0.012500 2022-09-22 03:37:06,535 epoch 37 - iter 200/209 - loss 0.00775786 - samples/sec: 149.48 - lr: 0.012500 2022-09-22 03:37:08,885 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:37:08,887 EPOCH 37 done: loss 0.0078 - lr 0.012500 2022-09-22 03:37:20,120 Evaluating as a multi-label problem: False 2022-09-22 03:37:20,140 DEV : loss 0.03935045003890991 - f1-score (micro avg) 0.8951 2022-09-22 03:37:20,277 BAD EPOCHS (no improvement): 1 2022-09-22 03:37:20,281 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:37:24,882 epoch 38 - iter 20/209 - loss 0.00733453 - samples/sec: 139.28 - lr: 0.012500 2022-09-22 03:37:29,521 epoch 38 - iter 40/209 - loss 0.00529715 - samples/sec: 138.12 - lr: 0.012500 2022-09-22 03:37:34,397 epoch 38 - iter 60/209 - loss 0.00530575 - samples/sec: 131.40 - lr: 0.012500 2022-09-22 03:37:38,900 epoch 38 - iter 80/209 - loss 0.00528524 - samples/sec: 142.33 - lr: 0.012500 2022-09-22 03:37:43,204 epoch 38 - iter 100/209 - loss 0.00528770 - samples/sec: 148.84 - lr: 0.012500 2022-09-22 03:37:47,467 epoch 38 - iter 120/209 - loss 0.00568610 - samples/sec: 150.31 - lr: 0.012500 2022-09-22 03:37:52,089 epoch 38 - iter 140/209 - loss 0.00617930 - samples/sec: 138.65 - lr: 0.012500 2022-09-22 03:37:56,289 epoch 38 - iter 160/209 - loss 0.00678712 - samples/sec: 152.53 - lr: 0.012500 2022-09-22 03:38:00,771 epoch 38 - iter 180/209 - loss 0.00684389 - samples/sec: 142.94 - lr: 0.012500 2022-09-22 03:38:04,959 epoch 38 - iter 200/209 - loss 0.00654574 - samples/sec: 153.03 - lr: 0.012500 2022-09-22 03:38:07,003 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:38:07,006 EPOCH 38 done: loss 0.0065 - lr 0.012500 2022-09-22 03:38:18,184 Evaluating as a multi-label problem: False 2022-09-22 03:38:18,204 DEV : loss 0.03920961171388626 - f1-score (micro avg) 0.8976 2022-09-22 03:38:18,333 BAD EPOCHS (no improvement): 2 2022-09-22 03:38:18,336 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:38:23,073 epoch 39 - iter 20/209 - loss 0.00435913 - samples/sec: 135.32 - lr: 0.012500 2022-09-22 03:38:28,433 epoch 39 - iter 40/209 - loss 0.00626423 - samples/sec: 119.51 - lr: 0.012500 2022-09-22 03:38:32,681 epoch 39 - iter 60/209 - loss 0.00744142 - samples/sec: 150.87 - lr: 0.012500 2022-09-22 03:38:36,974 epoch 39 - iter 80/209 - loss 0.00685123 - samples/sec: 149.27 - lr: 0.012500 2022-09-22 03:38:41,282 epoch 39 - iter 100/209 - loss 0.00690466 - samples/sec: 148.72 - lr: 0.012500 2022-09-22 03:38:46,230 epoch 39 - iter 120/209 - loss 0.00668960 - samples/sec: 129.50 - lr: 0.012500 2022-09-22 03:38:50,851 epoch 39 - iter 140/209 - loss 0.00654694 - samples/sec: 138.66 - lr: 0.012500 2022-09-22 03:38:54,684 epoch 39 - iter 160/209 - loss 0.00689269 - samples/sec: 167.23 - lr: 0.012500 2022-09-22 03:38:59,591 epoch 39 - iter 180/209 - loss 0.00695870 - samples/sec: 130.56 - lr: 0.012500 2022-09-22 03:39:04,255 epoch 39 - iter 200/209 - loss 0.00722959 - samples/sec: 137.38 - lr: 0.012500 2022-09-22 03:39:05,956 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:39:05,957 EPOCH 39 done: loss 0.0072 - lr 0.012500 2022-09-22 03:39:17,113 Evaluating as a multi-label problem: False 2022-09-22 03:39:17,133 DEV : loss 0.03949600085616112 - f1-score (micro avg) 0.8906 2022-09-22 03:39:17,265 BAD EPOCHS (no improvement): 3 2022-09-22 03:39:17,269 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:39:21,746 epoch 40 - iter 20/209 - loss 0.00374776 - samples/sec: 143.21 - lr: 0.012500 2022-09-22 03:39:26,904 epoch 40 - iter 40/209 - loss 0.00589579 - samples/sec: 124.20 - lr: 0.012500 2022-09-22 03:39:31,540 epoch 40 - iter 60/209 - loss 0.00631217 - samples/sec: 138.21 - lr: 0.012500 2022-09-22 03:39:35,341 epoch 40 - iter 80/209 - loss 0.00624166 - samples/sec: 168.60 - lr: 0.012500 2022-09-22 03:39:39,355 epoch 40 - iter 100/209 - loss 0.00616046 - samples/sec: 159.72 - lr: 0.012500 2022-09-22 03:39:43,921 epoch 40 - iter 120/209 - loss 0.00626057 - samples/sec: 140.34 - lr: 0.012500 2022-09-22 03:39:48,057 epoch 40 - iter 140/209 - loss 0.00600218 - samples/sec: 154.97 - lr: 0.012500 2022-09-22 03:39:53,111 epoch 40 - iter 160/209 - loss 0.00641384 - samples/sec: 126.78 - lr: 0.012500 2022-09-22 03:39:57,431 epoch 40 - iter 180/209 - loss 0.00658546 - samples/sec: 148.33 - lr: 0.012500 2022-09-22 03:40:02,003 epoch 40 - iter 200/209 - loss 0.00657651 - samples/sec: 140.13 - lr: 0.012500 2022-09-22 03:40:03,707 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:40:03,709 EPOCH 40 done: loss 0.0067 - lr 0.012500 2022-09-22 03:40:14,726 Evaluating as a multi-label problem: False 2022-09-22 03:40:14,751 DEV : loss 0.04123305529356003 - f1-score (micro avg) 0.8948 2022-09-22 03:40:14,900 Epoch 40: reducing learning rate of group 0 to 6.2500e-03. 2022-09-22 03:40:14,902 BAD EPOCHS (no improvement): 4 2022-09-22 03:40:14,905 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:40:19,029 epoch 41 - iter 20/209 - loss 0.00408827 - samples/sec: 155.47 - lr: 0.006250 2022-09-22 03:40:23,402 epoch 41 - iter 40/209 - loss 0.00580532 - samples/sec: 146.52 - lr: 0.006250 2022-09-22 03:40:28,999 epoch 41 - iter 60/209 - loss 0.00508256 - samples/sec: 114.44 - lr: 0.006250 2022-09-22 03:40:34,120 epoch 41 - iter 80/209 - loss 0.00581536 - samples/sec: 125.12 - lr: 0.006250 2022-09-22 03:40:38,419 epoch 41 - iter 100/209 - loss 0.00561195 - samples/sec: 149.06 - lr: 0.006250 2022-09-22 03:40:42,900 epoch 41 - iter 120/209 - loss 0.00644950 - samples/sec: 143.01 - lr: 0.006250 2022-09-22 03:40:46,692 epoch 41 - iter 140/209 - loss 0.00640956 - samples/sec: 168.99 - lr: 0.006250 2022-09-22 03:40:50,833 epoch 41 - iter 160/209 - loss 0.00657211 - samples/sec: 154.75 - lr: 0.006250 2022-09-22 03:40:55,498 epoch 41 - iter 180/209 - loss 0.00632891 - samples/sec: 137.35 - lr: 0.006250 2022-09-22 03:41:00,448 epoch 41 - iter 200/209 - loss 0.00602183 - samples/sec: 129.43 - lr: 0.006250 2022-09-22 03:41:02,128 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:41:02,130 EPOCH 41 done: loss 0.0059 - lr 0.006250 2022-09-22 03:41:13,135 Evaluating as a multi-label problem: False 2022-09-22 03:41:13,156 DEV : loss 0.04026409983634949 - f1-score (micro avg) 0.8918 2022-09-22 03:41:13,292 BAD EPOCHS (no improvement): 1 2022-09-22 03:41:13,295 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:41:18,266 epoch 42 - iter 20/209 - loss 0.00921877 - samples/sec: 128.93 - lr: 0.006250 2022-09-22 03:41:23,135 epoch 42 - iter 40/209 - loss 0.00845366 - samples/sec: 131.60 - lr: 0.006250 2022-09-22 03:41:27,768 epoch 42 - iter 60/209 - loss 0.00768829 - samples/sec: 138.29 - lr: 0.006250 2022-09-22 03:41:32,150 epoch 42 - iter 80/209 - loss 0.00740244 - samples/sec: 146.24 - lr: 0.006250 2022-09-22 03:41:36,663 epoch 42 - iter 100/209 - loss 0.00759450 - samples/sec: 141.96 - lr: 0.006250 2022-09-22 03:41:40,941 epoch 42 - iter 120/209 - loss 0.00732379 - samples/sec: 149.82 - lr: 0.006250 2022-09-22 03:41:44,693 epoch 42 - iter 140/209 - loss 0.00735407 - samples/sec: 170.83 - lr: 0.006250 2022-09-22 03:41:48,547 epoch 42 - iter 160/209 - loss 0.00714062 - samples/sec: 166.27 - lr: 0.006250 2022-09-22 03:41:52,651 epoch 42 - iter 180/209 - loss 0.00724010 - samples/sec: 156.13 - lr: 0.006250 2022-09-22 03:41:56,790 epoch 42 - iter 200/209 - loss 0.00701223 - samples/sec: 154.87 - lr: 0.006250 2022-09-22 03:41:58,876 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:41:58,879 EPOCH 42 done: loss 0.0068 - lr 0.006250 2022-09-22 03:42:10,128 Evaluating as a multi-label problem: False 2022-09-22 03:42:10,148 DEV : loss 0.041221924126148224 - f1-score (micro avg) 0.8945 2022-09-22 03:42:10,279 BAD EPOCHS (no improvement): 2 2022-09-22 03:42:10,282 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:42:15,093 epoch 43 - iter 20/209 - loss 0.00822164 - samples/sec: 133.26 - lr: 0.006250 2022-09-22 03:42:19,435 epoch 43 - iter 40/209 - loss 0.01032494 - samples/sec: 147.62 - lr: 0.006250 2022-09-22 03:42:24,018 epoch 43 - iter 60/209 - loss 0.00982694 - samples/sec: 139.82 - lr: 0.006250 2022-09-22 03:42:28,591 epoch 43 - iter 80/209 - loss 0.00821454 - samples/sec: 140.09 - lr: 0.006250 2022-09-22 03:42:33,560 epoch 43 - iter 100/209 - loss 0.00759173 - samples/sec: 128.92 - lr: 0.006250 2022-09-22 03:42:38,174 epoch 43 - iter 120/209 - loss 0.00722677 - samples/sec: 138.87 - lr: 0.006250 2022-09-22 03:42:42,359 epoch 43 - iter 140/209 - loss 0.00683628 - samples/sec: 153.12 - lr: 0.006250 2022-09-22 03:42:46,989 epoch 43 - iter 160/209 - loss 0.00722811 - samples/sec: 138.38 - lr: 0.006250 2022-09-22 03:42:51,261 epoch 43 - iter 180/209 - loss 0.00725975 - samples/sec: 150.03 - lr: 0.006250 2022-09-22 03:42:55,690 epoch 43 - iter 200/209 - loss 0.00746218 - samples/sec: 144.69 - lr: 0.006250 2022-09-22 03:42:57,535 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:42:57,537 EPOCH 43 done: loss 0.0078 - lr 0.006250 2022-09-22 03:43:08,630 Evaluating as a multi-label problem: False 2022-09-22 03:43:08,650 DEV : loss 0.04122209921479225 - f1-score (micro avg) 0.8951 2022-09-22 03:43:08,785 BAD EPOCHS (no improvement): 3 2022-09-22 03:43:08,788 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:43:12,595 epoch 44 - iter 20/209 - loss 0.00551673 - samples/sec: 168.46 - lr: 0.006250 2022-09-22 03:43:17,696 epoch 44 - iter 40/209 - loss 0.00577496 - samples/sec: 125.59 - lr: 0.006250 2022-09-22 03:43:22,426 epoch 44 - iter 60/209 - loss 0.00728249 - samples/sec: 135.44 - lr: 0.006250 2022-09-22 03:43:26,619 epoch 44 - iter 80/209 - loss 0.00794143 - samples/sec: 152.79 - lr: 0.006250 2022-09-22 03:43:31,003 epoch 44 - iter 100/209 - loss 0.00836823 - samples/sec: 146.16 - lr: 0.006250 2022-09-22 03:43:35,801 epoch 44 - iter 120/209 - loss 0.00816591 - samples/sec: 133.55 - lr: 0.006250 2022-09-22 03:43:39,493 epoch 44 - iter 140/209 - loss 0.00806715 - samples/sec: 173.61 - lr: 0.006250 2022-09-22 03:43:44,601 epoch 44 - iter 160/209 - loss 0.00786756 - samples/sec: 125.44 - lr: 0.006250 2022-09-22 03:43:48,483 epoch 44 - iter 180/209 - loss 0.00759055 - samples/sec: 165.12 - lr: 0.006250 2022-09-22 03:43:52,362 epoch 44 - iter 200/209 - loss 0.00746701 - samples/sec: 165.24 - lr: 0.006250 2022-09-22 03:43:54,494 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:43:54,497 EPOCH 44 done: loss 0.0073 - lr 0.006250 2022-09-22 03:44:05,744 Evaluating as a multi-label problem: False 2022-09-22 03:44:05,767 DEV : loss 0.04190916195511818 - f1-score (micro avg) 0.898 2022-09-22 03:44:05,908 Epoch 44: reducing learning rate of group 0 to 3.1250e-03. 2022-09-22 03:44:05,910 BAD EPOCHS (no improvement): 4 2022-09-22 03:44:05,914 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:44:11,279 epoch 45 - iter 20/209 - loss 0.00518840 - samples/sec: 119.50 - lr: 0.003125 2022-09-22 03:44:14,974 epoch 45 - iter 40/209 - loss 0.00474588 - samples/sec: 173.45 - lr: 0.003125 2022-09-22 03:44:19,375 epoch 45 - iter 60/209 - loss 0.00654423 - samples/sec: 145.62 - lr: 0.003125 2022-09-22 03:44:23,812 epoch 45 - iter 80/209 - loss 0.00607472 - samples/sec: 144.41 - lr: 0.003125 2022-09-22 03:44:28,130 epoch 45 - iter 100/209 - loss 0.00575411 - samples/sec: 148.38 - lr: 0.003125 2022-09-22 03:44:32,348 epoch 45 - iter 120/209 - loss 0.00591917 - samples/sec: 151.94 - lr: 0.003125 2022-09-22 03:44:37,338 epoch 45 - iter 140/209 - loss 0.00616937 - samples/sec: 128.37 - lr: 0.003125 2022-09-22 03:44:42,073 epoch 45 - iter 160/209 - loss 0.00578197 - samples/sec: 135.32 - lr: 0.003125 2022-09-22 03:44:46,292 epoch 45 - iter 180/209 - loss 0.00572235 - samples/sec: 151.90 - lr: 0.003125 2022-09-22 03:44:51,020 epoch 45 - iter 200/209 - loss 0.00565739 - samples/sec: 135.49 - lr: 0.003125 2022-09-22 03:44:52,784 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:44:52,788 EPOCH 45 done: loss 0.0056 - lr 0.003125 2022-09-22 03:45:03,692 Evaluating as a multi-label problem: False 2022-09-22 03:45:03,710 DEV : loss 0.0416376069188118 - f1-score (micro avg) 0.8977 2022-09-22 03:45:03,842 BAD EPOCHS (no improvement): 1 2022-09-22 03:45:03,845 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:45:08,383 epoch 46 - iter 20/209 - loss 0.00384870 - samples/sec: 141.20 - lr: 0.003125 2022-09-22 03:45:12,404 epoch 46 - iter 40/209 - loss 0.00489894 - samples/sec: 159.34 - lr: 0.003125 2022-09-22 03:45:17,443 epoch 46 - iter 60/209 - loss 0.00590516 - samples/sec: 127.14 - lr: 0.003125 2022-09-22 03:45:21,806 epoch 46 - iter 80/209 - loss 0.00644530 - samples/sec: 146.87 - lr: 0.003125 2022-09-22 03:45:25,972 epoch 46 - iter 100/209 - loss 0.00647438 - samples/sec: 153.78 - lr: 0.003125 2022-09-22 03:45:30,941 epoch 46 - iter 120/209 - loss 0.00616354 - samples/sec: 128.96 - lr: 0.003125 2022-09-22 03:45:34,825 epoch 46 - iter 140/209 - loss 0.00626058 - samples/sec: 164.96 - lr: 0.003125 2022-09-22 03:45:39,075 epoch 46 - iter 160/209 - loss 0.00611243 - samples/sec: 150.76 - lr: 0.003125 2022-09-22 03:45:43,092 epoch 46 - iter 180/209 - loss 0.00618923 - samples/sec: 159.57 - lr: 0.003125 2022-09-22 03:45:47,829 epoch 46 - iter 200/209 - loss 0.00631804 - samples/sec: 135.24 - lr: 0.003125 2022-09-22 03:45:49,548 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:45:49,550 EPOCH 46 done: loss 0.0065 - lr 0.003125 2022-09-22 03:46:00,573 Evaluating as a multi-label problem: False 2022-09-22 03:46:00,597 DEV : loss 0.0413595512509346 - f1-score (micro avg) 0.8987 2022-09-22 03:46:00,729 BAD EPOCHS (no improvement): 2 2022-09-22 03:46:00,732 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:46:05,603 epoch 47 - iter 20/209 - loss 0.00472012 - samples/sec: 131.57 - lr: 0.003125 2022-09-22 03:46:10,033 epoch 47 - iter 40/209 - loss 0.00551437 - samples/sec: 144.67 - lr: 0.003125 2022-09-22 03:46:13,862 epoch 47 - iter 60/209 - loss 0.00537817 - samples/sec: 167.31 - lr: 0.003125 2022-09-22 03:46:17,557 epoch 47 - iter 80/209 - loss 0.00642726 - samples/sec: 173.49 - lr: 0.003125 2022-09-22 03:46:21,623 epoch 47 - iter 100/209 - loss 0.00713598 - samples/sec: 157.55 - lr: 0.003125 2022-09-22 03:46:26,483 epoch 47 - iter 120/209 - loss 0.00688335 - samples/sec: 131.81 - lr: 0.003125 2022-09-22 03:46:31,147 epoch 47 - iter 140/209 - loss 0.00666171 - samples/sec: 137.34 - lr: 0.003125 2022-09-22 03:46:35,557 epoch 47 - iter 160/209 - loss 0.00648260 - samples/sec: 145.29 - lr: 0.003125 2022-09-22 03:46:40,265 epoch 47 - iter 180/209 - loss 0.00646361 - samples/sec: 136.09 - lr: 0.003125 2022-09-22 03:46:45,209 epoch 47 - iter 200/209 - loss 0.00658443 - samples/sec: 129.59 - lr: 0.003125 2022-09-22 03:46:47,207 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:46:47,208 EPOCH 47 done: loss 0.0064 - lr 0.003125 2022-09-22 03:46:58,368 Evaluating as a multi-label problem: False 2022-09-22 03:46:58,390 DEV : loss 0.04147793725132942 - f1-score (micro avg) 0.8954 2022-09-22 03:46:58,523 BAD EPOCHS (no improvement): 3 2022-09-22 03:46:58,527 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:47:02,592 epoch 48 - iter 20/209 - loss 0.00531851 - samples/sec: 157.74 - lr: 0.003125 2022-09-22 03:47:06,990 epoch 48 - iter 40/209 - loss 0.00447967 - samples/sec: 145.73 - lr: 0.003125 2022-09-22 03:47:11,151 epoch 48 - iter 60/209 - loss 0.00485587 - samples/sec: 154.02 - lr: 0.003125 2022-09-22 03:47:14,961 epoch 48 - iter 80/209 - loss 0.00517134 - samples/sec: 168.20 - lr: 0.003125 2022-09-22 03:47:19,992 epoch 48 - iter 100/209 - loss 0.00611644 - samples/sec: 127.35 - lr: 0.003125 2022-09-22 03:47:24,068 epoch 48 - iter 120/209 - loss 0.00582773 - samples/sec: 157.24 - lr: 0.003125 2022-09-22 03:47:28,971 epoch 48 - iter 140/209 - loss 0.00556429 - samples/sec: 130.66 - lr: 0.003125 2022-09-22 03:47:33,667 epoch 48 - iter 160/209 - loss 0.00545854 - samples/sec: 136.45 - lr: 0.003125 2022-09-22 03:47:38,170 epoch 48 - iter 180/209 - loss 0.00605017 - samples/sec: 142.31 - lr: 0.003125 2022-09-22 03:47:42,540 epoch 48 - iter 200/209 - loss 0.00607457 - samples/sec: 146.62 - lr: 0.003125 2022-09-22 03:47:44,559 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:47:44,561 EPOCH 48 done: loss 0.0060 - lr 0.003125 2022-09-22 03:47:55,550 Evaluating as a multi-label problem: False 2022-09-22 03:47:55,570 DEV : loss 0.04214434698224068 - f1-score (micro avg) 0.8962 2022-09-22 03:47:55,703 Epoch 48: reducing learning rate of group 0 to 1.5625e-03. 2022-09-22 03:47:55,707 BAD EPOCHS (no improvement): 4 2022-09-22 03:47:55,716 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:48:00,026 epoch 49 - iter 20/209 - loss 0.00476515 - samples/sec: 148.73 - lr: 0.001563 2022-09-22 03:48:05,117 epoch 49 - iter 40/209 - loss 0.00615691 - samples/sec: 125.82 - lr: 0.001563 2022-09-22 03:48:10,209 epoch 49 - iter 60/209 - loss 0.00543685 - samples/sec: 125.79 - lr: 0.001563 2022-09-22 03:48:15,035 epoch 49 - iter 80/209 - loss 0.00544056 - samples/sec: 132.78 - lr: 0.001563 2022-09-22 03:48:19,295 epoch 49 - iter 100/209 - loss 0.00573142 - samples/sec: 150.41 - lr: 0.001563 2022-09-22 03:48:23,659 epoch 49 - iter 120/209 - loss 0.00649771 - samples/sec: 146.90 - lr: 0.001563 2022-09-22 03:48:28,023 epoch 49 - iter 140/209 - loss 0.00648309 - samples/sec: 146.82 - lr: 0.001563 2022-09-22 03:48:32,674 epoch 49 - iter 160/209 - loss 0.00627615 - samples/sec: 137.76 - lr: 0.001563 2022-09-22 03:48:36,949 epoch 49 - iter 180/209 - loss 0.00605254 - samples/sec: 149.85 - lr: 0.001563 2022-09-22 03:48:40,968 epoch 49 - iter 200/209 - loss 0.00640203 - samples/sec: 159.47 - lr: 0.001563 2022-09-22 03:48:42,658 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:48:42,661 EPOCH 49 done: loss 0.0063 - lr 0.001563 2022-09-22 03:48:53,604 Evaluating as a multi-label problem: False 2022-09-22 03:48:53,630 DEV : loss 0.04169485345482826 - f1-score (micro avg) 0.897 2022-09-22 03:48:53,761 BAD EPOCHS (no improvement): 1 2022-09-22 03:48:53,765 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:48:58,976 epoch 50 - iter 20/209 - loss 0.01247537 - samples/sec: 122.99 - lr: 0.001563 2022-09-22 03:49:03,209 epoch 50 - iter 40/209 - loss 0.00830603 - samples/sec: 151.38 - lr: 0.001563 2022-09-22 03:49:07,125 epoch 50 - iter 60/209 - loss 0.00740389 - samples/sec: 163.60 - lr: 0.001563 2022-09-22 03:49:11,732 epoch 50 - iter 80/209 - loss 0.00799464 - samples/sec: 139.06 - lr: 0.001563 2022-09-22 03:49:16,263 epoch 50 - iter 100/209 - loss 0.00694069 - samples/sec: 141.43 - lr: 0.001563 2022-09-22 03:49:20,398 epoch 50 - iter 120/209 - loss 0.00731619 - samples/sec: 155.00 - lr: 0.001563 2022-09-22 03:49:25,004 epoch 50 - iter 140/209 - loss 0.00710708 - samples/sec: 139.11 - lr: 0.001563 2022-09-22 03:49:29,330 epoch 50 - iter 160/209 - loss 0.00662856 - samples/sec: 148.14 - lr: 0.001563 2022-09-22 03:49:33,761 epoch 50 - iter 180/209 - loss 0.00636110 - samples/sec: 144.58 - lr: 0.001563 2022-09-22 03:49:37,902 epoch 50 - iter 200/209 - loss 0.00640348 - samples/sec: 154.77 - lr: 0.001563 2022-09-22 03:49:39,694 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:49:39,696 EPOCH 50 done: loss 0.0069 - lr 0.001563 2022-09-22 03:49:51,043 Evaluating as a multi-label problem: False 2022-09-22 03:49:51,060 DEV : loss 0.041619885712862015 - f1-score (micro avg) 0.8968 2022-09-22 03:49:51,199 BAD EPOCHS (no improvement): 2 2022-09-22 03:49:51,202 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:49:55,438 epoch 51 - iter 20/209 - loss 0.00323729 - samples/sec: 151.31 - lr: 0.001563 2022-09-22 03:49:59,949 epoch 51 - iter 40/209 - loss 0.00423640 - samples/sec: 142.04 - lr: 0.001563 2022-09-22 03:50:04,502 epoch 51 - iter 60/209 - loss 0.00422087 - samples/sec: 140.73 - lr: 0.001563 2022-09-22 03:50:08,990 epoch 51 - iter 80/209 - loss 0.00499427 - samples/sec: 142.85 - lr: 0.001563 2022-09-22 03:50:12,596 epoch 51 - iter 100/209 - loss 0.00493329 - samples/sec: 177.76 - lr: 0.001563 2022-09-22 03:50:17,034 epoch 51 - iter 120/209 - loss 0.00533605 - samples/sec: 144.32 - lr: 0.001563 2022-09-22 03:50:21,480 epoch 51 - iter 140/209 - loss 0.00524836 - samples/sec: 144.15 - lr: 0.001563 2022-09-22 03:50:26,515 epoch 51 - iter 160/209 - loss 0.00569826 - samples/sec: 127.23 - lr: 0.001563 2022-09-22 03:50:31,028 epoch 51 - iter 180/209 - loss 0.00605276 - samples/sec: 141.99 - lr: 0.001563 2022-09-22 03:50:35,135 epoch 51 - iter 200/209 - loss 0.00600363 - samples/sec: 156.04 - lr: 0.001563 2022-09-22 03:50:37,241 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:50:37,243 EPOCH 51 done: loss 0.0059 - lr 0.001563 2022-09-22 03:50:48,402 Evaluating as a multi-label problem: False 2022-09-22 03:50:48,423 DEV : loss 0.041253745555877686 - f1-score (micro avg) 0.8966 2022-09-22 03:50:48,560 BAD EPOCHS (no improvement): 3 2022-09-22 03:50:48,563 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:50:52,765 epoch 52 - iter 20/209 - loss 0.00501658 - samples/sec: 152.57 - lr: 0.001563 2022-09-22 03:50:56,869 epoch 52 - iter 40/209 - loss 0.00567355 - samples/sec: 156.14 - lr: 0.001563 2022-09-22 03:51:01,175 epoch 52 - iter 60/209 - loss 0.00527932 - samples/sec: 148.82 - lr: 0.001563 2022-09-22 03:51:06,098 epoch 52 - iter 80/209 - loss 0.00633136 - samples/sec: 130.17 - lr: 0.001563 2022-09-22 03:51:09,634 epoch 52 - iter 100/209 - loss 0.00633831 - samples/sec: 181.21 - lr: 0.001563 2022-09-22 03:51:13,917 epoch 52 - iter 120/209 - loss 0.00624763 - samples/sec: 149.66 - lr: 0.001563 2022-09-22 03:51:18,816 epoch 52 - iter 140/209 - loss 0.00664413 - samples/sec: 130.78 - lr: 0.001563 2022-09-22 03:51:22,991 epoch 52 - iter 160/209 - loss 0.00641601 - samples/sec: 153.43 - lr: 0.001563 2022-09-22 03:51:27,768 epoch 52 - iter 180/209 - loss 0.00656821 - samples/sec: 134.13 - lr: 0.001563 2022-09-22 03:51:31,899 epoch 52 - iter 200/209 - loss 0.00666688 - samples/sec: 155.16 - lr: 0.001563 2022-09-22 03:51:34,206 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:51:34,208 EPOCH 52 done: loss 0.0067 - lr 0.001563 2022-09-22 03:51:45,201 Evaluating as a multi-label problem: False 2022-09-22 03:51:45,224 DEV : loss 0.04099021106958389 - f1-score (micro avg) 0.8974 2022-09-22 03:51:45,377 Epoch 52: reducing learning rate of group 0 to 7.8125e-04. 2022-09-22 03:51:45,378 BAD EPOCHS (no improvement): 4 2022-09-22 03:51:45,381 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:51:50,262 epoch 53 - iter 20/209 - loss 0.00620492 - samples/sec: 131.29 - lr: 0.000781 2022-09-22 03:51:54,829 epoch 53 - iter 40/209 - loss 0.00473102 - samples/sec: 140.29 - lr: 0.000781 2022-09-22 03:51:59,076 epoch 53 - iter 60/209 - loss 0.00568016 - samples/sec: 150.93 - lr: 0.000781 2022-09-22 03:52:03,056 epoch 53 - iter 80/209 - loss 0.00541270 - samples/sec: 160.96 - lr: 0.000781 2022-09-22 03:52:06,787 epoch 53 - iter 100/209 - loss 0.00523814 - samples/sec: 171.76 - lr: 0.000781 2022-09-22 03:52:12,054 epoch 53 - iter 120/209 - loss 0.00518712 - samples/sec: 121.64 - lr: 0.000781 2022-09-22 03:52:16,705 epoch 53 - iter 140/209 - loss 0.00500942 - samples/sec: 137.75 - lr: 0.000781 2022-09-22 03:52:20,907 epoch 53 - iter 160/209 - loss 0.00475860 - samples/sec: 152.51 - lr: 0.000781 2022-09-22 03:52:25,114 epoch 53 - iter 180/209 - loss 0.00483390 - samples/sec: 152.30 - lr: 0.000781 2022-09-22 03:52:29,928 epoch 53 - iter 200/209 - loss 0.00554696 - samples/sec: 133.07 - lr: 0.000781 2022-09-22 03:52:31,680 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:52:31,682 EPOCH 53 done: loss 0.0054 - lr 0.000781 2022-09-22 03:52:42,767 Evaluating as a multi-label problem: False 2022-09-22 03:52:42,786 DEV : loss 0.04100096598267555 - f1-score (micro avg) 0.8966 2022-09-22 03:52:42,921 BAD EPOCHS (no improvement): 1 2022-09-22 03:52:42,924 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:52:47,130 epoch 54 - iter 20/209 - loss 0.00740737 - samples/sec: 152.35 - lr: 0.000781 2022-09-22 03:52:51,033 epoch 54 - iter 40/209 - loss 0.00619026 - samples/sec: 164.15 - lr: 0.000781 2022-09-22 03:52:55,702 epoch 54 - iter 60/209 - loss 0.00576693 - samples/sec: 137.20 - lr: 0.000781 2022-09-22 03:52:59,630 epoch 54 - iter 80/209 - loss 0.00634621 - samples/sec: 163.14 - lr: 0.000781 2022-09-22 03:53:05,148 epoch 54 - iter 100/209 - loss 0.00650624 - samples/sec: 116.07 - lr: 0.000781 2022-09-22 03:53:09,973 epoch 54 - iter 120/209 - loss 0.00641571 - samples/sec: 132.79 - lr: 0.000781 2022-09-22 03:53:14,203 epoch 54 - iter 140/209 - loss 0.00586754 - samples/sec: 151.49 - lr: 0.000781 2022-09-22 03:53:18,175 epoch 54 - iter 160/209 - loss 0.00591058 - samples/sec: 161.33 - lr: 0.000781 2022-09-22 03:53:22,317 epoch 54 - iter 180/209 - loss 0.00577589 - samples/sec: 154.68 - lr: 0.000781 2022-09-22 03:53:27,019 epoch 54 - iter 200/209 - loss 0.00602536 - samples/sec: 136.27 - lr: 0.000781 2022-09-22 03:53:29,034 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:53:29,036 EPOCH 54 done: loss 0.0059 - lr 0.000781 2022-09-22 03:53:39,916 Evaluating as a multi-label problem: False 2022-09-22 03:53:39,935 DEV : loss 0.041073162108659744 - f1-score (micro avg) 0.8958 2022-09-22 03:53:40,077 BAD EPOCHS (no improvement): 2 2022-09-22 03:53:40,080 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:53:45,201 epoch 55 - iter 20/209 - loss 0.00331707 - samples/sec: 125.16 - lr: 0.000781 2022-09-22 03:53:49,603 epoch 55 - iter 40/209 - loss 0.00636928 - samples/sec: 145.55 - lr: 0.000781 2022-09-22 03:53:54,430 epoch 55 - iter 60/209 - loss 0.00557866 - samples/sec: 132.72 - lr: 0.000781 2022-09-22 03:53:58,477 epoch 55 - iter 80/209 - loss 0.00629021 - samples/sec: 158.38 - lr: 0.000781 2022-09-22 03:54:02,694 epoch 55 - iter 100/209 - loss 0.00595054 - samples/sec: 151.92 - lr: 0.000781 2022-09-22 03:54:06,796 epoch 55 - iter 120/209 - loss 0.00589230 - samples/sec: 156.21 - lr: 0.000781 2022-09-22 03:54:10,785 epoch 55 - iter 140/209 - loss 0.00603926 - samples/sec: 160.68 - lr: 0.000781 2022-09-22 03:54:14,773 epoch 55 - iter 160/209 - loss 0.00564119 - samples/sec: 160.64 - lr: 0.000781 2022-09-22 03:54:19,398 epoch 55 - iter 180/209 - loss 0.00553303 - samples/sec: 138.56 - lr: 0.000781 2022-09-22 03:54:24,409 epoch 55 - iter 200/209 - loss 0.00555741 - samples/sec: 127.87 - lr: 0.000781 2022-09-22 03:54:25,812 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:54:25,814 EPOCH 55 done: loss 0.0055 - lr 0.000781 2022-09-22 03:54:36,746 Evaluating as a multi-label problem: False 2022-09-22 03:54:36,771 DEV : loss 0.04101268947124481 - f1-score (micro avg) 0.8968 2022-09-22 03:54:36,900 BAD EPOCHS (no improvement): 3 2022-09-22 03:54:36,903 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:54:40,917 epoch 56 - iter 20/209 - loss 0.00453760 - samples/sec: 159.73 - lr: 0.000781 2022-09-22 03:54:44,719 epoch 56 - iter 40/209 - loss 0.00597451 - samples/sec: 168.57 - lr: 0.000781 2022-09-22 03:54:49,243 epoch 56 - iter 60/209 - loss 0.00656413 - samples/sec: 141.61 - lr: 0.000781 2022-09-22 03:54:53,497 epoch 56 - iter 80/209 - loss 0.00677738 - samples/sec: 150.62 - lr: 0.000781 2022-09-22 03:54:58,125 epoch 56 - iter 100/209 - loss 0.00625281 - samples/sec: 138.46 - lr: 0.000781 2022-09-22 03:55:02,593 epoch 56 - iter 120/209 - loss 0.00603710 - samples/sec: 143.39 - lr: 0.000781 2022-09-22 03:55:07,152 epoch 56 - iter 140/209 - loss 0.00587802 - samples/sec: 140.51 - lr: 0.000781 2022-09-22 03:55:11,561 epoch 56 - iter 160/209 - loss 0.00594629 - samples/sec: 145.32 - lr: 0.000781 2022-09-22 03:55:16,546 epoch 56 - iter 180/209 - loss 0.00608648 - samples/sec: 128.50 - lr: 0.000781 2022-09-22 03:55:21,191 epoch 56 - iter 200/209 - loss 0.00616108 - samples/sec: 137.94 - lr: 0.000781 2022-09-22 03:55:22,691 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:55:22,693 EPOCH 56 done: loss 0.0061 - lr 0.000781 2022-09-22 03:55:33,875 Evaluating as a multi-label problem: False 2022-09-22 03:55:33,895 DEV : loss 0.04110672324895859 - f1-score (micro avg) 0.8976 2022-09-22 03:55:34,029 Epoch 56: reducing learning rate of group 0 to 3.9063e-04. 2022-09-22 03:55:34,032 BAD EPOCHS (no improvement): 4 2022-09-22 03:55:34,035 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:55:38,408 epoch 57 - iter 20/209 - loss 0.00800472 - samples/sec: 146.61 - lr: 0.000391 2022-09-22 03:55:42,977 epoch 57 - iter 40/209 - loss 0.00588092 - samples/sec: 140.24 - lr: 0.000391 2022-09-22 03:55:47,356 epoch 57 - iter 60/209 - loss 0.00545936 - samples/sec: 146.31 - lr: 0.000391 2022-09-22 03:55:51,517 epoch 57 - iter 80/209 - loss 0.00526302 - samples/sec: 153.93 - lr: 0.000391 2022-09-22 03:55:55,704 epoch 57 - iter 100/209 - loss 0.00609277 - samples/sec: 153.05 - lr: 0.000391 2022-09-22 03:56:00,654 epoch 57 - iter 120/209 - loss 0.00663245 - samples/sec: 129.45 - lr: 0.000391 2022-09-22 03:56:04,841 epoch 57 - iter 140/209 - loss 0.00605833 - samples/sec: 153.08 - lr: 0.000391 2022-09-22 03:56:09,175 epoch 57 - iter 160/209 - loss 0.00611017 - samples/sec: 147.87 - lr: 0.000391 2022-09-22 03:56:14,002 epoch 57 - iter 180/209 - loss 0.00609829 - samples/sec: 132.71 - lr: 0.000391 2022-09-22 03:56:18,423 epoch 57 - iter 200/209 - loss 0.00574898 - samples/sec: 144.95 - lr: 0.000391 2022-09-22 03:56:20,198 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:56:20,200 EPOCH 57 done: loss 0.0058 - lr 0.000391 2022-09-22 03:56:31,221 Evaluating as a multi-label problem: False 2022-09-22 03:56:31,240 DEV : loss 0.04113282635807991 - f1-score (micro avg) 0.896 2022-09-22 03:56:31,373 BAD EPOCHS (no improvement): 1 2022-09-22 03:56:31,376 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:56:35,744 epoch 58 - iter 20/209 - loss 0.00921994 - samples/sec: 146.76 - lr: 0.000391 2022-09-22 03:56:40,525 epoch 58 - iter 40/209 - loss 0.00730079 - samples/sec: 134.02 - lr: 0.000391 2022-09-22 03:56:44,800 epoch 58 - iter 60/209 - loss 0.00619064 - samples/sec: 149.85 - lr: 0.000391 2022-09-22 03:56:49,095 epoch 58 - iter 80/209 - loss 0.00672812 - samples/sec: 149.22 - lr: 0.000391 2022-09-22 03:56:53,306 epoch 58 - iter 100/209 - loss 0.00665815 - samples/sec: 152.17 - lr: 0.000391 2022-09-22 03:56:57,824 epoch 58 - iter 120/209 - loss 0.00655682 - samples/sec: 141.78 - lr: 0.000391 2022-09-22 03:57:02,074 epoch 58 - iter 140/209 - loss 0.00657187 - samples/sec: 150.72 - lr: 0.000391 2022-09-22 03:57:06,700 epoch 58 - iter 160/209 - loss 0.00653902 - samples/sec: 138.50 - lr: 0.000391 2022-09-22 03:57:11,359 epoch 58 - iter 180/209 - loss 0.00639477 - samples/sec: 137.48 - lr: 0.000391 2022-09-22 03:57:15,654 epoch 58 - iter 200/209 - loss 0.00617299 - samples/sec: 149.21 - lr: 0.000391 2022-09-22 03:57:17,279 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:57:17,280 EPOCH 58 done: loss 0.0062 - lr 0.000391 2022-09-22 03:57:28,295 Evaluating as a multi-label problem: False 2022-09-22 03:57:28,318 DEV : loss 0.041114162653684616 - f1-score (micro avg) 0.896 2022-09-22 03:57:28,467 BAD EPOCHS (no improvement): 2 2022-09-22 03:57:28,470 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:57:33,086 epoch 59 - iter 20/209 - loss 0.00437776 - samples/sec: 138.85 - lr: 0.000391 2022-09-22 03:57:37,362 epoch 59 - iter 40/209 - loss 0.00492228 - samples/sec: 149.85 - lr: 0.000391 2022-09-22 03:57:41,977 epoch 59 - iter 60/209 - loss 0.00572025 - samples/sec: 138.80 - lr: 0.000391 2022-09-22 03:57:47,580 epoch 59 - iter 80/209 - loss 0.00584043 - samples/sec: 114.34 - lr: 0.000391 2022-09-22 03:57:51,688 epoch 59 - iter 100/209 - loss 0.00644478 - samples/sec: 155.98 - lr: 0.000391 2022-09-22 03:57:55,937 epoch 59 - iter 120/209 - loss 0.00640490 - samples/sec: 150.80 - lr: 0.000391 2022-09-22 03:58:00,652 epoch 59 - iter 140/209 - loss 0.00627387 - samples/sec: 135.91 - lr: 0.000391 2022-09-22 03:58:04,950 epoch 59 - iter 160/209 - loss 0.00615403 - samples/sec: 149.09 - lr: 0.000391 2022-09-22 03:58:09,252 epoch 59 - iter 180/209 - loss 0.00614340 - samples/sec: 148.94 - lr: 0.000391 2022-09-22 03:58:13,164 epoch 59 - iter 200/209 - loss 0.00618012 - samples/sec: 163.77 - lr: 0.000391 2022-09-22 03:58:14,912 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:58:14,913 EPOCH 59 done: loss 0.0060 - lr 0.000391 2022-09-22 03:58:26,061 Evaluating as a multi-label problem: False 2022-09-22 03:58:26,084 DEV : loss 0.04115418344736099 - f1-score (micro avg) 0.896 2022-09-22 03:58:26,216 BAD EPOCHS (no improvement): 3 2022-09-22 03:58:26,219 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:58:31,350 epoch 60 - iter 20/209 - loss 0.01034365 - samples/sec: 124.94 - lr: 0.000391 2022-09-22 03:58:35,473 epoch 60 - iter 40/209 - loss 0.00841921 - samples/sec: 155.41 - lr: 0.000391 2022-09-22 03:58:39,503 epoch 60 - iter 60/209 - loss 0.00751487 - samples/sec: 158.98 - lr: 0.000391 2022-09-22 03:58:43,836 epoch 60 - iter 80/209 - loss 0.00668013 - samples/sec: 147.86 - lr: 0.000391 2022-09-22 03:58:48,456 epoch 60 - iter 100/209 - loss 0.00595166 - samples/sec: 138.70 - lr: 0.000391 2022-09-22 03:58:52,787 epoch 60 - iter 120/209 - loss 0.00619094 - samples/sec: 147.95 - lr: 0.000391 2022-09-22 03:58:57,295 epoch 60 - iter 140/209 - loss 0.00693912 - samples/sec: 142.13 - lr: 0.000391 2022-09-22 03:59:01,618 epoch 60 - iter 160/209 - loss 0.00669440 - samples/sec: 148.24 - lr: 0.000391 2022-09-22 03:59:05,323 epoch 60 - iter 180/209 - loss 0.00674939 - samples/sec: 172.99 - lr: 0.000391 2022-09-22 03:59:09,914 epoch 60 - iter 200/209 - loss 0.00686810 - samples/sec: 139.55 - lr: 0.000391 2022-09-22 03:59:11,941 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:59:11,944 EPOCH 60 done: loss 0.0067 - lr 0.000391 2022-09-22 03:59:22,855 Evaluating as a multi-label problem: False 2022-09-22 03:59:22,877 DEV : loss 0.04105373099446297 - f1-score (micro avg) 0.8976 2022-09-22 03:59:23,022 Epoch 60: reducing learning rate of group 0 to 1.9531e-04. 2022-09-22 03:59:23,024 BAD EPOCHS (no improvement): 4 2022-09-22 03:59:23,029 ---------------------------------------------------------------------------------------------------- 2022-09-22 03:59:27,451 epoch 61 - iter 20/209 - loss 0.00430089 - samples/sec: 144.94 - lr: 0.000195 2022-09-22 03:59:31,734 epoch 61 - iter 40/209 - loss 0.00616997 - samples/sec: 149.60 - lr: 0.000195 2022-09-22 03:59:35,533 epoch 61 - iter 60/209 - loss 0.00665133 - samples/sec: 168.69 - lr: 0.000195 2022-09-22 03:59:40,005 epoch 61 - iter 80/209 - loss 0.00624606 - samples/sec: 143.25 - lr: 0.000195 2022-09-22 03:59:44,860 epoch 61 - iter 100/209 - loss 0.00682087 - samples/sec: 131.97 - lr: 0.000195 2022-09-22 03:59:49,198 epoch 61 - iter 120/209 - loss 0.00716552 - samples/sec: 147.67 - lr: 0.000195 2022-09-22 03:59:53,354 epoch 61 - iter 140/209 - loss 0.00674058 - samples/sec: 154.23 - lr: 0.000195 2022-09-22 03:59:58,018 epoch 61 - iter 160/209 - loss 0.00627423 - samples/sec: 137.38 - lr: 0.000195 2022-09-22 04:00:02,506 epoch 61 - iter 180/209 - loss 0.00598420 - samples/sec: 142.76 - lr: 0.000195 2022-09-22 04:00:07,204 epoch 61 - iter 200/209 - loss 0.00624811 - samples/sec: 136.36 - lr: 0.000195 2022-09-22 04:00:09,424 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:00:09,426 EPOCH 61 done: loss 0.0062 - lr 0.000195 2022-09-22 04:00:20,463 Evaluating as a multi-label problem: False 2022-09-22 04:00:20,483 DEV : loss 0.04105890542268753 - f1-score (micro avg) 0.8976 2022-09-22 04:00:20,619 BAD EPOCHS (no improvement): 1 2022-09-22 04:00:20,622 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:00:24,866 epoch 62 - iter 20/209 - loss 0.00543478 - samples/sec: 151.01 - lr: 0.000195 2022-09-22 04:00:28,684 epoch 62 - iter 40/209 - loss 0.00548842 - samples/sec: 167.87 - lr: 0.000195 2022-09-22 04:00:33,669 epoch 62 - iter 60/209 - loss 0.00570884 - samples/sec: 128.51 - lr: 0.000195 2022-09-22 04:00:37,570 epoch 62 - iter 80/209 - loss 0.00522488 - samples/sec: 164.25 - lr: 0.000195 2022-09-22 04:00:42,441 epoch 62 - iter 100/209 - loss 0.00491201 - samples/sec: 131.52 - lr: 0.000195 2022-09-22 04:00:47,289 epoch 62 - iter 120/209 - loss 0.00558944 - samples/sec: 132.17 - lr: 0.000195 2022-09-22 04:00:51,190 epoch 62 - iter 140/209 - loss 0.00618133 - samples/sec: 164.32 - lr: 0.000195 2022-09-22 04:00:55,259 epoch 62 - iter 160/209 - loss 0.00611068 - samples/sec: 157.51 - lr: 0.000195 2022-09-22 04:01:00,030 epoch 62 - iter 180/209 - loss 0.00643276 - samples/sec: 134.28 - lr: 0.000195 2022-09-22 04:01:03,950 epoch 62 - iter 200/209 - loss 0.00678147 - samples/sec: 163.49 - lr: 0.000195 2022-09-22 04:01:06,005 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:01:06,008 EPOCH 62 done: loss 0.0067 - lr 0.000195 2022-09-22 04:01:17,016 Evaluating as a multi-label problem: False 2022-09-22 04:01:17,036 DEV : loss 0.041022274643182755 - f1-score (micro avg) 0.8976 2022-09-22 04:01:17,171 BAD EPOCHS (no improvement): 2 2022-09-22 04:01:17,175 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:01:21,055 epoch 63 - iter 20/209 - loss 0.00854410 - samples/sec: 165.28 - lr: 0.000195 2022-09-22 04:01:25,308 epoch 63 - iter 40/209 - loss 0.00655488 - samples/sec: 150.65 - lr: 0.000195 2022-09-22 04:01:30,018 epoch 63 - iter 60/209 - loss 0.00593736 - samples/sec: 136.03 - lr: 0.000195 2022-09-22 04:01:35,072 epoch 63 - iter 80/209 - loss 0.00701346 - samples/sec: 126.74 - lr: 0.000195 2022-09-22 04:01:39,910 epoch 63 - iter 100/209 - loss 0.00607717 - samples/sec: 132.43 - lr: 0.000195 2022-09-22 04:01:44,504 epoch 63 - iter 120/209 - loss 0.00571147 - samples/sec: 139.45 - lr: 0.000195 2022-09-22 04:01:48,799 epoch 63 - iter 140/209 - loss 0.00624759 - samples/sec: 149.16 - lr: 0.000195 2022-09-22 04:01:53,726 epoch 63 - iter 160/209 - loss 0.00624797 - samples/sec: 130.03 - lr: 0.000195 2022-09-22 04:01:57,699 epoch 63 - iter 180/209 - loss 0.00596065 - samples/sec: 161.26 - lr: 0.000195 2022-09-22 04:02:02,311 epoch 63 - iter 200/209 - loss 0.00568658 - samples/sec: 138.94 - lr: 0.000195 2022-09-22 04:02:04,152 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:02:04,155 EPOCH 63 done: loss 0.0056 - lr 0.000195 2022-09-22 04:02:15,231 Evaluating as a multi-label problem: False 2022-09-22 04:02:15,251 DEV : loss 0.041041262447834015 - f1-score (micro avg) 0.8976 2022-09-22 04:02:15,380 BAD EPOCHS (no improvement): 3 2022-09-22 04:02:15,384 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:02:19,806 epoch 64 - iter 20/209 - loss 0.00676565 - samples/sec: 144.96 - lr: 0.000195 2022-09-22 04:02:24,540 epoch 64 - iter 40/209 - loss 0.00678591 - samples/sec: 135.34 - lr: 0.000195 2022-09-22 04:02:29,510 epoch 64 - iter 60/209 - loss 0.00613092 - samples/sec: 128.91 - lr: 0.000195 2022-09-22 04:02:33,494 epoch 64 - iter 80/209 - loss 0.00634589 - samples/sec: 160.87 - lr: 0.000195 2022-09-22 04:02:37,394 epoch 64 - iter 100/209 - loss 0.00577269 - samples/sec: 164.25 - lr: 0.000195 2022-09-22 04:02:42,312 epoch 64 - iter 120/209 - loss 0.00566276 - samples/sec: 130.25 - lr: 0.000195 2022-09-22 04:02:46,941 epoch 64 - iter 140/209 - loss 0.00593892 - samples/sec: 138.39 - lr: 0.000195 2022-09-22 04:02:51,130 epoch 64 - iter 160/209 - loss 0.00567450 - samples/sec: 152.97 - lr: 0.000195 2022-09-22 04:02:55,177 epoch 64 - iter 180/209 - loss 0.00563552 - samples/sec: 158.34 - lr: 0.000195 2022-09-22 04:02:59,248 epoch 64 - iter 200/209 - loss 0.00579475 - samples/sec: 157.39 - lr: 0.000195 2022-09-22 04:03:00,764 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:03:00,766 EPOCH 64 done: loss 0.0058 - lr 0.000195 2022-09-22 04:03:11,954 Evaluating as a multi-label problem: False 2022-09-22 04:03:11,974 DEV : loss 0.04105839133262634 - f1-score (micro avg) 0.8951 2022-09-22 04:03:12,131 Epoch 64: reducing learning rate of group 0 to 9.7656e-05. 2022-09-22 04:03:12,134 BAD EPOCHS (no improvement): 4 2022-09-22 04:03:12,136 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:03:12,138 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:03:12,140 learning rate too small - quitting training! 2022-09-22 04:03:12,143 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:03:16,189 ---------------------------------------------------------------------------------------------------- 2022-09-22 04:03:16,193 loading file resources/taggers/sota-ner-flair/best-model.pt 2022-09-22 04:03:18,123 SequenceTagger predicts: Dictionary with 31 tags: O, S-PESSOA, B-PESSOA, E-PESSOA, I-PESSOA, S-FUNDAMENTO, B-FUNDAMENTO, E-FUNDAMENTO, I-FUNDAMENTO, S-ORGANIZACAO, B-ORGANIZACAO, E-ORGANIZACAO, I-ORGANIZACAO, S-DATA, B-DATA, E-DATA, I-DATA, S-LOCAL, B-LOCAL, E-LOCAL, I-LOCAL, S-PRODUTODELEI, B-PRODUTODELEI, E-PRODUTODELEI, I-PRODUTODELEI, S-EVENTO, B-EVENTO, E-EVENTO, I-EVENTO, , 2022-09-22 04:03:41,873 Evaluating as a multi-label problem: False 2022-09-22 04:03:41,896 0.8952 0.8982 0.8967 0.8201 2022-09-22 04:03:41,898 Results: - F-score (micro) 0.8967 - F-score (macro) 0.8686 - Accuracy 0.8201 By class: precision recall f1-score support FUNDAMENTO 0.9421 0.9194 0.9306 124 PESSOA 0.9492 0.9412 0.9451 119 LOCAL 0.8113 0.8515 0.8309 101 DATA 0.9600 0.9796 0.9697 98 ORGANIZACAO 0.8367 0.8723 0.8542 94 PRODUTODELEI 0.8235 0.7778 0.8000 54 EVENTO 0.8571 0.6667 0.7500 9 micro avg 0.8952 0.8982 0.8967 599 macro avg 0.8829 0.8583 0.8686 599 weighted avg 0.8959 0.8982 0.8966 599 2022-09-22 04:03:41,899 ----------------------------------------------------------------------------------------------------