2023-10-17 23:46:24,650 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,651 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 23:46:24,651 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,651 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-17 23:46:24,651 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,651 Train: 5901 sentences 2023-10-17 23:46:24,651 (train_with_dev=False, train_with_test=False) 2023-10-17 23:46:24,651 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,651 Training Params: 2023-10-17 23:46:24,651 - learning_rate: "5e-05" 2023-10-17 23:46:24,651 - mini_batch_size: "4" 2023-10-17 23:46:24,651 - max_epochs: "10" 2023-10-17 23:46:24,651 - shuffle: "True" 2023-10-17 23:46:24,651 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,651 Plugins: 2023-10-17 23:46:24,651 - TensorboardLogger 2023-10-17 23:46:24,651 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 23:46:24,651 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,652 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 23:46:24,652 - metric: "('micro avg', 'f1-score')" 2023-10-17 23:46:24,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,652 Computation: 2023-10-17 23:46:24,652 - compute on device: cuda:0 2023-10-17 23:46:24,652 - embedding storage: none 2023-10-17 23:46:24,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,652 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 23:46:24,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:46:24,652 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 23:46:31,930 epoch 1 - iter 147/1476 - loss 2.31899626 - time (sec): 7.28 - samples/sec: 2329.85 - lr: 0.000005 - momentum: 0.000000 2023-10-17 23:46:39,105 epoch 1 - iter 294/1476 - loss 1.45048025 - time (sec): 14.45 - samples/sec: 2365.81 - lr: 0.000010 - momentum: 0.000000 2023-10-17 23:46:47,005 epoch 1 - iter 441/1476 - loss 1.07259867 - time (sec): 22.35 - samples/sec: 2346.60 - lr: 0.000015 - momentum: 0.000000 2023-10-17 23:46:54,154 epoch 1 - iter 588/1476 - loss 0.88392950 - time (sec): 29.50 - samples/sec: 2345.91 - lr: 0.000020 - momentum: 0.000000 2023-10-17 23:47:01,280 epoch 1 - iter 735/1476 - loss 0.76313219 - time (sec): 36.63 - samples/sec: 2335.02 - lr: 0.000025 - momentum: 0.000000 2023-10-17 23:47:08,128 epoch 1 - iter 882/1476 - loss 0.67915084 - time (sec): 43.48 - samples/sec: 2311.84 - lr: 0.000030 - momentum: 0.000000 2023-10-17 23:47:15,216 epoch 1 - iter 1029/1476 - loss 0.61481908 - time (sec): 50.56 - samples/sec: 2290.47 - lr: 0.000035 - momentum: 0.000000 2023-10-17 23:47:22,755 epoch 1 - iter 1176/1476 - loss 0.55954865 - time (sec): 58.10 - samples/sec: 2319.20 - lr: 0.000040 - momentum: 0.000000 2023-10-17 23:47:29,643 epoch 1 - iter 1323/1476 - loss 0.52079959 - time (sec): 64.99 - samples/sec: 2305.27 - lr: 0.000045 - momentum: 0.000000 2023-10-17 23:47:36,760 epoch 1 - iter 1470/1476 - loss 0.48589065 - time (sec): 72.11 - samples/sec: 2296.69 - lr: 0.000050 - momentum: 0.000000 2023-10-17 23:47:37,047 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:47:37,047 EPOCH 1 done: loss 0.4841 - lr: 0.000050 2023-10-17 23:47:43,227 DEV : loss 0.21548223495483398 - f1-score (micro avg) 0.711 2023-10-17 23:47:43,257 saving best model 2023-10-17 23:47:43,646 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:47:51,091 epoch 2 - iter 147/1476 - loss 0.16619350 - time (sec): 7.44 - samples/sec: 2288.68 - lr: 0.000049 - momentum: 0.000000 2023-10-17 23:47:58,312 epoch 2 - iter 294/1476 - loss 0.15471014 - time (sec): 14.66 - samples/sec: 2261.06 - lr: 0.000049 - momentum: 0.000000 2023-10-17 23:48:05,399 epoch 2 - iter 441/1476 - loss 0.14876748 - time (sec): 21.75 - samples/sec: 2282.38 - lr: 0.000048 - momentum: 0.000000 2023-10-17 23:48:12,345 epoch 2 - iter 588/1476 - loss 0.14471294 - time (sec): 28.70 - samples/sec: 2289.61 - lr: 0.000048 - momentum: 0.000000 2023-10-17 23:48:19,229 epoch 2 - iter 735/1476 - loss 0.14973764 - time (sec): 35.58 - samples/sec: 2273.89 - lr: 0.000047 - momentum: 0.000000 2023-10-17 23:48:26,217 epoch 2 - iter 882/1476 - loss 0.14410801 - time (sec): 42.57 - samples/sec: 2298.46 - lr: 0.000047 - momentum: 0.000000 2023-10-17 23:48:33,474 epoch 2 - iter 1029/1476 - loss 0.14102429 - time (sec): 49.83 - samples/sec: 2307.16 - lr: 0.000046 - momentum: 0.000000 2023-10-17 23:48:41,170 epoch 2 - iter 1176/1476 - loss 0.13898171 - time (sec): 57.52 - samples/sec: 2336.77 - lr: 0.000046 - momentum: 0.000000 2023-10-17 23:48:48,266 epoch 2 - iter 1323/1476 - loss 0.13989011 - time (sec): 64.62 - samples/sec: 2330.81 - lr: 0.000045 - momentum: 0.000000 2023-10-17 23:48:55,244 epoch 2 - iter 1470/1476 - loss 0.14015703 - time (sec): 71.60 - samples/sec: 2317.38 - lr: 0.000044 - momentum: 0.000000 2023-10-17 23:48:55,511 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:48:55,512 EPOCH 2 done: loss 0.1399 - lr: 0.000044 2023-10-17 23:49:07,371 DEV : loss 0.14165526628494263 - f1-score (micro avg) 0.7864 2023-10-17 23:49:07,403 saving best model 2023-10-17 23:49:07,920 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:49:15,672 epoch 3 - iter 147/1476 - loss 0.07240309 - time (sec): 7.75 - samples/sec: 2092.81 - lr: 0.000044 - momentum: 0.000000 2023-10-17 23:49:22,806 epoch 3 - iter 294/1476 - loss 0.07477988 - time (sec): 14.88 - samples/sec: 2162.41 - lr: 0.000043 - momentum: 0.000000 2023-10-17 23:49:30,056 epoch 3 - iter 441/1476 - loss 0.08692481 - time (sec): 22.13 - samples/sec: 2209.79 - lr: 0.000043 - momentum: 0.000000 2023-10-17 23:49:37,169 epoch 3 - iter 588/1476 - loss 0.09465275 - time (sec): 29.25 - samples/sec: 2210.62 - lr: 0.000042 - momentum: 0.000000 2023-10-17 23:49:44,545 epoch 3 - iter 735/1476 - loss 0.09240217 - time (sec): 36.62 - samples/sec: 2250.45 - lr: 0.000042 - momentum: 0.000000 2023-10-17 23:49:51,744 epoch 3 - iter 882/1476 - loss 0.09338741 - time (sec): 43.82 - samples/sec: 2241.95 - lr: 0.000041 - momentum: 0.000000 2023-10-17 23:49:59,135 epoch 3 - iter 1029/1476 - loss 0.09287578 - time (sec): 51.21 - samples/sec: 2272.20 - lr: 0.000041 - momentum: 0.000000 2023-10-17 23:50:06,210 epoch 3 - iter 1176/1476 - loss 0.09212955 - time (sec): 58.29 - samples/sec: 2271.37 - lr: 0.000040 - momentum: 0.000000 2023-10-17 23:50:13,318 epoch 3 - iter 1323/1476 - loss 0.09128009 - time (sec): 65.40 - samples/sec: 2274.94 - lr: 0.000039 - momentum: 0.000000 2023-10-17 23:50:20,611 epoch 3 - iter 1470/1476 - loss 0.09325358 - time (sec): 72.69 - samples/sec: 2282.28 - lr: 0.000039 - momentum: 0.000000 2023-10-17 23:50:20,877 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:50:20,877 EPOCH 3 done: loss 0.0934 - lr: 0.000039 2023-10-17 23:50:32,390 DEV : loss 0.1794406771659851 - f1-score (micro avg) 0.7849 2023-10-17 23:50:32,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:50:39,656 epoch 4 - iter 147/1476 - loss 0.04189508 - time (sec): 7.23 - samples/sec: 2373.29 - lr: 0.000038 - momentum: 0.000000 2023-10-17 23:50:46,610 epoch 4 - iter 294/1476 - loss 0.05999984 - time (sec): 14.18 - samples/sec: 2299.84 - lr: 0.000038 - momentum: 0.000000 2023-10-17 23:50:54,227 epoch 4 - iter 441/1476 - loss 0.06517904 - time (sec): 21.80 - samples/sec: 2341.18 - lr: 0.000037 - momentum: 0.000000 2023-10-17 23:51:01,702 epoch 4 - iter 588/1476 - loss 0.06966660 - time (sec): 29.27 - samples/sec: 2336.69 - lr: 0.000037 - momentum: 0.000000 2023-10-17 23:51:08,935 epoch 4 - iter 735/1476 - loss 0.06697795 - time (sec): 36.51 - samples/sec: 2292.75 - lr: 0.000036 - momentum: 0.000000 2023-10-17 23:51:16,058 epoch 4 - iter 882/1476 - loss 0.06639582 - time (sec): 43.63 - samples/sec: 2258.93 - lr: 0.000036 - momentum: 0.000000 2023-10-17 23:51:23,549 epoch 4 - iter 1029/1476 - loss 0.06530104 - time (sec): 51.12 - samples/sec: 2260.47 - lr: 0.000035 - momentum: 0.000000 2023-10-17 23:51:30,834 epoch 4 - iter 1176/1476 - loss 0.06432796 - time (sec): 58.41 - samples/sec: 2266.96 - lr: 0.000034 - momentum: 0.000000 2023-10-17 23:51:38,572 epoch 4 - iter 1323/1476 - loss 0.06400953 - time (sec): 66.14 - samples/sec: 2237.08 - lr: 0.000034 - momentum: 0.000000 2023-10-17 23:51:47,005 epoch 4 - iter 1470/1476 - loss 0.06365817 - time (sec): 74.58 - samples/sec: 2224.59 - lr: 0.000033 - momentum: 0.000000 2023-10-17 23:51:47,328 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:51:47,329 EPOCH 4 done: loss 0.0635 - lr: 0.000033 2023-10-17 23:51:59,569 DEV : loss 0.1764398068189621 - f1-score (micro avg) 0.814 2023-10-17 23:51:59,617 saving best model 2023-10-17 23:52:00,200 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:52:07,564 epoch 5 - iter 147/1476 - loss 0.03148151 - time (sec): 7.36 - samples/sec: 2163.68 - lr: 0.000033 - momentum: 0.000000 2023-10-17 23:52:14,508 epoch 5 - iter 294/1476 - loss 0.03598276 - time (sec): 14.31 - samples/sec: 2220.63 - lr: 0.000032 - momentum: 0.000000 2023-10-17 23:52:21,489 epoch 5 - iter 441/1476 - loss 0.04162358 - time (sec): 21.29 - samples/sec: 2266.15 - lr: 0.000032 - momentum: 0.000000 2023-10-17 23:52:28,796 epoch 5 - iter 588/1476 - loss 0.04510912 - time (sec): 28.59 - samples/sec: 2280.82 - lr: 0.000031 - momentum: 0.000000 2023-10-17 23:52:35,875 epoch 5 - iter 735/1476 - loss 0.04497694 - time (sec): 35.67 - samples/sec: 2274.20 - lr: 0.000031 - momentum: 0.000000 2023-10-17 23:52:43,598 epoch 5 - iter 882/1476 - loss 0.04344617 - time (sec): 43.40 - samples/sec: 2329.01 - lr: 0.000030 - momentum: 0.000000 2023-10-17 23:52:50,968 epoch 5 - iter 1029/1476 - loss 0.04421788 - time (sec): 50.77 - samples/sec: 2336.93 - lr: 0.000029 - momentum: 0.000000 2023-10-17 23:52:58,041 epoch 5 - iter 1176/1476 - loss 0.04460316 - time (sec): 57.84 - samples/sec: 2314.04 - lr: 0.000029 - momentum: 0.000000 2023-10-17 23:53:04,814 epoch 5 - iter 1323/1476 - loss 0.04403883 - time (sec): 64.61 - samples/sec: 2289.09 - lr: 0.000028 - momentum: 0.000000 2023-10-17 23:53:12,210 epoch 5 - iter 1470/1476 - loss 0.04436491 - time (sec): 72.01 - samples/sec: 2304.02 - lr: 0.000028 - momentum: 0.000000 2023-10-17 23:53:12,479 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:53:12,479 EPOCH 5 done: loss 0.0442 - lr: 0.000028 2023-10-17 23:53:24,386 DEV : loss 0.18140539526939392 - f1-score (micro avg) 0.8241 2023-10-17 23:53:24,422 saving best model 2023-10-17 23:53:25,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:53:32,585 epoch 6 - iter 147/1476 - loss 0.04187100 - time (sec): 7.57 - samples/sec: 2403.54 - lr: 0.000027 - momentum: 0.000000 2023-10-17 23:53:39,670 epoch 6 - iter 294/1476 - loss 0.03020420 - time (sec): 14.66 - samples/sec: 2352.90 - lr: 0.000027 - momentum: 0.000000 2023-10-17 23:53:47,148 epoch 6 - iter 441/1476 - loss 0.02675017 - time (sec): 22.14 - samples/sec: 2266.45 - lr: 0.000026 - momentum: 0.000000 2023-10-17 23:53:55,819 epoch 6 - iter 588/1476 - loss 0.02815533 - time (sec): 30.81 - samples/sec: 2278.66 - lr: 0.000026 - momentum: 0.000000 2023-10-17 23:54:02,951 epoch 6 - iter 735/1476 - loss 0.02827554 - time (sec): 37.94 - samples/sec: 2258.53 - lr: 0.000025 - momentum: 0.000000 2023-10-17 23:54:10,133 epoch 6 - iter 882/1476 - loss 0.02887525 - time (sec): 45.12 - samples/sec: 2256.53 - lr: 0.000024 - momentum: 0.000000 2023-10-17 23:54:17,347 epoch 6 - iter 1029/1476 - loss 0.02838570 - time (sec): 52.33 - samples/sec: 2275.05 - lr: 0.000024 - momentum: 0.000000 2023-10-17 23:54:24,315 epoch 6 - iter 1176/1476 - loss 0.02791635 - time (sec): 59.30 - samples/sec: 2257.69 - lr: 0.000023 - momentum: 0.000000 2023-10-17 23:54:31,332 epoch 6 - iter 1323/1476 - loss 0.02743435 - time (sec): 66.32 - samples/sec: 2256.11 - lr: 0.000023 - momentum: 0.000000 2023-10-17 23:54:38,380 epoch 6 - iter 1470/1476 - loss 0.02754388 - time (sec): 73.37 - samples/sec: 2261.26 - lr: 0.000022 - momentum: 0.000000 2023-10-17 23:54:38,648 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:54:38,648 EPOCH 6 done: loss 0.0275 - lr: 0.000022 2023-10-17 23:54:50,572 DEV : loss 0.21165505051612854 - f1-score (micro avg) 0.8189 2023-10-17 23:54:50,611 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:54:57,604 epoch 7 - iter 147/1476 - loss 0.03336044 - time (sec): 6.99 - samples/sec: 2202.23 - lr: 0.000022 - momentum: 0.000000 2023-10-17 23:55:04,867 epoch 7 - iter 294/1476 - loss 0.02830500 - time (sec): 14.25 - samples/sec: 2233.45 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:55:11,662 epoch 7 - iter 441/1476 - loss 0.02519746 - time (sec): 21.05 - samples/sec: 2242.77 - lr: 0.000021 - momentum: 0.000000 2023-10-17 23:55:18,747 epoch 7 - iter 588/1476 - loss 0.02427425 - time (sec): 28.13 - samples/sec: 2246.44 - lr: 0.000020 - momentum: 0.000000 2023-10-17 23:55:26,660 epoch 7 - iter 735/1476 - loss 0.02353020 - time (sec): 36.05 - samples/sec: 2264.62 - lr: 0.000019 - momentum: 0.000000 2023-10-17 23:55:34,371 epoch 7 - iter 882/1476 - loss 0.02322784 - time (sec): 43.76 - samples/sec: 2205.99 - lr: 0.000019 - momentum: 0.000000 2023-10-17 23:55:41,845 epoch 7 - iter 1029/1476 - loss 0.02477380 - time (sec): 51.23 - samples/sec: 2214.26 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:55:49,162 epoch 7 - iter 1176/1476 - loss 0.02374289 - time (sec): 58.55 - samples/sec: 2238.37 - lr: 0.000018 - momentum: 0.000000 2023-10-17 23:55:56,331 epoch 7 - iter 1323/1476 - loss 0.02328721 - time (sec): 65.72 - samples/sec: 2249.63 - lr: 0.000017 - momentum: 0.000000 2023-10-17 23:56:03,765 epoch 7 - iter 1470/1476 - loss 0.02252800 - time (sec): 73.15 - samples/sec: 2264.17 - lr: 0.000017 - momentum: 0.000000 2023-10-17 23:56:04,039 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:56:04,039 EPOCH 7 done: loss 0.0225 - lr: 0.000017 2023-10-17 23:56:15,766 DEV : loss 0.22102110087871552 - f1-score (micro avg) 0.8288 2023-10-17 23:56:15,801 saving best model 2023-10-17 23:56:16,352 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:56:23,364 epoch 8 - iter 147/1476 - loss 0.01157989 - time (sec): 7.01 - samples/sec: 2322.40 - lr: 0.000016 - momentum: 0.000000 2023-10-17 23:56:30,121 epoch 8 - iter 294/1476 - loss 0.01167708 - time (sec): 13.77 - samples/sec: 2240.73 - lr: 0.000016 - momentum: 0.000000 2023-10-17 23:56:37,509 epoch 8 - iter 441/1476 - loss 0.00968561 - time (sec): 21.16 - samples/sec: 2317.22 - lr: 0.000015 - momentum: 0.000000 2023-10-17 23:56:44,454 epoch 8 - iter 588/1476 - loss 0.00985733 - time (sec): 28.10 - samples/sec: 2278.99 - lr: 0.000014 - momentum: 0.000000 2023-10-17 23:56:51,836 epoch 8 - iter 735/1476 - loss 0.01133488 - time (sec): 35.48 - samples/sec: 2325.78 - lr: 0.000014 - momentum: 0.000000 2023-10-17 23:56:59,035 epoch 8 - iter 882/1476 - loss 0.01248439 - time (sec): 42.68 - samples/sec: 2316.77 - lr: 0.000013 - momentum: 0.000000 2023-10-17 23:57:06,073 epoch 8 - iter 1029/1476 - loss 0.01289164 - time (sec): 49.72 - samples/sec: 2299.82 - lr: 0.000013 - momentum: 0.000000 2023-10-17 23:57:12,782 epoch 8 - iter 1176/1476 - loss 0.01234089 - time (sec): 56.43 - samples/sec: 2296.02 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:57:20,061 epoch 8 - iter 1323/1476 - loss 0.01252386 - time (sec): 63.71 - samples/sec: 2282.19 - lr: 0.000012 - momentum: 0.000000 2023-10-17 23:57:27,897 epoch 8 - iter 1470/1476 - loss 0.01253658 - time (sec): 71.54 - samples/sec: 2315.07 - lr: 0.000011 - momentum: 0.000000 2023-10-17 23:57:28,179 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:57:28,180 EPOCH 8 done: loss 0.0125 - lr: 0.000011 2023-10-17 23:57:39,811 DEV : loss 0.21121571958065033 - f1-score (micro avg) 0.8457 2023-10-17 23:57:39,845 saving best model 2023-10-17 23:57:40,402 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:57:47,741 epoch 9 - iter 147/1476 - loss 0.00964780 - time (sec): 7.34 - samples/sec: 2199.97 - lr: 0.000011 - momentum: 0.000000 2023-10-17 23:57:54,762 epoch 9 - iter 294/1476 - loss 0.00914997 - time (sec): 14.36 - samples/sec: 2240.19 - lr: 0.000010 - momentum: 0.000000 2023-10-17 23:58:02,050 epoch 9 - iter 441/1476 - loss 0.00923034 - time (sec): 21.65 - samples/sec: 2313.28 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:58:08,743 epoch 9 - iter 588/1476 - loss 0.00767134 - time (sec): 28.34 - samples/sec: 2324.21 - lr: 0.000009 - momentum: 0.000000 2023-10-17 23:58:15,646 epoch 9 - iter 735/1476 - loss 0.00728748 - time (sec): 35.24 - samples/sec: 2370.99 - lr: 0.000008 - momentum: 0.000000 2023-10-17 23:58:22,401 epoch 9 - iter 882/1476 - loss 0.00929956 - time (sec): 42.00 - samples/sec: 2362.21 - lr: 0.000008 - momentum: 0.000000 2023-10-17 23:58:29,273 epoch 9 - iter 1029/1476 - loss 0.00884665 - time (sec): 48.87 - samples/sec: 2346.38 - lr: 0.000007 - momentum: 0.000000 2023-10-17 23:58:36,688 epoch 9 - iter 1176/1476 - loss 0.00813902 - time (sec): 56.28 - samples/sec: 2351.01 - lr: 0.000007 - momentum: 0.000000 2023-10-17 23:58:43,773 epoch 9 - iter 1323/1476 - loss 0.00784879 - time (sec): 63.37 - samples/sec: 2344.88 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:58:50,973 epoch 9 - iter 1470/1476 - loss 0.00820699 - time (sec): 70.57 - samples/sec: 2340.07 - lr: 0.000006 - momentum: 0.000000 2023-10-17 23:58:51,378 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:58:51,379 EPOCH 9 done: loss 0.0082 - lr: 0.000006 2023-10-17 23:59:02,756 DEV : loss 0.2453046441078186 - f1-score (micro avg) 0.8427 2023-10-17 23:59:02,789 ---------------------------------------------------------------------------------------------------- 2023-10-17 23:59:10,200 epoch 10 - iter 147/1476 - loss 0.00176823 - time (sec): 7.41 - samples/sec: 2383.35 - lr: 0.000005 - momentum: 0.000000 2023-10-17 23:59:17,026 epoch 10 - iter 294/1476 - loss 0.00524086 - time (sec): 14.24 - samples/sec: 2414.74 - lr: 0.000004 - momentum: 0.000000 2023-10-17 23:59:23,824 epoch 10 - iter 441/1476 - loss 0.00516859 - time (sec): 21.03 - samples/sec: 2405.97 - lr: 0.000004 - momentum: 0.000000 2023-10-17 23:59:31,060 epoch 10 - iter 588/1476 - loss 0.00530312 - time (sec): 28.27 - samples/sec: 2434.57 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:59:38,111 epoch 10 - iter 735/1476 - loss 0.00452613 - time (sec): 35.32 - samples/sec: 2403.30 - lr: 0.000003 - momentum: 0.000000 2023-10-17 23:59:45,220 epoch 10 - iter 882/1476 - loss 0.00426851 - time (sec): 42.43 - samples/sec: 2392.74 - lr: 0.000002 - momentum: 0.000000 2023-10-17 23:59:52,215 epoch 10 - iter 1029/1476 - loss 0.00407151 - time (sec): 49.42 - samples/sec: 2365.78 - lr: 0.000002 - momentum: 0.000000 2023-10-17 23:59:59,493 epoch 10 - iter 1176/1476 - loss 0.00383502 - time (sec): 56.70 - samples/sec: 2341.96 - lr: 0.000001 - momentum: 0.000000 2023-10-18 00:00:06,798 epoch 10 - iter 1323/1476 - loss 0.00389736 - time (sec): 64.01 - samples/sec: 2331.27 - lr: 0.000001 - momentum: 0.000000 2023-10-18 00:00:13,976 epoch 10 - iter 1470/1476 - loss 0.00414410 - time (sec): 71.19 - samples/sec: 2329.56 - lr: 0.000000 - momentum: 0.000000 2023-10-18 00:00:14,268 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:00:14,268 EPOCH 10 done: loss 0.0042 - lr: 0.000000 2023-10-18 00:00:26,564 DEV : loss 0.2395576387643814 - f1-score (micro avg) 0.8488 2023-10-18 00:00:26,608 saving best model 2023-10-18 00:00:27,721 ---------------------------------------------------------------------------------------------------- 2023-10-18 00:00:27,723 Loading model from best epoch ... 2023-10-18 00:00:29,126 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-18 00:00:36,049 Results: - F-score (micro) 0.8138 - F-score (macro) 0.7187 - Accuracy 0.7058 By class: precision recall f1-score support loc 0.8806 0.8858 0.8832 858 pers 0.7688 0.7989 0.7836 537 org 0.6385 0.6288 0.6336 132 prod 0.7018 0.6557 0.6780 61 time 0.5714 0.6667 0.6154 54 micro avg 0.8067 0.8210 0.8138 1642 macro avg 0.7122 0.7272 0.7187 1642 weighted avg 0.8078 0.8210 0.8141 1642 2023-10-18 00:00:36,049 ----------------------------------------------------------------------------------------------------