2023-10-17 10:18:58,581 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,582 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:18:58,583 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,583 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-17 10:18:58,583 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,583 Train: 20847 sentences 2023-10-17 10:18:58,583 (train_with_dev=False, train_with_test=False) 2023-10-17 10:18:58,583 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,583 Training Params: 2023-10-17 10:18:58,583 - learning_rate: "5e-05" 2023-10-17 10:18:58,583 - mini_batch_size: "8" 2023-10-17 10:18:58,583 - max_epochs: "10" 2023-10-17 10:18:58,584 - shuffle: "True" 2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,584 Plugins: 2023-10-17 10:18:58,584 - TensorboardLogger 2023-10-17 10:18:58,584 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,584 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:18:58,584 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,584 Computation: 2023-10-17 10:18:58,584 - compute on device: cuda:0 2023-10-17 10:18:58,584 - embedding storage: none 2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,584 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,585 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:18:58,585 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:19:25,620 epoch 1 - iter 260/2606 - loss 1.88523681 - time (sec): 27.03 - samples/sec: 1244.18 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:19:52,497 epoch 1 - iter 520/2606 - loss 1.12382223 - time (sec): 53.91 - samples/sec: 1283.76 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:20:20,300 epoch 1 - iter 780/2606 - loss 0.83719157 - time (sec): 81.71 - samples/sec: 1310.34 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:20:47,239 epoch 1 - iter 1040/2606 - loss 0.68970036 - time (sec): 108.65 - samples/sec: 1336.82 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:21:14,600 epoch 1 - iter 1300/2606 - loss 0.59439782 - time (sec): 136.01 - samples/sec: 1353.21 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:21:42,308 epoch 1 - iter 1560/2606 - loss 0.52755960 - time (sec): 163.72 - samples/sec: 1362.27 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:22:10,369 epoch 1 - iter 1820/2606 - loss 0.48679454 - time (sec): 191.78 - samples/sec: 1346.47 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:22:37,175 epoch 1 - iter 2080/2606 - loss 0.45521738 - time (sec): 218.59 - samples/sec: 1338.76 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:23:04,173 epoch 1 - iter 2340/2606 - loss 0.42400062 - time (sec): 245.59 - samples/sec: 1345.15 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:23:31,264 epoch 1 - iter 2600/2606 - loss 0.40097181 - time (sec): 272.68 - samples/sec: 1344.06 - lr: 0.000050 - momentum: 0.000000 2023-10-17 10:23:31,864 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:23:31,864 EPOCH 1 done: loss 0.4002 - lr: 0.000050 2023-10-17 10:23:39,220 DEV : loss 0.11381553113460541 - f1-score (micro avg) 0.3171 2023-10-17 10:23:39,272 saving best model 2023-10-17 10:23:39,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:24:08,030 epoch 2 - iter 260/2606 - loss 0.17285748 - time (sec): 28.23 - samples/sec: 1355.61 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:24:35,777 epoch 2 - iter 520/2606 - loss 0.20526777 - time (sec): 55.97 - samples/sec: 1329.33 - lr: 0.000049 - momentum: 0.000000 2023-10-17 10:25:03,854 epoch 2 - iter 780/2606 - loss 0.19400558 - time (sec): 84.05 - samples/sec: 1333.96 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:25:30,896 epoch 2 - iter 1040/2606 - loss 0.19145026 - time (sec): 111.09 - samples/sec: 1322.86 - lr: 0.000048 - momentum: 0.000000 2023-10-17 10:25:59,200 epoch 2 - iter 1300/2606 - loss 0.18904313 - time (sec): 139.40 - samples/sec: 1313.04 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:26:25,856 epoch 2 - iter 1560/2606 - loss 0.18594503 - time (sec): 166.05 - samples/sec: 1319.31 - lr: 0.000047 - momentum: 0.000000 2023-10-17 10:26:54,154 epoch 2 - iter 1820/2606 - loss 0.17938094 - time (sec): 194.35 - samples/sec: 1331.79 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:27:21,812 epoch 2 - iter 2080/2606 - loss 0.17717797 - time (sec): 222.01 - samples/sec: 1330.89 - lr: 0.000046 - momentum: 0.000000 2023-10-17 10:27:47,173 epoch 2 - iter 2340/2606 - loss 0.17325202 - time (sec): 247.37 - samples/sec: 1329.74 - lr: 0.000045 - momentum: 0.000000 2023-10-17 10:28:13,350 epoch 2 - iter 2600/2606 - loss 0.16972430 - time (sec): 273.55 - samples/sec: 1339.96 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:28:14,078 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:14,078 EPOCH 2 done: loss 0.1695 - lr: 0.000044 2023-10-17 10:28:26,197 DEV : loss 0.223122239112854 - f1-score (micro avg) 0.3392 2023-10-17 10:28:26,258 saving best model 2023-10-17 10:28:27,707 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:28:56,388 epoch 3 - iter 260/2606 - loss 0.11196216 - time (sec): 28.68 - samples/sec: 1321.95 - lr: 0.000044 - momentum: 0.000000 2023-10-17 10:29:23,668 epoch 3 - iter 520/2606 - loss 0.11653677 - time (sec): 55.96 - samples/sec: 1336.99 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:29:50,270 epoch 3 - iter 780/2606 - loss 0.11963140 - time (sec): 82.56 - samples/sec: 1343.12 - lr: 0.000043 - momentum: 0.000000 2023-10-17 10:30:17,420 epoch 3 - iter 1040/2606 - loss 0.11700179 - time (sec): 109.71 - samples/sec: 1335.04 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:30:44,113 epoch 3 - iter 1300/2606 - loss 0.11677412 - time (sec): 136.40 - samples/sec: 1341.76 - lr: 0.000042 - momentum: 0.000000 2023-10-17 10:31:10,804 epoch 3 - iter 1560/2606 - loss 0.12086406 - time (sec): 163.09 - samples/sec: 1333.15 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:31:37,531 epoch 3 - iter 1820/2606 - loss 0.11878217 - time (sec): 189.82 - samples/sec: 1335.69 - lr: 0.000041 - momentum: 0.000000 2023-10-17 10:32:05,066 epoch 3 - iter 2080/2606 - loss 0.11924890 - time (sec): 217.36 - samples/sec: 1335.00 - lr: 0.000040 - momentum: 0.000000 2023-10-17 10:32:34,760 epoch 3 - iter 2340/2606 - loss 0.11857763 - time (sec): 247.05 - samples/sec: 1333.10 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:33:03,869 epoch 3 - iter 2600/2606 - loss 0.11654712 - time (sec): 276.16 - samples/sec: 1327.20 - lr: 0.000039 - momentum: 0.000000 2023-10-17 10:33:04,518 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:33:04,518 EPOCH 3 done: loss 0.1163 - lr: 0.000039 2023-10-17 10:33:16,361 DEV : loss 0.18563708662986755 - f1-score (micro avg) 0.356 2023-10-17 10:33:16,417 saving best model 2023-10-17 10:33:17,825 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:33:45,981 epoch 4 - iter 260/2606 - loss 0.08809572 - time (sec): 28.15 - samples/sec: 1322.94 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:34:12,346 epoch 4 - iter 520/2606 - loss 0.08738428 - time (sec): 54.52 - samples/sec: 1348.70 - lr: 0.000038 - momentum: 0.000000 2023-10-17 10:34:37,823 epoch 4 - iter 780/2606 - loss 0.08778704 - time (sec): 79.99 - samples/sec: 1355.87 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:35:04,756 epoch 4 - iter 1040/2606 - loss 0.08611474 - time (sec): 106.93 - samples/sec: 1346.45 - lr: 0.000037 - momentum: 0.000000 2023-10-17 10:35:30,915 epoch 4 - iter 1300/2606 - loss 0.08825421 - time (sec): 133.09 - samples/sec: 1343.11 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:35:56,677 epoch 4 - iter 1560/2606 - loss 0.08795449 - time (sec): 158.85 - samples/sec: 1342.10 - lr: 0.000036 - momentum: 0.000000 2023-10-17 10:36:24,361 epoch 4 - iter 1820/2606 - loss 0.08855088 - time (sec): 186.53 - samples/sec: 1348.57 - lr: 0.000035 - momentum: 0.000000 2023-10-17 10:36:51,682 epoch 4 - iter 2080/2606 - loss 0.08701037 - time (sec): 213.85 - samples/sec: 1356.64 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:37:19,341 epoch 4 - iter 2340/2606 - loss 0.08716051 - time (sec): 241.51 - samples/sec: 1361.96 - lr: 0.000034 - momentum: 0.000000 2023-10-17 10:37:47,256 epoch 4 - iter 2600/2606 - loss 0.08575731 - time (sec): 269.43 - samples/sec: 1361.07 - lr: 0.000033 - momentum: 0.000000 2023-10-17 10:37:47,818 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:37:47,818 EPOCH 4 done: loss 0.0856 - lr: 0.000033 2023-10-17 10:37:58,718 DEV : loss 0.2579371929168701 - f1-score (micro avg) 0.3733 2023-10-17 10:37:58,776 saving best model 2023-10-17 10:38:00,188 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:38:27,739 epoch 5 - iter 260/2606 - loss 0.05100938 - time (sec): 27.55 - samples/sec: 1367.58 - lr: 0.000033 - momentum: 0.000000 2023-10-17 10:38:53,642 epoch 5 - iter 520/2606 - loss 0.04902557 - time (sec): 53.45 - samples/sec: 1337.47 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:39:20,996 epoch 5 - iter 780/2606 - loss 0.05197574 - time (sec): 80.80 - samples/sec: 1340.12 - lr: 0.000032 - momentum: 0.000000 2023-10-17 10:39:47,084 epoch 5 - iter 1040/2606 - loss 0.05396406 - time (sec): 106.89 - samples/sec: 1332.05 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:40:16,075 epoch 5 - iter 1300/2606 - loss 0.05874220 - time (sec): 135.88 - samples/sec: 1339.43 - lr: 0.000031 - momentum: 0.000000 2023-10-17 10:40:43,125 epoch 5 - iter 1560/2606 - loss 0.05952086 - time (sec): 162.93 - samples/sec: 1361.63 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:41:09,653 epoch 5 - iter 1820/2606 - loss 0.06088136 - time (sec): 189.46 - samples/sec: 1363.89 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:41:36,811 epoch 5 - iter 2080/2606 - loss 0.06080196 - time (sec): 216.62 - samples/sec: 1365.76 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:42:02,485 epoch 5 - iter 2340/2606 - loss 0.06099854 - time (sec): 242.29 - samples/sec: 1363.26 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:42:29,443 epoch 5 - iter 2600/2606 - loss 0.06045113 - time (sec): 269.25 - samples/sec: 1362.05 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:42:29,970 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:42:29,970 EPOCH 5 done: loss 0.0604 - lr: 0.000028 2023-10-17 10:42:40,709 DEV : loss 0.3068985044956207 - f1-score (micro avg) 0.4156 2023-10-17 10:42:40,761 saving best model 2023-10-17 10:42:42,155 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:43:08,497 epoch 6 - iter 260/2606 - loss 0.04273978 - time (sec): 26.34 - samples/sec: 1415.83 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:43:34,516 epoch 6 - iter 520/2606 - loss 0.04270582 - time (sec): 52.36 - samples/sec: 1379.74 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:44:00,440 epoch 6 - iter 780/2606 - loss 0.04178911 - time (sec): 78.28 - samples/sec: 1368.55 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:44:27,949 epoch 6 - iter 1040/2606 - loss 0.03962571 - time (sec): 105.79 - samples/sec: 1385.07 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:44:55,668 epoch 6 - iter 1300/2606 - loss 0.04045489 - time (sec): 133.51 - samples/sec: 1387.29 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:45:22,831 epoch 6 - iter 1560/2606 - loss 0.04097251 - time (sec): 160.67 - samples/sec: 1383.91 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:45:50,498 epoch 6 - iter 1820/2606 - loss 0.04012529 - time (sec): 188.34 - samples/sec: 1371.39 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:46:17,568 epoch 6 - iter 2080/2606 - loss 0.04115226 - time (sec): 215.41 - samples/sec: 1363.74 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:46:46,617 epoch 6 - iter 2340/2606 - loss 0.04163284 - time (sec): 244.46 - samples/sec: 1352.33 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:47:13,563 epoch 6 - iter 2600/2606 - loss 0.04297284 - time (sec): 271.40 - samples/sec: 1350.77 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:47:14,120 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:47:14,120 EPOCH 6 done: loss 0.0429 - lr: 0.000022 2023-10-17 10:47:24,944 DEV : loss 0.3329330384731293 - f1-score (micro avg) 0.3972 2023-10-17 10:47:24,995 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:47:52,020 epoch 7 - iter 260/2606 - loss 0.03145306 - time (sec): 27.02 - samples/sec: 1399.79 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:48:19,096 epoch 7 - iter 520/2606 - loss 0.02828661 - time (sec): 54.10 - samples/sec: 1391.68 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:48:46,758 epoch 7 - iter 780/2606 - loss 0.02853558 - time (sec): 81.76 - samples/sec: 1374.54 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:49:16,621 epoch 7 - iter 1040/2606 - loss 0.02992094 - time (sec): 111.62 - samples/sec: 1335.70 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:49:44,050 epoch 7 - iter 1300/2606 - loss 0.03061991 - time (sec): 139.05 - samples/sec: 1327.08 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:50:10,681 epoch 7 - iter 1560/2606 - loss 0.02982746 - time (sec): 165.68 - samples/sec: 1326.21 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:50:37,937 epoch 7 - iter 1820/2606 - loss 0.03066087 - time (sec): 192.94 - samples/sec: 1325.75 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:51:06,450 epoch 7 - iter 2080/2606 - loss 0.03008073 - time (sec): 221.45 - samples/sec: 1332.50 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:51:34,169 epoch 7 - iter 2340/2606 - loss 0.03142644 - time (sec): 249.17 - samples/sec: 1331.26 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:52:00,747 epoch 7 - iter 2600/2606 - loss 0.03141193 - time (sec): 275.75 - samples/sec: 1330.53 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:52:01,309 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:52:01,309 EPOCH 7 done: loss 0.0314 - lr: 0.000017 2023-10-17 10:52:12,353 DEV : loss 0.363223135471344 - f1-score (micro avg) 0.4096 2023-10-17 10:52:12,417 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:52:39,557 epoch 8 - iter 260/2606 - loss 0.01939361 - time (sec): 27.14 - samples/sec: 1321.47 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:53:07,789 epoch 8 - iter 520/2606 - loss 0.01739651 - time (sec): 55.37 - samples/sec: 1290.53 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:53:36,222 epoch 8 - iter 780/2606 - loss 0.01873777 - time (sec): 83.80 - samples/sec: 1267.00 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:54:05,116 epoch 8 - iter 1040/2606 - loss 0.01864960 - time (sec): 112.70 - samples/sec: 1263.19 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:54:32,882 epoch 8 - iter 1300/2606 - loss 0.01961418 - time (sec): 140.46 - samples/sec: 1267.02 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:55:00,732 epoch 8 - iter 1560/2606 - loss 0.01972925 - time (sec): 168.31 - samples/sec: 1280.96 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:55:28,259 epoch 8 - iter 1820/2606 - loss 0.01993508 - time (sec): 195.84 - samples/sec: 1291.10 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:55:55,992 epoch 8 - iter 2080/2606 - loss 0.02035476 - time (sec): 223.57 - samples/sec: 1305.07 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:56:23,834 epoch 8 - iter 2340/2606 - loss 0.02030965 - time (sec): 251.41 - samples/sec: 1312.25 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:56:52,233 epoch 8 - iter 2600/2606 - loss 0.02034546 - time (sec): 279.81 - samples/sec: 1310.76 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:56:52,873 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:56:52,874 EPOCH 8 done: loss 0.0203 - lr: 0.000011 2023-10-17 10:57:05,587 DEV : loss 0.393916517496109 - f1-score (micro avg) 0.406 2023-10-17 10:57:05,657 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:32,201 epoch 9 - iter 260/2606 - loss 0.00999242 - time (sec): 26.54 - samples/sec: 1281.27 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:57:59,545 epoch 9 - iter 520/2606 - loss 0.01555837 - time (sec): 53.89 - samples/sec: 1322.90 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:58:26,352 epoch 9 - iter 780/2606 - loss 0.01445469 - time (sec): 80.69 - samples/sec: 1300.78 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:58:54,326 epoch 9 - iter 1040/2606 - loss 0.01476011 - time (sec): 108.67 - samples/sec: 1284.97 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:59:21,870 epoch 9 - iter 1300/2606 - loss 0.01578290 - time (sec): 136.21 - samples/sec: 1293.88 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:59:50,972 epoch 9 - iter 1560/2606 - loss 0.01579451 - time (sec): 165.31 - samples/sec: 1287.08 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:00:20,149 epoch 9 - iter 1820/2606 - loss 0.01527993 - time (sec): 194.49 - samples/sec: 1286.46 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:00:49,010 epoch 9 - iter 2080/2606 - loss 0.01478612 - time (sec): 223.35 - samples/sec: 1291.50 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:01:17,393 epoch 9 - iter 2340/2606 - loss 0.01475537 - time (sec): 251.73 - samples/sec: 1297.41 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:01:46,450 epoch 9 - iter 2600/2606 - loss 0.01458389 - time (sec): 280.79 - samples/sec: 1305.91 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:01:47,114 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:01:47,114 EPOCH 9 done: loss 0.0147 - lr: 0.000006 2023-10-17 11:02:00,212 DEV : loss 0.5024428367614746 - f1-score (micro avg) 0.3891 2023-10-17 11:02:00,276 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:02:28,755 epoch 10 - iter 260/2606 - loss 0.00777663 - time (sec): 28.48 - samples/sec: 1304.69 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:02:56,881 epoch 10 - iter 520/2606 - loss 0.00852951 - time (sec): 56.60 - samples/sec: 1283.24 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:03:24,114 epoch 10 - iter 780/2606 - loss 0.00914261 - time (sec): 83.84 - samples/sec: 1274.15 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:03:50,849 epoch 10 - iter 1040/2606 - loss 0.00961380 - time (sec): 110.57 - samples/sec: 1302.75 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:04:20,070 epoch 10 - iter 1300/2606 - loss 0.00995380 - time (sec): 139.79 - samples/sec: 1298.95 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:04:47,597 epoch 10 - iter 1560/2606 - loss 0.00946199 - time (sec): 167.32 - samples/sec: 1295.54 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:05:14,358 epoch 10 - iter 1820/2606 - loss 0.00930366 - time (sec): 194.08 - samples/sec: 1303.79 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:05:41,338 epoch 10 - iter 2080/2606 - loss 0.00918369 - time (sec): 221.06 - samples/sec: 1310.36 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:06:09,838 epoch 10 - iter 2340/2606 - loss 0.00912808 - time (sec): 249.56 - samples/sec: 1315.21 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:06:39,144 epoch 10 - iter 2600/2606 - loss 0.00928698 - time (sec): 278.87 - samples/sec: 1315.25 - lr: 0.000000 - momentum: 0.000000 2023-10-17 11:06:39,685 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:06:39,685 EPOCH 10 done: loss 0.0093 - lr: 0.000000 2023-10-17 11:06:51,588 DEV : loss 0.5263164043426514 - f1-score (micro avg) 0.3942 2023-10-17 11:06:52,178 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:06:52,180 Loading model from best epoch ... 2023-10-17 11:06:54,538 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 11:07:15,105 Results: - F-score (micro) 0.4845 - F-score (macro) 0.3222 - Accuracy 0.3241 By class: precision recall f1-score support LOC 0.5629 0.5972 0.5795 1214 PER 0.4140 0.4406 0.4269 808 ORG 0.3077 0.2606 0.2822 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4784 0.4908 0.4845 2390 macro avg 0.3211 0.3246 0.3222 2390 weighted avg 0.4713 0.4908 0.4804 2390 2023-10-17 11:07:15,105 ----------------------------------------------------------------------------------------------------