|
2023-10-17 10:18:58,581 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,582 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 10:18:58,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,583 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-17 10:18:58,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,583 Train: 20847 sentences |
|
2023-10-17 10:18:58,583 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 10:18:58,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,583 Training Params: |
|
2023-10-17 10:18:58,583 - learning_rate: "5e-05" |
|
2023-10-17 10:18:58,583 - mini_batch_size: "8" |
|
2023-10-17 10:18:58,583 - max_epochs: "10" |
|
2023-10-17 10:18:58,584 - shuffle: "True" |
|
2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,584 Plugins: |
|
2023-10-17 10:18:58,584 - TensorboardLogger |
|
2023-10-17 10:18:58,584 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,584 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 10:18:58,584 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,584 Computation: |
|
2023-10-17 10:18:58,584 - compute on device: cuda:0 |
|
2023-10-17 10:18:58,584 - embedding storage: none |
|
2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,584 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-17 10:18:58,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,585 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:18:58,585 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 10:19:25,620 epoch 1 - iter 260/2606 - loss 1.88523681 - time (sec): 27.03 - samples/sec: 1244.18 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 10:19:52,497 epoch 1 - iter 520/2606 - loss 1.12382223 - time (sec): 53.91 - samples/sec: 1283.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 10:20:20,300 epoch 1 - iter 780/2606 - loss 0.83719157 - time (sec): 81.71 - samples/sec: 1310.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 10:20:47,239 epoch 1 - iter 1040/2606 - loss 0.68970036 - time (sec): 108.65 - samples/sec: 1336.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 10:21:14,600 epoch 1 - iter 1300/2606 - loss 0.59439782 - time (sec): 136.01 - samples/sec: 1353.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 10:21:42,308 epoch 1 - iter 1560/2606 - loss 0.52755960 - time (sec): 163.72 - samples/sec: 1362.27 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 10:22:10,369 epoch 1 - iter 1820/2606 - loss 0.48679454 - time (sec): 191.78 - samples/sec: 1346.47 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 10:22:37,175 epoch 1 - iter 2080/2606 - loss 0.45521738 - time (sec): 218.59 - samples/sec: 1338.76 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 10:23:04,173 epoch 1 - iter 2340/2606 - loss 0.42400062 - time (sec): 245.59 - samples/sec: 1345.15 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 10:23:31,264 epoch 1 - iter 2600/2606 - loss 0.40097181 - time (sec): 272.68 - samples/sec: 1344.06 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 10:23:31,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:23:31,864 EPOCH 1 done: loss 0.4002 - lr: 0.000050 |
|
2023-10-17 10:23:39,220 DEV : loss 0.11381553113460541 - f1-score (micro avg) 0.3171 |
|
2023-10-17 10:23:39,272 saving best model |
|
2023-10-17 10:23:39,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:24:08,030 epoch 2 - iter 260/2606 - loss 0.17285748 - time (sec): 28.23 - samples/sec: 1355.61 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 10:24:35,777 epoch 2 - iter 520/2606 - loss 0.20526777 - time (sec): 55.97 - samples/sec: 1329.33 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 10:25:03,854 epoch 2 - iter 780/2606 - loss 0.19400558 - time (sec): 84.05 - samples/sec: 1333.96 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 10:25:30,896 epoch 2 - iter 1040/2606 - loss 0.19145026 - time (sec): 111.09 - samples/sec: 1322.86 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 10:25:59,200 epoch 2 - iter 1300/2606 - loss 0.18904313 - time (sec): 139.40 - samples/sec: 1313.04 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 10:26:25,856 epoch 2 - iter 1560/2606 - loss 0.18594503 - time (sec): 166.05 - samples/sec: 1319.31 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 10:26:54,154 epoch 2 - iter 1820/2606 - loss 0.17938094 - time (sec): 194.35 - samples/sec: 1331.79 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 10:27:21,812 epoch 2 - iter 2080/2606 - loss 0.17717797 - time (sec): 222.01 - samples/sec: 1330.89 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 10:27:47,173 epoch 2 - iter 2340/2606 - loss 0.17325202 - time (sec): 247.37 - samples/sec: 1329.74 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 10:28:13,350 epoch 2 - iter 2600/2606 - loss 0.16972430 - time (sec): 273.55 - samples/sec: 1339.96 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 10:28:14,078 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:28:14,078 EPOCH 2 done: loss 0.1695 - lr: 0.000044 |
|
2023-10-17 10:28:26,197 DEV : loss 0.223122239112854 - f1-score (micro avg) 0.3392 |
|
2023-10-17 10:28:26,258 saving best model |
|
2023-10-17 10:28:27,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:28:56,388 epoch 3 - iter 260/2606 - loss 0.11196216 - time (sec): 28.68 - samples/sec: 1321.95 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 10:29:23,668 epoch 3 - iter 520/2606 - loss 0.11653677 - time (sec): 55.96 - samples/sec: 1336.99 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 10:29:50,270 epoch 3 - iter 780/2606 - loss 0.11963140 - time (sec): 82.56 - samples/sec: 1343.12 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 10:30:17,420 epoch 3 - iter 1040/2606 - loss 0.11700179 - time (sec): 109.71 - samples/sec: 1335.04 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 10:30:44,113 epoch 3 - iter 1300/2606 - loss 0.11677412 - time (sec): 136.40 - samples/sec: 1341.76 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 10:31:10,804 epoch 3 - iter 1560/2606 - loss 0.12086406 - time (sec): 163.09 - samples/sec: 1333.15 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 10:31:37,531 epoch 3 - iter 1820/2606 - loss 0.11878217 - time (sec): 189.82 - samples/sec: 1335.69 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 10:32:05,066 epoch 3 - iter 2080/2606 - loss 0.11924890 - time (sec): 217.36 - samples/sec: 1335.00 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 10:32:34,760 epoch 3 - iter 2340/2606 - loss 0.11857763 - time (sec): 247.05 - samples/sec: 1333.10 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 10:33:03,869 epoch 3 - iter 2600/2606 - loss 0.11654712 - time (sec): 276.16 - samples/sec: 1327.20 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 10:33:04,518 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:33:04,518 EPOCH 3 done: loss 0.1163 - lr: 0.000039 |
|
2023-10-17 10:33:16,361 DEV : loss 0.18563708662986755 - f1-score (micro avg) 0.356 |
|
2023-10-17 10:33:16,417 saving best model |
|
2023-10-17 10:33:17,825 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:33:45,981 epoch 4 - iter 260/2606 - loss 0.08809572 - time (sec): 28.15 - samples/sec: 1322.94 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 10:34:12,346 epoch 4 - iter 520/2606 - loss 0.08738428 - time (sec): 54.52 - samples/sec: 1348.70 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 10:34:37,823 epoch 4 - iter 780/2606 - loss 0.08778704 - time (sec): 79.99 - samples/sec: 1355.87 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 10:35:04,756 epoch 4 - iter 1040/2606 - loss 0.08611474 - time (sec): 106.93 - samples/sec: 1346.45 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 10:35:30,915 epoch 4 - iter 1300/2606 - loss 0.08825421 - time (sec): 133.09 - samples/sec: 1343.11 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 10:35:56,677 epoch 4 - iter 1560/2606 - loss 0.08795449 - time (sec): 158.85 - samples/sec: 1342.10 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 10:36:24,361 epoch 4 - iter 1820/2606 - loss 0.08855088 - time (sec): 186.53 - samples/sec: 1348.57 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 10:36:51,682 epoch 4 - iter 2080/2606 - loss 0.08701037 - time (sec): 213.85 - samples/sec: 1356.64 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 10:37:19,341 epoch 4 - iter 2340/2606 - loss 0.08716051 - time (sec): 241.51 - samples/sec: 1361.96 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 10:37:47,256 epoch 4 - iter 2600/2606 - loss 0.08575731 - time (sec): 269.43 - samples/sec: 1361.07 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 10:37:47,818 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:37:47,818 EPOCH 4 done: loss 0.0856 - lr: 0.000033 |
|
2023-10-17 10:37:58,718 DEV : loss 0.2579371929168701 - f1-score (micro avg) 0.3733 |
|
2023-10-17 10:37:58,776 saving best model |
|
2023-10-17 10:38:00,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:38:27,739 epoch 5 - iter 260/2606 - loss 0.05100938 - time (sec): 27.55 - samples/sec: 1367.58 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 10:38:53,642 epoch 5 - iter 520/2606 - loss 0.04902557 - time (sec): 53.45 - samples/sec: 1337.47 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 10:39:20,996 epoch 5 - iter 780/2606 - loss 0.05197574 - time (sec): 80.80 - samples/sec: 1340.12 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 10:39:47,084 epoch 5 - iter 1040/2606 - loss 0.05396406 - time (sec): 106.89 - samples/sec: 1332.05 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 10:40:16,075 epoch 5 - iter 1300/2606 - loss 0.05874220 - time (sec): 135.88 - samples/sec: 1339.43 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 10:40:43,125 epoch 5 - iter 1560/2606 - loss 0.05952086 - time (sec): 162.93 - samples/sec: 1361.63 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 10:41:09,653 epoch 5 - iter 1820/2606 - loss 0.06088136 - time (sec): 189.46 - samples/sec: 1363.89 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 10:41:36,811 epoch 5 - iter 2080/2606 - loss 0.06080196 - time (sec): 216.62 - samples/sec: 1365.76 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 10:42:02,485 epoch 5 - iter 2340/2606 - loss 0.06099854 - time (sec): 242.29 - samples/sec: 1363.26 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 10:42:29,443 epoch 5 - iter 2600/2606 - loss 0.06045113 - time (sec): 269.25 - samples/sec: 1362.05 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 10:42:29,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:42:29,970 EPOCH 5 done: loss 0.0604 - lr: 0.000028 |
|
2023-10-17 10:42:40,709 DEV : loss 0.3068985044956207 - f1-score (micro avg) 0.4156 |
|
2023-10-17 10:42:40,761 saving best model |
|
2023-10-17 10:42:42,155 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:43:08,497 epoch 6 - iter 260/2606 - loss 0.04273978 - time (sec): 26.34 - samples/sec: 1415.83 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 10:43:34,516 epoch 6 - iter 520/2606 - loss 0.04270582 - time (sec): 52.36 - samples/sec: 1379.74 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 10:44:00,440 epoch 6 - iter 780/2606 - loss 0.04178911 - time (sec): 78.28 - samples/sec: 1368.55 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 10:44:27,949 epoch 6 - iter 1040/2606 - loss 0.03962571 - time (sec): 105.79 - samples/sec: 1385.07 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 10:44:55,668 epoch 6 - iter 1300/2606 - loss 0.04045489 - time (sec): 133.51 - samples/sec: 1387.29 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 10:45:22,831 epoch 6 - iter 1560/2606 - loss 0.04097251 - time (sec): 160.67 - samples/sec: 1383.91 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 10:45:50,498 epoch 6 - iter 1820/2606 - loss 0.04012529 - time (sec): 188.34 - samples/sec: 1371.39 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 10:46:17,568 epoch 6 - iter 2080/2606 - loss 0.04115226 - time (sec): 215.41 - samples/sec: 1363.74 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 10:46:46,617 epoch 6 - iter 2340/2606 - loss 0.04163284 - time (sec): 244.46 - samples/sec: 1352.33 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 10:47:13,563 epoch 6 - iter 2600/2606 - loss 0.04297284 - time (sec): 271.40 - samples/sec: 1350.77 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 10:47:14,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:47:14,120 EPOCH 6 done: loss 0.0429 - lr: 0.000022 |
|
2023-10-17 10:47:24,944 DEV : loss 0.3329330384731293 - f1-score (micro avg) 0.3972 |
|
2023-10-17 10:47:24,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:47:52,020 epoch 7 - iter 260/2606 - loss 0.03145306 - time (sec): 27.02 - samples/sec: 1399.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 10:48:19,096 epoch 7 - iter 520/2606 - loss 0.02828661 - time (sec): 54.10 - samples/sec: 1391.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 10:48:46,758 epoch 7 - iter 780/2606 - loss 0.02853558 - time (sec): 81.76 - samples/sec: 1374.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 10:49:16,621 epoch 7 - iter 1040/2606 - loss 0.02992094 - time (sec): 111.62 - samples/sec: 1335.70 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 10:49:44,050 epoch 7 - iter 1300/2606 - loss 0.03061991 - time (sec): 139.05 - samples/sec: 1327.08 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 10:50:10,681 epoch 7 - iter 1560/2606 - loss 0.02982746 - time (sec): 165.68 - samples/sec: 1326.21 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 10:50:37,937 epoch 7 - iter 1820/2606 - loss 0.03066087 - time (sec): 192.94 - samples/sec: 1325.75 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 10:51:06,450 epoch 7 - iter 2080/2606 - loss 0.03008073 - time (sec): 221.45 - samples/sec: 1332.50 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 10:51:34,169 epoch 7 - iter 2340/2606 - loss 0.03142644 - time (sec): 249.17 - samples/sec: 1331.26 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 10:52:00,747 epoch 7 - iter 2600/2606 - loss 0.03141193 - time (sec): 275.75 - samples/sec: 1330.53 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 10:52:01,309 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:52:01,309 EPOCH 7 done: loss 0.0314 - lr: 0.000017 |
|
2023-10-17 10:52:12,353 DEV : loss 0.363223135471344 - f1-score (micro avg) 0.4096 |
|
2023-10-17 10:52:12,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:52:39,557 epoch 8 - iter 260/2606 - loss 0.01939361 - time (sec): 27.14 - samples/sec: 1321.47 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 10:53:07,789 epoch 8 - iter 520/2606 - loss 0.01739651 - time (sec): 55.37 - samples/sec: 1290.53 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 10:53:36,222 epoch 8 - iter 780/2606 - loss 0.01873777 - time (sec): 83.80 - samples/sec: 1267.00 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 10:54:05,116 epoch 8 - iter 1040/2606 - loss 0.01864960 - time (sec): 112.70 - samples/sec: 1263.19 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 10:54:32,882 epoch 8 - iter 1300/2606 - loss 0.01961418 - time (sec): 140.46 - samples/sec: 1267.02 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 10:55:00,732 epoch 8 - iter 1560/2606 - loss 0.01972925 - time (sec): 168.31 - samples/sec: 1280.96 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 10:55:28,259 epoch 8 - iter 1820/2606 - loss 0.01993508 - time (sec): 195.84 - samples/sec: 1291.10 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 10:55:55,992 epoch 8 - iter 2080/2606 - loss 0.02035476 - time (sec): 223.57 - samples/sec: 1305.07 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 10:56:23,834 epoch 8 - iter 2340/2606 - loss 0.02030965 - time (sec): 251.41 - samples/sec: 1312.25 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 10:56:52,233 epoch 8 - iter 2600/2606 - loss 0.02034546 - time (sec): 279.81 - samples/sec: 1310.76 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 10:56:52,873 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:56:52,874 EPOCH 8 done: loss 0.0203 - lr: 0.000011 |
|
2023-10-17 10:57:05,587 DEV : loss 0.393916517496109 - f1-score (micro avg) 0.406 |
|
2023-10-17 10:57:05,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 10:57:32,201 epoch 9 - iter 260/2606 - loss 0.00999242 - time (sec): 26.54 - samples/sec: 1281.27 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 10:57:59,545 epoch 9 - iter 520/2606 - loss 0.01555837 - time (sec): 53.89 - samples/sec: 1322.90 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 10:58:26,352 epoch 9 - iter 780/2606 - loss 0.01445469 - time (sec): 80.69 - samples/sec: 1300.78 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 10:58:54,326 epoch 9 - iter 1040/2606 - loss 0.01476011 - time (sec): 108.67 - samples/sec: 1284.97 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 10:59:21,870 epoch 9 - iter 1300/2606 - loss 0.01578290 - time (sec): 136.21 - samples/sec: 1293.88 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 10:59:50,972 epoch 9 - iter 1560/2606 - loss 0.01579451 - time (sec): 165.31 - samples/sec: 1287.08 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 11:00:20,149 epoch 9 - iter 1820/2606 - loss 0.01527993 - time (sec): 194.49 - samples/sec: 1286.46 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 11:00:49,010 epoch 9 - iter 2080/2606 - loss 0.01478612 - time (sec): 223.35 - samples/sec: 1291.50 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 11:01:17,393 epoch 9 - iter 2340/2606 - loss 0.01475537 - time (sec): 251.73 - samples/sec: 1297.41 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 11:01:46,450 epoch 9 - iter 2600/2606 - loss 0.01458389 - time (sec): 280.79 - samples/sec: 1305.91 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 11:01:47,114 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:01:47,114 EPOCH 9 done: loss 0.0147 - lr: 0.000006 |
|
2023-10-17 11:02:00,212 DEV : loss 0.5024428367614746 - f1-score (micro avg) 0.3891 |
|
2023-10-17 11:02:00,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:02:28,755 epoch 10 - iter 260/2606 - loss 0.00777663 - time (sec): 28.48 - samples/sec: 1304.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 11:02:56,881 epoch 10 - iter 520/2606 - loss 0.00852951 - time (sec): 56.60 - samples/sec: 1283.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 11:03:24,114 epoch 10 - iter 780/2606 - loss 0.00914261 - time (sec): 83.84 - samples/sec: 1274.15 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 11:03:50,849 epoch 10 - iter 1040/2606 - loss 0.00961380 - time (sec): 110.57 - samples/sec: 1302.75 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 11:04:20,070 epoch 10 - iter 1300/2606 - loss 0.00995380 - time (sec): 139.79 - samples/sec: 1298.95 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 11:04:47,597 epoch 10 - iter 1560/2606 - loss 0.00946199 - time (sec): 167.32 - samples/sec: 1295.54 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 11:05:14,358 epoch 10 - iter 1820/2606 - loss 0.00930366 - time (sec): 194.08 - samples/sec: 1303.79 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 11:05:41,338 epoch 10 - iter 2080/2606 - loss 0.00918369 - time (sec): 221.06 - samples/sec: 1310.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 11:06:09,838 epoch 10 - iter 2340/2606 - loss 0.00912808 - time (sec): 249.56 - samples/sec: 1315.21 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 11:06:39,144 epoch 10 - iter 2600/2606 - loss 0.00928698 - time (sec): 278.87 - samples/sec: 1315.25 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 11:06:39,685 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:06:39,685 EPOCH 10 done: loss 0.0093 - lr: 0.000000 |
|
2023-10-17 11:06:51,588 DEV : loss 0.5263164043426514 - f1-score (micro avg) 0.3942 |
|
2023-10-17 11:06:52,178 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 11:06:52,180 Loading model from best epoch ... |
|
2023-10-17 11:06:54,538 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 11:07:15,105 |
|
Results: |
|
- F-score (micro) 0.4845 |
|
- F-score (macro) 0.3222 |
|
- Accuracy 0.3241 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5629 0.5972 0.5795 1214 |
|
PER 0.4140 0.4406 0.4269 808 |
|
ORG 0.3077 0.2606 0.2822 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.4784 0.4908 0.4845 2390 |
|
macro avg 0.3211 0.3246 0.3222 2390 |
|
weighted avg 0.4713 0.4908 0.4804 2390 |
|
|
|
2023-10-17 11:07:15,105 ---------------------------------------------------------------------------------------------------- |
|
|