2023-10-14 18:28:56,599 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,600 Model: "SequenceTagger( (embeddings): ByT5Embeddings( (model): T5EncoderModel( (shared): Embedding(384, 1472) (encoder): T5Stack( (embed_tokens): Embedding(384, 1472) (block): ModuleList( (0): T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) (relative_attention_bias): Embedding(32, 6) ) (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) (1-11): 11 x T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1472, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 18:28:56,600 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,600 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-14 18:28:56,600 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,600 Train: 3575 sentences 2023-10-14 18:28:56,600 (train_with_dev=False, train_with_test=False) 2023-10-14 18:28:56,600 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,600 Training Params: 2023-10-14 18:28:56,600 - learning_rate: "0.00015" 2023-10-14 18:28:56,601 - mini_batch_size: "8" 2023-10-14 18:28:56,601 - max_epochs: "10" 2023-10-14 18:28:56,601 - shuffle: "True" 2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,601 Plugins: 2023-10-14 18:28:56,601 - TensorboardLogger 2023-10-14 18:28:56,601 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,601 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 18:28:56,601 - metric: "('micro avg', 'f1-score')" 2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,601 Computation: 2023-10-14 18:28:56,601 - compute on device: cuda:0 2023-10-14 18:28:56,601 - embedding storage: none 2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,601 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1" 2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:28:56,601 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-14 18:29:13,026 epoch 1 - iter 44/447 - loss 3.04968209 - time (sec): 16.42 - samples/sec: 558.83 - lr: 0.000014 - momentum: 0.000000 2023-10-14 18:29:27,978 epoch 1 - iter 88/447 - loss 3.03220833 - time (sec): 31.38 - samples/sec: 550.16 - lr: 0.000029 - momentum: 0.000000 2023-10-14 18:29:43,363 epoch 1 - iter 132/447 - loss 2.98006060 - time (sec): 46.76 - samples/sec: 545.03 - lr: 0.000044 - momentum: 0.000000 2023-10-14 18:29:58,651 epoch 1 - iter 176/447 - loss 2.85571004 - time (sec): 62.05 - samples/sec: 546.60 - lr: 0.000059 - momentum: 0.000000 2023-10-14 18:30:13,328 epoch 1 - iter 220/447 - loss 2.71064544 - time (sec): 76.73 - samples/sec: 540.41 - lr: 0.000073 - momentum: 0.000000 2023-10-14 18:30:28,322 epoch 1 - iter 264/447 - loss 2.53638057 - time (sec): 91.72 - samples/sec: 539.60 - lr: 0.000088 - momentum: 0.000000 2023-10-14 18:30:44,206 epoch 1 - iter 308/447 - loss 2.32818224 - time (sec): 107.60 - samples/sec: 545.92 - lr: 0.000103 - momentum: 0.000000 2023-10-14 18:30:59,374 epoch 1 - iter 352/447 - loss 2.14903375 - time (sec): 122.77 - samples/sec: 546.91 - lr: 0.000118 - momentum: 0.000000 2023-10-14 18:31:17,104 epoch 1 - iter 396/447 - loss 1.94372981 - time (sec): 140.50 - samples/sec: 551.15 - lr: 0.000133 - momentum: 0.000000 2023-10-14 18:31:32,315 epoch 1 - iter 440/447 - loss 1.81241981 - time (sec): 155.71 - samples/sec: 547.13 - lr: 0.000147 - momentum: 0.000000 2023-10-14 18:31:34,703 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:31:34,703 EPOCH 1 done: loss 1.7931 - lr: 0.000147 2023-10-14 18:31:57,276 DEV : loss 0.4706111252307892 - f1-score (micro avg) 0.0 2023-10-14 18:31:57,301 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:32:12,593 epoch 2 - iter 44/447 - loss 0.51221402 - time (sec): 15.29 - samples/sec: 553.41 - lr: 0.000148 - momentum: 0.000000 2023-10-14 18:32:27,785 epoch 2 - iter 88/447 - loss 0.49327832 - time (sec): 30.48 - samples/sec: 552.35 - lr: 0.000147 - momentum: 0.000000 2023-10-14 18:32:43,466 epoch 2 - iter 132/447 - loss 0.45252612 - time (sec): 46.16 - samples/sec: 567.85 - lr: 0.000145 - momentum: 0.000000 2023-10-14 18:33:00,409 epoch 2 - iter 176/447 - loss 0.42537516 - time (sec): 63.11 - samples/sec: 564.16 - lr: 0.000143 - momentum: 0.000000 2023-10-14 18:33:15,852 epoch 2 - iter 220/447 - loss 0.40524817 - time (sec): 78.55 - samples/sec: 562.93 - lr: 0.000142 - momentum: 0.000000 2023-10-14 18:33:31,601 epoch 2 - iter 264/447 - loss 0.38456250 - time (sec): 94.30 - samples/sec: 561.96 - lr: 0.000140 - momentum: 0.000000 2023-10-14 18:33:46,668 epoch 2 - iter 308/447 - loss 0.38166992 - time (sec): 109.37 - samples/sec: 557.13 - lr: 0.000139 - momentum: 0.000000 2023-10-14 18:34:02,008 epoch 2 - iter 352/447 - loss 0.37044403 - time (sec): 124.71 - samples/sec: 557.20 - lr: 0.000137 - momentum: 0.000000 2023-10-14 18:34:17,604 epoch 2 - iter 396/447 - loss 0.36100765 - time (sec): 140.30 - samples/sec: 555.12 - lr: 0.000135 - momentum: 0.000000 2023-10-14 18:34:32,442 epoch 2 - iter 440/447 - loss 0.35295013 - time (sec): 155.14 - samples/sec: 551.20 - lr: 0.000134 - momentum: 0.000000 2023-10-14 18:34:34,704 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:34:34,705 EPOCH 2 done: loss 0.3534 - lr: 0.000134 2023-10-14 18:34:59,145 DEV : loss 0.24648237228393555 - f1-score (micro avg) 0.4615 2023-10-14 18:34:59,170 saving best model 2023-10-14 18:34:59,966 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:35:15,514 epoch 3 - iter 44/447 - loss 0.28031449 - time (sec): 15.55 - samples/sec: 529.00 - lr: 0.000132 - momentum: 0.000000 2023-10-14 18:35:30,601 epoch 3 - iter 88/447 - loss 0.24660222 - time (sec): 30.63 - samples/sec: 533.31 - lr: 0.000130 - momentum: 0.000000 2023-10-14 18:35:46,488 epoch 3 - iter 132/447 - loss 0.24011989 - time (sec): 46.52 - samples/sec: 535.83 - lr: 0.000128 - momentum: 0.000000 2023-10-14 18:36:01,757 epoch 3 - iter 176/447 - loss 0.23680025 - time (sec): 61.79 - samples/sec: 538.05 - lr: 0.000127 - momentum: 0.000000 2023-10-14 18:36:19,186 epoch 3 - iter 220/447 - loss 0.22759569 - time (sec): 79.22 - samples/sec: 547.03 - lr: 0.000125 - momentum: 0.000000 2023-10-14 18:36:34,393 epoch 3 - iter 264/447 - loss 0.22429489 - time (sec): 94.43 - samples/sec: 545.33 - lr: 0.000124 - momentum: 0.000000 2023-10-14 18:36:49,787 epoch 3 - iter 308/447 - loss 0.21942290 - time (sec): 109.82 - samples/sec: 543.54 - lr: 0.000122 - momentum: 0.000000 2023-10-14 18:37:04,772 epoch 3 - iter 352/447 - loss 0.21213137 - time (sec): 124.80 - samples/sec: 541.12 - lr: 0.000120 - momentum: 0.000000 2023-10-14 18:37:20,495 epoch 3 - iter 396/447 - loss 0.20683047 - time (sec): 140.53 - samples/sec: 543.46 - lr: 0.000119 - momentum: 0.000000 2023-10-14 18:37:35,931 epoch 3 - iter 440/447 - loss 0.20185030 - time (sec): 155.96 - samples/sec: 545.25 - lr: 0.000117 - momentum: 0.000000 2023-10-14 18:37:38,436 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:37:38,437 EPOCH 3 done: loss 0.2006 - lr: 0.000117 2023-10-14 18:38:03,004 DEV : loss 0.17484012246131897 - f1-score (micro avg) 0.6667 2023-10-14 18:38:03,030 saving best model 2023-10-14 18:38:03,867 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:38:19,427 epoch 4 - iter 44/447 - loss 0.16075713 - time (sec): 15.56 - samples/sec: 535.32 - lr: 0.000115 - momentum: 0.000000 2023-10-14 18:38:34,365 epoch 4 - iter 88/447 - loss 0.15141283 - time (sec): 30.50 - samples/sec: 526.90 - lr: 0.000113 - momentum: 0.000000 2023-10-14 18:38:49,421 epoch 4 - iter 132/447 - loss 0.14651235 - time (sec): 45.55 - samples/sec: 528.07 - lr: 0.000112 - momentum: 0.000000 2023-10-14 18:39:04,810 epoch 4 - iter 176/447 - loss 0.14574267 - time (sec): 60.94 - samples/sec: 528.80 - lr: 0.000110 - momentum: 0.000000 2023-10-14 18:39:19,959 epoch 4 - iter 220/447 - loss 0.13882476 - time (sec): 76.09 - samples/sec: 527.66 - lr: 0.000109 - momentum: 0.000000 2023-10-14 18:39:35,657 epoch 4 - iter 264/447 - loss 0.13265973 - time (sec): 91.79 - samples/sec: 535.86 - lr: 0.000107 - momentum: 0.000000 2023-10-14 18:39:50,775 epoch 4 - iter 308/447 - loss 0.12641890 - time (sec): 106.91 - samples/sec: 535.20 - lr: 0.000105 - momentum: 0.000000 2023-10-14 18:40:05,921 epoch 4 - iter 352/447 - loss 0.12439053 - time (sec): 122.05 - samples/sec: 535.52 - lr: 0.000104 - momentum: 0.000000 2023-10-14 18:40:23,362 epoch 4 - iter 396/447 - loss 0.12260346 - time (sec): 139.49 - samples/sec: 539.84 - lr: 0.000102 - momentum: 0.000000 2023-10-14 18:40:39,750 epoch 4 - iter 440/447 - loss 0.11757667 - time (sec): 155.88 - samples/sec: 544.25 - lr: 0.000100 - momentum: 0.000000 2023-10-14 18:40:42,378 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:40:42,378 EPOCH 4 done: loss 0.1161 - lr: 0.000100 2023-10-14 18:41:07,153 DEV : loss 0.16356223821640015 - f1-score (micro avg) 0.725 2023-10-14 18:41:07,179 saving best model 2023-10-14 18:41:11,734 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:41:26,859 epoch 5 - iter 44/447 - loss 0.07023867 - time (sec): 15.12 - samples/sec: 506.11 - lr: 0.000098 - momentum: 0.000000 2023-10-14 18:41:42,294 epoch 5 - iter 88/447 - loss 0.06478001 - time (sec): 30.56 - samples/sec: 523.37 - lr: 0.000097 - momentum: 0.000000 2023-10-14 18:41:58,072 epoch 5 - iter 132/447 - loss 0.06412258 - time (sec): 46.34 - samples/sec: 536.84 - lr: 0.000095 - momentum: 0.000000 2023-10-14 18:42:13,295 epoch 5 - iter 176/447 - loss 0.07024363 - time (sec): 61.56 - samples/sec: 538.35 - lr: 0.000094 - momentum: 0.000000 2023-10-14 18:42:28,441 epoch 5 - iter 220/447 - loss 0.06745593 - time (sec): 76.70 - samples/sec: 541.44 - lr: 0.000092 - momentum: 0.000000 2023-10-14 18:42:45,723 epoch 5 - iter 264/447 - loss 0.07148378 - time (sec): 93.99 - samples/sec: 544.05 - lr: 0.000090 - momentum: 0.000000 2023-10-14 18:43:00,596 epoch 5 - iter 308/447 - loss 0.07209831 - time (sec): 108.86 - samples/sec: 542.81 - lr: 0.000089 - momentum: 0.000000 2023-10-14 18:43:15,817 epoch 5 - iter 352/447 - loss 0.07099360 - time (sec): 124.08 - samples/sec: 544.94 - lr: 0.000087 - momentum: 0.000000 2023-10-14 18:43:31,278 epoch 5 - iter 396/447 - loss 0.07091432 - time (sec): 139.54 - samples/sec: 548.72 - lr: 0.000085 - momentum: 0.000000 2023-10-14 18:43:46,569 epoch 5 - iter 440/447 - loss 0.07134499 - time (sec): 154.83 - samples/sec: 550.12 - lr: 0.000084 - momentum: 0.000000 2023-10-14 18:43:48,967 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:43:48,967 EPOCH 5 done: loss 0.0719 - lr: 0.000084 2023-10-14 18:44:13,629 DEV : loss 0.1595887392759323 - f1-score (micro avg) 0.7469 2023-10-14 18:44:13,655 saving best model 2023-10-14 18:44:18,225 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:44:33,929 epoch 6 - iter 44/447 - loss 0.02958638 - time (sec): 15.70 - samples/sec: 542.37 - lr: 0.000082 - momentum: 0.000000 2023-10-14 18:44:49,057 epoch 6 - iter 88/447 - loss 0.03872661 - time (sec): 30.83 - samples/sec: 545.68 - lr: 0.000080 - momentum: 0.000000 2023-10-14 18:45:04,426 epoch 6 - iter 132/447 - loss 0.04324198 - time (sec): 46.20 - samples/sec: 546.14 - lr: 0.000079 - momentum: 0.000000 2023-10-14 18:45:19,712 epoch 6 - iter 176/447 - loss 0.04411436 - time (sec): 61.48 - samples/sec: 549.13 - lr: 0.000077 - momentum: 0.000000 2023-10-14 18:45:34,792 epoch 6 - iter 220/447 - loss 0.04536235 - time (sec): 76.56 - samples/sec: 545.39 - lr: 0.000075 - momentum: 0.000000 2023-10-14 18:45:51,942 epoch 6 - iter 264/447 - loss 0.04612584 - time (sec): 93.71 - samples/sec: 546.82 - lr: 0.000074 - momentum: 0.000000 2023-10-14 18:46:07,599 epoch 6 - iter 308/447 - loss 0.04566728 - time (sec): 109.37 - samples/sec: 551.32 - lr: 0.000072 - momentum: 0.000000 2023-10-14 18:46:23,331 epoch 6 - iter 352/447 - loss 0.04571960 - time (sec): 125.10 - samples/sec: 549.12 - lr: 0.000070 - momentum: 0.000000 2023-10-14 18:46:38,316 epoch 6 - iter 396/447 - loss 0.04798319 - time (sec): 140.09 - samples/sec: 546.83 - lr: 0.000069 - momentum: 0.000000 2023-10-14 18:46:53,874 epoch 6 - iter 440/447 - loss 0.04832325 - time (sec): 155.65 - samples/sec: 547.34 - lr: 0.000067 - momentum: 0.000000 2023-10-14 18:46:56,276 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:46:56,277 EPOCH 6 done: loss 0.0481 - lr: 0.000067 2023-10-14 18:47:20,963 DEV : loss 0.1803191751241684 - f1-score (micro avg) 0.7481 2023-10-14 18:47:20,988 saving best model 2023-10-14 18:47:25,327 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:47:42,568 epoch 7 - iter 44/447 - loss 0.04512986 - time (sec): 17.24 - samples/sec: 561.18 - lr: 0.000065 - momentum: 0.000000 2023-10-14 18:47:58,148 epoch 7 - iter 88/447 - loss 0.03940830 - time (sec): 32.82 - samples/sec: 558.82 - lr: 0.000064 - momentum: 0.000000 2023-10-14 18:48:13,106 epoch 7 - iter 132/447 - loss 0.04551032 - time (sec): 47.78 - samples/sec: 547.80 - lr: 0.000062 - momentum: 0.000000 2023-10-14 18:48:28,259 epoch 7 - iter 176/447 - loss 0.04250563 - time (sec): 62.93 - samples/sec: 547.24 - lr: 0.000060 - momentum: 0.000000 2023-10-14 18:48:43,733 epoch 7 - iter 220/447 - loss 0.03900809 - time (sec): 78.40 - samples/sec: 549.82 - lr: 0.000059 - momentum: 0.000000 2023-10-14 18:48:59,886 epoch 7 - iter 264/447 - loss 0.03682204 - time (sec): 94.56 - samples/sec: 549.51 - lr: 0.000057 - momentum: 0.000000 2023-10-14 18:49:15,131 epoch 7 - iter 308/447 - loss 0.03689685 - time (sec): 109.80 - samples/sec: 549.10 - lr: 0.000055 - momentum: 0.000000 2023-10-14 18:49:30,170 epoch 7 - iter 352/447 - loss 0.03473793 - time (sec): 124.84 - samples/sec: 548.47 - lr: 0.000054 - momentum: 0.000000 2023-10-14 18:49:45,552 epoch 7 - iter 396/447 - loss 0.03496683 - time (sec): 140.22 - samples/sec: 550.10 - lr: 0.000052 - momentum: 0.000000 2023-10-14 18:50:00,748 epoch 7 - iter 440/447 - loss 0.03374463 - time (sec): 155.42 - samples/sec: 548.61 - lr: 0.000050 - momentum: 0.000000 2023-10-14 18:50:03,115 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:50:03,116 EPOCH 7 done: loss 0.0338 - lr: 0.000050 2023-10-14 18:50:27,933 DEV : loss 0.1989319771528244 - f1-score (micro avg) 0.7592 2023-10-14 18:50:27,958 saving best model 2023-10-14 18:50:32,398 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:50:47,481 epoch 8 - iter 44/447 - loss 0.02816068 - time (sec): 15.08 - samples/sec: 542.31 - lr: 0.000049 - momentum: 0.000000 2023-10-14 18:51:03,151 epoch 8 - iter 88/447 - loss 0.03674012 - time (sec): 30.75 - samples/sec: 546.65 - lr: 0.000047 - momentum: 0.000000 2023-10-14 18:51:18,134 epoch 8 - iter 132/447 - loss 0.03190669 - time (sec): 45.73 - samples/sec: 540.12 - lr: 0.000045 - momentum: 0.000000 2023-10-14 18:51:33,859 epoch 8 - iter 176/447 - loss 0.02926721 - time (sec): 61.46 - samples/sec: 552.77 - lr: 0.000044 - momentum: 0.000000 2023-10-14 18:51:49,605 epoch 8 - iter 220/447 - loss 0.02779927 - time (sec): 77.21 - samples/sec: 558.46 - lr: 0.000042 - momentum: 0.000000 2023-10-14 18:52:05,003 epoch 8 - iter 264/447 - loss 0.02652766 - time (sec): 92.60 - samples/sec: 551.05 - lr: 0.000040 - momentum: 0.000000 2023-10-14 18:52:21,852 epoch 8 - iter 308/447 - loss 0.02801775 - time (sec): 109.45 - samples/sec: 549.38 - lr: 0.000039 - momentum: 0.000000 2023-10-14 18:52:36,870 epoch 8 - iter 352/447 - loss 0.02681118 - time (sec): 124.47 - samples/sec: 548.52 - lr: 0.000037 - momentum: 0.000000 2023-10-14 18:52:52,122 epoch 8 - iter 396/447 - loss 0.02608349 - time (sec): 139.72 - samples/sec: 548.04 - lr: 0.000035 - momentum: 0.000000 2023-10-14 18:53:07,402 epoch 8 - iter 440/447 - loss 0.02505040 - time (sec): 155.00 - samples/sec: 549.68 - lr: 0.000034 - momentum: 0.000000 2023-10-14 18:53:09,819 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:53:09,819 EPOCH 8 done: loss 0.0249 - lr: 0.000034 2023-10-14 18:53:34,610 DEV : loss 0.20397181808948517 - f1-score (micro avg) 0.7593 2023-10-14 18:53:34,635 saving best model 2023-10-14 18:53:38,825 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:53:56,216 epoch 9 - iter 44/447 - loss 0.03413553 - time (sec): 17.39 - samples/sec: 559.05 - lr: 0.000032 - momentum: 0.000000 2023-10-14 18:54:12,137 epoch 9 - iter 88/447 - loss 0.02537551 - time (sec): 33.31 - samples/sec: 563.45 - lr: 0.000030 - momentum: 0.000000 2023-10-14 18:54:27,605 epoch 9 - iter 132/447 - loss 0.02230186 - time (sec): 48.78 - samples/sec: 561.25 - lr: 0.000029 - momentum: 0.000000 2023-10-14 18:54:43,093 epoch 9 - iter 176/447 - loss 0.02191161 - time (sec): 64.27 - samples/sec: 562.30 - lr: 0.000027 - momentum: 0.000000 2023-10-14 18:54:57,966 epoch 9 - iter 220/447 - loss 0.02003936 - time (sec): 79.14 - samples/sec: 553.96 - lr: 0.000025 - momentum: 0.000000 2023-10-14 18:55:13,466 epoch 9 - iter 264/447 - loss 0.02206598 - time (sec): 94.64 - samples/sec: 550.03 - lr: 0.000024 - momentum: 0.000000 2023-10-14 18:55:28,451 epoch 9 - iter 308/447 - loss 0.02063661 - time (sec): 109.62 - samples/sec: 545.60 - lr: 0.000022 - momentum: 0.000000 2023-10-14 18:55:43,831 epoch 9 - iter 352/447 - loss 0.02036125 - time (sec): 125.00 - samples/sec: 545.41 - lr: 0.000020 - momentum: 0.000000 2023-10-14 18:55:59,355 epoch 9 - iter 396/447 - loss 0.01930381 - time (sec): 140.53 - samples/sec: 546.14 - lr: 0.000019 - momentum: 0.000000 2023-10-14 18:56:14,934 epoch 9 - iter 440/447 - loss 0.02015557 - time (sec): 156.11 - samples/sec: 545.94 - lr: 0.000017 - momentum: 0.000000 2023-10-14 18:56:17,335 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:56:17,335 EPOCH 9 done: loss 0.0201 - lr: 0.000017 2023-10-14 18:56:42,347 DEV : loss 0.2131572663784027 - f1-score (micro avg) 0.7503 2023-10-14 18:56:42,372 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:56:57,773 epoch 10 - iter 44/447 - loss 0.02127213 - time (sec): 15.40 - samples/sec: 568.28 - lr: 0.000015 - momentum: 0.000000 2023-10-14 18:57:12,525 epoch 10 - iter 88/447 - loss 0.01811628 - time (sec): 30.15 - samples/sec: 544.58 - lr: 0.000014 - momentum: 0.000000 2023-10-14 18:57:27,600 epoch 10 - iter 132/447 - loss 0.01611064 - time (sec): 45.23 - samples/sec: 545.36 - lr: 0.000012 - momentum: 0.000000 2023-10-14 18:57:43,311 epoch 10 - iter 176/447 - loss 0.01554251 - time (sec): 60.94 - samples/sec: 551.21 - lr: 0.000010 - momentum: 0.000000 2023-10-14 18:58:01,072 epoch 10 - iter 220/447 - loss 0.01884512 - time (sec): 78.70 - samples/sec: 556.12 - lr: 0.000009 - momentum: 0.000000 2023-10-14 18:58:16,786 epoch 10 - iter 264/447 - loss 0.01782274 - time (sec): 94.41 - samples/sec: 551.69 - lr: 0.000007 - momentum: 0.000000 2023-10-14 18:58:31,821 epoch 10 - iter 308/447 - loss 0.01717610 - time (sec): 109.45 - samples/sec: 547.71 - lr: 0.000005 - momentum: 0.000000 2023-10-14 18:58:46,654 epoch 10 - iter 352/447 - loss 0.01630284 - time (sec): 124.28 - samples/sec: 543.75 - lr: 0.000004 - momentum: 0.000000 2023-10-14 18:59:02,118 epoch 10 - iter 396/447 - loss 0.01612965 - time (sec): 139.74 - samples/sec: 545.16 - lr: 0.000002 - momentum: 0.000000 2023-10-14 18:59:18,081 epoch 10 - iter 440/447 - loss 0.01766576 - time (sec): 155.71 - samples/sec: 547.01 - lr: 0.000001 - momentum: 0.000000 2023-10-14 18:59:20,506 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:59:20,506 EPOCH 10 done: loss 0.0180 - lr: 0.000001 2023-10-14 18:59:45,871 DEV : loss 0.22141988575458527 - f1-score (micro avg) 0.7508 2023-10-14 18:59:46,688 ---------------------------------------------------------------------------------------------------- 2023-10-14 18:59:46,689 Loading model from best epoch ... 2023-10-14 18:59:49,776 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-14 19:00:11,248 Results: - F-score (micro) 0.7519 - F-score (macro) 0.6501 - Accuracy 0.6163 By class: precision recall f1-score support loc 0.8339 0.8674 0.8503 596 pers 0.6772 0.7748 0.7227 333 org 0.5263 0.5303 0.5283 132 prod 0.6296 0.5152 0.5667 66 time 0.5556 0.6122 0.5825 49 micro avg 0.7319 0.7730 0.7519 1176 macro avg 0.6445 0.6600 0.6501 1176 weighted avg 0.7319 0.7730 0.7510 1176 2023-10-14 19:00:11,249 ----------------------------------------------------------------------------------------------------