stefan-it's picture
Upload folder using huggingface_hub
40683ff
2023-10-11 09:39:50,455 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,457 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-11 09:39:50,457 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,458 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-11 09:39:50,458 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,458 Train: 1085 sentences
2023-10-11 09:39:50,458 (train_with_dev=False, train_with_test=False)
2023-10-11 09:39:50,458 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,458 Training Params:
2023-10-11 09:39:50,458 - learning_rate: "0.00015"
2023-10-11 09:39:50,458 - mini_batch_size: "4"
2023-10-11 09:39:50,458 - max_epochs: "10"
2023-10-11 09:39:50,458 - shuffle: "True"
2023-10-11 09:39:50,458 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,459 Plugins:
2023-10-11 09:39:50,459 - TensorboardLogger
2023-10-11 09:39:50,459 - LinearScheduler | warmup_fraction: '0.1'
2023-10-11 09:39:50,459 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,459 Final evaluation on model from best epoch (best-model.pt)
2023-10-11 09:39:50,459 - metric: "('micro avg', 'f1-score')"
2023-10-11 09:39:50,459 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,459 Computation:
2023-10-11 09:39:50,459 - compute on device: cuda:0
2023-10-11 09:39:50,459 - embedding storage: none
2023-10-11 09:39:50,459 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,459 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2"
2023-10-11 09:39:50,459 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,459 ----------------------------------------------------------------------------------------------------
2023-10-11 09:39:50,460 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-11 09:40:00,360 epoch 1 - iter 27/272 - loss 2.84978598 - time (sec): 9.90 - samples/sec: 546.87 - lr: 0.000014 - momentum: 0.000000
2023-10-11 09:40:09,671 epoch 1 - iter 54/272 - loss 2.83993499 - time (sec): 19.21 - samples/sec: 513.66 - lr: 0.000029 - momentum: 0.000000
2023-10-11 09:40:19,438 epoch 1 - iter 81/272 - loss 2.82028870 - time (sec): 28.98 - samples/sec: 522.42 - lr: 0.000044 - momentum: 0.000000
2023-10-11 09:40:29,609 epoch 1 - iter 108/272 - loss 2.75600317 - time (sec): 39.15 - samples/sec: 536.66 - lr: 0.000059 - momentum: 0.000000
2023-10-11 09:40:39,237 epoch 1 - iter 135/272 - loss 2.66472267 - time (sec): 48.78 - samples/sec: 540.07 - lr: 0.000074 - momentum: 0.000000
2023-10-11 09:40:48,005 epoch 1 - iter 162/272 - loss 2.58070954 - time (sec): 57.54 - samples/sec: 532.22 - lr: 0.000089 - momentum: 0.000000
2023-10-11 09:40:57,344 epoch 1 - iter 189/272 - loss 2.47427847 - time (sec): 66.88 - samples/sec: 532.94 - lr: 0.000104 - momentum: 0.000000
2023-10-11 09:41:06,659 epoch 1 - iter 216/272 - loss 2.36367265 - time (sec): 76.20 - samples/sec: 534.61 - lr: 0.000119 - momentum: 0.000000
2023-10-11 09:41:17,028 epoch 1 - iter 243/272 - loss 2.22144061 - time (sec): 86.57 - samples/sec: 539.03 - lr: 0.000133 - momentum: 0.000000
2023-10-11 09:41:26,523 epoch 1 - iter 270/272 - loss 2.10748412 - time (sec): 96.06 - samples/sec: 540.43 - lr: 0.000148 - momentum: 0.000000
2023-10-11 09:41:26,853 ----------------------------------------------------------------------------------------------------
2023-10-11 09:41:26,853 EPOCH 1 done: loss 2.1060 - lr: 0.000148
2023-10-11 09:41:32,029 DEV : loss 0.8020414710044861 - f1-score (micro avg) 0.0
2023-10-11 09:41:32,037 ----------------------------------------------------------------------------------------------------
2023-10-11 09:41:42,064 epoch 2 - iter 27/272 - loss 0.76893037 - time (sec): 10.03 - samples/sec: 577.84 - lr: 0.000148 - momentum: 0.000000
2023-10-11 09:41:51,392 epoch 2 - iter 54/272 - loss 0.69007728 - time (sec): 19.35 - samples/sec: 556.09 - lr: 0.000147 - momentum: 0.000000
2023-10-11 09:42:01,289 epoch 2 - iter 81/272 - loss 0.67431997 - time (sec): 29.25 - samples/sec: 568.69 - lr: 0.000145 - momentum: 0.000000
2023-10-11 09:42:10,880 epoch 2 - iter 108/272 - loss 0.63443527 - time (sec): 38.84 - samples/sec: 561.67 - lr: 0.000143 - momentum: 0.000000
2023-10-11 09:42:20,670 epoch 2 - iter 135/272 - loss 0.61291708 - time (sec): 48.63 - samples/sec: 554.52 - lr: 0.000142 - momentum: 0.000000
2023-10-11 09:42:29,890 epoch 2 - iter 162/272 - loss 0.58952666 - time (sec): 57.85 - samples/sec: 542.86 - lr: 0.000140 - momentum: 0.000000
2023-10-11 09:42:39,479 epoch 2 - iter 189/272 - loss 0.57453427 - time (sec): 67.44 - samples/sec: 534.95 - lr: 0.000138 - momentum: 0.000000
2023-10-11 09:42:49,535 epoch 2 - iter 216/272 - loss 0.54891050 - time (sec): 77.50 - samples/sec: 533.83 - lr: 0.000137 - momentum: 0.000000
2023-10-11 09:42:59,160 epoch 2 - iter 243/272 - loss 0.53267587 - time (sec): 87.12 - samples/sec: 531.19 - lr: 0.000135 - momentum: 0.000000
2023-10-11 09:43:09,206 epoch 2 - iter 270/272 - loss 0.52022697 - time (sec): 97.17 - samples/sec: 533.36 - lr: 0.000134 - momentum: 0.000000
2023-10-11 09:43:09,606 ----------------------------------------------------------------------------------------------------
2023-10-11 09:43:09,606 EPOCH 2 done: loss 0.5193 - lr: 0.000134
2023-10-11 09:43:15,513 DEV : loss 0.3020303547382355 - f1-score (micro avg) 0.2903
2023-10-11 09:43:15,522 saving best model
2023-10-11 09:43:16,375 ----------------------------------------------------------------------------------------------------
2023-10-11 09:43:25,815 epoch 3 - iter 27/272 - loss 0.38410607 - time (sec): 9.44 - samples/sec: 557.12 - lr: 0.000132 - momentum: 0.000000
2023-10-11 09:43:35,064 epoch 3 - iter 54/272 - loss 0.37333646 - time (sec): 18.69 - samples/sec: 549.52 - lr: 0.000130 - momentum: 0.000000
2023-10-11 09:43:44,340 epoch 3 - iter 81/272 - loss 0.34926921 - time (sec): 27.96 - samples/sec: 543.47 - lr: 0.000128 - momentum: 0.000000
2023-10-11 09:43:53,690 epoch 3 - iter 108/272 - loss 0.34301949 - time (sec): 37.31 - samples/sec: 544.64 - lr: 0.000127 - momentum: 0.000000
2023-10-11 09:44:03,560 epoch 3 - iter 135/272 - loss 0.33694811 - time (sec): 47.18 - samples/sec: 551.32 - lr: 0.000125 - momentum: 0.000000
2023-10-11 09:44:12,886 epoch 3 - iter 162/272 - loss 0.32617109 - time (sec): 56.51 - samples/sec: 548.75 - lr: 0.000123 - momentum: 0.000000
2023-10-11 09:44:23,467 epoch 3 - iter 189/272 - loss 0.32458657 - time (sec): 67.09 - samples/sec: 555.56 - lr: 0.000122 - momentum: 0.000000
2023-10-11 09:44:33,403 epoch 3 - iter 216/272 - loss 0.31370703 - time (sec): 77.03 - samples/sec: 554.78 - lr: 0.000120 - momentum: 0.000000
2023-10-11 09:44:42,131 epoch 3 - iter 243/272 - loss 0.31287050 - time (sec): 85.75 - samples/sec: 546.57 - lr: 0.000119 - momentum: 0.000000
2023-10-11 09:44:51,444 epoch 3 - iter 270/272 - loss 0.31168254 - time (sec): 95.07 - samples/sec: 544.09 - lr: 0.000117 - momentum: 0.000000
2023-10-11 09:44:51,938 ----------------------------------------------------------------------------------------------------
2023-10-11 09:44:51,938 EPOCH 3 done: loss 0.3122 - lr: 0.000117
2023-10-11 09:44:57,957 DEV : loss 0.2577632665634155 - f1-score (micro avg) 0.3305
2023-10-11 09:44:57,965 saving best model
2023-10-11 09:45:00,499 ----------------------------------------------------------------------------------------------------
2023-10-11 09:45:09,969 epoch 4 - iter 27/272 - loss 0.29354878 - time (sec): 9.47 - samples/sec: 530.04 - lr: 0.000115 - momentum: 0.000000
2023-10-11 09:45:18,936 epoch 4 - iter 54/272 - loss 0.26285013 - time (sec): 18.43 - samples/sec: 518.92 - lr: 0.000113 - momentum: 0.000000
2023-10-11 09:45:28,972 epoch 4 - iter 81/272 - loss 0.24810936 - time (sec): 28.47 - samples/sec: 548.02 - lr: 0.000112 - momentum: 0.000000
2023-10-11 09:45:38,709 epoch 4 - iter 108/272 - loss 0.24358018 - time (sec): 38.21 - samples/sec: 552.25 - lr: 0.000110 - momentum: 0.000000
2023-10-11 09:45:47,860 epoch 4 - iter 135/272 - loss 0.23901215 - time (sec): 47.36 - samples/sec: 549.15 - lr: 0.000108 - momentum: 0.000000
2023-10-11 09:45:57,856 epoch 4 - iter 162/272 - loss 0.23297876 - time (sec): 57.35 - samples/sec: 551.33 - lr: 0.000107 - momentum: 0.000000
2023-10-11 09:46:06,942 epoch 4 - iter 189/272 - loss 0.23528726 - time (sec): 66.44 - samples/sec: 546.24 - lr: 0.000105 - momentum: 0.000000
2023-10-11 09:46:16,464 epoch 4 - iter 216/272 - loss 0.23243463 - time (sec): 75.96 - samples/sec: 545.52 - lr: 0.000103 - momentum: 0.000000
2023-10-11 09:46:26,378 epoch 4 - iter 243/272 - loss 0.23708393 - time (sec): 85.87 - samples/sec: 543.87 - lr: 0.000102 - momentum: 0.000000
2023-10-11 09:46:35,978 epoch 4 - iter 270/272 - loss 0.23306336 - time (sec): 95.47 - samples/sec: 542.45 - lr: 0.000100 - momentum: 0.000000
2023-10-11 09:46:36,427 ----------------------------------------------------------------------------------------------------
2023-10-11 09:46:36,428 EPOCH 4 done: loss 0.2329 - lr: 0.000100
2023-10-11 09:46:42,295 DEV : loss 0.19623495638370514 - f1-score (micro avg) 0.5471
2023-10-11 09:46:42,304 saving best model
2023-10-11 09:46:44,891 ----------------------------------------------------------------------------------------------------
2023-10-11 09:46:54,037 epoch 5 - iter 27/272 - loss 0.18848352 - time (sec): 9.14 - samples/sec: 510.71 - lr: 0.000098 - momentum: 0.000000
2023-10-11 09:47:03,653 epoch 5 - iter 54/272 - loss 0.17630539 - time (sec): 18.76 - samples/sec: 535.19 - lr: 0.000097 - momentum: 0.000000
2023-10-11 09:47:13,136 epoch 5 - iter 81/272 - loss 0.16835017 - time (sec): 28.24 - samples/sec: 543.11 - lr: 0.000095 - momentum: 0.000000
2023-10-11 09:47:22,285 epoch 5 - iter 108/272 - loss 0.16954062 - time (sec): 37.39 - samples/sec: 541.39 - lr: 0.000093 - momentum: 0.000000
2023-10-11 09:47:31,421 epoch 5 - iter 135/272 - loss 0.15990292 - time (sec): 46.53 - samples/sec: 537.48 - lr: 0.000092 - momentum: 0.000000
2023-10-11 09:47:41,467 epoch 5 - iter 162/272 - loss 0.15873693 - time (sec): 56.57 - samples/sec: 546.17 - lr: 0.000090 - momentum: 0.000000
2023-10-11 09:47:50,820 epoch 5 - iter 189/272 - loss 0.16239154 - time (sec): 65.93 - samples/sec: 544.54 - lr: 0.000088 - momentum: 0.000000
2023-10-11 09:48:00,587 epoch 5 - iter 216/272 - loss 0.16394956 - time (sec): 75.69 - samples/sec: 545.05 - lr: 0.000087 - momentum: 0.000000
2023-10-11 09:48:10,258 epoch 5 - iter 243/272 - loss 0.16439694 - time (sec): 85.36 - samples/sec: 547.01 - lr: 0.000085 - momentum: 0.000000
2023-10-11 09:48:19,506 epoch 5 - iter 270/272 - loss 0.16219348 - time (sec): 94.61 - samples/sec: 546.56 - lr: 0.000084 - momentum: 0.000000
2023-10-11 09:48:20,008 ----------------------------------------------------------------------------------------------------
2023-10-11 09:48:20,009 EPOCH 5 done: loss 0.1620 - lr: 0.000084
2023-10-11 09:48:25,523 DEV : loss 0.16873042285442352 - f1-score (micro avg) 0.6128
2023-10-11 09:48:25,531 saving best model
2023-10-11 09:48:28,074 ----------------------------------------------------------------------------------------------------
2023-10-11 09:48:37,600 epoch 6 - iter 27/272 - loss 0.14413262 - time (sec): 9.52 - samples/sec: 575.59 - lr: 0.000082 - momentum: 0.000000
2023-10-11 09:48:46,412 epoch 6 - iter 54/272 - loss 0.14208677 - time (sec): 18.33 - samples/sec: 556.61 - lr: 0.000080 - momentum: 0.000000
2023-10-11 09:48:55,754 epoch 6 - iter 81/272 - loss 0.13479681 - time (sec): 27.68 - samples/sec: 563.64 - lr: 0.000078 - momentum: 0.000000
2023-10-11 09:49:05,615 epoch 6 - iter 108/272 - loss 0.13231184 - time (sec): 37.54 - samples/sec: 572.69 - lr: 0.000077 - momentum: 0.000000
2023-10-11 09:49:14,856 epoch 6 - iter 135/272 - loss 0.12931757 - time (sec): 46.78 - samples/sec: 553.45 - lr: 0.000075 - momentum: 0.000000
2023-10-11 09:49:24,535 epoch 6 - iter 162/272 - loss 0.12321844 - time (sec): 56.46 - samples/sec: 558.64 - lr: 0.000073 - momentum: 0.000000
2023-10-11 09:49:33,780 epoch 6 - iter 189/272 - loss 0.11952690 - time (sec): 65.70 - samples/sec: 557.41 - lr: 0.000072 - momentum: 0.000000
2023-10-11 09:49:42,914 epoch 6 - iter 216/272 - loss 0.12316648 - time (sec): 74.84 - samples/sec: 556.15 - lr: 0.000070 - momentum: 0.000000
2023-10-11 09:49:52,434 epoch 6 - iter 243/272 - loss 0.12004095 - time (sec): 84.36 - samples/sec: 555.46 - lr: 0.000069 - momentum: 0.000000
2023-10-11 09:50:01,550 epoch 6 - iter 270/272 - loss 0.12026909 - time (sec): 93.47 - samples/sec: 553.33 - lr: 0.000067 - momentum: 0.000000
2023-10-11 09:50:02,085 ----------------------------------------------------------------------------------------------------
2023-10-11 09:50:02,085 EPOCH 6 done: loss 0.1200 - lr: 0.000067
2023-10-11 09:50:07,714 DEV : loss 0.1488361954689026 - f1-score (micro avg) 0.6112
2023-10-11 09:50:07,722 ----------------------------------------------------------------------------------------------------
2023-10-11 09:50:17,152 epoch 7 - iter 27/272 - loss 0.09847727 - time (sec): 9.43 - samples/sec: 511.97 - lr: 0.000065 - momentum: 0.000000
2023-10-11 09:50:27,127 epoch 7 - iter 54/272 - loss 0.09016802 - time (sec): 19.40 - samples/sec: 541.04 - lr: 0.000063 - momentum: 0.000000
2023-10-11 09:50:37,100 epoch 7 - iter 81/272 - loss 0.08601675 - time (sec): 29.38 - samples/sec: 544.14 - lr: 0.000062 - momentum: 0.000000
2023-10-11 09:50:46,118 epoch 7 - iter 108/272 - loss 0.08647169 - time (sec): 38.39 - samples/sec: 549.45 - lr: 0.000060 - momentum: 0.000000
2023-10-11 09:50:55,554 epoch 7 - iter 135/272 - loss 0.09090164 - time (sec): 47.83 - samples/sec: 554.50 - lr: 0.000058 - momentum: 0.000000
2023-10-11 09:51:04,902 epoch 7 - iter 162/272 - loss 0.08989021 - time (sec): 57.18 - samples/sec: 549.21 - lr: 0.000057 - momentum: 0.000000
2023-10-11 09:51:13,584 epoch 7 - iter 189/272 - loss 0.08994632 - time (sec): 65.86 - samples/sec: 543.30 - lr: 0.000055 - momentum: 0.000000
2023-10-11 09:51:23,911 epoch 7 - iter 216/272 - loss 0.08845138 - time (sec): 76.19 - samples/sec: 544.27 - lr: 0.000053 - momentum: 0.000000
2023-10-11 09:51:34,418 epoch 7 - iter 243/272 - loss 0.09342558 - time (sec): 86.69 - samples/sec: 534.59 - lr: 0.000052 - momentum: 0.000000
2023-10-11 09:51:45,463 epoch 7 - iter 270/272 - loss 0.09226835 - time (sec): 97.74 - samples/sec: 529.60 - lr: 0.000050 - momentum: 0.000000
2023-10-11 09:51:46,024 ----------------------------------------------------------------------------------------------------
2023-10-11 09:51:46,025 EPOCH 7 done: loss 0.0920 - lr: 0.000050
2023-10-11 09:51:51,630 DEV : loss 0.14688394963741302 - f1-score (micro avg) 0.6654
2023-10-11 09:51:51,638 saving best model
2023-10-11 09:51:54,157 ----------------------------------------------------------------------------------------------------
2023-10-11 09:52:03,922 epoch 8 - iter 27/272 - loss 0.08937485 - time (sec): 9.76 - samples/sec: 470.65 - lr: 0.000048 - momentum: 0.000000
2023-10-11 09:52:13,039 epoch 8 - iter 54/272 - loss 0.07136459 - time (sec): 18.88 - samples/sec: 497.63 - lr: 0.000047 - momentum: 0.000000
2023-10-11 09:52:23,152 epoch 8 - iter 81/272 - loss 0.07727213 - time (sec): 28.99 - samples/sec: 530.55 - lr: 0.000045 - momentum: 0.000000
2023-10-11 09:52:32,583 epoch 8 - iter 108/272 - loss 0.07738875 - time (sec): 38.42 - samples/sec: 538.24 - lr: 0.000043 - momentum: 0.000000
2023-10-11 09:52:41,991 epoch 8 - iter 135/272 - loss 0.07911098 - time (sec): 47.83 - samples/sec: 541.09 - lr: 0.000042 - momentum: 0.000000
2023-10-11 09:52:50,431 epoch 8 - iter 162/272 - loss 0.08340609 - time (sec): 56.27 - samples/sec: 533.28 - lr: 0.000040 - momentum: 0.000000
2023-10-11 09:53:00,002 epoch 8 - iter 189/272 - loss 0.08129659 - time (sec): 65.84 - samples/sec: 541.02 - lr: 0.000038 - momentum: 0.000000
2023-10-11 09:53:10,477 epoch 8 - iter 216/272 - loss 0.07877295 - time (sec): 76.32 - samples/sec: 551.20 - lr: 0.000037 - momentum: 0.000000
2023-10-11 09:53:19,525 epoch 8 - iter 243/272 - loss 0.07667042 - time (sec): 85.36 - samples/sec: 548.01 - lr: 0.000035 - momentum: 0.000000
2023-10-11 09:53:29,019 epoch 8 - iter 270/272 - loss 0.07426787 - time (sec): 94.86 - samples/sec: 546.00 - lr: 0.000034 - momentum: 0.000000
2023-10-11 09:53:29,435 ----------------------------------------------------------------------------------------------------
2023-10-11 09:53:29,435 EPOCH 8 done: loss 0.0742 - lr: 0.000034
2023-10-11 09:53:34,876 DEV : loss 0.14787980914115906 - f1-score (micro avg) 0.7432
2023-10-11 09:53:34,884 saving best model
2023-10-11 09:53:37,399 ----------------------------------------------------------------------------------------------------
2023-10-11 09:53:46,140 epoch 9 - iter 27/272 - loss 0.09102105 - time (sec): 8.74 - samples/sec: 491.25 - lr: 0.000032 - momentum: 0.000000
2023-10-11 09:53:55,379 epoch 9 - iter 54/272 - loss 0.08935916 - time (sec): 17.98 - samples/sec: 526.14 - lr: 0.000030 - momentum: 0.000000
2023-10-11 09:54:04,769 epoch 9 - iter 81/272 - loss 0.07481512 - time (sec): 27.37 - samples/sec: 529.16 - lr: 0.000028 - momentum: 0.000000
2023-10-11 09:54:14,186 epoch 9 - iter 108/272 - loss 0.07867968 - time (sec): 36.78 - samples/sec: 531.27 - lr: 0.000027 - momentum: 0.000000
2023-10-11 09:54:22,976 epoch 9 - iter 135/272 - loss 0.07834757 - time (sec): 45.57 - samples/sec: 525.22 - lr: 0.000025 - momentum: 0.000000
2023-10-11 09:54:33,025 epoch 9 - iter 162/272 - loss 0.07369543 - time (sec): 55.62 - samples/sec: 537.67 - lr: 0.000023 - momentum: 0.000000
2023-10-11 09:54:42,330 epoch 9 - iter 189/272 - loss 0.07182289 - time (sec): 64.93 - samples/sec: 539.05 - lr: 0.000022 - momentum: 0.000000
2023-10-11 09:54:52,050 epoch 9 - iter 216/272 - loss 0.06958838 - time (sec): 74.65 - samples/sec: 540.33 - lr: 0.000020 - momentum: 0.000000
2023-10-11 09:55:02,157 epoch 9 - iter 243/272 - loss 0.06633126 - time (sec): 84.75 - samples/sec: 545.83 - lr: 0.000019 - momentum: 0.000000
2023-10-11 09:55:11,700 epoch 9 - iter 270/272 - loss 0.06401853 - time (sec): 94.30 - samples/sec: 548.89 - lr: 0.000017 - momentum: 0.000000
2023-10-11 09:55:12,130 ----------------------------------------------------------------------------------------------------
2023-10-11 09:55:12,130 EPOCH 9 done: loss 0.0641 - lr: 0.000017
2023-10-11 09:55:17,955 DEV : loss 0.14741504192352295 - f1-score (micro avg) 0.7505
2023-10-11 09:55:17,963 saving best model
2023-10-11 09:55:20,524 ----------------------------------------------------------------------------------------------------
2023-10-11 09:55:29,829 epoch 10 - iter 27/272 - loss 0.07188521 - time (sec): 9.30 - samples/sec: 575.75 - lr: 0.000015 - momentum: 0.000000
2023-10-11 09:55:38,625 epoch 10 - iter 54/272 - loss 0.07716434 - time (sec): 18.10 - samples/sec: 554.04 - lr: 0.000013 - momentum: 0.000000
2023-10-11 09:55:47,519 epoch 10 - iter 81/272 - loss 0.06791166 - time (sec): 26.99 - samples/sec: 553.49 - lr: 0.000012 - momentum: 0.000000
2023-10-11 09:55:56,393 epoch 10 - iter 108/272 - loss 0.06458565 - time (sec): 35.86 - samples/sec: 549.24 - lr: 0.000010 - momentum: 0.000000
2023-10-11 09:56:06,420 epoch 10 - iter 135/272 - loss 0.06332167 - time (sec): 45.89 - samples/sec: 565.03 - lr: 0.000008 - momentum: 0.000000
2023-10-11 09:56:15,810 epoch 10 - iter 162/272 - loss 0.05986959 - time (sec): 55.28 - samples/sec: 557.44 - lr: 0.000007 - momentum: 0.000000
2023-10-11 09:56:25,615 epoch 10 - iter 189/272 - loss 0.05919230 - time (sec): 65.09 - samples/sec: 553.40 - lr: 0.000005 - momentum: 0.000000
2023-10-11 09:56:35,088 epoch 10 - iter 216/272 - loss 0.05792051 - time (sec): 74.56 - samples/sec: 555.55 - lr: 0.000003 - momentum: 0.000000
2023-10-11 09:56:45,050 epoch 10 - iter 243/272 - loss 0.05609555 - time (sec): 84.52 - samples/sec: 555.36 - lr: 0.000002 - momentum: 0.000000
2023-10-11 09:56:54,243 epoch 10 - iter 270/272 - loss 0.05776757 - time (sec): 93.71 - samples/sec: 551.58 - lr: 0.000000 - momentum: 0.000000
2023-10-11 09:56:54,775 ----------------------------------------------------------------------------------------------------
2023-10-11 09:56:54,775 EPOCH 10 done: loss 0.0576 - lr: 0.000000
2023-10-11 09:57:00,361 DEV : loss 0.14719465374946594 - f1-score (micro avg) 0.7401
2023-10-11 09:57:01,192 ----------------------------------------------------------------------------------------------------
2023-10-11 09:57:01,194 Loading model from best epoch ...
2023-10-11 09:57:05,952 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-11 09:57:18,000
Results:
- F-score (micro) 0.7087
- F-score (macro) 0.6243
- Accuracy 0.5811
By class:
precision recall f1-score support
LOC 0.6899 0.8558 0.7639 312
PER 0.6667 0.7212 0.6928 208
ORG 0.5263 0.3636 0.4301 55
HumanProd 0.4865 0.8182 0.6102 22
micro avg 0.6623 0.7621 0.7087 597
macro avg 0.5923 0.6897 0.6243 597
weighted avg 0.6593 0.7621 0.7028 597
2023-10-11 09:57:18,000 ----------------------------------------------------------------------------------------------------