stefan-it's picture
Upload folder using huggingface_hub
a12a8fd
raw
history blame
25.2 kB
2023-10-11 13:09:29,492 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,495 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-11 13:09:29,495 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,495 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-11 13:09:29,495 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,495 Train: 1085 sentences
2023-10-11 13:09:29,495 (train_with_dev=False, train_with_test=False)
2023-10-11 13:09:29,495 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,495 Training Params:
2023-10-11 13:09:29,495 - learning_rate: "0.00015"
2023-10-11 13:09:29,496 - mini_batch_size: "4"
2023-10-11 13:09:29,496 - max_epochs: "10"
2023-10-11 13:09:29,496 - shuffle: "True"
2023-10-11 13:09:29,496 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,496 Plugins:
2023-10-11 13:09:29,496 - TensorboardLogger
2023-10-11 13:09:29,496 - LinearScheduler | warmup_fraction: '0.1'
2023-10-11 13:09:29,496 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,496 Final evaluation on model from best epoch (best-model.pt)
2023-10-11 13:09:29,496 - metric: "('micro avg', 'f1-score')"
2023-10-11 13:09:29,496 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,496 Computation:
2023-10-11 13:09:29,496 - compute on device: cuda:0
2023-10-11 13:09:29,496 - embedding storage: none
2023-10-11 13:09:29,496 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,496 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5"
2023-10-11 13:09:29,497 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,497 ----------------------------------------------------------------------------------------------------
2023-10-11 13:09:29,497 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-11 13:09:39,376 epoch 1 - iter 27/272 - loss 2.82017975 - time (sec): 9.88 - samples/sec: 571.41 - lr: 0.000014 - momentum: 0.000000
2023-10-11 13:09:48,429 epoch 1 - iter 54/272 - loss 2.81198645 - time (sec): 18.93 - samples/sec: 547.73 - lr: 0.000029 - momentum: 0.000000
2023-10-11 13:09:57,442 epoch 1 - iter 81/272 - loss 2.79406251 - time (sec): 27.94 - samples/sec: 534.30 - lr: 0.000044 - momentum: 0.000000
2023-10-11 13:10:08,054 epoch 1 - iter 108/272 - loss 2.72687206 - time (sec): 38.56 - samples/sec: 550.43 - lr: 0.000059 - momentum: 0.000000
2023-10-11 13:10:17,491 epoch 1 - iter 135/272 - loss 2.64533462 - time (sec): 47.99 - samples/sec: 550.78 - lr: 0.000074 - momentum: 0.000000
2023-10-11 13:10:27,175 epoch 1 - iter 162/272 - loss 2.54392629 - time (sec): 57.68 - samples/sec: 549.91 - lr: 0.000089 - momentum: 0.000000
2023-10-11 13:10:36,492 epoch 1 - iter 189/272 - loss 2.44259599 - time (sec): 66.99 - samples/sec: 547.29 - lr: 0.000104 - momentum: 0.000000
2023-10-11 13:10:45,661 epoch 1 - iter 216/272 - loss 2.34019983 - time (sec): 76.16 - samples/sec: 543.57 - lr: 0.000119 - momentum: 0.000000
2023-10-11 13:10:54,498 epoch 1 - iter 243/272 - loss 2.24446431 - time (sec): 85.00 - samples/sec: 539.80 - lr: 0.000133 - momentum: 0.000000
2023-10-11 13:11:04,895 epoch 1 - iter 270/272 - loss 2.10357772 - time (sec): 95.40 - samples/sec: 543.29 - lr: 0.000148 - momentum: 0.000000
2023-10-11 13:11:05,300 ----------------------------------------------------------------------------------------------------
2023-10-11 13:11:05,300 EPOCH 1 done: loss 2.0999 - lr: 0.000148
2023-10-11 13:11:10,495 DEV : loss 0.7871720194816589 - f1-score (micro avg) 0.0
2023-10-11 13:11:10,504 ----------------------------------------------------------------------------------------------------
2023-10-11 13:11:19,799 epoch 2 - iter 27/272 - loss 0.79101376 - time (sec): 9.29 - samples/sec: 513.17 - lr: 0.000148 - momentum: 0.000000
2023-10-11 13:11:29,461 epoch 2 - iter 54/272 - loss 0.71052680 - time (sec): 18.96 - samples/sec: 529.15 - lr: 0.000147 - momentum: 0.000000
2023-10-11 13:11:38,965 epoch 2 - iter 81/272 - loss 0.67490954 - time (sec): 28.46 - samples/sec: 531.82 - lr: 0.000145 - momentum: 0.000000
2023-10-11 13:11:49,057 epoch 2 - iter 108/272 - loss 0.62296579 - time (sec): 38.55 - samples/sec: 540.97 - lr: 0.000143 - momentum: 0.000000
2023-10-11 13:11:58,488 epoch 2 - iter 135/272 - loss 0.60641748 - time (sec): 47.98 - samples/sec: 540.39 - lr: 0.000142 - momentum: 0.000000
2023-10-11 13:12:08,344 epoch 2 - iter 162/272 - loss 0.58390681 - time (sec): 57.84 - samples/sec: 542.86 - lr: 0.000140 - momentum: 0.000000
2023-10-11 13:12:18,273 epoch 2 - iter 189/272 - loss 0.55506002 - time (sec): 67.77 - samples/sec: 541.84 - lr: 0.000138 - momentum: 0.000000
2023-10-11 13:12:27,349 epoch 2 - iter 216/272 - loss 0.53417108 - time (sec): 76.84 - samples/sec: 537.71 - lr: 0.000137 - momentum: 0.000000
2023-10-11 13:12:36,566 epoch 2 - iter 243/272 - loss 0.52025397 - time (sec): 86.06 - samples/sec: 537.01 - lr: 0.000135 - momentum: 0.000000
2023-10-11 13:12:46,295 epoch 2 - iter 270/272 - loss 0.49807868 - time (sec): 95.79 - samples/sec: 538.43 - lr: 0.000134 - momentum: 0.000000
2023-10-11 13:12:46,928 ----------------------------------------------------------------------------------------------------
2023-10-11 13:12:46,929 EPOCH 2 done: loss 0.4972 - lr: 0.000134
2023-10-11 13:12:52,875 DEV : loss 0.2955166697502136 - f1-score (micro avg) 0.4394
2023-10-11 13:12:52,884 saving best model
2023-10-11 13:12:53,727 ----------------------------------------------------------------------------------------------------
2023-10-11 13:13:02,616 epoch 3 - iter 27/272 - loss 0.34336911 - time (sec): 8.89 - samples/sec: 523.49 - lr: 0.000132 - momentum: 0.000000
2023-10-11 13:13:13,494 epoch 3 - iter 54/272 - loss 0.30648810 - time (sec): 19.76 - samples/sec: 564.39 - lr: 0.000130 - momentum: 0.000000
2023-10-11 13:13:23,468 epoch 3 - iter 81/272 - loss 0.29198034 - time (sec): 29.74 - samples/sec: 562.43 - lr: 0.000128 - momentum: 0.000000
2023-10-11 13:13:32,799 epoch 3 - iter 108/272 - loss 0.28951492 - time (sec): 39.07 - samples/sec: 546.13 - lr: 0.000127 - momentum: 0.000000
2023-10-11 13:13:42,226 epoch 3 - iter 135/272 - loss 0.28229197 - time (sec): 48.50 - samples/sec: 545.36 - lr: 0.000125 - momentum: 0.000000
2023-10-11 13:13:52,082 epoch 3 - iter 162/272 - loss 0.28226115 - time (sec): 58.35 - samples/sec: 547.41 - lr: 0.000123 - momentum: 0.000000
2023-10-11 13:14:01,329 epoch 3 - iter 189/272 - loss 0.28080431 - time (sec): 67.60 - samples/sec: 543.23 - lr: 0.000122 - momentum: 0.000000
2023-10-11 13:14:10,768 epoch 3 - iter 216/272 - loss 0.27438629 - time (sec): 77.04 - samples/sec: 542.60 - lr: 0.000120 - momentum: 0.000000
2023-10-11 13:14:20,209 epoch 3 - iter 243/272 - loss 0.26560305 - time (sec): 86.48 - samples/sec: 539.20 - lr: 0.000119 - momentum: 0.000000
2023-10-11 13:14:29,635 epoch 3 - iter 270/272 - loss 0.26630953 - time (sec): 95.91 - samples/sec: 540.48 - lr: 0.000117 - momentum: 0.000000
2023-10-11 13:14:30,020 ----------------------------------------------------------------------------------------------------
2023-10-11 13:14:30,020 EPOCH 3 done: loss 0.2661 - lr: 0.000117
2023-10-11 13:14:35,743 DEV : loss 0.1891184151172638 - f1-score (micro avg) 0.6248
2023-10-11 13:14:35,752 saving best model
2023-10-11 13:14:38,296 ----------------------------------------------------------------------------------------------------
2023-10-11 13:14:47,491 epoch 4 - iter 27/272 - loss 0.19267975 - time (sec): 9.19 - samples/sec: 513.04 - lr: 0.000115 - momentum: 0.000000
2023-10-11 13:14:57,471 epoch 4 - iter 54/272 - loss 0.19054158 - time (sec): 19.17 - samples/sec: 524.71 - lr: 0.000113 - momentum: 0.000000
2023-10-11 13:15:08,023 epoch 4 - iter 81/272 - loss 0.18246952 - time (sec): 29.72 - samples/sec: 536.80 - lr: 0.000112 - momentum: 0.000000
2023-10-11 13:15:17,799 epoch 4 - iter 108/272 - loss 0.17446918 - time (sec): 39.50 - samples/sec: 542.94 - lr: 0.000110 - momentum: 0.000000
2023-10-11 13:15:26,973 epoch 4 - iter 135/272 - loss 0.17469825 - time (sec): 48.67 - samples/sec: 541.13 - lr: 0.000108 - momentum: 0.000000
2023-10-11 13:15:36,607 epoch 4 - iter 162/272 - loss 0.16413413 - time (sec): 58.31 - samples/sec: 545.12 - lr: 0.000107 - momentum: 0.000000
2023-10-11 13:15:46,200 epoch 4 - iter 189/272 - loss 0.16415812 - time (sec): 67.90 - samples/sec: 539.69 - lr: 0.000105 - momentum: 0.000000
2023-10-11 13:15:55,944 epoch 4 - iter 216/272 - loss 0.16298361 - time (sec): 77.64 - samples/sec: 539.85 - lr: 0.000103 - momentum: 0.000000
2023-10-11 13:16:05,439 epoch 4 - iter 243/272 - loss 0.16548331 - time (sec): 87.14 - samples/sec: 537.75 - lr: 0.000102 - momentum: 0.000000
2023-10-11 13:16:14,765 epoch 4 - iter 270/272 - loss 0.16290655 - time (sec): 96.46 - samples/sec: 537.15 - lr: 0.000100 - momentum: 0.000000
2023-10-11 13:16:15,180 ----------------------------------------------------------------------------------------------------
2023-10-11 13:16:15,180 EPOCH 4 done: loss 0.1633 - lr: 0.000100
2023-10-11 13:16:20,930 DEV : loss 0.14617015421390533 - f1-score (micro avg) 0.686
2023-10-11 13:16:20,939 saving best model
2023-10-11 13:16:23,475 ----------------------------------------------------------------------------------------------------
2023-10-11 13:16:33,728 epoch 5 - iter 27/272 - loss 0.15319500 - time (sec): 10.25 - samples/sec: 570.34 - lr: 0.000098 - momentum: 0.000000
2023-10-11 13:16:43,416 epoch 5 - iter 54/272 - loss 0.14820840 - time (sec): 19.94 - samples/sec: 562.38 - lr: 0.000097 - momentum: 0.000000
2023-10-11 13:16:52,122 epoch 5 - iter 81/272 - loss 0.13818758 - time (sec): 28.64 - samples/sec: 542.20 - lr: 0.000095 - momentum: 0.000000
2023-10-11 13:17:01,635 epoch 5 - iter 108/272 - loss 0.13056305 - time (sec): 38.16 - samples/sec: 543.01 - lr: 0.000093 - momentum: 0.000000
2023-10-11 13:17:10,347 epoch 5 - iter 135/272 - loss 0.12686995 - time (sec): 46.87 - samples/sec: 534.15 - lr: 0.000092 - momentum: 0.000000
2023-10-11 13:17:20,047 epoch 5 - iter 162/272 - loss 0.11740312 - time (sec): 56.57 - samples/sec: 538.35 - lr: 0.000090 - momentum: 0.000000
2023-10-11 13:17:29,507 epoch 5 - iter 189/272 - loss 0.11502572 - time (sec): 66.03 - samples/sec: 538.50 - lr: 0.000088 - momentum: 0.000000
2023-10-11 13:17:39,913 epoch 5 - iter 216/272 - loss 0.11581681 - time (sec): 76.43 - samples/sec: 544.73 - lr: 0.000087 - momentum: 0.000000
2023-10-11 13:17:49,112 epoch 5 - iter 243/272 - loss 0.11020025 - time (sec): 85.63 - samples/sec: 541.12 - lr: 0.000085 - momentum: 0.000000
2023-10-11 13:17:58,535 epoch 5 - iter 270/272 - loss 0.10938645 - time (sec): 95.06 - samples/sec: 539.87 - lr: 0.000084 - momentum: 0.000000
2023-10-11 13:17:59,373 ----------------------------------------------------------------------------------------------------
2023-10-11 13:17:59,373 EPOCH 5 done: loss 0.1087 - lr: 0.000084
2023-10-11 13:18:05,260 DEV : loss 0.12970557808876038 - f1-score (micro avg) 0.7782
2023-10-11 13:18:05,268 saving best model
2023-10-11 13:18:07,813 ----------------------------------------------------------------------------------------------------
2023-10-11 13:18:17,449 epoch 6 - iter 27/272 - loss 0.08697393 - time (sec): 9.63 - samples/sec: 566.07 - lr: 0.000082 - momentum: 0.000000
2023-10-11 13:18:26,302 epoch 6 - iter 54/272 - loss 0.08490523 - time (sec): 18.48 - samples/sec: 537.15 - lr: 0.000080 - momentum: 0.000000
2023-10-11 13:18:36,045 epoch 6 - iter 81/272 - loss 0.08898586 - time (sec): 28.23 - samples/sec: 546.35 - lr: 0.000078 - momentum: 0.000000
2023-10-11 13:18:45,268 epoch 6 - iter 108/272 - loss 0.08812781 - time (sec): 37.45 - samples/sec: 547.61 - lr: 0.000077 - momentum: 0.000000
2023-10-11 13:18:54,730 epoch 6 - iter 135/272 - loss 0.08125660 - time (sec): 46.91 - samples/sec: 547.04 - lr: 0.000075 - momentum: 0.000000
2023-10-11 13:19:03,883 epoch 6 - iter 162/272 - loss 0.08079255 - time (sec): 56.07 - samples/sec: 542.30 - lr: 0.000073 - momentum: 0.000000
2023-10-11 13:19:13,710 epoch 6 - iter 189/272 - loss 0.07658614 - time (sec): 65.89 - samples/sec: 539.76 - lr: 0.000072 - momentum: 0.000000
2023-10-11 13:19:24,302 epoch 6 - iter 216/272 - loss 0.07746535 - time (sec): 76.48 - samples/sec: 539.77 - lr: 0.000070 - momentum: 0.000000
2023-10-11 13:19:34,054 epoch 6 - iter 243/272 - loss 0.07630159 - time (sec): 86.24 - samples/sec: 536.17 - lr: 0.000069 - momentum: 0.000000
2023-10-11 13:19:44,094 epoch 6 - iter 270/272 - loss 0.07449806 - time (sec): 96.28 - samples/sec: 537.63 - lr: 0.000067 - momentum: 0.000000
2023-10-11 13:19:44,548 ----------------------------------------------------------------------------------------------------
2023-10-11 13:19:44,548 EPOCH 6 done: loss 0.0755 - lr: 0.000067
2023-10-11 13:19:50,727 DEV : loss 0.13596650958061218 - f1-score (micro avg) 0.7802
2023-10-11 13:19:50,741 saving best model
2023-10-11 13:19:53,401 ----------------------------------------------------------------------------------------------------
2023-10-11 13:20:04,340 epoch 7 - iter 27/272 - loss 0.07151511 - time (sec): 10.94 - samples/sec: 544.36 - lr: 0.000065 - momentum: 0.000000
2023-10-11 13:20:14,460 epoch 7 - iter 54/272 - loss 0.06298137 - time (sec): 21.06 - samples/sec: 535.86 - lr: 0.000063 - momentum: 0.000000
2023-10-11 13:20:24,037 epoch 7 - iter 81/272 - loss 0.06852612 - time (sec): 30.63 - samples/sec: 529.97 - lr: 0.000062 - momentum: 0.000000
2023-10-11 13:20:33,314 epoch 7 - iter 108/272 - loss 0.06310370 - time (sec): 39.91 - samples/sec: 525.06 - lr: 0.000060 - momentum: 0.000000
2023-10-11 13:20:43,363 epoch 7 - iter 135/272 - loss 0.06476083 - time (sec): 49.96 - samples/sec: 528.83 - lr: 0.000058 - momentum: 0.000000
2023-10-11 13:20:53,660 epoch 7 - iter 162/272 - loss 0.05980500 - time (sec): 60.26 - samples/sec: 535.29 - lr: 0.000057 - momentum: 0.000000
2023-10-11 13:21:03,338 epoch 7 - iter 189/272 - loss 0.05858969 - time (sec): 69.93 - samples/sec: 535.33 - lr: 0.000055 - momentum: 0.000000
2023-10-11 13:21:12,953 epoch 7 - iter 216/272 - loss 0.06293113 - time (sec): 79.55 - samples/sec: 532.76 - lr: 0.000053 - momentum: 0.000000
2023-10-11 13:21:21,594 epoch 7 - iter 243/272 - loss 0.06038448 - time (sec): 88.19 - samples/sec: 525.24 - lr: 0.000052 - momentum: 0.000000
2023-10-11 13:21:31,424 epoch 7 - iter 270/272 - loss 0.05738546 - time (sec): 98.02 - samples/sec: 527.79 - lr: 0.000050 - momentum: 0.000000
2023-10-11 13:21:31,922 ----------------------------------------------------------------------------------------------------
2023-10-11 13:21:31,922 EPOCH 7 done: loss 0.0574 - lr: 0.000050
2023-10-11 13:21:37,845 DEV : loss 0.12629856169223785 - f1-score (micro avg) 0.8
2023-10-11 13:21:37,854 saving best model
2023-10-11 13:21:40,425 ----------------------------------------------------------------------------------------------------
2023-10-11 13:21:49,568 epoch 8 - iter 27/272 - loss 0.03692180 - time (sec): 9.14 - samples/sec: 522.07 - lr: 0.000048 - momentum: 0.000000
2023-10-11 13:21:58,535 epoch 8 - iter 54/272 - loss 0.04304127 - time (sec): 18.11 - samples/sec: 515.86 - lr: 0.000047 - momentum: 0.000000
2023-10-11 13:22:08,106 epoch 8 - iter 81/272 - loss 0.04492744 - time (sec): 27.68 - samples/sec: 524.05 - lr: 0.000045 - momentum: 0.000000
2023-10-11 13:22:19,082 epoch 8 - iter 108/272 - loss 0.04289346 - time (sec): 38.65 - samples/sec: 533.91 - lr: 0.000043 - momentum: 0.000000
2023-10-11 13:22:28,087 epoch 8 - iter 135/272 - loss 0.04671101 - time (sec): 47.66 - samples/sec: 531.12 - lr: 0.000042 - momentum: 0.000000
2023-10-11 13:22:37,505 epoch 8 - iter 162/272 - loss 0.04694481 - time (sec): 57.08 - samples/sec: 535.95 - lr: 0.000040 - momentum: 0.000000
2023-10-11 13:22:47,043 epoch 8 - iter 189/272 - loss 0.04703120 - time (sec): 66.61 - samples/sec: 539.57 - lr: 0.000038 - momentum: 0.000000
2023-10-11 13:22:56,431 epoch 8 - iter 216/272 - loss 0.04473500 - time (sec): 76.00 - samples/sec: 541.82 - lr: 0.000037 - momentum: 0.000000
2023-10-11 13:23:06,007 epoch 8 - iter 243/272 - loss 0.04459354 - time (sec): 85.58 - samples/sec: 544.67 - lr: 0.000035 - momentum: 0.000000
2023-10-11 13:23:15,526 epoch 8 - iter 270/272 - loss 0.04481176 - time (sec): 95.10 - samples/sec: 545.90 - lr: 0.000034 - momentum: 0.000000
2023-10-11 13:23:15,869 ----------------------------------------------------------------------------------------------------
2023-10-11 13:23:15,869 EPOCH 8 done: loss 0.0450 - lr: 0.000034
2023-10-11 13:23:21,558 DEV : loss 0.12960414588451385 - f1-score (micro avg) 0.7782
2023-10-11 13:23:21,566 ----------------------------------------------------------------------------------------------------
2023-10-11 13:23:30,857 epoch 9 - iter 27/272 - loss 0.03796750 - time (sec): 9.29 - samples/sec: 544.75 - lr: 0.000032 - momentum: 0.000000
2023-10-11 13:23:40,261 epoch 9 - iter 54/272 - loss 0.04594063 - time (sec): 18.69 - samples/sec: 556.90 - lr: 0.000030 - momentum: 0.000000
2023-10-11 13:23:49,865 epoch 9 - iter 81/272 - loss 0.04313347 - time (sec): 28.30 - samples/sec: 559.29 - lr: 0.000028 - momentum: 0.000000
2023-10-11 13:23:59,378 epoch 9 - iter 108/272 - loss 0.03968736 - time (sec): 37.81 - samples/sec: 552.16 - lr: 0.000027 - momentum: 0.000000
2023-10-11 13:24:09,034 epoch 9 - iter 135/272 - loss 0.03928521 - time (sec): 47.47 - samples/sec: 550.93 - lr: 0.000025 - momentum: 0.000000
2023-10-11 13:24:18,685 epoch 9 - iter 162/272 - loss 0.03901635 - time (sec): 57.12 - samples/sec: 551.63 - lr: 0.000023 - momentum: 0.000000
2023-10-11 13:24:28,210 epoch 9 - iter 189/272 - loss 0.03840262 - time (sec): 66.64 - samples/sec: 547.60 - lr: 0.000022 - momentum: 0.000000
2023-10-11 13:24:37,304 epoch 9 - iter 216/272 - loss 0.04002943 - time (sec): 75.74 - samples/sec: 546.89 - lr: 0.000020 - momentum: 0.000000
2023-10-11 13:24:46,482 epoch 9 - iter 243/272 - loss 0.03766633 - time (sec): 84.91 - samples/sec: 547.65 - lr: 0.000019 - momentum: 0.000000
2023-10-11 13:24:55,673 epoch 9 - iter 270/272 - loss 0.03760281 - time (sec): 94.10 - samples/sec: 547.45 - lr: 0.000017 - momentum: 0.000000
2023-10-11 13:24:56,337 ----------------------------------------------------------------------------------------------------
2023-10-11 13:24:56,338 EPOCH 9 done: loss 0.0378 - lr: 0.000017
2023-10-11 13:25:01,878 DEV : loss 0.12975578010082245 - f1-score (micro avg) 0.7883
2023-10-11 13:25:01,888 ----------------------------------------------------------------------------------------------------
2023-10-11 13:25:11,078 epoch 10 - iter 27/272 - loss 0.02470685 - time (sec): 9.19 - samples/sec: 560.90 - lr: 0.000015 - momentum: 0.000000
2023-10-11 13:25:19,920 epoch 10 - iter 54/272 - loss 0.02211500 - time (sec): 18.03 - samples/sec: 543.76 - lr: 0.000013 - momentum: 0.000000
2023-10-11 13:25:29,801 epoch 10 - iter 81/272 - loss 0.02680971 - time (sec): 27.91 - samples/sec: 559.69 - lr: 0.000012 - momentum: 0.000000
2023-10-11 13:25:39,117 epoch 10 - iter 108/272 - loss 0.02701104 - time (sec): 37.23 - samples/sec: 562.59 - lr: 0.000010 - momentum: 0.000000
2023-10-11 13:25:48,499 epoch 10 - iter 135/272 - loss 0.03203950 - time (sec): 46.61 - samples/sec: 567.91 - lr: 0.000008 - momentum: 0.000000
2023-10-11 13:25:58,719 epoch 10 - iter 162/272 - loss 0.03547743 - time (sec): 56.83 - samples/sec: 578.80 - lr: 0.000007 - momentum: 0.000000
2023-10-11 13:26:07,029 epoch 10 - iter 189/272 - loss 0.03546825 - time (sec): 65.14 - samples/sec: 565.07 - lr: 0.000005 - momentum: 0.000000
2023-10-11 13:26:16,231 epoch 10 - iter 216/272 - loss 0.03456324 - time (sec): 74.34 - samples/sec: 563.35 - lr: 0.000003 - momentum: 0.000000
2023-10-11 13:26:25,608 epoch 10 - iter 243/272 - loss 0.03387184 - time (sec): 83.72 - samples/sec: 558.24 - lr: 0.000002 - momentum: 0.000000
2023-10-11 13:26:34,835 epoch 10 - iter 270/272 - loss 0.03411250 - time (sec): 92.95 - samples/sec: 555.26 - lr: 0.000000 - momentum: 0.000000
2023-10-11 13:26:35,403 ----------------------------------------------------------------------------------------------------
2023-10-11 13:26:35,403 EPOCH 10 done: loss 0.0340 - lr: 0.000000
2023-10-11 13:26:40,958 DEV : loss 0.13090862333774567 - f1-score (micro avg) 0.7847
2023-10-11 13:26:41,799 ----------------------------------------------------------------------------------------------------
2023-10-11 13:26:41,800 Loading model from best epoch ...
2023-10-11 13:26:45,973 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-11 13:26:58,353
Results:
- F-score (micro) 0.7799
- F-score (macro) 0.6981
- Accuracy 0.657
By class:
precision recall f1-score support
LOC 0.7977 0.8718 0.8331 312
PER 0.7061 0.8894 0.7872 208
ORG 0.4419 0.3455 0.3878 55
HumanProd 0.6897 0.9091 0.7843 22
micro avg 0.7348 0.8308 0.7799 597
macro avg 0.6588 0.7539 0.6981 597
weighted avg 0.7290 0.8308 0.7743 597
2023-10-11 13:26:58,353 ----------------------------------------------------------------------------------------------------