|
2023-10-11 13:09:29,492 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,495 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 13:09:29,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,495 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-11 13:09:29,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,495 Train: 1085 sentences |
|
2023-10-11 13:09:29,495 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 13:09:29,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,495 Training Params: |
|
2023-10-11 13:09:29,495 - learning_rate: "0.00015" |
|
2023-10-11 13:09:29,496 - mini_batch_size: "4" |
|
2023-10-11 13:09:29,496 - max_epochs: "10" |
|
2023-10-11 13:09:29,496 - shuffle: "True" |
|
2023-10-11 13:09:29,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,496 Plugins: |
|
2023-10-11 13:09:29,496 - TensorboardLogger |
|
2023-10-11 13:09:29,496 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 13:09:29,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,496 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 13:09:29,496 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 13:09:29,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,496 Computation: |
|
2023-10-11 13:09:29,496 - compute on device: cuda:0 |
|
2023-10-11 13:09:29,496 - embedding storage: none |
|
2023-10-11 13:09:29,496 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,496 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-11 13:09:29,497 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,497 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:09:29,497 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 13:09:39,376 epoch 1 - iter 27/272 - loss 2.82017975 - time (sec): 9.88 - samples/sec: 571.41 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 13:09:48,429 epoch 1 - iter 54/272 - loss 2.81198645 - time (sec): 18.93 - samples/sec: 547.73 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 13:09:57,442 epoch 1 - iter 81/272 - loss 2.79406251 - time (sec): 27.94 - samples/sec: 534.30 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 13:10:08,054 epoch 1 - iter 108/272 - loss 2.72687206 - time (sec): 38.56 - samples/sec: 550.43 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 13:10:17,491 epoch 1 - iter 135/272 - loss 2.64533462 - time (sec): 47.99 - samples/sec: 550.78 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 13:10:27,175 epoch 1 - iter 162/272 - loss 2.54392629 - time (sec): 57.68 - samples/sec: 549.91 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 13:10:36,492 epoch 1 - iter 189/272 - loss 2.44259599 - time (sec): 66.99 - samples/sec: 547.29 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 13:10:45,661 epoch 1 - iter 216/272 - loss 2.34019983 - time (sec): 76.16 - samples/sec: 543.57 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 13:10:54,498 epoch 1 - iter 243/272 - loss 2.24446431 - time (sec): 85.00 - samples/sec: 539.80 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 13:11:04,895 epoch 1 - iter 270/272 - loss 2.10357772 - time (sec): 95.40 - samples/sec: 543.29 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 13:11:05,300 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:11:05,300 EPOCH 1 done: loss 2.0999 - lr: 0.000148 |
|
2023-10-11 13:11:10,495 DEV : loss 0.7871720194816589 - f1-score (micro avg) 0.0 |
|
2023-10-11 13:11:10,504 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:11:19,799 epoch 2 - iter 27/272 - loss 0.79101376 - time (sec): 9.29 - samples/sec: 513.17 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 13:11:29,461 epoch 2 - iter 54/272 - loss 0.71052680 - time (sec): 18.96 - samples/sec: 529.15 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 13:11:38,965 epoch 2 - iter 81/272 - loss 0.67490954 - time (sec): 28.46 - samples/sec: 531.82 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 13:11:49,057 epoch 2 - iter 108/272 - loss 0.62296579 - time (sec): 38.55 - samples/sec: 540.97 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 13:11:58,488 epoch 2 - iter 135/272 - loss 0.60641748 - time (sec): 47.98 - samples/sec: 540.39 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 13:12:08,344 epoch 2 - iter 162/272 - loss 0.58390681 - time (sec): 57.84 - samples/sec: 542.86 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 13:12:18,273 epoch 2 - iter 189/272 - loss 0.55506002 - time (sec): 67.77 - samples/sec: 541.84 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-11 13:12:27,349 epoch 2 - iter 216/272 - loss 0.53417108 - time (sec): 76.84 - samples/sec: 537.71 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 13:12:36,566 epoch 2 - iter 243/272 - loss 0.52025397 - time (sec): 86.06 - samples/sec: 537.01 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 13:12:46,295 epoch 2 - iter 270/272 - loss 0.49807868 - time (sec): 95.79 - samples/sec: 538.43 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 13:12:46,928 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:12:46,929 EPOCH 2 done: loss 0.4972 - lr: 0.000134 |
|
2023-10-11 13:12:52,875 DEV : loss 0.2955166697502136 - f1-score (micro avg) 0.4394 |
|
2023-10-11 13:12:52,884 saving best model |
|
2023-10-11 13:12:53,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:13:02,616 epoch 3 - iter 27/272 - loss 0.34336911 - time (sec): 8.89 - samples/sec: 523.49 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 13:13:13,494 epoch 3 - iter 54/272 - loss 0.30648810 - time (sec): 19.76 - samples/sec: 564.39 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 13:13:23,468 epoch 3 - iter 81/272 - loss 0.29198034 - time (sec): 29.74 - samples/sec: 562.43 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 13:13:32,799 epoch 3 - iter 108/272 - loss 0.28951492 - time (sec): 39.07 - samples/sec: 546.13 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 13:13:42,226 epoch 3 - iter 135/272 - loss 0.28229197 - time (sec): 48.50 - samples/sec: 545.36 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 13:13:52,082 epoch 3 - iter 162/272 - loss 0.28226115 - time (sec): 58.35 - samples/sec: 547.41 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 13:14:01,329 epoch 3 - iter 189/272 - loss 0.28080431 - time (sec): 67.60 - samples/sec: 543.23 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-11 13:14:10,768 epoch 3 - iter 216/272 - loss 0.27438629 - time (sec): 77.04 - samples/sec: 542.60 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 13:14:20,209 epoch 3 - iter 243/272 - loss 0.26560305 - time (sec): 86.48 - samples/sec: 539.20 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 13:14:29,635 epoch 3 - iter 270/272 - loss 0.26630953 - time (sec): 95.91 - samples/sec: 540.48 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 13:14:30,020 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:14:30,020 EPOCH 3 done: loss 0.2661 - lr: 0.000117 |
|
2023-10-11 13:14:35,743 DEV : loss 0.1891184151172638 - f1-score (micro avg) 0.6248 |
|
2023-10-11 13:14:35,752 saving best model |
|
2023-10-11 13:14:38,296 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:14:47,491 epoch 4 - iter 27/272 - loss 0.19267975 - time (sec): 9.19 - samples/sec: 513.04 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 13:14:57,471 epoch 4 - iter 54/272 - loss 0.19054158 - time (sec): 19.17 - samples/sec: 524.71 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 13:15:08,023 epoch 4 - iter 81/272 - loss 0.18246952 - time (sec): 29.72 - samples/sec: 536.80 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 13:15:17,799 epoch 4 - iter 108/272 - loss 0.17446918 - time (sec): 39.50 - samples/sec: 542.94 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 13:15:26,973 epoch 4 - iter 135/272 - loss 0.17469825 - time (sec): 48.67 - samples/sec: 541.13 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 13:15:36,607 epoch 4 - iter 162/272 - loss 0.16413413 - time (sec): 58.31 - samples/sec: 545.12 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 13:15:46,200 epoch 4 - iter 189/272 - loss 0.16415812 - time (sec): 67.90 - samples/sec: 539.69 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 13:15:55,944 epoch 4 - iter 216/272 - loss 0.16298361 - time (sec): 77.64 - samples/sec: 539.85 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 13:16:05,439 epoch 4 - iter 243/272 - loss 0.16548331 - time (sec): 87.14 - samples/sec: 537.75 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 13:16:14,765 epoch 4 - iter 270/272 - loss 0.16290655 - time (sec): 96.46 - samples/sec: 537.15 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 13:16:15,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:16:15,180 EPOCH 4 done: loss 0.1633 - lr: 0.000100 |
|
2023-10-11 13:16:20,930 DEV : loss 0.14617015421390533 - f1-score (micro avg) 0.686 |
|
2023-10-11 13:16:20,939 saving best model |
|
2023-10-11 13:16:23,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:16:33,728 epoch 5 - iter 27/272 - loss 0.15319500 - time (sec): 10.25 - samples/sec: 570.34 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 13:16:43,416 epoch 5 - iter 54/272 - loss 0.14820840 - time (sec): 19.94 - samples/sec: 562.38 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 13:16:52,122 epoch 5 - iter 81/272 - loss 0.13818758 - time (sec): 28.64 - samples/sec: 542.20 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 13:17:01,635 epoch 5 - iter 108/272 - loss 0.13056305 - time (sec): 38.16 - samples/sec: 543.01 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 13:17:10,347 epoch 5 - iter 135/272 - loss 0.12686995 - time (sec): 46.87 - samples/sec: 534.15 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 13:17:20,047 epoch 5 - iter 162/272 - loss 0.11740312 - time (sec): 56.57 - samples/sec: 538.35 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 13:17:29,507 epoch 5 - iter 189/272 - loss 0.11502572 - time (sec): 66.03 - samples/sec: 538.50 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 13:17:39,913 epoch 5 - iter 216/272 - loss 0.11581681 - time (sec): 76.43 - samples/sec: 544.73 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 13:17:49,112 epoch 5 - iter 243/272 - loss 0.11020025 - time (sec): 85.63 - samples/sec: 541.12 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 13:17:58,535 epoch 5 - iter 270/272 - loss 0.10938645 - time (sec): 95.06 - samples/sec: 539.87 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 13:17:59,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:17:59,373 EPOCH 5 done: loss 0.1087 - lr: 0.000084 |
|
2023-10-11 13:18:05,260 DEV : loss 0.12970557808876038 - f1-score (micro avg) 0.7782 |
|
2023-10-11 13:18:05,268 saving best model |
|
2023-10-11 13:18:07,813 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:18:17,449 epoch 6 - iter 27/272 - loss 0.08697393 - time (sec): 9.63 - samples/sec: 566.07 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 13:18:26,302 epoch 6 - iter 54/272 - loss 0.08490523 - time (sec): 18.48 - samples/sec: 537.15 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 13:18:36,045 epoch 6 - iter 81/272 - loss 0.08898586 - time (sec): 28.23 - samples/sec: 546.35 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 13:18:45,268 epoch 6 - iter 108/272 - loss 0.08812781 - time (sec): 37.45 - samples/sec: 547.61 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 13:18:54,730 epoch 6 - iter 135/272 - loss 0.08125660 - time (sec): 46.91 - samples/sec: 547.04 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 13:19:03,883 epoch 6 - iter 162/272 - loss 0.08079255 - time (sec): 56.07 - samples/sec: 542.30 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 13:19:13,710 epoch 6 - iter 189/272 - loss 0.07658614 - time (sec): 65.89 - samples/sec: 539.76 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 13:19:24,302 epoch 6 - iter 216/272 - loss 0.07746535 - time (sec): 76.48 - samples/sec: 539.77 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 13:19:34,054 epoch 6 - iter 243/272 - loss 0.07630159 - time (sec): 86.24 - samples/sec: 536.17 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 13:19:44,094 epoch 6 - iter 270/272 - loss 0.07449806 - time (sec): 96.28 - samples/sec: 537.63 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 13:19:44,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:19:44,548 EPOCH 6 done: loss 0.0755 - lr: 0.000067 |
|
2023-10-11 13:19:50,727 DEV : loss 0.13596650958061218 - f1-score (micro avg) 0.7802 |
|
2023-10-11 13:19:50,741 saving best model |
|
2023-10-11 13:19:53,401 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:20:04,340 epoch 7 - iter 27/272 - loss 0.07151511 - time (sec): 10.94 - samples/sec: 544.36 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 13:20:14,460 epoch 7 - iter 54/272 - loss 0.06298137 - time (sec): 21.06 - samples/sec: 535.86 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 13:20:24,037 epoch 7 - iter 81/272 - loss 0.06852612 - time (sec): 30.63 - samples/sec: 529.97 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 13:20:33,314 epoch 7 - iter 108/272 - loss 0.06310370 - time (sec): 39.91 - samples/sec: 525.06 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 13:20:43,363 epoch 7 - iter 135/272 - loss 0.06476083 - time (sec): 49.96 - samples/sec: 528.83 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 13:20:53,660 epoch 7 - iter 162/272 - loss 0.05980500 - time (sec): 60.26 - samples/sec: 535.29 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 13:21:03,338 epoch 7 - iter 189/272 - loss 0.05858969 - time (sec): 69.93 - samples/sec: 535.33 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 13:21:12,953 epoch 7 - iter 216/272 - loss 0.06293113 - time (sec): 79.55 - samples/sec: 532.76 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 13:21:21,594 epoch 7 - iter 243/272 - loss 0.06038448 - time (sec): 88.19 - samples/sec: 525.24 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 13:21:31,424 epoch 7 - iter 270/272 - loss 0.05738546 - time (sec): 98.02 - samples/sec: 527.79 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 13:21:31,922 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:21:31,922 EPOCH 7 done: loss 0.0574 - lr: 0.000050 |
|
2023-10-11 13:21:37,845 DEV : loss 0.12629856169223785 - f1-score (micro avg) 0.8 |
|
2023-10-11 13:21:37,854 saving best model |
|
2023-10-11 13:21:40,425 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:21:49,568 epoch 8 - iter 27/272 - loss 0.03692180 - time (sec): 9.14 - samples/sec: 522.07 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 13:21:58,535 epoch 8 - iter 54/272 - loss 0.04304127 - time (sec): 18.11 - samples/sec: 515.86 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 13:22:08,106 epoch 8 - iter 81/272 - loss 0.04492744 - time (sec): 27.68 - samples/sec: 524.05 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 13:22:19,082 epoch 8 - iter 108/272 - loss 0.04289346 - time (sec): 38.65 - samples/sec: 533.91 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 13:22:28,087 epoch 8 - iter 135/272 - loss 0.04671101 - time (sec): 47.66 - samples/sec: 531.12 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 13:22:37,505 epoch 8 - iter 162/272 - loss 0.04694481 - time (sec): 57.08 - samples/sec: 535.95 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 13:22:47,043 epoch 8 - iter 189/272 - loss 0.04703120 - time (sec): 66.61 - samples/sec: 539.57 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-11 13:22:56,431 epoch 8 - iter 216/272 - loss 0.04473500 - time (sec): 76.00 - samples/sec: 541.82 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 13:23:06,007 epoch 8 - iter 243/272 - loss 0.04459354 - time (sec): 85.58 - samples/sec: 544.67 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-11 13:23:15,526 epoch 8 - iter 270/272 - loss 0.04481176 - time (sec): 95.10 - samples/sec: 545.90 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 13:23:15,869 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:23:15,869 EPOCH 8 done: loss 0.0450 - lr: 0.000034 |
|
2023-10-11 13:23:21,558 DEV : loss 0.12960414588451385 - f1-score (micro avg) 0.7782 |
|
2023-10-11 13:23:21,566 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:23:30,857 epoch 9 - iter 27/272 - loss 0.03796750 - time (sec): 9.29 - samples/sec: 544.75 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 13:23:40,261 epoch 9 - iter 54/272 - loss 0.04594063 - time (sec): 18.69 - samples/sec: 556.90 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 13:23:49,865 epoch 9 - iter 81/272 - loss 0.04313347 - time (sec): 28.30 - samples/sec: 559.29 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 13:23:59,378 epoch 9 - iter 108/272 - loss 0.03968736 - time (sec): 37.81 - samples/sec: 552.16 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 13:24:09,034 epoch 9 - iter 135/272 - loss 0.03928521 - time (sec): 47.47 - samples/sec: 550.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 13:24:18,685 epoch 9 - iter 162/272 - loss 0.03901635 - time (sec): 57.12 - samples/sec: 551.63 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 13:24:28,210 epoch 9 - iter 189/272 - loss 0.03840262 - time (sec): 66.64 - samples/sec: 547.60 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 13:24:37,304 epoch 9 - iter 216/272 - loss 0.04002943 - time (sec): 75.74 - samples/sec: 546.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 13:24:46,482 epoch 9 - iter 243/272 - loss 0.03766633 - time (sec): 84.91 - samples/sec: 547.65 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 13:24:55,673 epoch 9 - iter 270/272 - loss 0.03760281 - time (sec): 94.10 - samples/sec: 547.45 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 13:24:56,337 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:24:56,338 EPOCH 9 done: loss 0.0378 - lr: 0.000017 |
|
2023-10-11 13:25:01,878 DEV : loss 0.12975578010082245 - f1-score (micro avg) 0.7883 |
|
2023-10-11 13:25:01,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:25:11,078 epoch 10 - iter 27/272 - loss 0.02470685 - time (sec): 9.19 - samples/sec: 560.90 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 13:25:19,920 epoch 10 - iter 54/272 - loss 0.02211500 - time (sec): 18.03 - samples/sec: 543.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 13:25:29,801 epoch 10 - iter 81/272 - loss 0.02680971 - time (sec): 27.91 - samples/sec: 559.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 13:25:39,117 epoch 10 - iter 108/272 - loss 0.02701104 - time (sec): 37.23 - samples/sec: 562.59 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 13:25:48,499 epoch 10 - iter 135/272 - loss 0.03203950 - time (sec): 46.61 - samples/sec: 567.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 13:25:58,719 epoch 10 - iter 162/272 - loss 0.03547743 - time (sec): 56.83 - samples/sec: 578.80 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 13:26:07,029 epoch 10 - iter 189/272 - loss 0.03546825 - time (sec): 65.14 - samples/sec: 565.07 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 13:26:16,231 epoch 10 - iter 216/272 - loss 0.03456324 - time (sec): 74.34 - samples/sec: 563.35 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 13:26:25,608 epoch 10 - iter 243/272 - loss 0.03387184 - time (sec): 83.72 - samples/sec: 558.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 13:26:34,835 epoch 10 - iter 270/272 - loss 0.03411250 - time (sec): 92.95 - samples/sec: 555.26 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 13:26:35,403 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:26:35,403 EPOCH 10 done: loss 0.0340 - lr: 0.000000 |
|
2023-10-11 13:26:40,958 DEV : loss 0.13090862333774567 - f1-score (micro avg) 0.7847 |
|
2023-10-11 13:26:41,799 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:26:41,800 Loading model from best epoch ... |
|
2023-10-11 13:26:45,973 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 13:26:58,353 |
|
Results: |
|
- F-score (micro) 0.7799 |
|
- F-score (macro) 0.6981 |
|
- Accuracy 0.657 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7977 0.8718 0.8331 312 |
|
PER 0.7061 0.8894 0.7872 208 |
|
ORG 0.4419 0.3455 0.3878 55 |
|
HumanProd 0.6897 0.9091 0.7843 22 |
|
|
|
micro avg 0.7348 0.8308 0.7799 597 |
|
macro avg 0.6588 0.7539 0.6981 597 |
|
weighted avg 0.7290 0.8308 0.7743 597 |
|
|
|
2023-10-11 13:26:58,353 ---------------------------------------------------------------------------------------------------- |
|
|