|
2023-10-16 18:47:30,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,789 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:47:30,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,789 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:47:30,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,789 Train: 1166 sentences |
|
2023-10-16 18:47:30,789 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:47:30,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,789 Training Params: |
|
2023-10-16 18:47:30,789 - learning_rate: "3e-05" |
|
2023-10-16 18:47:30,789 - mini_batch_size: "4" |
|
2023-10-16 18:47:30,789 - max_epochs: "10" |
|
2023-10-16 18:47:30,789 - shuffle: "True" |
|
2023-10-16 18:47:30,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,789 Plugins: |
|
2023-10-16 18:47:30,789 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:47:30,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,789 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:47:30,790 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:47:30,790 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,790 Computation: |
|
2023-10-16 18:47:30,790 - compute on device: cuda:0 |
|
2023-10-16 18:47:30,790 - embedding storage: none |
|
2023-10-16 18:47:30,790 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,790 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-16 18:47:30,790 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:30,790 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:32,650 epoch 1 - iter 29/292 - loss 2.82911046 - time (sec): 1.86 - samples/sec: 2422.25 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:47:34,389 epoch 1 - iter 58/292 - loss 2.49086620 - time (sec): 3.60 - samples/sec: 2514.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:47:36,229 epoch 1 - iter 87/292 - loss 1.84331573 - time (sec): 5.44 - samples/sec: 2561.21 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:47:37,911 epoch 1 - iter 116/292 - loss 1.51218308 - time (sec): 7.12 - samples/sec: 2586.19 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:47:39,496 epoch 1 - iter 145/292 - loss 1.36167935 - time (sec): 8.70 - samples/sec: 2538.62 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:47:41,263 epoch 1 - iter 174/292 - loss 1.17652481 - time (sec): 10.47 - samples/sec: 2577.89 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:47:42,961 epoch 1 - iter 203/292 - loss 1.06327340 - time (sec): 12.17 - samples/sec: 2578.85 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:47:44,670 epoch 1 - iter 232/292 - loss 0.97664361 - time (sec): 13.88 - samples/sec: 2550.83 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:47:46,367 epoch 1 - iter 261/292 - loss 0.90519041 - time (sec): 15.58 - samples/sec: 2527.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:47:48,221 epoch 1 - iter 290/292 - loss 0.84527451 - time (sec): 17.43 - samples/sec: 2542.47 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:47:48,318 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:48,318 EPOCH 1 done: loss 0.8436 - lr: 0.000030 |
|
2023-10-16 18:47:49,558 DEV : loss 0.21602800488471985 - f1-score (micro avg) 0.337 |
|
2023-10-16 18:47:49,563 saving best model |
|
2023-10-16 18:47:50,004 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:47:51,866 epoch 2 - iter 29/292 - loss 0.24466333 - time (sec): 1.86 - samples/sec: 2815.32 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:47:53,581 epoch 2 - iter 58/292 - loss 0.20933144 - time (sec): 3.58 - samples/sec: 2721.81 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:47:55,438 epoch 2 - iter 87/292 - loss 0.21419564 - time (sec): 5.43 - samples/sec: 2543.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:47:57,300 epoch 2 - iter 116/292 - loss 0.22692878 - time (sec): 7.29 - samples/sec: 2538.49 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:47:59,284 epoch 2 - iter 145/292 - loss 0.23462881 - time (sec): 9.28 - samples/sec: 2553.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:48:01,011 epoch 2 - iter 174/292 - loss 0.22426306 - time (sec): 11.01 - samples/sec: 2592.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:48:02,442 epoch 2 - iter 203/292 - loss 0.22508575 - time (sec): 12.44 - samples/sec: 2565.16 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:48:03,851 epoch 2 - iter 232/292 - loss 0.22081040 - time (sec): 13.85 - samples/sec: 2543.91 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:48:05,712 epoch 2 - iter 261/292 - loss 0.21204480 - time (sec): 15.71 - samples/sec: 2564.40 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:48:07,375 epoch 2 - iter 290/292 - loss 0.20672562 - time (sec): 17.37 - samples/sec: 2551.57 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:48:07,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:48:07,458 EPOCH 2 done: loss 0.2062 - lr: 0.000027 |
|
2023-10-16 18:48:08,760 DEV : loss 0.15461082756519318 - f1-score (micro avg) 0.5798 |
|
2023-10-16 18:48:08,766 saving best model |
|
2023-10-16 18:48:09,304 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:48:10,958 epoch 3 - iter 29/292 - loss 0.14468212 - time (sec): 1.65 - samples/sec: 2577.11 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:48:12,664 epoch 3 - iter 58/292 - loss 0.14204837 - time (sec): 3.36 - samples/sec: 2634.33 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:48:14,378 epoch 3 - iter 87/292 - loss 0.12917051 - time (sec): 5.07 - samples/sec: 2582.04 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:48:16,152 epoch 3 - iter 116/292 - loss 0.12946334 - time (sec): 6.85 - samples/sec: 2590.11 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:48:17,863 epoch 3 - iter 145/292 - loss 0.12001689 - time (sec): 8.56 - samples/sec: 2620.61 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:48:19,560 epoch 3 - iter 174/292 - loss 0.11169965 - time (sec): 10.25 - samples/sec: 2630.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:48:21,157 epoch 3 - iter 203/292 - loss 0.11960085 - time (sec): 11.85 - samples/sec: 2601.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:48:22,949 epoch 3 - iter 232/292 - loss 0.11698755 - time (sec): 13.64 - samples/sec: 2589.10 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:48:24,708 epoch 3 - iter 261/292 - loss 0.11551965 - time (sec): 15.40 - samples/sec: 2562.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:48:26,558 epoch 3 - iter 290/292 - loss 0.11480963 - time (sec): 17.25 - samples/sec: 2568.65 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:48:26,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:48:26,657 EPOCH 3 done: loss 0.1146 - lr: 0.000023 |
|
2023-10-16 18:48:27,913 DEV : loss 0.13052009046077728 - f1-score (micro avg) 0.679 |
|
2023-10-16 18:48:27,919 saving best model |
|
2023-10-16 18:48:28,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:48:30,132 epoch 4 - iter 29/292 - loss 0.08537570 - time (sec): 1.71 - samples/sec: 2536.85 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:48:31,767 epoch 4 - iter 58/292 - loss 0.07028676 - time (sec): 3.35 - samples/sec: 2495.01 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:48:33,404 epoch 4 - iter 87/292 - loss 0.07384165 - time (sec): 4.99 - samples/sec: 2550.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:48:34,978 epoch 4 - iter 116/292 - loss 0.07106452 - time (sec): 6.56 - samples/sec: 2518.68 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:48:36,765 epoch 4 - iter 145/292 - loss 0.07166389 - time (sec): 8.35 - samples/sec: 2633.37 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:48:38,779 epoch 4 - iter 174/292 - loss 0.07127372 - time (sec): 10.36 - samples/sec: 2532.09 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:48:40,421 epoch 4 - iter 203/292 - loss 0.07109717 - time (sec): 12.00 - samples/sec: 2539.84 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:48:42,238 epoch 4 - iter 232/292 - loss 0.07908090 - time (sec): 13.82 - samples/sec: 2559.91 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:48:44,005 epoch 4 - iter 261/292 - loss 0.07663505 - time (sec): 15.59 - samples/sec: 2542.07 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:48:45,789 epoch 4 - iter 290/292 - loss 0.07642799 - time (sec): 17.37 - samples/sec: 2534.38 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:48:45,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:48:45,903 EPOCH 4 done: loss 0.0759 - lr: 0.000020 |
|
2023-10-16 18:48:47,155 DEV : loss 0.132146954536438 - f1-score (micro avg) 0.6763 |
|
2023-10-16 18:48:47,160 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:48:48,969 epoch 5 - iter 29/292 - loss 0.04044493 - time (sec): 1.81 - samples/sec: 2428.77 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:48:50,548 epoch 5 - iter 58/292 - loss 0.03579751 - time (sec): 3.39 - samples/sec: 2449.11 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:48:52,506 epoch 5 - iter 87/292 - loss 0.05205437 - time (sec): 5.34 - samples/sec: 2486.93 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:48:54,176 epoch 5 - iter 116/292 - loss 0.04956530 - time (sec): 7.01 - samples/sec: 2520.65 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:48:55,809 epoch 5 - iter 145/292 - loss 0.05364871 - time (sec): 8.65 - samples/sec: 2590.18 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:48:57,397 epoch 5 - iter 174/292 - loss 0.05566681 - time (sec): 10.24 - samples/sec: 2588.97 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:48:59,058 epoch 5 - iter 203/292 - loss 0.05511954 - time (sec): 11.90 - samples/sec: 2581.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:49:00,736 epoch 5 - iter 232/292 - loss 0.05166875 - time (sec): 13.57 - samples/sec: 2592.74 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:49:02,332 epoch 5 - iter 261/292 - loss 0.05145842 - time (sec): 15.17 - samples/sec: 2613.29 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:49:04,070 epoch 5 - iter 290/292 - loss 0.05131951 - time (sec): 16.91 - samples/sec: 2622.72 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:49:04,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:49:04,156 EPOCH 5 done: loss 0.0517 - lr: 0.000017 |
|
2023-10-16 18:49:05,439 DEV : loss 0.12989631295204163 - f1-score (micro avg) 0.7468 |
|
2023-10-16 18:49:05,445 saving best model |
|
2023-10-16 18:49:05,950 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:49:07,718 epoch 6 - iter 29/292 - loss 0.03368459 - time (sec): 1.77 - samples/sec: 2588.74 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:49:09,383 epoch 6 - iter 58/292 - loss 0.03595871 - time (sec): 3.43 - samples/sec: 2661.03 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:49:11,087 epoch 6 - iter 87/292 - loss 0.03750286 - time (sec): 5.13 - samples/sec: 2663.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:49:12,734 epoch 6 - iter 116/292 - loss 0.03535109 - time (sec): 6.78 - samples/sec: 2658.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:49:14,462 epoch 6 - iter 145/292 - loss 0.03424649 - time (sec): 8.51 - samples/sec: 2679.85 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:49:16,119 epoch 6 - iter 174/292 - loss 0.03583046 - time (sec): 10.17 - samples/sec: 2686.52 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:49:17,673 epoch 6 - iter 203/292 - loss 0.03342373 - time (sec): 11.72 - samples/sec: 2643.28 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:49:19,337 epoch 6 - iter 232/292 - loss 0.03942726 - time (sec): 13.38 - samples/sec: 2666.94 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:49:20,924 epoch 6 - iter 261/292 - loss 0.03835690 - time (sec): 14.97 - samples/sec: 2653.46 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:49:22,612 epoch 6 - iter 290/292 - loss 0.03977398 - time (sec): 16.66 - samples/sec: 2657.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:49:22,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:49:22,700 EPOCH 6 done: loss 0.0397 - lr: 0.000013 |
|
2023-10-16 18:49:23,944 DEV : loss 0.14868424832820892 - f1-score (micro avg) 0.7473 |
|
2023-10-16 18:49:23,949 saving best model |
|
2023-10-16 18:49:24,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:49:26,376 epoch 7 - iter 29/292 - loss 0.03302502 - time (sec): 1.87 - samples/sec: 3020.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:49:28,000 epoch 7 - iter 58/292 - loss 0.02701595 - time (sec): 3.49 - samples/sec: 2914.66 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:49:29,505 epoch 7 - iter 87/292 - loss 0.02743433 - time (sec): 5.00 - samples/sec: 2830.89 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:49:31,177 epoch 7 - iter 116/292 - loss 0.03183254 - time (sec): 6.67 - samples/sec: 2749.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:49:32,784 epoch 7 - iter 145/292 - loss 0.03388558 - time (sec): 8.28 - samples/sec: 2672.20 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:49:34,471 epoch 7 - iter 174/292 - loss 0.03382158 - time (sec): 9.97 - samples/sec: 2690.49 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:49:36,122 epoch 7 - iter 203/292 - loss 0.03367802 - time (sec): 11.62 - samples/sec: 2695.76 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:49:37,702 epoch 7 - iter 232/292 - loss 0.03454636 - time (sec): 13.20 - samples/sec: 2680.29 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:49:39,172 epoch 7 - iter 261/292 - loss 0.03328201 - time (sec): 14.67 - samples/sec: 2692.19 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:49:40,957 epoch 7 - iter 290/292 - loss 0.03164000 - time (sec): 16.45 - samples/sec: 2691.02 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:49:41,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:49:41,053 EPOCH 7 done: loss 0.0315 - lr: 0.000010 |
|
2023-10-16 18:49:42,323 DEV : loss 0.15577590465545654 - f1-score (micro avg) 0.7425 |
|
2023-10-16 18:49:42,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:49:43,926 epoch 8 - iter 29/292 - loss 0.02151028 - time (sec): 1.60 - samples/sec: 2896.40 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:49:45,528 epoch 8 - iter 58/292 - loss 0.01568049 - time (sec): 3.20 - samples/sec: 2836.41 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:49:47,028 epoch 8 - iter 87/292 - loss 0.01534683 - time (sec): 4.70 - samples/sec: 2681.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:49:48,699 epoch 8 - iter 116/292 - loss 0.01579191 - time (sec): 6.37 - samples/sec: 2714.41 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:49:50,308 epoch 8 - iter 145/292 - loss 0.01777071 - time (sec): 7.98 - samples/sec: 2713.12 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:49:52,025 epoch 8 - iter 174/292 - loss 0.02052391 - time (sec): 9.70 - samples/sec: 2733.60 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:49:53,814 epoch 8 - iter 203/292 - loss 0.02071842 - time (sec): 11.49 - samples/sec: 2641.29 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:49:55,480 epoch 8 - iter 232/292 - loss 0.02537783 - time (sec): 13.15 - samples/sec: 2639.35 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:49:57,248 epoch 8 - iter 261/292 - loss 0.02774566 - time (sec): 14.92 - samples/sec: 2665.19 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:49:58,766 epoch 8 - iter 290/292 - loss 0.02654013 - time (sec): 16.44 - samples/sec: 2690.86 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:49:58,850 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:49:58,850 EPOCH 8 done: loss 0.0264 - lr: 0.000007 |
|
2023-10-16 18:50:00,128 DEV : loss 0.15800637006759644 - f1-score (micro avg) 0.7521 |
|
2023-10-16 18:50:00,132 saving best model |
|
2023-10-16 18:50:00,641 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:50:02,387 epoch 9 - iter 29/292 - loss 0.01041658 - time (sec): 1.74 - samples/sec: 2808.85 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:50:04,020 epoch 9 - iter 58/292 - loss 0.01199768 - time (sec): 3.38 - samples/sec: 2761.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:50:05,672 epoch 9 - iter 87/292 - loss 0.01928507 - time (sec): 5.03 - samples/sec: 2709.97 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:50:07,278 epoch 9 - iter 116/292 - loss 0.01843598 - time (sec): 6.63 - samples/sec: 2756.90 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:50:09,017 epoch 9 - iter 145/292 - loss 0.01796752 - time (sec): 8.37 - samples/sec: 2783.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:50:10,606 epoch 9 - iter 174/292 - loss 0.01908946 - time (sec): 9.96 - samples/sec: 2745.24 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:50:12,136 epoch 9 - iter 203/292 - loss 0.01923461 - time (sec): 11.49 - samples/sec: 2702.79 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:50:13,933 epoch 9 - iter 232/292 - loss 0.01906635 - time (sec): 13.29 - samples/sec: 2695.05 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:50:15,552 epoch 9 - iter 261/292 - loss 0.02406961 - time (sec): 14.91 - samples/sec: 2696.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:50:17,130 epoch 9 - iter 290/292 - loss 0.02290073 - time (sec): 16.49 - samples/sec: 2689.74 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:50:17,226 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:50:17,226 EPOCH 9 done: loss 0.0234 - lr: 0.000003 |
|
2023-10-16 18:50:18,498 DEV : loss 0.1649204045534134 - f1-score (micro avg) 0.7579 |
|
2023-10-16 18:50:18,503 saving best model |
|
2023-10-16 18:50:18,966 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:50:20,566 epoch 10 - iter 29/292 - loss 0.01048869 - time (sec): 1.60 - samples/sec: 2501.04 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:50:22,232 epoch 10 - iter 58/292 - loss 0.01035586 - time (sec): 3.26 - samples/sec: 2695.63 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:50:23,973 epoch 10 - iter 87/292 - loss 0.01530515 - time (sec): 5.01 - samples/sec: 2703.40 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:50:25,635 epoch 10 - iter 116/292 - loss 0.01435463 - time (sec): 6.67 - samples/sec: 2786.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:50:27,430 epoch 10 - iter 145/292 - loss 0.01677783 - time (sec): 8.46 - samples/sec: 2778.38 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:50:28,954 epoch 10 - iter 174/292 - loss 0.01671529 - time (sec): 9.99 - samples/sec: 2752.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:50:30,653 epoch 10 - iter 203/292 - loss 0.01614914 - time (sec): 11.69 - samples/sec: 2712.33 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:50:32,184 epoch 10 - iter 232/292 - loss 0.01780397 - time (sec): 13.22 - samples/sec: 2705.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:50:33,826 epoch 10 - iter 261/292 - loss 0.02122159 - time (sec): 14.86 - samples/sec: 2701.08 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:50:35,339 epoch 10 - iter 290/292 - loss 0.01997268 - time (sec): 16.37 - samples/sec: 2697.63 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:50:35,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:50:35,433 EPOCH 10 done: loss 0.0199 - lr: 0.000000 |
|
2023-10-16 18:50:36,707 DEV : loss 0.16057094931602478 - f1-score (micro avg) 0.74 |
|
2023-10-16 18:50:37,097 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:50:37,098 Loading model from best epoch ... |
|
2023-10-16 18:50:38,880 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:50:41,349 |
|
Results: |
|
- F-score (micro) 0.7441 |
|
- F-score (macro) 0.6612 |
|
- Accuracy 0.6157 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7775 0.8534 0.8137 348 |
|
LOC 0.6564 0.8199 0.7291 261 |
|
ORG 0.3393 0.3654 0.3519 52 |
|
HumanProd 0.6923 0.8182 0.7500 22 |
|
|
|
micro avg 0.6937 0.8023 0.7441 683 |
|
macro avg 0.6164 0.7142 0.6612 683 |
|
weighted avg 0.6951 0.8023 0.7442 683 |
|
|
|
2023-10-16 18:50:41,350 ---------------------------------------------------------------------------------------------------- |
|
|