|
2023-10-16 17:53:10,355 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 Train: 1166 sentences |
|
2023-10-16 17:53:10,356 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 Training Params: |
|
2023-10-16 17:53:10,356 - learning_rate: "3e-05" |
|
2023-10-16 17:53:10,356 - mini_batch_size: "4" |
|
2023-10-16 17:53:10,356 - max_epochs: "10" |
|
2023-10-16 17:53:10,356 - shuffle: "True" |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 Plugins: |
|
2023-10-16 17:53:10,356 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 17:53:10,356 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 Computation: |
|
2023-10-16 17:53:10,356 - compute on device: cuda:0 |
|
2023-10-16 17:53:10,356 - embedding storage: none |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,356 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-16 17:53:10,356 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:10,357 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:13,018 epoch 1 - iter 29/292 - loss 3.01670138 - time (sec): 2.66 - samples/sec: 1590.03 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 17:53:14,948 epoch 1 - iter 58/292 - loss 2.54580984 - time (sec): 4.59 - samples/sec: 2158.32 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 17:53:16,645 epoch 1 - iter 87/292 - loss 2.07822180 - time (sec): 6.29 - samples/sec: 2183.16 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 17:53:18,313 epoch 1 - iter 116/292 - loss 1.77670253 - time (sec): 7.96 - samples/sec: 2206.58 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 17:53:20,188 epoch 1 - iter 145/292 - loss 1.48355404 - time (sec): 9.83 - samples/sec: 2309.09 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 17:53:21,905 epoch 1 - iter 174/292 - loss 1.31635425 - time (sec): 11.55 - samples/sec: 2320.51 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 17:53:23,540 epoch 1 - iter 203/292 - loss 1.20996793 - time (sec): 13.18 - samples/sec: 2379.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 17:53:25,115 epoch 1 - iter 232/292 - loss 1.10738448 - time (sec): 14.76 - samples/sec: 2389.64 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 17:53:26,860 epoch 1 - iter 261/292 - loss 1.01693990 - time (sec): 16.50 - samples/sec: 2405.80 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 17:53:28,671 epoch 1 - iter 290/292 - loss 0.94433148 - time (sec): 18.31 - samples/sec: 2415.89 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 17:53:28,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:28,769 EPOCH 1 done: loss 0.9414 - lr: 0.000030 |
|
2023-10-16 17:53:29,849 DEV : loss 0.2120908498764038 - f1-score (micro avg) 0.4492 |
|
2023-10-16 17:53:29,853 saving best model |
|
2023-10-16 17:53:30,340 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:32,028 epoch 2 - iter 29/292 - loss 0.18672523 - time (sec): 1.69 - samples/sec: 2637.53 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 17:53:33,606 epoch 2 - iter 58/292 - loss 0.22053055 - time (sec): 3.27 - samples/sec: 2719.56 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 17:53:35,118 epoch 2 - iter 87/292 - loss 0.23027135 - time (sec): 4.78 - samples/sec: 2749.72 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 17:53:36,675 epoch 2 - iter 116/292 - loss 0.20606513 - time (sec): 6.33 - samples/sec: 2752.78 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 17:53:38,235 epoch 2 - iter 145/292 - loss 0.20734738 - time (sec): 7.89 - samples/sec: 2671.83 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 17:53:39,802 epoch 2 - iter 174/292 - loss 0.20460916 - time (sec): 9.46 - samples/sec: 2668.26 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 17:53:41,754 epoch 2 - iter 203/292 - loss 0.21184347 - time (sec): 11.41 - samples/sec: 2680.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 17:53:43,491 epoch 2 - iter 232/292 - loss 0.20796678 - time (sec): 13.15 - samples/sec: 2667.97 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 17:53:45,327 epoch 2 - iter 261/292 - loss 0.20176642 - time (sec): 14.99 - samples/sec: 2667.45 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 17:53:46,957 epoch 2 - iter 290/292 - loss 0.19939226 - time (sec): 16.62 - samples/sec: 2657.51 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 17:53:47,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:47,056 EPOCH 2 done: loss 0.1997 - lr: 0.000027 |
|
2023-10-16 17:53:48,273 DEV : loss 0.13168571889400482 - f1-score (micro avg) 0.6189 |
|
2023-10-16 17:53:48,277 saving best model |
|
2023-10-16 17:53:48,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:53:50,649 epoch 3 - iter 29/292 - loss 0.15474519 - time (sec): 1.82 - samples/sec: 3032.01 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 17:53:52,245 epoch 3 - iter 58/292 - loss 0.13264738 - time (sec): 3.41 - samples/sec: 2593.52 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 17:53:53,805 epoch 3 - iter 87/292 - loss 0.13055608 - time (sec): 4.97 - samples/sec: 2655.91 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 17:53:55,758 epoch 3 - iter 116/292 - loss 0.12609661 - time (sec): 6.93 - samples/sec: 2574.92 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 17:53:57,465 epoch 3 - iter 145/292 - loss 0.11999825 - time (sec): 8.63 - samples/sec: 2669.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 17:53:59,053 epoch 3 - iter 174/292 - loss 0.11593147 - time (sec): 10.22 - samples/sec: 2668.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 17:54:00,553 epoch 3 - iter 203/292 - loss 0.11595612 - time (sec): 11.72 - samples/sec: 2624.30 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 17:54:02,144 epoch 3 - iter 232/292 - loss 0.11529465 - time (sec): 13.31 - samples/sec: 2613.51 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 17:54:03,755 epoch 3 - iter 261/292 - loss 0.11497819 - time (sec): 14.92 - samples/sec: 2629.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 17:54:05,520 epoch 3 - iter 290/292 - loss 0.11247568 - time (sec): 16.69 - samples/sec: 2648.07 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 17:54:05,615 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:54:05,615 EPOCH 3 done: loss 0.1132 - lr: 0.000023 |
|
2023-10-16 17:54:06,825 DEV : loss 0.12188015133142471 - f1-score (micro avg) 0.7049 |
|
2023-10-16 17:54:06,829 saving best model |
|
2023-10-16 17:54:07,368 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:54:09,077 epoch 4 - iter 29/292 - loss 0.07659714 - time (sec): 1.71 - samples/sec: 2724.79 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 17:54:10,758 epoch 4 - iter 58/292 - loss 0.08198423 - time (sec): 3.39 - samples/sec: 2706.29 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 17:54:12,342 epoch 4 - iter 87/292 - loss 0.07174156 - time (sec): 4.97 - samples/sec: 2704.35 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 17:54:14,000 epoch 4 - iter 116/292 - loss 0.07292734 - time (sec): 6.63 - samples/sec: 2706.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 17:54:15,587 epoch 4 - iter 145/292 - loss 0.07248743 - time (sec): 8.22 - samples/sec: 2666.65 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 17:54:17,256 epoch 4 - iter 174/292 - loss 0.07455389 - time (sec): 9.88 - samples/sec: 2672.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 17:54:18,864 epoch 4 - iter 203/292 - loss 0.07414268 - time (sec): 11.49 - samples/sec: 2648.26 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 17:54:20,561 epoch 4 - iter 232/292 - loss 0.07838926 - time (sec): 13.19 - samples/sec: 2687.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 17:54:22,083 epoch 4 - iter 261/292 - loss 0.07356868 - time (sec): 14.71 - samples/sec: 2688.42 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 17:54:23,717 epoch 4 - iter 290/292 - loss 0.07163331 - time (sec): 16.35 - samples/sec: 2709.19 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 17:54:23,801 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:54:23,801 EPOCH 4 done: loss 0.0721 - lr: 0.000020 |
|
2023-10-16 17:54:24,988 DEV : loss 0.11146893352270126 - f1-score (micro avg) 0.7448 |
|
2023-10-16 17:54:24,992 saving best model |
|
2023-10-16 17:54:25,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:54:27,306 epoch 5 - iter 29/292 - loss 0.05024086 - time (sec): 1.71 - samples/sec: 2953.50 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 17:54:28,952 epoch 5 - iter 58/292 - loss 0.05493652 - time (sec): 3.36 - samples/sec: 2757.32 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 17:54:30,633 epoch 5 - iter 87/292 - loss 0.05660584 - time (sec): 5.04 - samples/sec: 2735.59 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 17:54:32,295 epoch 5 - iter 116/292 - loss 0.05164633 - time (sec): 6.70 - samples/sec: 2713.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 17:54:34,049 epoch 5 - iter 145/292 - loss 0.04760992 - time (sec): 8.45 - samples/sec: 2700.32 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 17:54:35,733 epoch 5 - iter 174/292 - loss 0.05141896 - time (sec): 10.14 - samples/sec: 2731.30 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 17:54:37,256 epoch 5 - iter 203/292 - loss 0.05190871 - time (sec): 11.66 - samples/sec: 2715.80 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 17:54:38,859 epoch 5 - iter 232/292 - loss 0.05203336 - time (sec): 13.26 - samples/sec: 2699.38 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 17:54:40,414 epoch 5 - iter 261/292 - loss 0.05031939 - time (sec): 14.82 - samples/sec: 2694.27 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 17:54:42,108 epoch 5 - iter 290/292 - loss 0.05098912 - time (sec): 16.51 - samples/sec: 2682.83 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 17:54:42,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:54:42,200 EPOCH 5 done: loss 0.0516 - lr: 0.000017 |
|
2023-10-16 17:54:43,417 DEV : loss 0.12832774221897125 - f1-score (micro avg) 0.73 |
|
2023-10-16 17:54:43,422 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:54:45,097 epoch 6 - iter 29/292 - loss 0.05950145 - time (sec): 1.67 - samples/sec: 2499.26 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 17:54:46,754 epoch 6 - iter 58/292 - loss 0.04628346 - time (sec): 3.33 - samples/sec: 2599.83 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 17:54:48,212 epoch 6 - iter 87/292 - loss 0.04890591 - time (sec): 4.79 - samples/sec: 2578.40 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 17:54:49,916 epoch 6 - iter 116/292 - loss 0.04392787 - time (sec): 6.49 - samples/sec: 2612.30 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 17:54:51,424 epoch 6 - iter 145/292 - loss 0.04177698 - time (sec): 8.00 - samples/sec: 2569.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 17:54:53,231 epoch 6 - iter 174/292 - loss 0.04185515 - time (sec): 9.81 - samples/sec: 2600.73 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 17:54:54,950 epoch 6 - iter 203/292 - loss 0.04627777 - time (sec): 11.53 - samples/sec: 2669.00 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 17:54:56,490 epoch 6 - iter 232/292 - loss 0.04466345 - time (sec): 13.07 - samples/sec: 2675.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 17:54:58,193 epoch 6 - iter 261/292 - loss 0.04324020 - time (sec): 14.77 - samples/sec: 2697.60 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 17:54:59,804 epoch 6 - iter 290/292 - loss 0.04093259 - time (sec): 16.38 - samples/sec: 2700.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 17:54:59,896 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:54:59,897 EPOCH 6 done: loss 0.0408 - lr: 0.000013 |
|
2023-10-16 17:55:01,125 DEV : loss 0.14976172149181366 - f1-score (micro avg) 0.74 |
|
2023-10-16 17:55:01,129 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:55:02,678 epoch 7 - iter 29/292 - loss 0.02344868 - time (sec): 1.55 - samples/sec: 2823.98 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 17:55:04,236 epoch 7 - iter 58/292 - loss 0.01915777 - time (sec): 3.11 - samples/sec: 2719.21 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 17:55:05,773 epoch 7 - iter 87/292 - loss 0.03000931 - time (sec): 4.64 - samples/sec: 2680.51 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 17:55:07,689 epoch 7 - iter 116/292 - loss 0.03479154 - time (sec): 6.56 - samples/sec: 2718.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 17:55:09,461 epoch 7 - iter 145/292 - loss 0.03189663 - time (sec): 8.33 - samples/sec: 2703.43 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 17:55:11,006 epoch 7 - iter 174/292 - loss 0.03017231 - time (sec): 9.88 - samples/sec: 2650.72 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 17:55:12,695 epoch 7 - iter 203/292 - loss 0.02853514 - time (sec): 11.56 - samples/sec: 2617.59 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 17:55:14,460 epoch 7 - iter 232/292 - loss 0.03262212 - time (sec): 13.33 - samples/sec: 2637.76 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 17:55:16,144 epoch 7 - iter 261/292 - loss 0.03145605 - time (sec): 15.01 - samples/sec: 2643.78 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 17:55:17,765 epoch 7 - iter 290/292 - loss 0.03161094 - time (sec): 16.63 - samples/sec: 2662.24 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 17:55:17,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:55:17,864 EPOCH 7 done: loss 0.0317 - lr: 0.000010 |
|
2023-10-16 17:55:19,109 DEV : loss 0.15802569687366486 - f1-score (micro avg) 0.7669 |
|
2023-10-16 17:55:19,115 saving best model |
|
2023-10-16 17:55:19,736 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:55:21,251 epoch 8 - iter 29/292 - loss 0.02308704 - time (sec): 1.51 - samples/sec: 2423.33 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 17:55:23,003 epoch 8 - iter 58/292 - loss 0.01751107 - time (sec): 3.27 - samples/sec: 2645.43 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 17:55:24,636 epoch 8 - iter 87/292 - loss 0.01868607 - time (sec): 4.90 - samples/sec: 2645.19 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 17:55:26,347 epoch 8 - iter 116/292 - loss 0.01794405 - time (sec): 6.61 - samples/sec: 2669.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 17:55:28,103 epoch 8 - iter 145/292 - loss 0.02452178 - time (sec): 8.37 - samples/sec: 2670.62 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 17:55:29,621 epoch 8 - iter 174/292 - loss 0.02379010 - time (sec): 9.88 - samples/sec: 2633.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 17:55:31,201 epoch 8 - iter 203/292 - loss 0.02414199 - time (sec): 11.46 - samples/sec: 2615.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 17:55:32,900 epoch 8 - iter 232/292 - loss 0.02725617 - time (sec): 13.16 - samples/sec: 2616.60 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 17:55:34,529 epoch 8 - iter 261/292 - loss 0.02585742 - time (sec): 14.79 - samples/sec: 2621.20 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 17:55:36,363 epoch 8 - iter 290/292 - loss 0.02434377 - time (sec): 16.63 - samples/sec: 2657.48 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 17:55:36,485 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:55:36,485 EPOCH 8 done: loss 0.0242 - lr: 0.000007 |
|
2023-10-16 17:55:37,720 DEV : loss 0.16654258966445923 - f1-score (micro avg) 0.742 |
|
2023-10-16 17:55:37,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:55:39,403 epoch 9 - iter 29/292 - loss 0.01417823 - time (sec): 1.68 - samples/sec: 2715.08 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 17:55:41,051 epoch 9 - iter 58/292 - loss 0.01232230 - time (sec): 3.33 - samples/sec: 2616.24 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 17:55:42,622 epoch 9 - iter 87/292 - loss 0.01376009 - time (sec): 4.90 - samples/sec: 2518.21 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 17:55:44,385 epoch 9 - iter 116/292 - loss 0.02555969 - time (sec): 6.66 - samples/sec: 2645.87 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 17:55:46,074 epoch 9 - iter 145/292 - loss 0.02158313 - time (sec): 8.35 - samples/sec: 2653.79 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 17:55:47,637 epoch 9 - iter 174/292 - loss 0.02019502 - time (sec): 9.91 - samples/sec: 2641.31 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 17:55:49,335 epoch 9 - iter 203/292 - loss 0.02252725 - time (sec): 11.61 - samples/sec: 2620.54 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 17:55:50,943 epoch 9 - iter 232/292 - loss 0.02172381 - time (sec): 13.22 - samples/sec: 2623.35 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 17:55:52,771 epoch 9 - iter 261/292 - loss 0.01959173 - time (sec): 15.05 - samples/sec: 2637.08 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 17:55:54,506 epoch 9 - iter 290/292 - loss 0.01884339 - time (sec): 16.78 - samples/sec: 2641.24 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 17:55:54,588 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:55:54,588 EPOCH 9 done: loss 0.0188 - lr: 0.000003 |
|
2023-10-16 17:55:55,813 DEV : loss 0.1669674515724182 - f1-score (micro avg) 0.7447 |
|
2023-10-16 17:55:55,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:55:57,419 epoch 10 - iter 29/292 - loss 0.00817807 - time (sec): 1.60 - samples/sec: 2510.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 17:55:59,136 epoch 10 - iter 58/292 - loss 0.00727258 - time (sec): 3.32 - samples/sec: 2788.28 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 17:56:01,145 epoch 10 - iter 87/292 - loss 0.01631169 - time (sec): 5.33 - samples/sec: 2808.60 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 17:56:02,812 epoch 10 - iter 116/292 - loss 0.01409848 - time (sec): 6.99 - samples/sec: 2845.86 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 17:56:04,508 epoch 10 - iter 145/292 - loss 0.01324680 - time (sec): 8.69 - samples/sec: 2765.70 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 17:56:06,061 epoch 10 - iter 174/292 - loss 0.01866032 - time (sec): 10.24 - samples/sec: 2731.65 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 17:56:07,607 epoch 10 - iter 203/292 - loss 0.01831901 - time (sec): 11.79 - samples/sec: 2690.65 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 17:56:09,171 epoch 10 - iter 232/292 - loss 0.01714936 - time (sec): 13.35 - samples/sec: 2679.21 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 17:56:10,757 epoch 10 - iter 261/292 - loss 0.01673991 - time (sec): 14.94 - samples/sec: 2656.88 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 17:56:12,321 epoch 10 - iter 290/292 - loss 0.01661350 - time (sec): 16.50 - samples/sec: 2666.97 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 17:56:12,466 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:56:12,466 EPOCH 10 done: loss 0.0165 - lr: 0.000000 |
|
2023-10-16 17:56:13,697 DEV : loss 0.1702689379453659 - f1-score (micro avg) 0.7368 |
|
2023-10-16 17:56:14,198 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 17:56:14,199 Loading model from best epoch ... |
|
2023-10-16 17:56:16,142 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 17:56:18,744 |
|
Results: |
|
- F-score (micro) 0.7574 |
|
- F-score (macro) 0.6866 |
|
- Accuracy 0.6335 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7936 0.8506 0.8211 348 |
|
LOC 0.6677 0.8238 0.7376 261 |
|
ORG 0.4130 0.3654 0.3878 52 |
|
HumanProd 0.7826 0.8182 0.8000 22 |
|
|
|
micro avg 0.7173 0.8023 0.7574 683 |
|
macro avg 0.6642 0.7145 0.6866 683 |
|
weighted avg 0.7161 0.8023 0.7555 683 |
|
|
|
2023-10-16 17:56:18,744 ---------------------------------------------------------------------------------------------------- |
|
|