|
2023-10-17 17:30:36,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,894 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 17:30:36,894 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,894 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 17:30:36,894 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,894 Train: 1166 sentences |
|
2023-10-17 17:30:36,894 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 17:30:36,894 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,894 Training Params: |
|
2023-10-17 17:30:36,894 - learning_rate: "5e-05" |
|
2023-10-17 17:30:36,894 - mini_batch_size: "4" |
|
2023-10-17 17:30:36,895 - max_epochs: "10" |
|
2023-10-17 17:30:36,895 - shuffle: "True" |
|
2023-10-17 17:30:36,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,895 Plugins: |
|
2023-10-17 17:30:36,895 - TensorboardLogger |
|
2023-10-17 17:30:36,895 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 17:30:36,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,895 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 17:30:36,895 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 17:30:36,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,895 Computation: |
|
2023-10-17 17:30:36,895 - compute on device: cuda:0 |
|
2023-10-17 17:30:36,895 - embedding storage: none |
|
2023-10-17 17:30:36,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,895 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-17 17:30:36,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:36,895 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 17:30:38,477 epoch 1 - iter 29/292 - loss 3.30003254 - time (sec): 1.58 - samples/sec: 2413.28 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:30:40,085 epoch 1 - iter 58/292 - loss 2.46491688 - time (sec): 3.19 - samples/sec: 2550.87 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:30:42,223 epoch 1 - iter 87/292 - loss 1.78620105 - time (sec): 5.33 - samples/sec: 2612.08 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:30:43,795 epoch 1 - iter 116/292 - loss 1.44223007 - time (sec): 6.90 - samples/sec: 2692.60 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:30:45,319 epoch 1 - iter 145/292 - loss 1.26681101 - time (sec): 8.42 - samples/sec: 2676.66 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:30:46,932 epoch 1 - iter 174/292 - loss 1.11965052 - time (sec): 10.04 - samples/sec: 2657.49 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:30:48,467 epoch 1 - iter 203/292 - loss 1.00652496 - time (sec): 11.57 - samples/sec: 2649.79 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:30:50,183 epoch 1 - iter 232/292 - loss 0.90576718 - time (sec): 13.29 - samples/sec: 2675.23 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:30:51,677 epoch 1 - iter 261/292 - loss 0.84363510 - time (sec): 14.78 - samples/sec: 2669.12 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:30:53,437 epoch 1 - iter 290/292 - loss 0.77909776 - time (sec): 16.54 - samples/sec: 2670.90 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:30:53,535 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:53,536 EPOCH 1 done: loss 0.7757 - lr: 0.000049 |
|
2023-10-17 17:30:54,603 DEV : loss 0.17767061293125153 - f1-score (micro avg) 0.5031 |
|
2023-10-17 17:30:54,608 saving best model |
|
2023-10-17 17:30:54,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:30:56,734 epoch 2 - iter 29/292 - loss 0.21655750 - time (sec): 1.75 - samples/sec: 2628.88 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:30:58,381 epoch 2 - iter 58/292 - loss 0.18535360 - time (sec): 3.40 - samples/sec: 2571.11 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 17:31:00,019 epoch 2 - iter 87/292 - loss 0.17442449 - time (sec): 5.04 - samples/sec: 2614.31 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:31:01,946 epoch 2 - iter 116/292 - loss 0.17059919 - time (sec): 6.96 - samples/sec: 2627.44 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 17:31:03,656 epoch 2 - iter 145/292 - loss 0.17086627 - time (sec): 8.67 - samples/sec: 2655.51 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:31:05,276 epoch 2 - iter 174/292 - loss 0.17566071 - time (sec): 10.29 - samples/sec: 2678.53 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 17:31:06,827 epoch 2 - iter 203/292 - loss 0.18058009 - time (sec): 11.85 - samples/sec: 2649.44 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:31:08,397 epoch 2 - iter 232/292 - loss 0.19124050 - time (sec): 13.42 - samples/sec: 2617.53 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 17:31:10,207 epoch 2 - iter 261/292 - loss 0.18814577 - time (sec): 15.22 - samples/sec: 2640.87 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:31:11,880 epoch 2 - iter 290/292 - loss 0.18197479 - time (sec): 16.90 - samples/sec: 2619.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 17:31:11,977 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:31:11,977 EPOCH 2 done: loss 0.1824 - lr: 0.000045 |
|
2023-10-17 17:31:13,308 DEV : loss 0.14048103988170624 - f1-score (micro avg) 0.6466 |
|
2023-10-17 17:31:13,313 saving best model |
|
2023-10-17 17:31:13,804 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:31:15,639 epoch 3 - iter 29/292 - loss 0.11981676 - time (sec): 1.83 - samples/sec: 2485.88 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 17:31:17,273 epoch 3 - iter 58/292 - loss 0.10010235 - time (sec): 3.46 - samples/sec: 2594.59 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:31:18,927 epoch 3 - iter 87/292 - loss 0.12124707 - time (sec): 5.12 - samples/sec: 2673.04 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 17:31:20,473 epoch 3 - iter 116/292 - loss 0.11456222 - time (sec): 6.67 - samples/sec: 2652.45 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:31:22,123 epoch 3 - iter 145/292 - loss 0.11612834 - time (sec): 8.32 - samples/sec: 2626.07 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 17:31:23,684 epoch 3 - iter 174/292 - loss 0.10891041 - time (sec): 9.88 - samples/sec: 2616.21 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:31:25,443 epoch 3 - iter 203/292 - loss 0.10851194 - time (sec): 11.64 - samples/sec: 2659.50 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 17:31:27,180 epoch 3 - iter 232/292 - loss 0.10532314 - time (sec): 13.37 - samples/sec: 2661.79 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:31:28,766 epoch 3 - iter 261/292 - loss 0.10435454 - time (sec): 14.96 - samples/sec: 2649.95 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 17:31:30,574 epoch 3 - iter 290/292 - loss 0.10430299 - time (sec): 16.77 - samples/sec: 2632.39 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 17:31:30,687 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:31:30,687 EPOCH 3 done: loss 0.1054 - lr: 0.000039 |
|
2023-10-17 17:31:31,979 DEV : loss 0.14705337584018707 - f1-score (micro avg) 0.7097 |
|
2023-10-17 17:31:31,985 saving best model |
|
2023-10-17 17:31:32,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:31:33,879 epoch 4 - iter 29/292 - loss 0.08724733 - time (sec): 1.42 - samples/sec: 2505.49 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:31:35,458 epoch 4 - iter 58/292 - loss 0.06675805 - time (sec): 3.00 - samples/sec: 2619.19 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 17:31:37,297 epoch 4 - iter 87/292 - loss 0.06063724 - time (sec): 4.84 - samples/sec: 2651.68 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:31:39,062 epoch 4 - iter 116/292 - loss 0.05953907 - time (sec): 6.60 - samples/sec: 2643.44 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 17:31:40,676 epoch 4 - iter 145/292 - loss 0.05856854 - time (sec): 8.22 - samples/sec: 2650.73 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:31:42,711 epoch 4 - iter 174/292 - loss 0.06007165 - time (sec): 10.25 - samples/sec: 2603.20 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 17:31:44,539 epoch 4 - iter 203/292 - loss 0.06473355 - time (sec): 12.08 - samples/sec: 2603.57 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:31:46,265 epoch 4 - iter 232/292 - loss 0.06599594 - time (sec): 13.81 - samples/sec: 2586.77 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 17:31:47,862 epoch 4 - iter 261/292 - loss 0.06883779 - time (sec): 15.40 - samples/sec: 2608.51 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 17:31:49,518 epoch 4 - iter 290/292 - loss 0.06894707 - time (sec): 17.06 - samples/sec: 2589.38 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:31:49,618 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:31:49,618 EPOCH 4 done: loss 0.0689 - lr: 0.000033 |
|
2023-10-17 17:31:50,909 DEV : loss 0.1414823830127716 - f1-score (micro avg) 0.7249 |
|
2023-10-17 17:31:50,915 saving best model |
|
2023-10-17 17:31:51,410 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:31:53,319 epoch 5 - iter 29/292 - loss 0.06241394 - time (sec): 1.91 - samples/sec: 2724.21 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 17:31:55,075 epoch 5 - iter 58/292 - loss 0.04367683 - time (sec): 3.66 - samples/sec: 2698.48 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:31:56,641 epoch 5 - iter 87/292 - loss 0.04656726 - time (sec): 5.23 - samples/sec: 2727.79 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 17:31:58,099 epoch 5 - iter 116/292 - loss 0.05088141 - time (sec): 6.69 - samples/sec: 2702.78 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:31:59,745 epoch 5 - iter 145/292 - loss 0.05467026 - time (sec): 8.33 - samples/sec: 2695.21 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 17:32:01,457 epoch 5 - iter 174/292 - loss 0.05364881 - time (sec): 10.04 - samples/sec: 2717.40 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:32:03,097 epoch 5 - iter 203/292 - loss 0.05135010 - time (sec): 11.68 - samples/sec: 2717.29 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 17:32:04,745 epoch 5 - iter 232/292 - loss 0.05250977 - time (sec): 13.33 - samples/sec: 2711.88 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 17:32:06,387 epoch 5 - iter 261/292 - loss 0.05110472 - time (sec): 14.97 - samples/sec: 2669.84 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:32:07,968 epoch 5 - iter 290/292 - loss 0.04973646 - time (sec): 16.55 - samples/sec: 2664.60 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 17:32:08,086 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:32:08,086 EPOCH 5 done: loss 0.0494 - lr: 0.000028 |
|
2023-10-17 17:32:09,439 DEV : loss 0.14127209782600403 - f1-score (micro avg) 0.7167 |
|
2023-10-17 17:32:09,447 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:32:11,036 epoch 6 - iter 29/292 - loss 0.04625909 - time (sec): 1.59 - samples/sec: 2819.48 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:32:12,510 epoch 6 - iter 58/292 - loss 0.04056874 - time (sec): 3.06 - samples/sec: 2643.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 17:32:14,343 epoch 6 - iter 87/292 - loss 0.03655091 - time (sec): 4.89 - samples/sec: 2634.06 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:32:15,975 epoch 6 - iter 116/292 - loss 0.03649962 - time (sec): 6.53 - samples/sec: 2633.52 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 17:32:17,546 epoch 6 - iter 145/292 - loss 0.03544033 - time (sec): 8.10 - samples/sec: 2558.87 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:32:19,276 epoch 6 - iter 174/292 - loss 0.03744064 - time (sec): 9.83 - samples/sec: 2544.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 17:32:21,052 epoch 6 - iter 203/292 - loss 0.03723482 - time (sec): 11.60 - samples/sec: 2562.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 17:32:22,754 epoch 6 - iter 232/292 - loss 0.03746420 - time (sec): 13.30 - samples/sec: 2586.93 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:32:24,378 epoch 6 - iter 261/292 - loss 0.03589379 - time (sec): 14.93 - samples/sec: 2595.57 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 17:32:26,153 epoch 6 - iter 290/292 - loss 0.03421415 - time (sec): 16.70 - samples/sec: 2650.81 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:32:26,246 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:32:26,246 EPOCH 6 done: loss 0.0346 - lr: 0.000022 |
|
2023-10-17 17:32:27,563 DEV : loss 0.18597932159900665 - f1-score (micro avg) 0.7516 |
|
2023-10-17 17:32:27,586 saving best model |
|
2023-10-17 17:32:28,046 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:32:29,834 epoch 7 - iter 29/292 - loss 0.00818199 - time (sec): 1.78 - samples/sec: 2760.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 17:32:31,568 epoch 7 - iter 58/292 - loss 0.02426964 - time (sec): 3.52 - samples/sec: 2614.31 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:32:33,249 epoch 7 - iter 87/292 - loss 0.02134647 - time (sec): 5.20 - samples/sec: 2662.96 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 17:32:35,089 epoch 7 - iter 116/292 - loss 0.01965762 - time (sec): 7.04 - samples/sec: 2649.10 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:32:36,865 epoch 7 - iter 145/292 - loss 0.01930732 - time (sec): 8.81 - samples/sec: 2675.36 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 17:32:38,478 epoch 7 - iter 174/292 - loss 0.02383070 - time (sec): 10.43 - samples/sec: 2670.87 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 17:32:40,141 epoch 7 - iter 203/292 - loss 0.02392436 - time (sec): 12.09 - samples/sec: 2694.46 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:32:41,705 epoch 7 - iter 232/292 - loss 0.02288270 - time (sec): 13.65 - samples/sec: 2691.62 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 17:32:43,429 epoch 7 - iter 261/292 - loss 0.02486344 - time (sec): 15.38 - samples/sec: 2630.28 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:32:44,992 epoch 7 - iter 290/292 - loss 0.02465437 - time (sec): 16.94 - samples/sec: 2613.50 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 17:32:45,090 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:32:45,090 EPOCH 7 done: loss 0.0247 - lr: 0.000017 |
|
2023-10-17 17:32:46,390 DEV : loss 0.18214021623134613 - f1-score (micro avg) 0.745 |
|
2023-10-17 17:32:46,398 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:32:48,160 epoch 8 - iter 29/292 - loss 0.01468803 - time (sec): 1.76 - samples/sec: 2688.71 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:32:49,913 epoch 8 - iter 58/292 - loss 0.02488023 - time (sec): 3.51 - samples/sec: 2680.57 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 17:32:51,722 epoch 8 - iter 87/292 - loss 0.02302285 - time (sec): 5.32 - samples/sec: 2676.97 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:32:53,284 epoch 8 - iter 116/292 - loss 0.02543944 - time (sec): 6.88 - samples/sec: 2644.23 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 17:32:54,913 epoch 8 - iter 145/292 - loss 0.02236377 - time (sec): 8.51 - samples/sec: 2591.58 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 17:32:56,481 epoch 8 - iter 174/292 - loss 0.02166988 - time (sec): 10.08 - samples/sec: 2560.01 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:32:58,182 epoch 8 - iter 203/292 - loss 0.02064423 - time (sec): 11.78 - samples/sec: 2571.56 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 17:32:59,794 epoch 8 - iter 232/292 - loss 0.01874188 - time (sec): 13.39 - samples/sec: 2563.73 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:33:01,784 epoch 8 - iter 261/292 - loss 0.01891195 - time (sec): 15.38 - samples/sec: 2622.51 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 17:33:03,287 epoch 8 - iter 290/292 - loss 0.01868060 - time (sec): 16.89 - samples/sec: 2626.97 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:33:03,377 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:33:03,377 EPOCH 8 done: loss 0.0187 - lr: 0.000011 |
|
2023-10-17 17:33:04,692 DEV : loss 0.19712962210178375 - f1-score (micro avg) 0.7385 |
|
2023-10-17 17:33:04,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:33:06,387 epoch 9 - iter 29/292 - loss 0.01245605 - time (sec): 1.69 - samples/sec: 2525.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 17:33:07,995 epoch 9 - iter 58/292 - loss 0.01509373 - time (sec): 3.30 - samples/sec: 2471.69 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:33:09,564 epoch 9 - iter 87/292 - loss 0.01256178 - time (sec): 4.87 - samples/sec: 2439.19 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 17:33:11,182 epoch 9 - iter 116/292 - loss 0.01064407 - time (sec): 6.48 - samples/sec: 2460.99 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 17:33:12,919 epoch 9 - iter 145/292 - loss 0.00982040 - time (sec): 8.22 - samples/sec: 2531.73 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:33:14,624 epoch 9 - iter 174/292 - loss 0.01005871 - time (sec): 9.93 - samples/sec: 2563.45 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 17:33:16,259 epoch 9 - iter 203/292 - loss 0.00902004 - time (sec): 11.56 - samples/sec: 2562.04 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:33:18,149 epoch 9 - iter 232/292 - loss 0.00850666 - time (sec): 13.45 - samples/sec: 2571.37 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 17:33:19,863 epoch 9 - iter 261/292 - loss 0.00998313 - time (sec): 15.17 - samples/sec: 2599.17 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:33:21,594 epoch 9 - iter 290/292 - loss 0.01091769 - time (sec): 16.90 - samples/sec: 2601.11 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 17:33:21,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:33:21,788 EPOCH 9 done: loss 0.0109 - lr: 0.000006 |
|
2023-10-17 17:33:23,047 DEV : loss 0.2052023559808731 - f1-score (micro avg) 0.7632 |
|
2023-10-17 17:33:23,053 saving best model |
|
2023-10-17 17:33:23,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:33:25,290 epoch 10 - iter 29/292 - loss 0.01432242 - time (sec): 1.75 - samples/sec: 2794.01 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:33:26,991 epoch 10 - iter 58/292 - loss 0.01708261 - time (sec): 3.45 - samples/sec: 2798.02 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 17:33:28,628 epoch 10 - iter 87/292 - loss 0.01497146 - time (sec): 5.09 - samples/sec: 2727.39 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 17:33:30,287 epoch 10 - iter 116/292 - loss 0.01184976 - time (sec): 6.75 - samples/sec: 2682.04 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:33:31,906 epoch 10 - iter 145/292 - loss 0.00956243 - time (sec): 8.37 - samples/sec: 2689.98 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 17:33:33,596 epoch 10 - iter 174/292 - loss 0.00983371 - time (sec): 10.06 - samples/sec: 2667.52 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:33:35,413 epoch 10 - iter 203/292 - loss 0.01046263 - time (sec): 11.87 - samples/sec: 2678.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 17:33:37,022 epoch 10 - iter 232/292 - loss 0.01126760 - time (sec): 13.48 - samples/sec: 2647.13 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:33:38,797 epoch 10 - iter 261/292 - loss 0.01062140 - time (sec): 15.26 - samples/sec: 2661.98 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 17:33:40,353 epoch 10 - iter 290/292 - loss 0.00996671 - time (sec): 16.81 - samples/sec: 2632.41 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 17:33:40,459 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:33:40,459 EPOCH 10 done: loss 0.0099 - lr: 0.000000 |
|
2023-10-17 17:33:41,754 DEV : loss 0.19731809198856354 - f1-score (micro avg) 0.7549 |
|
2023-10-17 17:33:42,117 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 17:33:42,119 Loading model from best epoch ... |
|
2023-10-17 17:33:43,724 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 17:33:46,279 |
|
Results: |
|
- F-score (micro) 0.7605 |
|
- F-score (macro) 0.6957 |
|
- Accuracy 0.6295 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8244 0.8362 0.8302 348 |
|
LOC 0.6387 0.8467 0.7282 261 |
|
ORG 0.4902 0.4808 0.4854 52 |
|
HumanProd 0.7083 0.7727 0.7391 22 |
|
|
|
micro avg 0.7158 0.8111 0.7605 683 |
|
macro avg 0.6654 0.7341 0.6957 683 |
|
weighted avg 0.7242 0.8111 0.7621 683 |
|
|
|
2023-10-17 17:33:46,279 ---------------------------------------------------------------------------------------------------- |
|
|