|
2023-10-16 18:37:27,557 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,558 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:37:27,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,558 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:37:27,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,558 Train: 1166 sentences |
|
2023-10-16 18:37:27,558 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:37:27,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,558 Training Params: |
|
2023-10-16 18:37:27,558 - learning_rate: "5e-05" |
|
2023-10-16 18:37:27,558 - mini_batch_size: "4" |
|
2023-10-16 18:37:27,558 - max_epochs: "10" |
|
2023-10-16 18:37:27,558 - shuffle: "True" |
|
2023-10-16 18:37:27,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,558 Plugins: |
|
2023-10-16 18:37:27,558 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:37:27,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,558 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:37:27,558 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:37:27,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,558 Computation: |
|
2023-10-16 18:37:27,558 - compute on device: cuda:0 |
|
2023-10-16 18:37:27,558 - embedding storage: none |
|
2023-10-16 18:37:27,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,559 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-16 18:37:27,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:27,559 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:29,222 epoch 1 - iter 29/292 - loss 2.91011187 - time (sec): 1.66 - samples/sec: 2715.94 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:37:30,705 epoch 1 - iter 58/292 - loss 2.22168904 - time (sec): 3.15 - samples/sec: 2650.68 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:37:32,423 epoch 1 - iter 87/292 - loss 1.60697027 - time (sec): 4.86 - samples/sec: 2672.46 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:37:34,276 epoch 1 - iter 116/292 - loss 1.34788026 - time (sec): 6.72 - samples/sec: 2668.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:37:35,860 epoch 1 - iter 145/292 - loss 1.20838293 - time (sec): 8.30 - samples/sec: 2628.47 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:37:37,675 epoch 1 - iter 174/292 - loss 1.08372999 - time (sec): 10.11 - samples/sec: 2660.03 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:37:39,244 epoch 1 - iter 203/292 - loss 0.97907216 - time (sec): 11.68 - samples/sec: 2651.97 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:37:40,953 epoch 1 - iter 232/292 - loss 0.88148784 - time (sec): 13.39 - samples/sec: 2677.37 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:37:42,632 epoch 1 - iter 261/292 - loss 0.81684973 - time (sec): 15.07 - samples/sec: 2674.36 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:37:44,197 epoch 1 - iter 290/292 - loss 0.77210022 - time (sec): 16.64 - samples/sec: 2662.04 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:37:44,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:44,290 EPOCH 1 done: loss 0.7702 - lr: 0.000049 |
|
2023-10-16 18:37:45,123 DEV : loss 0.19686032831668854 - f1-score (micro avg) 0.493 |
|
2023-10-16 18:37:45,131 saving best model |
|
2023-10-16 18:37:45,486 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:37:47,222 epoch 2 - iter 29/292 - loss 0.25306732 - time (sec): 1.73 - samples/sec: 2531.79 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:37:49,128 epoch 2 - iter 58/292 - loss 0.28082060 - time (sec): 3.64 - samples/sec: 2729.34 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:37:50,890 epoch 2 - iter 87/292 - loss 0.26712309 - time (sec): 5.40 - samples/sec: 2640.62 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:37:52,650 epoch 2 - iter 116/292 - loss 0.25249370 - time (sec): 7.16 - samples/sec: 2644.03 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:37:54,152 epoch 2 - iter 145/292 - loss 0.23957595 - time (sec): 8.66 - samples/sec: 2642.03 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:37:55,790 epoch 2 - iter 174/292 - loss 0.23236470 - time (sec): 10.30 - samples/sec: 2668.36 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:37:57,259 epoch 2 - iter 203/292 - loss 0.22615504 - time (sec): 11.77 - samples/sec: 2663.40 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:37:58,791 epoch 2 - iter 232/292 - loss 0.21951834 - time (sec): 13.30 - samples/sec: 2667.54 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:38:00,470 epoch 2 - iter 261/292 - loss 0.20918561 - time (sec): 14.98 - samples/sec: 2662.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:38:02,029 epoch 2 - iter 290/292 - loss 0.20202252 - time (sec): 16.54 - samples/sec: 2660.35 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:38:02,141 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:38:02,142 EPOCH 2 done: loss 0.2003 - lr: 0.000045 |
|
2023-10-16 18:38:03,417 DEV : loss 0.158650204539299 - f1-score (micro avg) 0.6542 |
|
2023-10-16 18:38:03,421 saving best model |
|
2023-10-16 18:38:04,123 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:38:05,892 epoch 3 - iter 29/292 - loss 0.10153556 - time (sec): 1.77 - samples/sec: 2802.00 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 18:38:07,818 epoch 3 - iter 58/292 - loss 0.10831370 - time (sec): 3.69 - samples/sec: 2690.02 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:38:09,316 epoch 3 - iter 87/292 - loss 0.10922557 - time (sec): 5.19 - samples/sec: 2642.27 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:38:10,763 epoch 3 - iter 116/292 - loss 0.10600736 - time (sec): 6.64 - samples/sec: 2584.80 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:38:12,440 epoch 3 - iter 145/292 - loss 0.10724913 - time (sec): 8.32 - samples/sec: 2589.17 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:38:14,103 epoch 3 - iter 174/292 - loss 0.10441347 - time (sec): 9.98 - samples/sec: 2655.49 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:38:15,983 epoch 3 - iter 203/292 - loss 0.10838574 - time (sec): 11.86 - samples/sec: 2713.07 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:38:17,565 epoch 3 - iter 232/292 - loss 0.10772044 - time (sec): 13.44 - samples/sec: 2683.94 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:38:19,145 epoch 3 - iter 261/292 - loss 0.10838088 - time (sec): 15.02 - samples/sec: 2670.22 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:38:20,782 epoch 3 - iter 290/292 - loss 0.11248021 - time (sec): 16.66 - samples/sec: 2645.58 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 18:38:20,890 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:38:20,890 EPOCH 3 done: loss 0.1121 - lr: 0.000039 |
|
2023-10-16 18:38:22,165 DEV : loss 0.16866865754127502 - f1-score (micro avg) 0.6454 |
|
2023-10-16 18:38:22,169 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:38:23,948 epoch 4 - iter 29/292 - loss 0.07309028 - time (sec): 1.78 - samples/sec: 2907.31 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:38:25,692 epoch 4 - iter 58/292 - loss 0.09292286 - time (sec): 3.52 - samples/sec: 2722.63 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:38:27,438 epoch 4 - iter 87/292 - loss 0.07807604 - time (sec): 5.27 - samples/sec: 2734.45 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:38:28,839 epoch 4 - iter 116/292 - loss 0.07803060 - time (sec): 6.67 - samples/sec: 2670.69 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:38:30,525 epoch 4 - iter 145/292 - loss 0.07517713 - time (sec): 8.36 - samples/sec: 2705.98 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:38:32,284 epoch 4 - iter 174/292 - loss 0.07995214 - time (sec): 10.11 - samples/sec: 2661.50 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:38:33,865 epoch 4 - iter 203/292 - loss 0.07654734 - time (sec): 11.69 - samples/sec: 2627.69 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:38:35,747 epoch 4 - iter 232/292 - loss 0.07335699 - time (sec): 13.58 - samples/sec: 2669.63 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:38:37,270 epoch 4 - iter 261/292 - loss 0.07681713 - time (sec): 15.10 - samples/sec: 2664.75 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 18:38:38,836 epoch 4 - iter 290/292 - loss 0.07686933 - time (sec): 16.67 - samples/sec: 2653.19 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:38:38,925 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:38:38,925 EPOCH 4 done: loss 0.0769 - lr: 0.000033 |
|
2023-10-16 18:38:40,210 DEV : loss 0.15432444214820862 - f1-score (micro avg) 0.6467 |
|
2023-10-16 18:38:40,214 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:38:41,860 epoch 5 - iter 29/292 - loss 0.04301207 - time (sec): 1.64 - samples/sec: 2695.24 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:38:43,551 epoch 5 - iter 58/292 - loss 0.03809167 - time (sec): 3.34 - samples/sec: 2614.48 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:38:45,275 epoch 5 - iter 87/292 - loss 0.03839505 - time (sec): 5.06 - samples/sec: 2614.76 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:38:47,049 epoch 5 - iter 116/292 - loss 0.04375124 - time (sec): 6.83 - samples/sec: 2711.30 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:38:48,759 epoch 5 - iter 145/292 - loss 0.04266788 - time (sec): 8.54 - samples/sec: 2750.18 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:38:50,425 epoch 5 - iter 174/292 - loss 0.04535040 - time (sec): 10.21 - samples/sec: 2726.59 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:38:52,095 epoch 5 - iter 203/292 - loss 0.04546838 - time (sec): 11.88 - samples/sec: 2752.67 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:38:53,669 epoch 5 - iter 232/292 - loss 0.04687883 - time (sec): 13.45 - samples/sec: 2724.78 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:38:55,152 epoch 5 - iter 261/292 - loss 0.05118725 - time (sec): 14.94 - samples/sec: 2708.21 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:38:56,656 epoch 5 - iter 290/292 - loss 0.05214197 - time (sec): 16.44 - samples/sec: 2686.31 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:38:56,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:38:56,759 EPOCH 5 done: loss 0.0520 - lr: 0.000028 |
|
2023-10-16 18:38:58,008 DEV : loss 0.14177842438220978 - f1-score (micro avg) 0.7619 |
|
2023-10-16 18:38:58,013 saving best model |
|
2023-10-16 18:38:58,483 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:39:00,099 epoch 6 - iter 29/292 - loss 0.05101971 - time (sec): 1.61 - samples/sec: 2737.38 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:39:01,800 epoch 6 - iter 58/292 - loss 0.04834906 - time (sec): 3.31 - samples/sec: 2802.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:39:03,356 epoch 6 - iter 87/292 - loss 0.04266830 - time (sec): 4.87 - samples/sec: 2789.38 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:39:04,991 epoch 6 - iter 116/292 - loss 0.04019668 - time (sec): 6.50 - samples/sec: 2730.45 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:39:06,648 epoch 6 - iter 145/292 - loss 0.03960645 - time (sec): 8.16 - samples/sec: 2712.36 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:39:08,243 epoch 6 - iter 174/292 - loss 0.03900318 - time (sec): 9.76 - samples/sec: 2670.81 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:39:09,906 epoch 6 - iter 203/292 - loss 0.03756220 - time (sec): 11.42 - samples/sec: 2689.48 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:39:11,580 epoch 6 - iter 232/292 - loss 0.03654571 - time (sec): 13.09 - samples/sec: 2710.25 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:39:13,227 epoch 6 - iter 261/292 - loss 0.03692190 - time (sec): 14.74 - samples/sec: 2708.96 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:39:14,980 epoch 6 - iter 290/292 - loss 0.03851499 - time (sec): 16.49 - samples/sec: 2683.26 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:39:15,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:39:15,072 EPOCH 6 done: loss 0.0385 - lr: 0.000022 |
|
2023-10-16 18:39:16,426 DEV : loss 0.15027689933776855 - f1-score (micro avg) 0.7458 |
|
2023-10-16 18:39:16,432 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:39:18,171 epoch 7 - iter 29/292 - loss 0.02746134 - time (sec): 1.74 - samples/sec: 2521.73 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:39:19,827 epoch 7 - iter 58/292 - loss 0.02252500 - time (sec): 3.39 - samples/sec: 2513.96 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:39:21,484 epoch 7 - iter 87/292 - loss 0.02102995 - time (sec): 5.05 - samples/sec: 2608.39 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:39:23,481 epoch 7 - iter 116/292 - loss 0.02302638 - time (sec): 7.05 - samples/sec: 2551.46 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:39:25,058 epoch 7 - iter 145/292 - loss 0.02321526 - time (sec): 8.62 - samples/sec: 2542.08 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:39:26,976 epoch 7 - iter 174/292 - loss 0.02710328 - time (sec): 10.54 - samples/sec: 2593.33 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:39:28,492 epoch 7 - iter 203/292 - loss 0.02550259 - time (sec): 12.06 - samples/sec: 2600.79 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:39:30,206 epoch 7 - iter 232/292 - loss 0.02846380 - time (sec): 13.77 - samples/sec: 2594.47 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:39:31,916 epoch 7 - iter 261/292 - loss 0.02759356 - time (sec): 15.48 - samples/sec: 2595.48 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:39:33,466 epoch 7 - iter 290/292 - loss 0.02720347 - time (sec): 17.03 - samples/sec: 2595.07 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:39:33,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:39:33,565 EPOCH 7 done: loss 0.0271 - lr: 0.000017 |
|
2023-10-16 18:39:34,842 DEV : loss 0.16174526512622833 - f1-score (micro avg) 0.7617 |
|
2023-10-16 18:39:34,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:39:36,418 epoch 8 - iter 29/292 - loss 0.01140027 - time (sec): 1.57 - samples/sec: 2814.16 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:39:38,168 epoch 8 - iter 58/292 - loss 0.01308990 - time (sec): 3.32 - samples/sec: 2789.42 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:39:40,004 epoch 8 - iter 87/292 - loss 0.02439680 - time (sec): 5.16 - samples/sec: 2768.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:39:41,475 epoch 8 - iter 116/292 - loss 0.02707583 - time (sec): 6.63 - samples/sec: 2672.02 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:39:43,374 epoch 8 - iter 145/292 - loss 0.02509823 - time (sec): 8.53 - samples/sec: 2722.03 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:39:44,987 epoch 8 - iter 174/292 - loss 0.02364774 - time (sec): 10.14 - samples/sec: 2699.66 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:39:46,675 epoch 8 - iter 203/292 - loss 0.02165952 - time (sec): 11.83 - samples/sec: 2664.10 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:39:48,333 epoch 8 - iter 232/292 - loss 0.02096976 - time (sec): 13.49 - samples/sec: 2679.52 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:39:50,009 epoch 8 - iter 261/292 - loss 0.02043181 - time (sec): 15.16 - samples/sec: 2681.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:39:51,460 epoch 8 - iter 290/292 - loss 0.02046018 - time (sec): 16.61 - samples/sec: 2663.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:39:51,550 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:39:51,550 EPOCH 8 done: loss 0.0204 - lr: 0.000011 |
|
2023-10-16 18:39:52,806 DEV : loss 0.1880948841571808 - f1-score (micro avg) 0.7164 |
|
2023-10-16 18:39:52,810 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:39:54,412 epoch 9 - iter 29/292 - loss 0.02255474 - time (sec): 1.60 - samples/sec: 2980.03 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:39:56,208 epoch 9 - iter 58/292 - loss 0.02845919 - time (sec): 3.40 - samples/sec: 2836.44 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:39:57,808 epoch 9 - iter 87/292 - loss 0.02396023 - time (sec): 5.00 - samples/sec: 2698.78 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:39:59,390 epoch 9 - iter 116/292 - loss 0.02250170 - time (sec): 6.58 - samples/sec: 2704.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:40:00,927 epoch 9 - iter 145/292 - loss 0.02016379 - time (sec): 8.12 - samples/sec: 2666.79 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:40:02,535 epoch 9 - iter 174/292 - loss 0.01984362 - time (sec): 9.72 - samples/sec: 2655.54 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:40:04,175 epoch 9 - iter 203/292 - loss 0.01724021 - time (sec): 11.36 - samples/sec: 2698.32 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:40:05,675 epoch 9 - iter 232/292 - loss 0.01701978 - time (sec): 12.86 - samples/sec: 2654.89 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:40:07,426 epoch 9 - iter 261/292 - loss 0.01550812 - time (sec): 14.62 - samples/sec: 2648.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:40:09,246 epoch 9 - iter 290/292 - loss 0.01507845 - time (sec): 16.43 - samples/sec: 2677.62 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:40:09,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:09,380 EPOCH 9 done: loss 0.0149 - lr: 0.000006 |
|
2023-10-16 18:40:10,634 DEV : loss 0.1720294952392578 - f1-score (micro avg) 0.7479 |
|
2023-10-16 18:40:10,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:12,109 epoch 10 - iter 29/292 - loss 0.01950142 - time (sec): 1.47 - samples/sec: 2638.32 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:40:13,665 epoch 10 - iter 58/292 - loss 0.01376562 - time (sec): 3.02 - samples/sec: 2703.94 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:40:15,361 epoch 10 - iter 87/292 - loss 0.01047050 - time (sec): 4.72 - samples/sec: 2740.54 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:40:17,260 epoch 10 - iter 116/292 - loss 0.01107990 - time (sec): 6.62 - samples/sec: 2746.99 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:40:18,822 epoch 10 - iter 145/292 - loss 0.01088382 - time (sec): 8.18 - samples/sec: 2721.81 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:40:20,596 epoch 10 - iter 174/292 - loss 0.01209993 - time (sec): 9.96 - samples/sec: 2724.55 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:40:22,121 epoch 10 - iter 203/292 - loss 0.01153913 - time (sec): 11.48 - samples/sec: 2708.05 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:40:23,805 epoch 10 - iter 232/292 - loss 0.01083310 - time (sec): 13.16 - samples/sec: 2693.51 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:40:25,547 epoch 10 - iter 261/292 - loss 0.00986534 - time (sec): 14.91 - samples/sec: 2720.24 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:40:27,096 epoch 10 - iter 290/292 - loss 0.00929650 - time (sec): 16.46 - samples/sec: 2692.69 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:40:27,177 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:27,177 EPOCH 10 done: loss 0.0093 - lr: 0.000000 |
|
2023-10-16 18:40:28,479 DEV : loss 0.1805955320596695 - f1-score (micro avg) 0.7542 |
|
2023-10-16 18:40:28,856 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:28,858 Loading model from best epoch ... |
|
2023-10-16 18:40:30,628 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:40:33,999 |
|
Results: |
|
- F-score (micro) 0.7443 |
|
- F-score (macro) 0.6697 |
|
- Accuracy 0.6164 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8146 0.8333 0.8239 348 |
|
LOC 0.6460 0.7969 0.7136 261 |
|
ORG 0.3833 0.4423 0.4107 52 |
|
HumanProd 0.6333 0.8636 0.7308 22 |
|
|
|
micro avg 0.7031 0.7906 0.7443 683 |
|
macro avg 0.6193 0.7341 0.6697 683 |
|
weighted avg 0.7115 0.7906 0.7473 683 |
|
|
|
2023-10-16 18:40:34,000 ---------------------------------------------------------------------------------------------------- |
|
|