|
2023-10-17 20:02:07,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,719 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 20:02:07,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,719 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 20:02:07,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,719 Train: 1085 sentences |
|
2023-10-17 20:02:07,719 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 20:02:07,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,719 Training Params: |
|
2023-10-17 20:02:07,719 - learning_rate: "5e-05" |
|
2023-10-17 20:02:07,719 - mini_batch_size: "8" |
|
2023-10-17 20:02:07,719 - max_epochs: "10" |
|
2023-10-17 20:02:07,719 - shuffle: "True" |
|
2023-10-17 20:02:07,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,719 Plugins: |
|
2023-10-17 20:02:07,720 - TensorboardLogger |
|
2023-10-17 20:02:07,720 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 20:02:07,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,720 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 20:02:07,720 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 20:02:07,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,720 Computation: |
|
2023-10-17 20:02:07,720 - compute on device: cuda:0 |
|
2023-10-17 20:02:07,720 - embedding storage: none |
|
2023-10-17 20:02:07,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,720 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 20:02:07,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:07,720 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 20:02:09,052 epoch 1 - iter 13/136 - loss 3.90680085 - time (sec): 1.33 - samples/sec: 3528.83 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:02:10,322 epoch 1 - iter 26/136 - loss 3.55895351 - time (sec): 2.60 - samples/sec: 3430.69 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:02:11,749 epoch 1 - iter 39/136 - loss 2.74126999 - time (sec): 4.03 - samples/sec: 3594.16 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:02:13,066 epoch 1 - iter 52/136 - loss 2.17219974 - time (sec): 5.35 - samples/sec: 3639.45 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:02:14,451 epoch 1 - iter 65/136 - loss 1.87047412 - time (sec): 6.73 - samples/sec: 3551.62 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:02:15,833 epoch 1 - iter 78/136 - loss 1.60103945 - time (sec): 8.11 - samples/sec: 3604.49 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:02:17,294 epoch 1 - iter 91/136 - loss 1.41316409 - time (sec): 9.57 - samples/sec: 3599.18 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:02:18,721 epoch 1 - iter 104/136 - loss 1.25997927 - time (sec): 11.00 - samples/sec: 3626.72 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:02:20,174 epoch 1 - iter 117/136 - loss 1.14147820 - time (sec): 12.45 - samples/sec: 3637.12 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:02:21,708 epoch 1 - iter 130/136 - loss 1.05471195 - time (sec): 13.99 - samples/sec: 3567.84 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:02:22,347 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:22,348 EPOCH 1 done: loss 1.0206 - lr: 0.000047 |
|
2023-10-17 20:02:23,475 DEV : loss 0.1729203313589096 - f1-score (micro avg) 0.6042 |
|
2023-10-17 20:02:23,480 saving best model |
|
2023-10-17 20:02:23,882 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:25,325 epoch 2 - iter 13/136 - loss 0.14807463 - time (sec): 1.44 - samples/sec: 3461.82 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 20:02:26,662 epoch 2 - iter 26/136 - loss 0.15887359 - time (sec): 2.78 - samples/sec: 3620.58 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 20:02:28,031 epoch 2 - iter 39/136 - loss 0.16702427 - time (sec): 4.15 - samples/sec: 3613.56 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:02:29,731 epoch 2 - iter 52/136 - loss 0.16318825 - time (sec): 5.85 - samples/sec: 3524.72 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 20:02:30,932 epoch 2 - iter 65/136 - loss 0.16136772 - time (sec): 7.05 - samples/sec: 3570.87 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:02:32,381 epoch 2 - iter 78/136 - loss 0.16561702 - time (sec): 8.50 - samples/sec: 3496.40 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 20:02:33,674 epoch 2 - iter 91/136 - loss 0.16472638 - time (sec): 9.79 - samples/sec: 3496.53 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:02:35,123 epoch 2 - iter 104/136 - loss 0.15378818 - time (sec): 11.24 - samples/sec: 3543.22 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 20:02:36,492 epoch 2 - iter 117/136 - loss 0.15058060 - time (sec): 12.61 - samples/sec: 3498.24 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:02:37,954 epoch 2 - iter 130/136 - loss 0.14512488 - time (sec): 14.07 - samples/sec: 3536.68 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 20:02:38,493 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:38,494 EPOCH 2 done: loss 0.1436 - lr: 0.000045 |
|
2023-10-17 20:02:39,948 DEV : loss 0.10842905938625336 - f1-score (micro avg) 0.7751 |
|
2023-10-17 20:02:39,954 saving best model |
|
2023-10-17 20:02:40,491 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:42,041 epoch 3 - iter 13/136 - loss 0.07133591 - time (sec): 1.54 - samples/sec: 3401.08 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 20:02:43,550 epoch 3 - iter 26/136 - loss 0.08050463 - time (sec): 3.05 - samples/sec: 3625.60 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:02:44,850 epoch 3 - iter 39/136 - loss 0.07549434 - time (sec): 4.35 - samples/sec: 3638.66 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 20:02:46,389 epoch 3 - iter 52/136 - loss 0.07539455 - time (sec): 5.89 - samples/sec: 3583.23 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:02:47,824 epoch 3 - iter 65/136 - loss 0.07284452 - time (sec): 7.32 - samples/sec: 3546.17 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 20:02:49,109 epoch 3 - iter 78/136 - loss 0.07551226 - time (sec): 8.61 - samples/sec: 3584.68 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:02:50,507 epoch 3 - iter 91/136 - loss 0.08313272 - time (sec): 10.01 - samples/sec: 3547.32 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 20:02:51,670 epoch 3 - iter 104/136 - loss 0.08415168 - time (sec): 11.17 - samples/sec: 3576.19 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:02:53,075 epoch 3 - iter 117/136 - loss 0.08122647 - time (sec): 12.58 - samples/sec: 3597.21 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 20:02:54,397 epoch 3 - iter 130/136 - loss 0.08038663 - time (sec): 13.90 - samples/sec: 3598.01 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 20:02:54,935 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:54,935 EPOCH 3 done: loss 0.0800 - lr: 0.000039 |
|
2023-10-17 20:02:56,467 DEV : loss 0.12283124774694443 - f1-score (micro avg) 0.7792 |
|
2023-10-17 20:02:56,473 saving best model |
|
2023-10-17 20:02:56,981 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:02:58,359 epoch 4 - iter 13/136 - loss 0.04451944 - time (sec): 1.38 - samples/sec: 3361.74 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:03:00,015 epoch 4 - iter 26/136 - loss 0.03568214 - time (sec): 3.03 - samples/sec: 3180.10 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 20:03:01,549 epoch 4 - iter 39/136 - loss 0.04359458 - time (sec): 4.57 - samples/sec: 3335.54 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:03:02,824 epoch 4 - iter 52/136 - loss 0.04163316 - time (sec): 5.84 - samples/sec: 3370.12 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 20:03:04,009 epoch 4 - iter 65/136 - loss 0.04418803 - time (sec): 7.03 - samples/sec: 3383.09 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:03:05,626 epoch 4 - iter 78/136 - loss 0.04593159 - time (sec): 8.64 - samples/sec: 3444.03 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 20:03:07,242 epoch 4 - iter 91/136 - loss 0.04579649 - time (sec): 10.26 - samples/sec: 3411.15 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:03:08,671 epoch 4 - iter 104/136 - loss 0.04668489 - time (sec): 11.69 - samples/sec: 3408.71 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 20:03:09,931 epoch 4 - iter 117/136 - loss 0.04644160 - time (sec): 12.95 - samples/sec: 3470.46 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:03:11,202 epoch 4 - iter 130/136 - loss 0.04726019 - time (sec): 14.22 - samples/sec: 3452.42 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 20:03:11,829 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:03:11,830 EPOCH 4 done: loss 0.0486 - lr: 0.000034 |
|
2023-10-17 20:03:13,341 DEV : loss 0.10291425883769989 - f1-score (micro avg) 0.7812 |
|
2023-10-17 20:03:13,348 saving best model |
|
2023-10-17 20:03:13,865 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:03:15,205 epoch 5 - iter 13/136 - loss 0.03258676 - time (sec): 1.33 - samples/sec: 3792.78 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 20:03:16,642 epoch 5 - iter 26/136 - loss 0.03067879 - time (sec): 2.77 - samples/sec: 3721.20 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:03:18,006 epoch 5 - iter 39/136 - loss 0.03276621 - time (sec): 4.13 - samples/sec: 3770.05 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 20:03:19,436 epoch 5 - iter 52/136 - loss 0.03765332 - time (sec): 5.56 - samples/sec: 3678.39 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:03:21,046 epoch 5 - iter 65/136 - loss 0.03607231 - time (sec): 7.18 - samples/sec: 3570.71 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 20:03:22,456 epoch 5 - iter 78/136 - loss 0.03339675 - time (sec): 8.59 - samples/sec: 3564.67 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:03:23,830 epoch 5 - iter 91/136 - loss 0.03650890 - time (sec): 9.96 - samples/sec: 3557.82 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 20:03:25,217 epoch 5 - iter 104/136 - loss 0.03442960 - time (sec): 11.35 - samples/sec: 3562.26 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:03:26,500 epoch 5 - iter 117/136 - loss 0.03459512 - time (sec): 12.63 - samples/sec: 3533.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 20:03:27,877 epoch 5 - iter 130/136 - loss 0.03444956 - time (sec): 14.01 - samples/sec: 3536.40 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 20:03:28,518 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:03:28,518 EPOCH 5 done: loss 0.0340 - lr: 0.000028 |
|
2023-10-17 20:03:29,983 DEV : loss 0.11607682704925537 - f1-score (micro avg) 0.7934 |
|
2023-10-17 20:03:29,989 saving best model |
|
2023-10-17 20:03:30,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:03:31,877 epoch 6 - iter 13/136 - loss 0.01514382 - time (sec): 1.15 - samples/sec: 3999.40 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:03:33,319 epoch 6 - iter 26/136 - loss 0.02288618 - time (sec): 2.59 - samples/sec: 3691.16 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 20:03:34,661 epoch 6 - iter 39/136 - loss 0.02429339 - time (sec): 3.93 - samples/sec: 3697.51 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:03:36,064 epoch 6 - iter 52/136 - loss 0.02463788 - time (sec): 5.34 - samples/sec: 3650.79 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 20:03:37,426 epoch 6 - iter 65/136 - loss 0.02298716 - time (sec): 6.70 - samples/sec: 3683.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:03:38,843 epoch 6 - iter 78/136 - loss 0.02247348 - time (sec): 8.12 - samples/sec: 3754.68 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 20:03:40,196 epoch 6 - iter 91/136 - loss 0.02314909 - time (sec): 9.47 - samples/sec: 3748.25 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:03:41,642 epoch 6 - iter 104/136 - loss 0.02276842 - time (sec): 10.91 - samples/sec: 3682.51 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 20:03:43,003 epoch 6 - iter 117/136 - loss 0.02092841 - time (sec): 12.28 - samples/sec: 3662.64 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:03:44,335 epoch 6 - iter 130/136 - loss 0.02176745 - time (sec): 13.61 - samples/sec: 3636.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 20:03:44,970 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:03:44,970 EPOCH 6 done: loss 0.0221 - lr: 0.000023 |
|
2023-10-17 20:03:46,513 DEV : loss 0.1339295357465744 - f1-score (micro avg) 0.8155 |
|
2023-10-17 20:03:46,521 saving best model |
|
2023-10-17 20:03:47,019 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:03:48,905 epoch 7 - iter 13/136 - loss 0.02470688 - time (sec): 1.88 - samples/sec: 3215.00 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 20:03:50,305 epoch 7 - iter 26/136 - loss 0.01909740 - time (sec): 3.28 - samples/sec: 3374.16 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:03:51,503 epoch 7 - iter 39/136 - loss 0.01699709 - time (sec): 4.48 - samples/sec: 3470.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 20:03:52,802 epoch 7 - iter 52/136 - loss 0.01571998 - time (sec): 5.78 - samples/sec: 3403.29 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:03:54,275 epoch 7 - iter 65/136 - loss 0.01520591 - time (sec): 7.25 - samples/sec: 3387.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 20:03:55,529 epoch 7 - iter 78/136 - loss 0.01504275 - time (sec): 8.51 - samples/sec: 3398.14 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:03:56,914 epoch 7 - iter 91/136 - loss 0.01386626 - time (sec): 9.89 - samples/sec: 3452.43 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 20:03:58,403 epoch 7 - iter 104/136 - loss 0.01333893 - time (sec): 11.38 - samples/sec: 3487.81 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:03:59,839 epoch 7 - iter 117/136 - loss 0.01428210 - time (sec): 12.82 - samples/sec: 3486.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 20:04:01,239 epoch 7 - iter 130/136 - loss 0.01389741 - time (sec): 14.22 - samples/sec: 3477.85 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 20:04:02,009 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:02,009 EPOCH 7 done: loss 0.0150 - lr: 0.000017 |
|
2023-10-17 20:04:03,491 DEV : loss 0.1483554244041443 - f1-score (micro avg) 0.8101 |
|
2023-10-17 20:04:03,497 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:04,888 epoch 8 - iter 13/136 - loss 0.01313850 - time (sec): 1.39 - samples/sec: 3546.31 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:04:06,855 epoch 8 - iter 26/136 - loss 0.00694009 - time (sec): 3.36 - samples/sec: 3079.49 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 20:04:08,190 epoch 8 - iter 39/136 - loss 0.00596193 - time (sec): 4.69 - samples/sec: 3339.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:04:09,645 epoch 8 - iter 52/136 - loss 0.00751512 - time (sec): 6.15 - samples/sec: 3288.52 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 20:04:11,049 epoch 8 - iter 65/136 - loss 0.00859132 - time (sec): 7.55 - samples/sec: 3348.21 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:04:12,587 epoch 8 - iter 78/136 - loss 0.00954904 - time (sec): 9.09 - samples/sec: 3331.65 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 20:04:14,359 epoch 8 - iter 91/136 - loss 0.01015131 - time (sec): 10.86 - samples/sec: 3336.20 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:04:15,628 epoch 8 - iter 104/136 - loss 0.01038419 - time (sec): 12.13 - samples/sec: 3364.49 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 20:04:16,838 epoch 8 - iter 117/136 - loss 0.01034903 - time (sec): 13.34 - samples/sec: 3327.07 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:04:18,376 epoch 8 - iter 130/136 - loss 0.01005288 - time (sec): 14.88 - samples/sec: 3340.23 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 20:04:19,027 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:19,027 EPOCH 8 done: loss 0.0103 - lr: 0.000012 |
|
2023-10-17 20:04:20,496 DEV : loss 0.15991544723510742 - f1-score (micro avg) 0.8199 |
|
2023-10-17 20:04:20,502 saving best model |
|
2023-10-17 20:04:21,106 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:22,433 epoch 9 - iter 13/136 - loss 0.00224882 - time (sec): 1.32 - samples/sec: 3563.19 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 20:04:23,917 epoch 9 - iter 26/136 - loss 0.00183179 - time (sec): 2.81 - samples/sec: 3419.00 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:04:25,133 epoch 9 - iter 39/136 - loss 0.00194506 - time (sec): 4.03 - samples/sec: 3360.84 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 20:04:26,792 epoch 9 - iter 52/136 - loss 0.00308254 - time (sec): 5.68 - samples/sec: 3396.35 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:04:28,070 epoch 9 - iter 65/136 - loss 0.01003080 - time (sec): 6.96 - samples/sec: 3442.17 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 20:04:29,442 epoch 9 - iter 78/136 - loss 0.01082740 - time (sec): 8.33 - samples/sec: 3427.57 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:04:31,187 epoch 9 - iter 91/136 - loss 0.00961569 - time (sec): 10.08 - samples/sec: 3468.68 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 20:04:32,615 epoch 9 - iter 104/136 - loss 0.01034925 - time (sec): 11.51 - samples/sec: 3533.01 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:04:33,821 epoch 9 - iter 117/136 - loss 0.00979864 - time (sec): 12.71 - samples/sec: 3507.74 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 20:04:35,280 epoch 9 - iter 130/136 - loss 0.00932433 - time (sec): 14.17 - samples/sec: 3496.43 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 20:04:35,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:35,947 EPOCH 9 done: loss 0.0090 - lr: 0.000006 |
|
2023-10-17 20:04:37,520 DEV : loss 0.16742360591888428 - f1-score (micro avg) 0.8066 |
|
2023-10-17 20:04:37,530 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:39,101 epoch 10 - iter 13/136 - loss 0.00275634 - time (sec): 1.57 - samples/sec: 3043.58 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:04:40,427 epoch 10 - iter 26/136 - loss 0.00255019 - time (sec): 2.90 - samples/sec: 3219.14 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 20:04:41,754 epoch 10 - iter 39/136 - loss 0.00297794 - time (sec): 4.22 - samples/sec: 3334.06 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:04:43,057 epoch 10 - iter 52/136 - loss 0.00301243 - time (sec): 5.53 - samples/sec: 3515.85 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 20:04:44,654 epoch 10 - iter 65/136 - loss 0.00244713 - time (sec): 7.12 - samples/sec: 3482.34 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:04:46,218 epoch 10 - iter 78/136 - loss 0.00342059 - time (sec): 8.69 - samples/sec: 3477.13 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 20:04:47,740 epoch 10 - iter 91/136 - loss 0.00341841 - time (sec): 10.21 - samples/sec: 3447.75 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:04:49,055 epoch 10 - iter 104/136 - loss 0.00409207 - time (sec): 11.52 - samples/sec: 3433.47 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 20:04:50,709 epoch 10 - iter 117/136 - loss 0.00660157 - time (sec): 13.18 - samples/sec: 3456.80 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 20:04:51,941 epoch 10 - iter 130/136 - loss 0.00640003 - time (sec): 14.41 - samples/sec: 3456.15 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 20:04:52,544 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:52,544 EPOCH 10 done: loss 0.0063 - lr: 0.000000 |
|
2023-10-17 20:04:54,087 DEV : loss 0.1717977672815323 - f1-score (micro avg) 0.8051 |
|
2023-10-17 20:04:54,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 20:04:54,477 Loading model from best epoch ... |
|
2023-10-17 20:04:56,043 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 20:04:58,126 |
|
Results: |
|
- F-score (micro) 0.7891 |
|
- F-score (macro) 0.7256 |
|
- Accuracy 0.6694 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8171 0.8590 0.8375 312 |
|
PER 0.7258 0.8654 0.7895 208 |
|
ORG 0.5217 0.4364 0.4752 55 |
|
HumanProd 0.7143 0.9091 0.8000 22 |
|
|
|
micro avg 0.7569 0.8241 0.7891 597 |
|
macro avg 0.6947 0.7675 0.7256 597 |
|
weighted avg 0.7543 0.8241 0.7860 597 |
|
|
|
2023-10-17 20:04:58,126 ---------------------------------------------------------------------------------------------------- |
|
|