|
2023-10-17 19:45:28,906 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,907 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 19:45:28,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,907 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-17 19:45:28,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,907 Train: 1085 sentences |
|
2023-10-17 19:45:28,907 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 19:45:28,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,907 Training Params: |
|
2023-10-17 19:45:28,907 - learning_rate: "3e-05" |
|
2023-10-17 19:45:28,907 - mini_batch_size: "8" |
|
2023-10-17 19:45:28,907 - max_epochs: "10" |
|
2023-10-17 19:45:28,907 - shuffle: "True" |
|
2023-10-17 19:45:28,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,908 Plugins: |
|
2023-10-17 19:45:28,908 - TensorboardLogger |
|
2023-10-17 19:45:28,908 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 19:45:28,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,908 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 19:45:28,908 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 19:45:28,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,908 Computation: |
|
2023-10-17 19:45:28,908 - compute on device: cuda:0 |
|
2023-10-17 19:45:28,908 - embedding storage: none |
|
2023-10-17 19:45:28,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,908 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-17 19:45:28,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,908 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:28,908 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 19:45:30,157 epoch 1 - iter 13/136 - loss 3.33397868 - time (sec): 1.25 - samples/sec: 3543.27 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:45:31,618 epoch 1 - iter 26/136 - loss 3.22956761 - time (sec): 2.71 - samples/sec: 3393.17 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:45:33,051 epoch 1 - iter 39/136 - loss 2.74193109 - time (sec): 4.14 - samples/sec: 3491.67 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 19:45:34,191 epoch 1 - iter 52/136 - loss 2.31884286 - time (sec): 5.28 - samples/sec: 3598.45 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 19:45:35,431 epoch 1 - iter 65/136 - loss 1.92717289 - time (sec): 6.52 - samples/sec: 3698.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:45:37,072 epoch 1 - iter 78/136 - loss 1.63979275 - time (sec): 8.16 - samples/sec: 3639.22 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:45:38,366 epoch 1 - iter 91/136 - loss 1.46722015 - time (sec): 9.46 - samples/sec: 3694.78 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:45:39,666 epoch 1 - iter 104/136 - loss 1.32920820 - time (sec): 10.76 - samples/sec: 3675.18 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:45:41,263 epoch 1 - iter 117/136 - loss 1.21334918 - time (sec): 12.35 - samples/sec: 3660.52 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:45:42,510 epoch 1 - iter 130/136 - loss 1.12478609 - time (sec): 13.60 - samples/sec: 3660.60 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:45:43,301 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:43,302 EPOCH 1 done: loss 1.0853 - lr: 0.000028 |
|
2023-10-17 19:45:44,154 DEV : loss 0.1887417882680893 - f1-score (micro avg) 0.586 |
|
2023-10-17 19:45:44,158 saving best model |
|
2023-10-17 19:45:44,504 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:45,778 epoch 2 - iter 13/136 - loss 0.23547845 - time (sec): 1.27 - samples/sec: 3414.02 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 19:45:47,241 epoch 2 - iter 26/136 - loss 0.22643586 - time (sec): 2.74 - samples/sec: 3443.08 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:45:48,623 epoch 2 - iter 39/136 - loss 0.22725957 - time (sec): 4.12 - samples/sec: 3473.57 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:45:50,104 epoch 2 - iter 52/136 - loss 0.20122890 - time (sec): 5.60 - samples/sec: 3494.90 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 19:45:51,460 epoch 2 - iter 65/136 - loss 0.19554214 - time (sec): 6.95 - samples/sec: 3548.04 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:45:52,778 epoch 2 - iter 78/136 - loss 0.19027869 - time (sec): 8.27 - samples/sec: 3542.50 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:45:54,064 epoch 2 - iter 91/136 - loss 0.18118561 - time (sec): 9.56 - samples/sec: 3544.54 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 19:45:55,724 epoch 2 - iter 104/136 - loss 0.17483536 - time (sec): 11.22 - samples/sec: 3587.86 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:45:57,243 epoch 2 - iter 117/136 - loss 0.17304087 - time (sec): 12.74 - samples/sec: 3614.37 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:45:58,451 epoch 2 - iter 130/136 - loss 0.17118168 - time (sec): 13.95 - samples/sec: 3592.97 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 19:45:58,972 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:45:58,972 EPOCH 2 done: loss 0.1697 - lr: 0.000027 |
|
2023-10-17 19:46:00,584 DEV : loss 0.12722007930278778 - f1-score (micro avg) 0.7203 |
|
2023-10-17 19:46:00,589 saving best model |
|
2023-10-17 19:46:01,048 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:46:02,406 epoch 3 - iter 13/136 - loss 0.08552455 - time (sec): 1.36 - samples/sec: 3306.82 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:46:03,767 epoch 3 - iter 26/136 - loss 0.09740518 - time (sec): 2.72 - samples/sec: 3531.04 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:46:05,173 epoch 3 - iter 39/136 - loss 0.11284075 - time (sec): 4.12 - samples/sec: 3575.31 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 19:46:06,487 epoch 3 - iter 52/136 - loss 0.10646472 - time (sec): 5.44 - samples/sec: 3509.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:46:07,718 epoch 3 - iter 65/136 - loss 0.10566909 - time (sec): 6.67 - samples/sec: 3524.95 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:46:08,935 epoch 3 - iter 78/136 - loss 0.10377471 - time (sec): 7.89 - samples/sec: 3599.86 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 19:46:10,354 epoch 3 - iter 91/136 - loss 0.09934481 - time (sec): 9.30 - samples/sec: 3603.58 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:46:11,675 epoch 3 - iter 104/136 - loss 0.09830355 - time (sec): 10.62 - samples/sec: 3648.98 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:46:13,252 epoch 3 - iter 117/136 - loss 0.10073907 - time (sec): 12.20 - samples/sec: 3645.18 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:46:14,799 epoch 3 - iter 130/136 - loss 0.09879412 - time (sec): 13.75 - samples/sec: 3625.32 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 19:46:15,460 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:46:15,460 EPOCH 3 done: loss 0.0966 - lr: 0.000024 |
|
2023-10-17 19:46:16,898 DEV : loss 0.09972850233316422 - f1-score (micro avg) 0.7873 |
|
2023-10-17 19:46:16,902 saving best model |
|
2023-10-17 19:46:17,348 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:46:18,637 epoch 4 - iter 13/136 - loss 0.07795068 - time (sec): 1.29 - samples/sec: 3810.25 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:46:19,792 epoch 4 - iter 26/136 - loss 0.06346444 - time (sec): 2.44 - samples/sec: 3703.32 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 19:46:21,115 epoch 4 - iter 39/136 - loss 0.05993848 - time (sec): 3.76 - samples/sec: 3705.07 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:46:22,543 epoch 4 - iter 52/136 - loss 0.06706455 - time (sec): 5.19 - samples/sec: 3569.75 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:46:23,805 epoch 4 - iter 65/136 - loss 0.06461519 - time (sec): 6.45 - samples/sec: 3592.16 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 19:46:25,238 epoch 4 - iter 78/136 - loss 0.06394341 - time (sec): 7.89 - samples/sec: 3569.06 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:46:26,546 epoch 4 - iter 91/136 - loss 0.06499679 - time (sec): 9.20 - samples/sec: 3567.70 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:46:27,903 epoch 4 - iter 104/136 - loss 0.06187012 - time (sec): 10.55 - samples/sec: 3560.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:46:29,513 epoch 4 - iter 117/136 - loss 0.06028076 - time (sec): 12.16 - samples/sec: 3579.24 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 19:46:31,140 epoch 4 - iter 130/136 - loss 0.06154054 - time (sec): 13.79 - samples/sec: 3601.48 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:46:31,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:46:31,700 EPOCH 4 done: loss 0.0611 - lr: 0.000020 |
|
2023-10-17 19:46:33,128 DEV : loss 0.09438183903694153 - f1-score (micro avg) 0.803 |
|
2023-10-17 19:46:33,132 saving best model |
|
2023-10-17 19:46:33,761 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:46:35,228 epoch 5 - iter 13/136 - loss 0.06198827 - time (sec): 1.46 - samples/sec: 3298.59 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 19:46:36,638 epoch 5 - iter 26/136 - loss 0.04800064 - time (sec): 2.87 - samples/sec: 3300.29 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:46:38,124 epoch 5 - iter 39/136 - loss 0.04515388 - time (sec): 4.36 - samples/sec: 3366.74 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:46:39,531 epoch 5 - iter 52/136 - loss 0.03997014 - time (sec): 5.76 - samples/sec: 3421.11 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 19:46:40,813 epoch 5 - iter 65/136 - loss 0.03825992 - time (sec): 7.05 - samples/sec: 3434.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:46:41,979 epoch 5 - iter 78/136 - loss 0.03974778 - time (sec): 8.21 - samples/sec: 3503.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:46:43,548 epoch 5 - iter 91/136 - loss 0.03991506 - time (sec): 9.78 - samples/sec: 3510.34 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:46:44,816 epoch 5 - iter 104/136 - loss 0.03902788 - time (sec): 11.05 - samples/sec: 3559.98 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 19:46:46,465 epoch 5 - iter 117/136 - loss 0.03926071 - time (sec): 12.70 - samples/sec: 3534.55 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:46:47,825 epoch 5 - iter 130/136 - loss 0.03853492 - time (sec): 14.06 - samples/sec: 3542.59 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 19:46:48,438 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:46:48,439 EPOCH 5 done: loss 0.0379 - lr: 0.000017 |
|
2023-10-17 19:46:49,873 DEV : loss 0.10785163938999176 - f1-score (micro avg) 0.8104 |
|
2023-10-17 19:46:49,877 saving best model |
|
2023-10-17 19:46:50,310 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:46:51,730 epoch 6 - iter 13/136 - loss 0.01830058 - time (sec): 1.42 - samples/sec: 3035.63 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:46:53,071 epoch 6 - iter 26/136 - loss 0.01862414 - time (sec): 2.76 - samples/sec: 3360.91 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:46:54,711 epoch 6 - iter 39/136 - loss 0.02509421 - time (sec): 4.40 - samples/sec: 3351.12 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 19:46:56,193 epoch 6 - iter 52/136 - loss 0.02464938 - time (sec): 5.88 - samples/sec: 3463.32 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:46:57,438 epoch 6 - iter 65/136 - loss 0.02773964 - time (sec): 7.12 - samples/sec: 3440.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:46:58,758 epoch 6 - iter 78/136 - loss 0.02642769 - time (sec): 8.44 - samples/sec: 3426.29 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:47:00,054 epoch 6 - iter 91/136 - loss 0.02877906 - time (sec): 9.74 - samples/sec: 3470.92 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 19:47:01,788 epoch 6 - iter 104/136 - loss 0.02708005 - time (sec): 11.47 - samples/sec: 3497.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:47:03,068 epoch 6 - iter 117/136 - loss 0.02760006 - time (sec): 12.75 - samples/sec: 3516.80 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:47:04,509 epoch 6 - iter 130/136 - loss 0.02671052 - time (sec): 14.20 - samples/sec: 3525.84 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 19:47:05,087 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:05,087 EPOCH 6 done: loss 0.0264 - lr: 0.000014 |
|
2023-10-17 19:47:06,521 DEV : loss 0.12122640013694763 - f1-score (micro avg) 0.7861 |
|
2023-10-17 19:47:06,526 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:07,844 epoch 7 - iter 13/136 - loss 0.02099326 - time (sec): 1.32 - samples/sec: 3110.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 19:47:09,260 epoch 7 - iter 26/136 - loss 0.02059330 - time (sec): 2.73 - samples/sec: 3263.22 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 19:47:10,650 epoch 7 - iter 39/136 - loss 0.01707556 - time (sec): 4.12 - samples/sec: 3229.59 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:47:12,053 epoch 7 - iter 52/136 - loss 0.01532293 - time (sec): 5.53 - samples/sec: 3391.43 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:47:13,382 epoch 7 - iter 65/136 - loss 0.01502420 - time (sec): 6.85 - samples/sec: 3391.53 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:47:14,856 epoch 7 - iter 78/136 - loss 0.01574769 - time (sec): 8.33 - samples/sec: 3422.26 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 19:47:16,276 epoch 7 - iter 91/136 - loss 0.01574047 - time (sec): 9.75 - samples/sec: 3493.82 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 19:47:17,808 epoch 7 - iter 104/136 - loss 0.01664688 - time (sec): 11.28 - samples/sec: 3484.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 19:47:19,187 epoch 7 - iter 117/136 - loss 0.01828514 - time (sec): 12.66 - samples/sec: 3536.14 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 19:47:20,540 epoch 7 - iter 130/136 - loss 0.01861858 - time (sec): 14.01 - samples/sec: 3564.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 19:47:21,234 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:21,235 EPOCH 7 done: loss 0.0195 - lr: 0.000010 |
|
2023-10-17 19:47:22,686 DEV : loss 0.13704432547092438 - f1-score (micro avg) 0.8029 |
|
2023-10-17 19:47:22,691 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:24,097 epoch 8 - iter 13/136 - loss 0.00946865 - time (sec): 1.40 - samples/sec: 3371.58 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 19:47:25,336 epoch 8 - iter 26/136 - loss 0.00842619 - time (sec): 2.64 - samples/sec: 3425.53 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:47:26,731 epoch 8 - iter 39/136 - loss 0.00904894 - time (sec): 4.04 - samples/sec: 3397.39 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:47:27,998 epoch 8 - iter 52/136 - loss 0.00964021 - time (sec): 5.31 - samples/sec: 3432.30 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:47:29,552 epoch 8 - iter 65/136 - loss 0.01280372 - time (sec): 6.86 - samples/sec: 3450.89 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 19:47:30,934 epoch 8 - iter 78/136 - loss 0.01318227 - time (sec): 8.24 - samples/sec: 3539.62 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 19:47:32,250 epoch 8 - iter 91/136 - loss 0.01418217 - time (sec): 9.56 - samples/sec: 3562.21 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 19:47:33,892 epoch 8 - iter 104/136 - loss 0.01464463 - time (sec): 11.20 - samples/sec: 3535.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 19:47:35,321 epoch 8 - iter 117/136 - loss 0.01425410 - time (sec): 12.63 - samples/sec: 3519.96 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 19:47:36,777 epoch 8 - iter 130/136 - loss 0.01341235 - time (sec): 14.08 - samples/sec: 3539.91 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 19:47:37,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:37,343 EPOCH 8 done: loss 0.0137 - lr: 0.000007 |
|
2023-10-17 19:47:38,772 DEV : loss 0.147489994764328 - f1-score (micro avg) 0.8133 |
|
2023-10-17 19:47:38,777 saving best model |
|
2023-10-17 19:47:39,246 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:40,780 epoch 9 - iter 13/136 - loss 0.01225089 - time (sec): 1.53 - samples/sec: 3784.91 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:47:42,049 epoch 9 - iter 26/136 - loss 0.01027809 - time (sec): 2.80 - samples/sec: 3673.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:47:43,575 epoch 9 - iter 39/136 - loss 0.01343566 - time (sec): 4.33 - samples/sec: 3474.31 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:47:44,917 epoch 9 - iter 52/136 - loss 0.01351620 - time (sec): 5.67 - samples/sec: 3460.21 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 19:47:46,304 epoch 9 - iter 65/136 - loss 0.01350860 - time (sec): 7.06 - samples/sec: 3485.46 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 19:47:47,589 epoch 9 - iter 78/136 - loss 0.01288045 - time (sec): 8.34 - samples/sec: 3527.16 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 19:47:49,069 epoch 9 - iter 91/136 - loss 0.01276011 - time (sec): 9.82 - samples/sec: 3540.13 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 19:47:50,459 epoch 9 - iter 104/136 - loss 0.01242219 - time (sec): 11.21 - samples/sec: 3572.81 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 19:47:51,830 epoch 9 - iter 117/136 - loss 0.01140109 - time (sec): 12.58 - samples/sec: 3547.14 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 19:47:53,229 epoch 9 - iter 130/136 - loss 0.01123631 - time (sec): 13.98 - samples/sec: 3547.43 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 19:47:53,837 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:53,837 EPOCH 9 done: loss 0.0109 - lr: 0.000004 |
|
2023-10-17 19:47:55,272 DEV : loss 0.15262548625469208 - f1-score (micro avg) 0.8119 |
|
2023-10-17 19:47:55,277 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:47:56,911 epoch 10 - iter 13/136 - loss 0.00591935 - time (sec): 1.63 - samples/sec: 3084.69 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:47:58,332 epoch 10 - iter 26/136 - loss 0.00456826 - time (sec): 3.05 - samples/sec: 3228.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:47:59,747 epoch 10 - iter 39/136 - loss 0.00486841 - time (sec): 4.47 - samples/sec: 3354.36 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 19:48:01,332 epoch 10 - iter 52/136 - loss 0.00683218 - time (sec): 6.05 - samples/sec: 3357.32 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 19:48:02,768 epoch 10 - iter 65/136 - loss 0.00718841 - time (sec): 7.49 - samples/sec: 3390.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 19:48:04,258 epoch 10 - iter 78/136 - loss 0.00664417 - time (sec): 8.98 - samples/sec: 3381.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 19:48:05,817 epoch 10 - iter 91/136 - loss 0.00757883 - time (sec): 10.54 - samples/sec: 3403.14 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 19:48:07,143 epoch 10 - iter 104/136 - loss 0.00822163 - time (sec): 11.87 - samples/sec: 3432.20 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 19:48:08,483 epoch 10 - iter 117/136 - loss 0.00956093 - time (sec): 13.21 - samples/sec: 3445.19 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 19:48:09,763 epoch 10 - iter 130/136 - loss 0.00934435 - time (sec): 14.49 - samples/sec: 3452.90 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 19:48:10,257 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:48:10,258 EPOCH 10 done: loss 0.0093 - lr: 0.000000 |
|
2023-10-17 19:48:11,702 DEV : loss 0.15066301822662354 - f1-score (micro avg) 0.8194 |
|
2023-10-17 19:48:11,706 saving best model |
|
2023-10-17 19:48:12,533 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 19:48:12,534 Loading model from best epoch ... |
|
2023-10-17 19:48:14,235 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-17 19:48:16,231 |
|
Results: |
|
- F-score (micro) 0.7789 |
|
- F-score (macro) 0.7273 |
|
- Accuracy 0.6539 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8024 0.8590 0.8297 312 |
|
PER 0.7087 0.8654 0.7792 208 |
|
ORG 0.4483 0.4727 0.4602 55 |
|
HumanProd 0.7500 0.9545 0.8400 22 |
|
|
|
micro avg 0.7344 0.8291 0.7789 597 |
|
macro avg 0.6773 0.7879 0.7273 597 |
|
weighted avg 0.7352 0.8291 0.7785 597 |
|
|
|
2023-10-17 19:48:16,231 ---------------------------------------------------------------------------------------------------- |
|
|