stefan-it's picture
Upload folder using huggingface_hub
c439a55
2023-10-17 19:45:28,906 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,907 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 19:45:28,907 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,907 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-17 19:45:28,907 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,907 Train: 1085 sentences
2023-10-17 19:45:28,907 (train_with_dev=False, train_with_test=False)
2023-10-17 19:45:28,907 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,907 Training Params:
2023-10-17 19:45:28,907 - learning_rate: "3e-05"
2023-10-17 19:45:28,907 - mini_batch_size: "8"
2023-10-17 19:45:28,907 - max_epochs: "10"
2023-10-17 19:45:28,907 - shuffle: "True"
2023-10-17 19:45:28,907 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,908 Plugins:
2023-10-17 19:45:28,908 - TensorboardLogger
2023-10-17 19:45:28,908 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 19:45:28,908 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,908 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 19:45:28,908 - metric: "('micro avg', 'f1-score')"
2023-10-17 19:45:28,908 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,908 Computation:
2023-10-17 19:45:28,908 - compute on device: cuda:0
2023-10-17 19:45:28,908 - embedding storage: none
2023-10-17 19:45:28,908 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,908 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 19:45:28,908 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,908 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:28,908 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 19:45:30,157 epoch 1 - iter 13/136 - loss 3.33397868 - time (sec): 1.25 - samples/sec: 3543.27 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:45:31,618 epoch 1 - iter 26/136 - loss 3.22956761 - time (sec): 2.71 - samples/sec: 3393.17 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:45:33,051 epoch 1 - iter 39/136 - loss 2.74193109 - time (sec): 4.14 - samples/sec: 3491.67 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:45:34,191 epoch 1 - iter 52/136 - loss 2.31884286 - time (sec): 5.28 - samples/sec: 3598.45 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:45:35,431 epoch 1 - iter 65/136 - loss 1.92717289 - time (sec): 6.52 - samples/sec: 3698.91 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:45:37,072 epoch 1 - iter 78/136 - loss 1.63979275 - time (sec): 8.16 - samples/sec: 3639.22 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:45:38,366 epoch 1 - iter 91/136 - loss 1.46722015 - time (sec): 9.46 - samples/sec: 3694.78 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:45:39,666 epoch 1 - iter 104/136 - loss 1.32920820 - time (sec): 10.76 - samples/sec: 3675.18 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:45:41,263 epoch 1 - iter 117/136 - loss 1.21334918 - time (sec): 12.35 - samples/sec: 3660.52 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:45:42,510 epoch 1 - iter 130/136 - loss 1.12478609 - time (sec): 13.60 - samples/sec: 3660.60 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:45:43,301 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:43,302 EPOCH 1 done: loss 1.0853 - lr: 0.000028
2023-10-17 19:45:44,154 DEV : loss 0.1887417882680893 - f1-score (micro avg) 0.586
2023-10-17 19:45:44,158 saving best model
2023-10-17 19:45:44,504 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:45,778 epoch 2 - iter 13/136 - loss 0.23547845 - time (sec): 1.27 - samples/sec: 3414.02 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:45:47,241 epoch 2 - iter 26/136 - loss 0.22643586 - time (sec): 2.74 - samples/sec: 3443.08 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:45:48,623 epoch 2 - iter 39/136 - loss 0.22725957 - time (sec): 4.12 - samples/sec: 3473.57 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:45:50,104 epoch 2 - iter 52/136 - loss 0.20122890 - time (sec): 5.60 - samples/sec: 3494.90 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:45:51,460 epoch 2 - iter 65/136 - loss 0.19554214 - time (sec): 6.95 - samples/sec: 3548.04 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:45:52,778 epoch 2 - iter 78/136 - loss 0.19027869 - time (sec): 8.27 - samples/sec: 3542.50 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:45:54,064 epoch 2 - iter 91/136 - loss 0.18118561 - time (sec): 9.56 - samples/sec: 3544.54 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:45:55,724 epoch 2 - iter 104/136 - loss 0.17483536 - time (sec): 11.22 - samples/sec: 3587.86 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:45:57,243 epoch 2 - iter 117/136 - loss 0.17304087 - time (sec): 12.74 - samples/sec: 3614.37 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:45:58,451 epoch 2 - iter 130/136 - loss 0.17118168 - time (sec): 13.95 - samples/sec: 3592.97 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:45:58,972 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:58,972 EPOCH 2 done: loss 0.1697 - lr: 0.000027
2023-10-17 19:46:00,584 DEV : loss 0.12722007930278778 - f1-score (micro avg) 0.7203
2023-10-17 19:46:00,589 saving best model
2023-10-17 19:46:01,048 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:02,406 epoch 3 - iter 13/136 - loss 0.08552455 - time (sec): 1.36 - samples/sec: 3306.82 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:46:03,767 epoch 3 - iter 26/136 - loss 0.09740518 - time (sec): 2.72 - samples/sec: 3531.04 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:46:05,173 epoch 3 - iter 39/136 - loss 0.11284075 - time (sec): 4.12 - samples/sec: 3575.31 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:46:06,487 epoch 3 - iter 52/136 - loss 0.10646472 - time (sec): 5.44 - samples/sec: 3509.39 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:46:07,718 epoch 3 - iter 65/136 - loss 0.10566909 - time (sec): 6.67 - samples/sec: 3524.95 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:46:08,935 epoch 3 - iter 78/136 - loss 0.10377471 - time (sec): 7.89 - samples/sec: 3599.86 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:46:10,354 epoch 3 - iter 91/136 - loss 0.09934481 - time (sec): 9.30 - samples/sec: 3603.58 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:46:11,675 epoch 3 - iter 104/136 - loss 0.09830355 - time (sec): 10.62 - samples/sec: 3648.98 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:46:13,252 epoch 3 - iter 117/136 - loss 0.10073907 - time (sec): 12.20 - samples/sec: 3645.18 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:46:14,799 epoch 3 - iter 130/136 - loss 0.09879412 - time (sec): 13.75 - samples/sec: 3625.32 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:46:15,460 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:15,460 EPOCH 3 done: loss 0.0966 - lr: 0.000024
2023-10-17 19:46:16,898 DEV : loss 0.09972850233316422 - f1-score (micro avg) 0.7873
2023-10-17 19:46:16,902 saving best model
2023-10-17 19:46:17,348 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:18,637 epoch 4 - iter 13/136 - loss 0.07795068 - time (sec): 1.29 - samples/sec: 3810.25 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:46:19,792 epoch 4 - iter 26/136 - loss 0.06346444 - time (sec): 2.44 - samples/sec: 3703.32 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:46:21,115 epoch 4 - iter 39/136 - loss 0.05993848 - time (sec): 3.76 - samples/sec: 3705.07 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:46:22,543 epoch 4 - iter 52/136 - loss 0.06706455 - time (sec): 5.19 - samples/sec: 3569.75 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:46:23,805 epoch 4 - iter 65/136 - loss 0.06461519 - time (sec): 6.45 - samples/sec: 3592.16 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:46:25,238 epoch 4 - iter 78/136 - loss 0.06394341 - time (sec): 7.89 - samples/sec: 3569.06 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:46:26,546 epoch 4 - iter 91/136 - loss 0.06499679 - time (sec): 9.20 - samples/sec: 3567.70 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:46:27,903 epoch 4 - iter 104/136 - loss 0.06187012 - time (sec): 10.55 - samples/sec: 3560.69 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:46:29,513 epoch 4 - iter 117/136 - loss 0.06028076 - time (sec): 12.16 - samples/sec: 3579.24 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:46:31,140 epoch 4 - iter 130/136 - loss 0.06154054 - time (sec): 13.79 - samples/sec: 3601.48 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:46:31,699 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:31,700 EPOCH 4 done: loss 0.0611 - lr: 0.000020
2023-10-17 19:46:33,128 DEV : loss 0.09438183903694153 - f1-score (micro avg) 0.803
2023-10-17 19:46:33,132 saving best model
2023-10-17 19:46:33,761 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:35,228 epoch 5 - iter 13/136 - loss 0.06198827 - time (sec): 1.46 - samples/sec: 3298.59 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:46:36,638 epoch 5 - iter 26/136 - loss 0.04800064 - time (sec): 2.87 - samples/sec: 3300.29 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:46:38,124 epoch 5 - iter 39/136 - loss 0.04515388 - time (sec): 4.36 - samples/sec: 3366.74 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:46:39,531 epoch 5 - iter 52/136 - loss 0.03997014 - time (sec): 5.76 - samples/sec: 3421.11 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:46:40,813 epoch 5 - iter 65/136 - loss 0.03825992 - time (sec): 7.05 - samples/sec: 3434.66 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:46:41,979 epoch 5 - iter 78/136 - loss 0.03974778 - time (sec): 8.21 - samples/sec: 3503.87 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:46:43,548 epoch 5 - iter 91/136 - loss 0.03991506 - time (sec): 9.78 - samples/sec: 3510.34 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:46:44,816 epoch 5 - iter 104/136 - loss 0.03902788 - time (sec): 11.05 - samples/sec: 3559.98 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:46:46,465 epoch 5 - iter 117/136 - loss 0.03926071 - time (sec): 12.70 - samples/sec: 3534.55 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:46:47,825 epoch 5 - iter 130/136 - loss 0.03853492 - time (sec): 14.06 - samples/sec: 3542.59 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:46:48,438 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:48,439 EPOCH 5 done: loss 0.0379 - lr: 0.000017
2023-10-17 19:46:49,873 DEV : loss 0.10785163938999176 - f1-score (micro avg) 0.8104
2023-10-17 19:46:49,877 saving best model
2023-10-17 19:46:50,310 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:51,730 epoch 6 - iter 13/136 - loss 0.01830058 - time (sec): 1.42 - samples/sec: 3035.63 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:46:53,071 epoch 6 - iter 26/136 - loss 0.01862414 - time (sec): 2.76 - samples/sec: 3360.91 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:46:54,711 epoch 6 - iter 39/136 - loss 0.02509421 - time (sec): 4.40 - samples/sec: 3351.12 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:46:56,193 epoch 6 - iter 52/136 - loss 0.02464938 - time (sec): 5.88 - samples/sec: 3463.32 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:46:57,438 epoch 6 - iter 65/136 - loss 0.02773964 - time (sec): 7.12 - samples/sec: 3440.92 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:46:58,758 epoch 6 - iter 78/136 - loss 0.02642769 - time (sec): 8.44 - samples/sec: 3426.29 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:47:00,054 epoch 6 - iter 91/136 - loss 0.02877906 - time (sec): 9.74 - samples/sec: 3470.92 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:47:01,788 epoch 6 - iter 104/136 - loss 0.02708005 - time (sec): 11.47 - samples/sec: 3497.99 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:47:03,068 epoch 6 - iter 117/136 - loss 0.02760006 - time (sec): 12.75 - samples/sec: 3516.80 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:47:04,509 epoch 6 - iter 130/136 - loss 0.02671052 - time (sec): 14.20 - samples/sec: 3525.84 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:47:05,087 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:05,087 EPOCH 6 done: loss 0.0264 - lr: 0.000014
2023-10-17 19:47:06,521 DEV : loss 0.12122640013694763 - f1-score (micro avg) 0.7861
2023-10-17 19:47:06,526 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:07,844 epoch 7 - iter 13/136 - loss 0.02099326 - time (sec): 1.32 - samples/sec: 3110.22 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:47:09,260 epoch 7 - iter 26/136 - loss 0.02059330 - time (sec): 2.73 - samples/sec: 3263.22 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:47:10,650 epoch 7 - iter 39/136 - loss 0.01707556 - time (sec): 4.12 - samples/sec: 3229.59 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:47:12,053 epoch 7 - iter 52/136 - loss 0.01532293 - time (sec): 5.53 - samples/sec: 3391.43 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:47:13,382 epoch 7 - iter 65/136 - loss 0.01502420 - time (sec): 6.85 - samples/sec: 3391.53 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:47:14,856 epoch 7 - iter 78/136 - loss 0.01574769 - time (sec): 8.33 - samples/sec: 3422.26 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:47:16,276 epoch 7 - iter 91/136 - loss 0.01574047 - time (sec): 9.75 - samples/sec: 3493.82 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:47:17,808 epoch 7 - iter 104/136 - loss 0.01664688 - time (sec): 11.28 - samples/sec: 3484.52 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:47:19,187 epoch 7 - iter 117/136 - loss 0.01828514 - time (sec): 12.66 - samples/sec: 3536.14 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:47:20,540 epoch 7 - iter 130/136 - loss 0.01861858 - time (sec): 14.01 - samples/sec: 3564.77 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:47:21,234 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:21,235 EPOCH 7 done: loss 0.0195 - lr: 0.000010
2023-10-17 19:47:22,686 DEV : loss 0.13704432547092438 - f1-score (micro avg) 0.8029
2023-10-17 19:47:22,691 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:24,097 epoch 8 - iter 13/136 - loss 0.00946865 - time (sec): 1.40 - samples/sec: 3371.58 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:47:25,336 epoch 8 - iter 26/136 - loss 0.00842619 - time (sec): 2.64 - samples/sec: 3425.53 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:47:26,731 epoch 8 - iter 39/136 - loss 0.00904894 - time (sec): 4.04 - samples/sec: 3397.39 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:47:27,998 epoch 8 - iter 52/136 - loss 0.00964021 - time (sec): 5.31 - samples/sec: 3432.30 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:47:29,552 epoch 8 - iter 65/136 - loss 0.01280372 - time (sec): 6.86 - samples/sec: 3450.89 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:47:30,934 epoch 8 - iter 78/136 - loss 0.01318227 - time (sec): 8.24 - samples/sec: 3539.62 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:47:32,250 epoch 8 - iter 91/136 - loss 0.01418217 - time (sec): 9.56 - samples/sec: 3562.21 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:47:33,892 epoch 8 - iter 104/136 - loss 0.01464463 - time (sec): 11.20 - samples/sec: 3535.50 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:47:35,321 epoch 8 - iter 117/136 - loss 0.01425410 - time (sec): 12.63 - samples/sec: 3519.96 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:47:36,777 epoch 8 - iter 130/136 - loss 0.01341235 - time (sec): 14.08 - samples/sec: 3539.91 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:47:37,343 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:37,343 EPOCH 8 done: loss 0.0137 - lr: 0.000007
2023-10-17 19:47:38,772 DEV : loss 0.147489994764328 - f1-score (micro avg) 0.8133
2023-10-17 19:47:38,777 saving best model
2023-10-17 19:47:39,246 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:40,780 epoch 9 - iter 13/136 - loss 0.01225089 - time (sec): 1.53 - samples/sec: 3784.91 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:47:42,049 epoch 9 - iter 26/136 - loss 0.01027809 - time (sec): 2.80 - samples/sec: 3673.88 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:47:43,575 epoch 9 - iter 39/136 - loss 0.01343566 - time (sec): 4.33 - samples/sec: 3474.31 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:47:44,917 epoch 9 - iter 52/136 - loss 0.01351620 - time (sec): 5.67 - samples/sec: 3460.21 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:47:46,304 epoch 9 - iter 65/136 - loss 0.01350860 - time (sec): 7.06 - samples/sec: 3485.46 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:47:47,589 epoch 9 - iter 78/136 - loss 0.01288045 - time (sec): 8.34 - samples/sec: 3527.16 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:47:49,069 epoch 9 - iter 91/136 - loss 0.01276011 - time (sec): 9.82 - samples/sec: 3540.13 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:47:50,459 epoch 9 - iter 104/136 - loss 0.01242219 - time (sec): 11.21 - samples/sec: 3572.81 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:47:51,830 epoch 9 - iter 117/136 - loss 0.01140109 - time (sec): 12.58 - samples/sec: 3547.14 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:47:53,229 epoch 9 - iter 130/136 - loss 0.01123631 - time (sec): 13.98 - samples/sec: 3547.43 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:47:53,837 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:53,837 EPOCH 9 done: loss 0.0109 - lr: 0.000004
2023-10-17 19:47:55,272 DEV : loss 0.15262548625469208 - f1-score (micro avg) 0.8119
2023-10-17 19:47:55,277 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:56,911 epoch 10 - iter 13/136 - loss 0.00591935 - time (sec): 1.63 - samples/sec: 3084.69 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:47:58,332 epoch 10 - iter 26/136 - loss 0.00456826 - time (sec): 3.05 - samples/sec: 3228.15 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:47:59,747 epoch 10 - iter 39/136 - loss 0.00486841 - time (sec): 4.47 - samples/sec: 3354.36 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:48:01,332 epoch 10 - iter 52/136 - loss 0.00683218 - time (sec): 6.05 - samples/sec: 3357.32 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:48:02,768 epoch 10 - iter 65/136 - loss 0.00718841 - time (sec): 7.49 - samples/sec: 3390.65 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:48:04,258 epoch 10 - iter 78/136 - loss 0.00664417 - time (sec): 8.98 - samples/sec: 3381.59 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:48:05,817 epoch 10 - iter 91/136 - loss 0.00757883 - time (sec): 10.54 - samples/sec: 3403.14 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:48:07,143 epoch 10 - iter 104/136 - loss 0.00822163 - time (sec): 11.87 - samples/sec: 3432.20 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:48:08,483 epoch 10 - iter 117/136 - loss 0.00956093 - time (sec): 13.21 - samples/sec: 3445.19 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:48:09,763 epoch 10 - iter 130/136 - loss 0.00934435 - time (sec): 14.49 - samples/sec: 3452.90 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:48:10,257 ----------------------------------------------------------------------------------------------------
2023-10-17 19:48:10,258 EPOCH 10 done: loss 0.0093 - lr: 0.000000
2023-10-17 19:48:11,702 DEV : loss 0.15066301822662354 - f1-score (micro avg) 0.8194
2023-10-17 19:48:11,706 saving best model
2023-10-17 19:48:12,533 ----------------------------------------------------------------------------------------------------
2023-10-17 19:48:12,534 Loading model from best epoch ...
2023-10-17 19:48:14,235 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 19:48:16,231
Results:
- F-score (micro) 0.7789
- F-score (macro) 0.7273
- Accuracy 0.6539
By class:
precision recall f1-score support
LOC 0.8024 0.8590 0.8297 312
PER 0.7087 0.8654 0.7792 208
ORG 0.4483 0.4727 0.4602 55
HumanProd 0.7500 0.9545 0.8400 22
micro avg 0.7344 0.8291 0.7789 597
macro avg 0.6773 0.7879 0.7273 597
weighted avg 0.7352 0.8291 0.7785 597
2023-10-17 19:48:16,231 ----------------------------------------------------------------------------------------------------