stefan-it's picture
Upload folder using huggingface_hub
e41286f
2023-10-13 09:17:33,545 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,546 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 09:17:33,546 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,546 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-13 09:17:33,546 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,546 Train: 1214 sentences
2023-10-13 09:17:33,546 (train_with_dev=False, train_with_test=False)
2023-10-13 09:17:33,546 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,546 Training Params:
2023-10-13 09:17:33,546 - learning_rate: "3e-05"
2023-10-13 09:17:33,546 - mini_batch_size: "8"
2023-10-13 09:17:33,546 - max_epochs: "10"
2023-10-13 09:17:33,547 - shuffle: "True"
2023-10-13 09:17:33,547 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,547 Plugins:
2023-10-13 09:17:33,547 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 09:17:33,547 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,547 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 09:17:33,547 - metric: "('micro avg', 'f1-score')"
2023-10-13 09:17:33,547 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,547 Computation:
2023-10-13 09:17:33,547 - compute on device: cuda:0
2023-10-13 09:17:33,547 - embedding storage: none
2023-10-13 09:17:33,547 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,547 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 09:17:33,547 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:33,547 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:34,442 epoch 1 - iter 15/152 - loss 3.43231563 - time (sec): 0.89 - samples/sec: 3438.53 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:17:35,325 epoch 1 - iter 30/152 - loss 3.17528117 - time (sec): 1.78 - samples/sec: 3557.69 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:17:36,192 epoch 1 - iter 45/152 - loss 2.71015130 - time (sec): 2.64 - samples/sec: 3561.95 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:17:37,060 epoch 1 - iter 60/152 - loss 2.25976386 - time (sec): 3.51 - samples/sec: 3494.66 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:17:37,946 epoch 1 - iter 75/152 - loss 1.95193778 - time (sec): 4.40 - samples/sec: 3526.36 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:17:38,764 epoch 1 - iter 90/152 - loss 1.74550379 - time (sec): 5.22 - samples/sec: 3532.06 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:17:39,597 epoch 1 - iter 105/152 - loss 1.56936473 - time (sec): 6.05 - samples/sec: 3532.03 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:17:40,463 epoch 1 - iter 120/152 - loss 1.43321747 - time (sec): 6.92 - samples/sec: 3528.73 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:17:41,309 epoch 1 - iter 135/152 - loss 1.32080472 - time (sec): 7.76 - samples/sec: 3524.44 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:17:42,179 epoch 1 - iter 150/152 - loss 1.21944501 - time (sec): 8.63 - samples/sec: 3543.93 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:17:42,288 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:42,288 EPOCH 1 done: loss 1.2094 - lr: 0.000029
2023-10-13 09:17:43,170 DEV : loss 0.30928874015808105 - f1-score (micro avg) 0.3768
2023-10-13 09:17:43,176 saving best model
2023-10-13 09:17:43,548 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:44,408 epoch 2 - iter 15/152 - loss 0.31603838 - time (sec): 0.86 - samples/sec: 3465.75 - lr: 0.000030 - momentum: 0.000000
2023-10-13 09:17:45,290 epoch 2 - iter 30/152 - loss 0.28922810 - time (sec): 1.74 - samples/sec: 3460.63 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:17:46,164 epoch 2 - iter 45/152 - loss 0.25182527 - time (sec): 2.61 - samples/sec: 3470.65 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:17:47,025 epoch 2 - iter 60/152 - loss 0.24330164 - time (sec): 3.48 - samples/sec: 3492.24 - lr: 0.000029 - momentum: 0.000000
2023-10-13 09:17:47,887 epoch 2 - iter 75/152 - loss 0.23361289 - time (sec): 4.34 - samples/sec: 3538.17 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:17:48,767 epoch 2 - iter 90/152 - loss 0.22340757 - time (sec): 5.22 - samples/sec: 3499.67 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:17:49,643 epoch 2 - iter 105/152 - loss 0.21549916 - time (sec): 6.09 - samples/sec: 3535.82 - lr: 0.000028 - momentum: 0.000000
2023-10-13 09:17:50,465 epoch 2 - iter 120/152 - loss 0.20648760 - time (sec): 6.92 - samples/sec: 3574.98 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:17:51,320 epoch 2 - iter 135/152 - loss 0.20531334 - time (sec): 7.77 - samples/sec: 3578.15 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:17:52,131 epoch 2 - iter 150/152 - loss 0.19773529 - time (sec): 8.58 - samples/sec: 3587.60 - lr: 0.000027 - momentum: 0.000000
2023-10-13 09:17:52,223 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:52,224 EPOCH 2 done: loss 0.1982 - lr: 0.000027
2023-10-13 09:17:53,124 DEV : loss 0.1611824333667755 - f1-score (micro avg) 0.7323
2023-10-13 09:17:53,130 saving best model
2023-10-13 09:17:53,626 ----------------------------------------------------------------------------------------------------
2023-10-13 09:17:54,497 epoch 3 - iter 15/152 - loss 0.08185510 - time (sec): 0.86 - samples/sec: 3517.84 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:17:55,365 epoch 3 - iter 30/152 - loss 0.08955699 - time (sec): 1.73 - samples/sec: 3515.77 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:17:56,218 epoch 3 - iter 45/152 - loss 0.09014888 - time (sec): 2.59 - samples/sec: 3630.68 - lr: 0.000026 - momentum: 0.000000
2023-10-13 09:17:57,035 epoch 3 - iter 60/152 - loss 0.08782678 - time (sec): 3.40 - samples/sec: 3619.08 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:17:57,904 epoch 3 - iter 75/152 - loss 0.08919998 - time (sec): 4.27 - samples/sec: 3586.16 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:17:58,721 epoch 3 - iter 90/152 - loss 0.09539873 - time (sec): 5.09 - samples/sec: 3573.56 - lr: 0.000025 - momentum: 0.000000
2023-10-13 09:17:59,573 epoch 3 - iter 105/152 - loss 0.09778693 - time (sec): 5.94 - samples/sec: 3594.24 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:18:00,418 epoch 3 - iter 120/152 - loss 0.10010355 - time (sec): 6.79 - samples/sec: 3588.60 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:18:01,286 epoch 3 - iter 135/152 - loss 0.09879299 - time (sec): 7.65 - samples/sec: 3627.52 - lr: 0.000024 - momentum: 0.000000
2023-10-13 09:18:02,080 epoch 3 - iter 150/152 - loss 0.09827156 - time (sec): 8.45 - samples/sec: 3616.48 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:18:02,194 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:02,194 EPOCH 3 done: loss 0.0976 - lr: 0.000023
2023-10-13 09:18:03,122 DEV : loss 0.14934414625167847 - f1-score (micro avg) 0.8032
2023-10-13 09:18:03,128 saving best model
2023-10-13 09:18:03,646 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:04,483 epoch 4 - iter 15/152 - loss 0.05991201 - time (sec): 0.83 - samples/sec: 3853.13 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:18:05,336 epoch 4 - iter 30/152 - loss 0.04771997 - time (sec): 1.69 - samples/sec: 3725.25 - lr: 0.000023 - momentum: 0.000000
2023-10-13 09:18:06,146 epoch 4 - iter 45/152 - loss 0.06036223 - time (sec): 2.50 - samples/sec: 3708.98 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:18:06,978 epoch 4 - iter 60/152 - loss 0.05672872 - time (sec): 3.33 - samples/sec: 3657.19 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:18:07,813 epoch 4 - iter 75/152 - loss 0.05143727 - time (sec): 4.16 - samples/sec: 3687.89 - lr: 0.000022 - momentum: 0.000000
2023-10-13 09:18:08,615 epoch 4 - iter 90/152 - loss 0.05838143 - time (sec): 4.96 - samples/sec: 3696.18 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:18:09,430 epoch 4 - iter 105/152 - loss 0.06027114 - time (sec): 5.78 - samples/sec: 3706.01 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:18:10,329 epoch 4 - iter 120/152 - loss 0.06281418 - time (sec): 6.68 - samples/sec: 3681.47 - lr: 0.000021 - momentum: 0.000000
2023-10-13 09:18:11,212 epoch 4 - iter 135/152 - loss 0.06135023 - time (sec): 7.56 - samples/sec: 3661.94 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:18:12,058 epoch 4 - iter 150/152 - loss 0.06515964 - time (sec): 8.41 - samples/sec: 3638.70 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:18:12,167 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:12,167 EPOCH 4 done: loss 0.0646 - lr: 0.000020
2023-10-13 09:18:13,062 DEV : loss 0.14927834272384644 - f1-score (micro avg) 0.7995
2023-10-13 09:18:13,067 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:13,928 epoch 5 - iter 15/152 - loss 0.04878445 - time (sec): 0.86 - samples/sec: 3694.55 - lr: 0.000020 - momentum: 0.000000
2023-10-13 09:18:14,754 epoch 5 - iter 30/152 - loss 0.03806395 - time (sec): 1.69 - samples/sec: 3794.04 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:18:15,616 epoch 5 - iter 45/152 - loss 0.03909102 - time (sec): 2.55 - samples/sec: 3703.43 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:18:16,456 epoch 5 - iter 60/152 - loss 0.04305523 - time (sec): 3.39 - samples/sec: 3680.61 - lr: 0.000019 - momentum: 0.000000
2023-10-13 09:18:17,327 epoch 5 - iter 75/152 - loss 0.05026496 - time (sec): 4.26 - samples/sec: 3644.88 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:18:18,172 epoch 5 - iter 90/152 - loss 0.04943306 - time (sec): 5.10 - samples/sec: 3641.02 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:18:19,049 epoch 5 - iter 105/152 - loss 0.04630032 - time (sec): 5.98 - samples/sec: 3600.12 - lr: 0.000018 - momentum: 0.000000
2023-10-13 09:18:19,943 epoch 5 - iter 120/152 - loss 0.04904589 - time (sec): 6.87 - samples/sec: 3582.56 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:18:20,801 epoch 5 - iter 135/152 - loss 0.04979173 - time (sec): 7.73 - samples/sec: 3570.64 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:18:21,666 epoch 5 - iter 150/152 - loss 0.04824478 - time (sec): 8.60 - samples/sec: 3566.34 - lr: 0.000017 - momentum: 0.000000
2023-10-13 09:18:21,764 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:21,765 EPOCH 5 done: loss 0.0484 - lr: 0.000017
2023-10-13 09:18:22,679 DEV : loss 0.16671518981456757 - f1-score (micro avg) 0.8139
2023-10-13 09:18:22,686 saving best model
2023-10-13 09:18:23,199 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:24,023 epoch 6 - iter 15/152 - loss 0.04722034 - time (sec): 0.82 - samples/sec: 3413.25 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:18:24,920 epoch 6 - iter 30/152 - loss 0.04572734 - time (sec): 1.72 - samples/sec: 3529.59 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:18:25,798 epoch 6 - iter 45/152 - loss 0.03835916 - time (sec): 2.60 - samples/sec: 3510.91 - lr: 0.000016 - momentum: 0.000000
2023-10-13 09:18:26,712 epoch 6 - iter 60/152 - loss 0.03559348 - time (sec): 3.51 - samples/sec: 3475.21 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:18:27,558 epoch 6 - iter 75/152 - loss 0.03219426 - time (sec): 4.36 - samples/sec: 3438.71 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:18:28,449 epoch 6 - iter 90/152 - loss 0.03079567 - time (sec): 5.25 - samples/sec: 3447.47 - lr: 0.000015 - momentum: 0.000000
2023-10-13 09:18:29,314 epoch 6 - iter 105/152 - loss 0.03234725 - time (sec): 6.11 - samples/sec: 3480.92 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:18:30,187 epoch 6 - iter 120/152 - loss 0.03590374 - time (sec): 6.98 - samples/sec: 3483.67 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:18:31,026 epoch 6 - iter 135/152 - loss 0.03478735 - time (sec): 7.82 - samples/sec: 3494.52 - lr: 0.000014 - momentum: 0.000000
2023-10-13 09:18:31,866 epoch 6 - iter 150/152 - loss 0.03481313 - time (sec): 8.66 - samples/sec: 3535.53 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:18:31,972 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:31,972 EPOCH 6 done: loss 0.0346 - lr: 0.000013
2023-10-13 09:18:33,026 DEV : loss 0.1832340508699417 - f1-score (micro avg) 0.8253
2023-10-13 09:18:33,032 saving best model
2023-10-13 09:18:33,662 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:34,512 epoch 7 - iter 15/152 - loss 0.01599971 - time (sec): 0.85 - samples/sec: 3634.74 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:18:35,406 epoch 7 - iter 30/152 - loss 0.03165204 - time (sec): 1.74 - samples/sec: 3515.19 - lr: 0.000013 - momentum: 0.000000
2023-10-13 09:18:36,250 epoch 7 - iter 45/152 - loss 0.03534164 - time (sec): 2.59 - samples/sec: 3512.02 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:18:37,207 epoch 7 - iter 60/152 - loss 0.03177575 - time (sec): 3.54 - samples/sec: 3534.56 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:18:38,052 epoch 7 - iter 75/152 - loss 0.02705708 - time (sec): 4.39 - samples/sec: 3527.06 - lr: 0.000012 - momentum: 0.000000
2023-10-13 09:18:38,924 epoch 7 - iter 90/152 - loss 0.02402478 - time (sec): 5.26 - samples/sec: 3541.44 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:18:39,833 epoch 7 - iter 105/152 - loss 0.02218151 - time (sec): 6.17 - samples/sec: 3523.68 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:18:40,789 epoch 7 - iter 120/152 - loss 0.02623768 - time (sec): 7.13 - samples/sec: 3521.49 - lr: 0.000011 - momentum: 0.000000
2023-10-13 09:18:41,689 epoch 7 - iter 135/152 - loss 0.02470530 - time (sec): 8.03 - samples/sec: 3466.18 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:18:42,533 epoch 7 - iter 150/152 - loss 0.02588736 - time (sec): 8.87 - samples/sec: 3463.01 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:18:42,636 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:42,637 EPOCH 7 done: loss 0.0257 - lr: 0.000010
2023-10-13 09:18:43,609 DEV : loss 0.17377883195877075 - f1-score (micro avg) 0.8494
2023-10-13 09:18:43,616 saving best model
2023-10-13 09:18:44,083 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:45,035 epoch 8 - iter 15/152 - loss 0.01736505 - time (sec): 0.95 - samples/sec: 3556.07 - lr: 0.000010 - momentum: 0.000000
2023-10-13 09:18:45,963 epoch 8 - iter 30/152 - loss 0.01369969 - time (sec): 1.88 - samples/sec: 3366.16 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:18:46,890 epoch 8 - iter 45/152 - loss 0.01979052 - time (sec): 2.80 - samples/sec: 3353.73 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:18:47,777 epoch 8 - iter 60/152 - loss 0.01937442 - time (sec): 3.69 - samples/sec: 3366.62 - lr: 0.000009 - momentum: 0.000000
2023-10-13 09:18:48,714 epoch 8 - iter 75/152 - loss 0.02290668 - time (sec): 4.63 - samples/sec: 3373.47 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:18:49,587 epoch 8 - iter 90/152 - loss 0.01974080 - time (sec): 5.50 - samples/sec: 3403.86 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:18:50,414 epoch 8 - iter 105/152 - loss 0.02119474 - time (sec): 6.33 - samples/sec: 3425.66 - lr: 0.000008 - momentum: 0.000000
2023-10-13 09:18:51,293 epoch 8 - iter 120/152 - loss 0.02094552 - time (sec): 7.21 - samples/sec: 3434.56 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:18:52,114 epoch 8 - iter 135/152 - loss 0.01963669 - time (sec): 8.03 - samples/sec: 3446.71 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:18:52,924 epoch 8 - iter 150/152 - loss 0.02179370 - time (sec): 8.84 - samples/sec: 3462.58 - lr: 0.000007 - momentum: 0.000000
2023-10-13 09:18:53,040 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:53,041 EPOCH 8 done: loss 0.0215 - lr: 0.000007
2023-10-13 09:18:54,003 DEV : loss 0.18849490582942963 - f1-score (micro avg) 0.838
2023-10-13 09:18:54,011 ----------------------------------------------------------------------------------------------------
2023-10-13 09:18:54,833 epoch 9 - iter 15/152 - loss 0.02677392 - time (sec): 0.82 - samples/sec: 3359.94 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:18:55,712 epoch 9 - iter 30/152 - loss 0.01698186 - time (sec): 1.70 - samples/sec: 3474.31 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:18:56,607 epoch 9 - iter 45/152 - loss 0.02630750 - time (sec): 2.59 - samples/sec: 3547.44 - lr: 0.000006 - momentum: 0.000000
2023-10-13 09:18:57,507 epoch 9 - iter 60/152 - loss 0.02453081 - time (sec): 3.50 - samples/sec: 3541.92 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:18:58,379 epoch 9 - iter 75/152 - loss 0.02057650 - time (sec): 4.37 - samples/sec: 3518.05 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:18:59,191 epoch 9 - iter 90/152 - loss 0.02152942 - time (sec): 5.18 - samples/sec: 3547.14 - lr: 0.000005 - momentum: 0.000000
2023-10-13 09:19:00,021 epoch 9 - iter 105/152 - loss 0.01914940 - time (sec): 6.01 - samples/sec: 3530.03 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:19:00,889 epoch 9 - iter 120/152 - loss 0.01790542 - time (sec): 6.88 - samples/sec: 3577.36 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:19:01,715 epoch 9 - iter 135/152 - loss 0.01832648 - time (sec): 7.70 - samples/sec: 3583.55 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:19:02,570 epoch 9 - iter 150/152 - loss 0.01715611 - time (sec): 8.56 - samples/sec: 3577.64 - lr: 0.000004 - momentum: 0.000000
2023-10-13 09:19:02,675 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:02,675 EPOCH 9 done: loss 0.0170 - lr: 0.000004
2023-10-13 09:19:03,622 DEV : loss 0.19795483350753784 - f1-score (micro avg) 0.8398
2023-10-13 09:19:03,628 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:04,453 epoch 10 - iter 15/152 - loss 0.00109153 - time (sec): 0.82 - samples/sec: 3542.74 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:19:05,322 epoch 10 - iter 30/152 - loss 0.00873862 - time (sec): 1.69 - samples/sec: 3624.20 - lr: 0.000003 - momentum: 0.000000
2023-10-13 09:19:06,181 epoch 10 - iter 45/152 - loss 0.01216745 - time (sec): 2.55 - samples/sec: 3725.19 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:19:07,041 epoch 10 - iter 60/152 - loss 0.01687381 - time (sec): 3.41 - samples/sec: 3669.22 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:19:07,850 epoch 10 - iter 75/152 - loss 0.01763652 - time (sec): 4.22 - samples/sec: 3638.35 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:19:08,754 epoch 10 - iter 90/152 - loss 0.01536798 - time (sec): 5.12 - samples/sec: 3646.73 - lr: 0.000002 - momentum: 0.000000
2023-10-13 09:19:09,570 epoch 10 - iter 105/152 - loss 0.01651864 - time (sec): 5.94 - samples/sec: 3667.72 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:19:10,413 epoch 10 - iter 120/152 - loss 0.01655194 - time (sec): 6.78 - samples/sec: 3626.65 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:19:11,294 epoch 10 - iter 135/152 - loss 0.01594330 - time (sec): 7.66 - samples/sec: 3614.00 - lr: 0.000001 - momentum: 0.000000
2023-10-13 09:19:12,107 epoch 10 - iter 150/152 - loss 0.01593639 - time (sec): 8.48 - samples/sec: 3630.91 - lr: 0.000000 - momentum: 0.000000
2023-10-13 09:19:12,201 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:12,201 EPOCH 10 done: loss 0.0158 - lr: 0.000000
2023-10-13 09:19:13,219 DEV : loss 0.20332136750221252 - f1-score (micro avg) 0.8303
2023-10-13 09:19:13,605 ----------------------------------------------------------------------------------------------------
2023-10-13 09:19:13,606 Loading model from best epoch ...
2023-10-13 09:19:15,087 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-13 09:19:16,153
Results:
- F-score (micro) 0.7936
- F-score (macro) 0.6407
- Accuracy 0.6667
By class:
precision recall f1-score support
scope 0.7546 0.8146 0.7834 151
work 0.6484 0.8737 0.7444 95
pers 0.8381 0.9167 0.8756 96
loc 1.0000 0.6667 0.8000 3
date 0.0000 0.0000 0.0000 3
micro avg 0.7437 0.8506 0.7936 348
macro avg 0.6482 0.6543 0.6407 348
weighted avg 0.7443 0.8506 0.7916 348
2023-10-13 09:19:16,153 ----------------------------------------------------------------------------------------------------