stefan-it's picture
Upload folder using huggingface_hub
22fa481
2023-10-16 19:38:25,064 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,065 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 19:38:25,065 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,065 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-16 19:38:25,065 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,065 Train: 1085 sentences
2023-10-16 19:38:25,065 (train_with_dev=False, train_with_test=False)
2023-10-16 19:38:25,065 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,065 Training Params:
2023-10-16 19:38:25,066 - learning_rate: "5e-05"
2023-10-16 19:38:25,066 - mini_batch_size: "8"
2023-10-16 19:38:25,066 - max_epochs: "10"
2023-10-16 19:38:25,066 - shuffle: "True"
2023-10-16 19:38:25,066 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,066 Plugins:
2023-10-16 19:38:25,066 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 19:38:25,066 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,066 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 19:38:25,066 - metric: "('micro avg', 'f1-score')"
2023-10-16 19:38:25,066 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,066 Computation:
2023-10-16 19:38:25,066 - compute on device: cuda:0
2023-10-16 19:38:25,066 - embedding storage: none
2023-10-16 19:38:25,066 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,066 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-16 19:38:25,066 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:25,066 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:26,453 epoch 1 - iter 13/136 - loss 3.02415645 - time (sec): 1.39 - samples/sec: 3380.07 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:38:27,894 epoch 1 - iter 26/136 - loss 2.73651899 - time (sec): 2.83 - samples/sec: 3429.90 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:38:29,134 epoch 1 - iter 39/136 - loss 2.19292351 - time (sec): 4.07 - samples/sec: 3528.44 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:38:30,344 epoch 1 - iter 52/136 - loss 1.83877465 - time (sec): 5.28 - samples/sec: 3573.73 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:38:31,580 epoch 1 - iter 65/136 - loss 1.56468537 - time (sec): 6.51 - samples/sec: 3685.59 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:38:32,997 epoch 1 - iter 78/136 - loss 1.37162883 - time (sec): 7.93 - samples/sec: 3708.76 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:38:34,370 epoch 1 - iter 91/136 - loss 1.23231369 - time (sec): 9.30 - samples/sec: 3698.96 - lr: 0.000033 - momentum: 0.000000
2023-10-16 19:38:35,609 epoch 1 - iter 104/136 - loss 1.12236078 - time (sec): 10.54 - samples/sec: 3723.21 - lr: 0.000038 - momentum: 0.000000
2023-10-16 19:38:37,007 epoch 1 - iter 117/136 - loss 1.02932350 - time (sec): 11.94 - samples/sec: 3727.18 - lr: 0.000043 - momentum: 0.000000
2023-10-16 19:38:38,585 epoch 1 - iter 130/136 - loss 0.94617700 - time (sec): 13.52 - samples/sec: 3675.25 - lr: 0.000047 - momentum: 0.000000
2023-10-16 19:38:39,187 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:39,187 EPOCH 1 done: loss 0.9161 - lr: 0.000047
2023-10-16 19:38:40,204 DEV : loss 0.20288820564746857 - f1-score (micro avg) 0.4722
2023-10-16 19:38:40,208 saving best model
2023-10-16 19:38:40,522 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:42,045 epoch 2 - iter 13/136 - loss 0.23732824 - time (sec): 1.52 - samples/sec: 3734.23 - lr: 0.000050 - momentum: 0.000000
2023-10-16 19:38:43,375 epoch 2 - iter 26/136 - loss 0.20696770 - time (sec): 2.85 - samples/sec: 3603.59 - lr: 0.000049 - momentum: 0.000000
2023-10-16 19:38:44,679 epoch 2 - iter 39/136 - loss 0.19951300 - time (sec): 4.16 - samples/sec: 3714.36 - lr: 0.000048 - momentum: 0.000000
2023-10-16 19:38:46,063 epoch 2 - iter 52/136 - loss 0.21957384 - time (sec): 5.54 - samples/sec: 3616.90 - lr: 0.000048 - momentum: 0.000000
2023-10-16 19:38:47,354 epoch 2 - iter 65/136 - loss 0.21017221 - time (sec): 6.83 - samples/sec: 3618.48 - lr: 0.000047 - momentum: 0.000000
2023-10-16 19:38:48,776 epoch 2 - iter 78/136 - loss 0.19876748 - time (sec): 8.25 - samples/sec: 3574.30 - lr: 0.000047 - momentum: 0.000000
2023-10-16 19:38:50,182 epoch 2 - iter 91/136 - loss 0.19154341 - time (sec): 9.66 - samples/sec: 3605.84 - lr: 0.000046 - momentum: 0.000000
2023-10-16 19:38:51,554 epoch 2 - iter 104/136 - loss 0.18885502 - time (sec): 11.03 - samples/sec: 3630.47 - lr: 0.000046 - momentum: 0.000000
2023-10-16 19:38:52,936 epoch 2 - iter 117/136 - loss 0.18113803 - time (sec): 12.41 - samples/sec: 3630.23 - lr: 0.000045 - momentum: 0.000000
2023-10-16 19:38:54,386 epoch 2 - iter 130/136 - loss 0.17639998 - time (sec): 13.86 - samples/sec: 3592.81 - lr: 0.000045 - momentum: 0.000000
2023-10-16 19:38:55,007 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:55,007 EPOCH 2 done: loss 0.1744 - lr: 0.000045
2023-10-16 19:38:56,455 DEV : loss 0.12787802517414093 - f1-score (micro avg) 0.709
2023-10-16 19:38:56,462 saving best model
2023-10-16 19:38:56,980 ----------------------------------------------------------------------------------------------------
2023-10-16 19:38:58,375 epoch 3 - iter 13/136 - loss 0.10130091 - time (sec): 1.39 - samples/sec: 3458.50 - lr: 0.000044 - momentum: 0.000000
2023-10-16 19:38:59,508 epoch 3 - iter 26/136 - loss 0.09604338 - time (sec): 2.52 - samples/sec: 3682.54 - lr: 0.000043 - momentum: 0.000000
2023-10-16 19:39:00,997 epoch 3 - iter 39/136 - loss 0.10268479 - time (sec): 4.01 - samples/sec: 3713.63 - lr: 0.000043 - momentum: 0.000000
2023-10-16 19:39:02,517 epoch 3 - iter 52/136 - loss 0.10309567 - time (sec): 5.53 - samples/sec: 3542.78 - lr: 0.000042 - momentum: 0.000000
2023-10-16 19:39:03,822 epoch 3 - iter 65/136 - loss 0.09646645 - time (sec): 6.84 - samples/sec: 3534.59 - lr: 0.000042 - momentum: 0.000000
2023-10-16 19:39:05,342 epoch 3 - iter 78/136 - loss 0.09885337 - time (sec): 8.36 - samples/sec: 3539.54 - lr: 0.000041 - momentum: 0.000000
2023-10-16 19:39:06,891 epoch 3 - iter 91/136 - loss 0.09499064 - time (sec): 9.91 - samples/sec: 3475.54 - lr: 0.000041 - momentum: 0.000000
2023-10-16 19:39:08,202 epoch 3 - iter 104/136 - loss 0.09164222 - time (sec): 11.22 - samples/sec: 3466.50 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:39:09,616 epoch 3 - iter 117/136 - loss 0.09539987 - time (sec): 12.63 - samples/sec: 3460.71 - lr: 0.000040 - momentum: 0.000000
2023-10-16 19:39:11,090 epoch 3 - iter 130/136 - loss 0.09368696 - time (sec): 14.11 - samples/sec: 3501.20 - lr: 0.000039 - momentum: 0.000000
2023-10-16 19:39:11,799 ----------------------------------------------------------------------------------------------------
2023-10-16 19:39:11,799 EPOCH 3 done: loss 0.0936 - lr: 0.000039
2023-10-16 19:39:13,630 DEV : loss 0.10517842322587967 - f1-score (micro avg) 0.7648
2023-10-16 19:39:13,634 saving best model
2023-10-16 19:39:14,303 ----------------------------------------------------------------------------------------------------
2023-10-16 19:39:15,796 epoch 4 - iter 13/136 - loss 0.06608433 - time (sec): 1.49 - samples/sec: 3677.44 - lr: 0.000038 - momentum: 0.000000
2023-10-16 19:39:17,085 epoch 4 - iter 26/136 - loss 0.05710389 - time (sec): 2.78 - samples/sec: 3836.83 - lr: 0.000038 - momentum: 0.000000
2023-10-16 19:39:18,540 epoch 4 - iter 39/136 - loss 0.05467172 - time (sec): 4.23 - samples/sec: 3719.82 - lr: 0.000037 - momentum: 0.000000
2023-10-16 19:39:20,049 epoch 4 - iter 52/136 - loss 0.05580848 - time (sec): 5.74 - samples/sec: 3608.35 - lr: 0.000037 - momentum: 0.000000
2023-10-16 19:39:21,509 epoch 4 - iter 65/136 - loss 0.05256870 - time (sec): 7.20 - samples/sec: 3558.38 - lr: 0.000036 - momentum: 0.000000
2023-10-16 19:39:22,925 epoch 4 - iter 78/136 - loss 0.05445472 - time (sec): 8.62 - samples/sec: 3545.68 - lr: 0.000036 - momentum: 0.000000
2023-10-16 19:39:24,676 epoch 4 - iter 91/136 - loss 0.05306817 - time (sec): 10.37 - samples/sec: 3511.57 - lr: 0.000035 - momentum: 0.000000
2023-10-16 19:39:25,972 epoch 4 - iter 104/136 - loss 0.05204892 - time (sec): 11.66 - samples/sec: 3509.49 - lr: 0.000035 - momentum: 0.000000
2023-10-16 19:39:27,239 epoch 4 - iter 117/136 - loss 0.04955030 - time (sec): 12.93 - samples/sec: 3519.63 - lr: 0.000034 - momentum: 0.000000
2023-10-16 19:39:28,688 epoch 4 - iter 130/136 - loss 0.05090082 - time (sec): 14.38 - samples/sec: 3482.74 - lr: 0.000034 - momentum: 0.000000
2023-10-16 19:39:29,277 ----------------------------------------------------------------------------------------------------
2023-10-16 19:39:29,277 EPOCH 4 done: loss 0.0503 - lr: 0.000034
2023-10-16 19:39:30,742 DEV : loss 0.11859514564275742 - f1-score (micro avg) 0.7751
2023-10-16 19:39:30,746 saving best model
2023-10-16 19:39:31,279 ----------------------------------------------------------------------------------------------------
2023-10-16 19:39:32,725 epoch 5 - iter 13/136 - loss 0.05288570 - time (sec): 1.44 - samples/sec: 3341.66 - lr: 0.000033 - momentum: 0.000000
2023-10-16 19:39:34,235 epoch 5 - iter 26/136 - loss 0.03888893 - time (sec): 2.95 - samples/sec: 3360.97 - lr: 0.000032 - momentum: 0.000000
2023-10-16 19:39:35,703 epoch 5 - iter 39/136 - loss 0.03461814 - time (sec): 4.42 - samples/sec: 3491.55 - lr: 0.000032 - momentum: 0.000000
2023-10-16 19:39:37,270 epoch 5 - iter 52/136 - loss 0.03465037 - time (sec): 5.99 - samples/sec: 3451.16 - lr: 0.000031 - momentum: 0.000000
2023-10-16 19:39:38,489 epoch 5 - iter 65/136 - loss 0.03740903 - time (sec): 7.21 - samples/sec: 3564.80 - lr: 0.000031 - momentum: 0.000000
2023-10-16 19:39:39,872 epoch 5 - iter 78/136 - loss 0.03524836 - time (sec): 8.59 - samples/sec: 3525.38 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:39:41,208 epoch 5 - iter 91/136 - loss 0.03585208 - time (sec): 9.92 - samples/sec: 3537.93 - lr: 0.000030 - momentum: 0.000000
2023-10-16 19:39:42,760 epoch 5 - iter 104/136 - loss 0.03429363 - time (sec): 11.48 - samples/sec: 3529.25 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:39:44,131 epoch 5 - iter 117/136 - loss 0.03278488 - time (sec): 12.85 - samples/sec: 3543.05 - lr: 0.000029 - momentum: 0.000000
2023-10-16 19:39:45,521 epoch 5 - iter 130/136 - loss 0.03204268 - time (sec): 14.24 - samples/sec: 3549.25 - lr: 0.000028 - momentum: 0.000000
2023-10-16 19:39:45,959 ----------------------------------------------------------------------------------------------------
2023-10-16 19:39:45,959 EPOCH 5 done: loss 0.0324 - lr: 0.000028
2023-10-16 19:39:47,752 DEV : loss 0.12475510686635971 - f1-score (micro avg) 0.8214
2023-10-16 19:39:47,756 saving best model
2023-10-16 19:39:48,262 ----------------------------------------------------------------------------------------------------
2023-10-16 19:39:49,651 epoch 6 - iter 13/136 - loss 0.01579763 - time (sec): 1.39 - samples/sec: 3361.70 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:39:50,851 epoch 6 - iter 26/136 - loss 0.02031476 - time (sec): 2.59 - samples/sec: 3594.29 - lr: 0.000027 - momentum: 0.000000
2023-10-16 19:39:52,223 epoch 6 - iter 39/136 - loss 0.02508221 - time (sec): 3.96 - samples/sec: 3588.18 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:39:53,795 epoch 6 - iter 52/136 - loss 0.02385107 - time (sec): 5.53 - samples/sec: 3554.23 - lr: 0.000026 - momentum: 0.000000
2023-10-16 19:39:55,038 epoch 6 - iter 65/136 - loss 0.02550333 - time (sec): 6.77 - samples/sec: 3691.72 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:39:56,455 epoch 6 - iter 78/136 - loss 0.02488529 - time (sec): 8.19 - samples/sec: 3573.47 - lr: 0.000025 - momentum: 0.000000
2023-10-16 19:39:57,884 epoch 6 - iter 91/136 - loss 0.02306815 - time (sec): 9.62 - samples/sec: 3522.48 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:39:59,503 epoch 6 - iter 104/136 - loss 0.02319266 - time (sec): 11.24 - samples/sec: 3504.24 - lr: 0.000024 - momentum: 0.000000
2023-10-16 19:40:00,962 epoch 6 - iter 117/136 - loss 0.02205655 - time (sec): 12.70 - samples/sec: 3499.80 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:40:02,319 epoch 6 - iter 130/136 - loss 0.02214229 - time (sec): 14.06 - samples/sec: 3534.47 - lr: 0.000023 - momentum: 0.000000
2023-10-16 19:40:02,991 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:02,992 EPOCH 6 done: loss 0.0226 - lr: 0.000023
2023-10-16 19:40:04,414 DEV : loss 0.1283072680234909 - f1-score (micro avg) 0.8
2023-10-16 19:40:04,418 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:05,810 epoch 7 - iter 13/136 - loss 0.02517646 - time (sec): 1.39 - samples/sec: 4173.17 - lr: 0.000022 - momentum: 0.000000
2023-10-16 19:40:07,200 epoch 7 - iter 26/136 - loss 0.01976096 - time (sec): 2.78 - samples/sec: 3751.25 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:40:08,558 epoch 7 - iter 39/136 - loss 0.01871697 - time (sec): 4.14 - samples/sec: 3810.94 - lr: 0.000021 - momentum: 0.000000
2023-10-16 19:40:09,860 epoch 7 - iter 52/136 - loss 0.01928305 - time (sec): 5.44 - samples/sec: 3807.28 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:40:11,172 epoch 7 - iter 65/136 - loss 0.01756808 - time (sec): 6.75 - samples/sec: 3750.80 - lr: 0.000020 - momentum: 0.000000
2023-10-16 19:40:12,582 epoch 7 - iter 78/136 - loss 0.01776526 - time (sec): 8.16 - samples/sec: 3725.04 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:40:14,129 epoch 7 - iter 91/136 - loss 0.01667503 - time (sec): 9.71 - samples/sec: 3720.31 - lr: 0.000019 - momentum: 0.000000
2023-10-16 19:40:15,364 epoch 7 - iter 104/136 - loss 0.01663346 - time (sec): 10.94 - samples/sec: 3705.00 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:40:16,770 epoch 7 - iter 117/136 - loss 0.01670474 - time (sec): 12.35 - samples/sec: 3681.72 - lr: 0.000018 - momentum: 0.000000
2023-10-16 19:40:18,168 epoch 7 - iter 130/136 - loss 0.01720601 - time (sec): 13.75 - samples/sec: 3642.58 - lr: 0.000017 - momentum: 0.000000
2023-10-16 19:40:18,768 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:18,768 EPOCH 7 done: loss 0.0174 - lr: 0.000017
2023-10-16 19:40:20,403 DEV : loss 0.14328120648860931 - f1-score (micro avg) 0.7899
2023-10-16 19:40:20,407 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:21,913 epoch 8 - iter 13/136 - loss 0.00578552 - time (sec): 1.50 - samples/sec: 3549.97 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:40:23,532 epoch 8 - iter 26/136 - loss 0.00802889 - time (sec): 3.12 - samples/sec: 3436.77 - lr: 0.000016 - momentum: 0.000000
2023-10-16 19:40:24,845 epoch 8 - iter 39/136 - loss 0.00991238 - time (sec): 4.44 - samples/sec: 3553.05 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:40:26,046 epoch 8 - iter 52/136 - loss 0.00922150 - time (sec): 5.64 - samples/sec: 3555.06 - lr: 0.000015 - momentum: 0.000000
2023-10-16 19:40:27,545 epoch 8 - iter 65/136 - loss 0.01032178 - time (sec): 7.14 - samples/sec: 3626.78 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:40:28,943 epoch 8 - iter 78/136 - loss 0.01028200 - time (sec): 8.54 - samples/sec: 3589.22 - lr: 0.000014 - momentum: 0.000000
2023-10-16 19:40:30,503 epoch 8 - iter 91/136 - loss 0.01074870 - time (sec): 10.09 - samples/sec: 3592.90 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:40:32,046 epoch 8 - iter 104/136 - loss 0.00989424 - time (sec): 11.64 - samples/sec: 3558.63 - lr: 0.000013 - momentum: 0.000000
2023-10-16 19:40:33,390 epoch 8 - iter 117/136 - loss 0.01118879 - time (sec): 12.98 - samples/sec: 3511.74 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:40:34,696 epoch 8 - iter 130/136 - loss 0.01176432 - time (sec): 14.29 - samples/sec: 3480.28 - lr: 0.000012 - momentum: 0.000000
2023-10-16 19:40:35,376 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:35,376 EPOCH 8 done: loss 0.0117 - lr: 0.000012
2023-10-16 19:40:36,803 DEV : loss 0.15624405443668365 - f1-score (micro avg) 0.8133
2023-10-16 19:40:36,807 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:38,141 epoch 9 - iter 13/136 - loss 0.00928382 - time (sec): 1.33 - samples/sec: 3710.24 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:40:39,597 epoch 9 - iter 26/136 - loss 0.01339997 - time (sec): 2.79 - samples/sec: 3706.34 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:40:41,087 epoch 9 - iter 39/136 - loss 0.01199908 - time (sec): 4.28 - samples/sec: 3708.38 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:40:42,388 epoch 9 - iter 52/136 - loss 0.01266300 - time (sec): 5.58 - samples/sec: 3762.51 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:40:43,771 epoch 9 - iter 65/136 - loss 0.01242176 - time (sec): 6.96 - samples/sec: 3686.38 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:40:45,109 epoch 9 - iter 78/136 - loss 0.01236743 - time (sec): 8.30 - samples/sec: 3581.29 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:40:46,647 epoch 9 - iter 91/136 - loss 0.01228619 - time (sec): 9.84 - samples/sec: 3554.07 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:40:47,911 epoch 9 - iter 104/136 - loss 0.01090571 - time (sec): 11.10 - samples/sec: 3580.06 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:40:49,656 epoch 9 - iter 117/136 - loss 0.01047892 - time (sec): 12.85 - samples/sec: 3517.41 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:40:51,014 epoch 9 - iter 130/136 - loss 0.00993445 - time (sec): 14.21 - samples/sec: 3532.53 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:40:51,564 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:51,564 EPOCH 9 done: loss 0.0099 - lr: 0.000006
2023-10-16 19:40:53,004 DEV : loss 0.15900768339633942 - f1-score (micro avg) 0.8088
2023-10-16 19:40:53,008 ----------------------------------------------------------------------------------------------------
2023-10-16 19:40:55,022 epoch 10 - iter 13/136 - loss 0.00709288 - time (sec): 2.01 - samples/sec: 2608.60 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:40:56,538 epoch 10 - iter 26/136 - loss 0.00518710 - time (sec): 3.53 - samples/sec: 2992.37 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:40:57,869 epoch 10 - iter 39/136 - loss 0.00781429 - time (sec): 4.86 - samples/sec: 3046.30 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:40:59,220 epoch 10 - iter 52/136 - loss 0.00899035 - time (sec): 6.21 - samples/sec: 3172.27 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:41:00,601 epoch 10 - iter 65/136 - loss 0.00817988 - time (sec): 7.59 - samples/sec: 3212.55 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:41:02,099 epoch 10 - iter 78/136 - loss 0.00777940 - time (sec): 9.09 - samples/sec: 3226.67 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:41:03,454 epoch 10 - iter 91/136 - loss 0.00719324 - time (sec): 10.44 - samples/sec: 3245.39 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:41:04,870 epoch 10 - iter 104/136 - loss 0.00693415 - time (sec): 11.86 - samples/sec: 3312.65 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:41:06,420 epoch 10 - iter 117/136 - loss 0.00767387 - time (sec): 13.41 - samples/sec: 3353.14 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:41:07,740 epoch 10 - iter 130/136 - loss 0.00777585 - time (sec): 14.73 - samples/sec: 3398.50 - lr: 0.000000 - momentum: 0.000000
2023-10-16 19:41:08,252 ----------------------------------------------------------------------------------------------------
2023-10-16 19:41:08,252 EPOCH 10 done: loss 0.0076 - lr: 0.000000
2023-10-16 19:41:09,680 DEV : loss 0.16889716684818268 - f1-score (micro avg) 0.8103
2023-10-16 19:41:10,080 ----------------------------------------------------------------------------------------------------
2023-10-16 19:41:10,081 Loading model from best epoch ...
2023-10-16 19:41:11,797 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-16 19:41:13,818
Results:
- F-score (micro) 0.7764
- F-score (macro) 0.7367
- Accuracy 0.651
By class:
precision recall f1-score support
LOC 0.7971 0.8942 0.8429 312
PER 0.6439 0.8606 0.7366 208
ORG 0.5366 0.4000 0.4583 55
HumanProd 0.9091 0.9091 0.9091 22
micro avg 0.7236 0.8375 0.7764 597
macro avg 0.7217 0.7660 0.7367 597
weighted avg 0.7239 0.8375 0.7729 597
2023-10-16 19:41:13,818 ----------------------------------------------------------------------------------------------------