stefan-it's picture
Upload ./training.log with huggingface_hub
ef61312
2023-10-25 21:24:56,312 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,313 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:24:56,313 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 Train: 1085 sentences
2023-10-25 21:24:56,314 (train_with_dev=False, train_with_test=False)
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 Training Params:
2023-10-25 21:24:56,314 - learning_rate: "3e-05"
2023-10-25 21:24:56,314 - mini_batch_size: "4"
2023-10-25 21:24:56,314 - max_epochs: "10"
2023-10-25 21:24:56,314 - shuffle: "True"
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 Plugins:
2023-10-25 21:24:56,314 - TensorboardLogger
2023-10-25 21:24:56,314 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:24:56,314 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 Computation:
2023-10-25 21:24:56,314 - compute on device: cuda:0
2023-10-25 21:24:56,314 - embedding storage: none
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 ----------------------------------------------------------------------------------------------------
2023-10-25 21:24:56,314 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:24:57,790 epoch 1 - iter 27/272 - loss 2.95478253 - time (sec): 1.47 - samples/sec: 3722.53 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:24:59,293 epoch 1 - iter 54/272 - loss 2.36462168 - time (sec): 2.98 - samples/sec: 3644.52 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:25:00,680 epoch 1 - iter 81/272 - loss 1.87243562 - time (sec): 4.36 - samples/sec: 3545.80 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:25:02,180 epoch 1 - iter 108/272 - loss 1.49436340 - time (sec): 5.86 - samples/sec: 3560.63 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:25:03,621 epoch 1 - iter 135/272 - loss 1.28098397 - time (sec): 7.31 - samples/sec: 3486.44 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:25:05,100 epoch 1 - iter 162/272 - loss 1.10859938 - time (sec): 8.78 - samples/sec: 3539.33 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:25:06,640 epoch 1 - iter 189/272 - loss 0.99184857 - time (sec): 10.33 - samples/sec: 3494.32 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:25:08,116 epoch 1 - iter 216/272 - loss 0.89139646 - time (sec): 11.80 - samples/sec: 3510.77 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:25:09,621 epoch 1 - iter 243/272 - loss 0.82167559 - time (sec): 13.31 - samples/sec: 3471.31 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:25:11,185 epoch 1 - iter 270/272 - loss 0.76133078 - time (sec): 14.87 - samples/sec: 3480.77 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:25:11,284 ----------------------------------------------------------------------------------------------------
2023-10-25 21:25:11,285 EPOCH 1 done: loss 0.7584 - lr: 0.000030
2023-10-25 21:25:12,457 DEV : loss 0.14451949298381805 - f1-score (micro avg) 0.6804
2023-10-25 21:25:12,463 saving best model
2023-10-25 21:25:12,965 ----------------------------------------------------------------------------------------------------
2023-10-25 21:25:14,476 epoch 2 - iter 27/272 - loss 0.10550064 - time (sec): 1.51 - samples/sec: 3122.04 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:25:16,011 epoch 2 - iter 54/272 - loss 0.11014367 - time (sec): 3.04 - samples/sec: 3476.83 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:25:17,480 epoch 2 - iter 81/272 - loss 0.13176071 - time (sec): 4.51 - samples/sec: 3324.86 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:25:18,967 epoch 2 - iter 108/272 - loss 0.13399368 - time (sec): 6.00 - samples/sec: 3535.90 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:25:20,463 epoch 2 - iter 135/272 - loss 0.12917079 - time (sec): 7.50 - samples/sec: 3535.18 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:25:21,972 epoch 2 - iter 162/272 - loss 0.13057126 - time (sec): 9.01 - samples/sec: 3483.46 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:25:23,459 epoch 2 - iter 189/272 - loss 0.13256525 - time (sec): 10.49 - samples/sec: 3536.97 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:25:24,953 epoch 2 - iter 216/272 - loss 0.12890332 - time (sec): 11.99 - samples/sec: 3606.43 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:25:26,348 epoch 2 - iter 243/272 - loss 0.12876898 - time (sec): 13.38 - samples/sec: 3557.32 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:25:27,704 epoch 2 - iter 270/272 - loss 0.12992069 - time (sec): 14.74 - samples/sec: 3518.57 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:25:27,801 ----------------------------------------------------------------------------------------------------
2023-10-25 21:25:27,801 EPOCH 2 done: loss 0.1300 - lr: 0.000027
2023-10-25 21:25:29,026 DEV : loss 0.10889776796102524 - f1-score (micro avg) 0.7868
2023-10-25 21:25:29,033 saving best model
2023-10-25 21:25:29,757 ----------------------------------------------------------------------------------------------------
2023-10-25 21:25:31,157 epoch 3 - iter 27/272 - loss 0.09712327 - time (sec): 1.40 - samples/sec: 3564.88 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:25:32,529 epoch 3 - iter 54/272 - loss 0.08242375 - time (sec): 2.77 - samples/sec: 3892.24 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:25:33,921 epoch 3 - iter 81/272 - loss 0.07445624 - time (sec): 4.16 - samples/sec: 3744.19 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:25:35,305 epoch 3 - iter 108/272 - loss 0.07196821 - time (sec): 5.55 - samples/sec: 3795.36 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:25:36,802 epoch 3 - iter 135/272 - loss 0.07055271 - time (sec): 7.04 - samples/sec: 3797.39 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:25:38,252 epoch 3 - iter 162/272 - loss 0.06865725 - time (sec): 8.49 - samples/sec: 3704.75 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:25:39,756 epoch 3 - iter 189/272 - loss 0.07384271 - time (sec): 10.00 - samples/sec: 3626.70 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:25:41,233 epoch 3 - iter 216/272 - loss 0.07124307 - time (sec): 11.47 - samples/sec: 3613.06 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:25:42,692 epoch 3 - iter 243/272 - loss 0.07419483 - time (sec): 12.93 - samples/sec: 3580.42 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:25:44,265 epoch 3 - iter 270/272 - loss 0.07300144 - time (sec): 14.51 - samples/sec: 3570.19 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:25:44,377 ----------------------------------------------------------------------------------------------------
2023-10-25 21:25:44,377 EPOCH 3 done: loss 0.0727 - lr: 0.000023
2023-10-25 21:25:45,552 DEV : loss 0.12385641783475876 - f1-score (micro avg) 0.789
2023-10-25 21:25:45,558 saving best model
2023-10-25 21:25:46,276 ----------------------------------------------------------------------------------------------------
2023-10-25 21:25:47,852 epoch 4 - iter 27/272 - loss 0.03286519 - time (sec): 1.57 - samples/sec: 3279.89 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:25:49,219 epoch 4 - iter 54/272 - loss 0.04511852 - time (sec): 2.94 - samples/sec: 3410.45 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:25:50,715 epoch 4 - iter 81/272 - loss 0.04150325 - time (sec): 4.44 - samples/sec: 3644.86 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:25:52,114 epoch 4 - iter 108/272 - loss 0.04650275 - time (sec): 5.84 - samples/sec: 3609.30 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:25:53,527 epoch 4 - iter 135/272 - loss 0.04301268 - time (sec): 7.25 - samples/sec: 3632.60 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:25:54,928 epoch 4 - iter 162/272 - loss 0.04424032 - time (sec): 8.65 - samples/sec: 3684.89 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:25:56,365 epoch 4 - iter 189/272 - loss 0.04374274 - time (sec): 10.09 - samples/sec: 3666.06 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:25:57,803 epoch 4 - iter 216/272 - loss 0.04162559 - time (sec): 11.52 - samples/sec: 3659.98 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:25:59,204 epoch 4 - iter 243/272 - loss 0.04400279 - time (sec): 12.93 - samples/sec: 3652.24 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:26:00,599 epoch 4 - iter 270/272 - loss 0.04402940 - time (sec): 14.32 - samples/sec: 3618.64 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:26:00,697 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:00,698 EPOCH 4 done: loss 0.0439 - lr: 0.000020
2023-10-25 21:26:01,832 DEV : loss 0.13214778900146484 - f1-score (micro avg) 0.8022
2023-10-25 21:26:01,838 saving best model
2023-10-25 21:26:02,546 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:04,042 epoch 5 - iter 27/272 - loss 0.01630458 - time (sec): 1.49 - samples/sec: 3393.60 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:26:05,505 epoch 5 - iter 54/272 - loss 0.03437239 - time (sec): 2.96 - samples/sec: 3323.21 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:26:07,020 epoch 5 - iter 81/272 - loss 0.03191448 - time (sec): 4.47 - samples/sec: 3306.77 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:26:08,585 epoch 5 - iter 108/272 - loss 0.02979947 - time (sec): 6.04 - samples/sec: 3311.14 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:26:10,374 epoch 5 - iter 135/272 - loss 0.02958984 - time (sec): 7.82 - samples/sec: 3099.23 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:26:11,974 epoch 5 - iter 162/272 - loss 0.03062299 - time (sec): 9.42 - samples/sec: 3172.34 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:26:13,507 epoch 5 - iter 189/272 - loss 0.03100487 - time (sec): 10.96 - samples/sec: 3208.04 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:26:14,971 epoch 5 - iter 216/272 - loss 0.02954207 - time (sec): 12.42 - samples/sec: 3235.39 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:26:16,462 epoch 5 - iter 243/272 - loss 0.03056086 - time (sec): 13.91 - samples/sec: 3321.39 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:26:17,920 epoch 5 - iter 270/272 - loss 0.03180794 - time (sec): 15.37 - samples/sec: 3371.03 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:26:18,023 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:18,023 EPOCH 5 done: loss 0.0320 - lr: 0.000017
2023-10-25 21:26:19,214 DEV : loss 0.1537138819694519 - f1-score (micro avg) 0.7964
2023-10-25 21:26:19,220 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:20,792 epoch 6 - iter 27/272 - loss 0.02666730 - time (sec): 1.57 - samples/sec: 3471.94 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:26:22,333 epoch 6 - iter 54/272 - loss 0.01962865 - time (sec): 3.11 - samples/sec: 3446.68 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:26:23,857 epoch 6 - iter 81/272 - loss 0.01821303 - time (sec): 4.64 - samples/sec: 3337.98 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:26:25,433 epoch 6 - iter 108/272 - loss 0.02318386 - time (sec): 6.21 - samples/sec: 3341.21 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:26:26,946 epoch 6 - iter 135/272 - loss 0.02155185 - time (sec): 7.72 - samples/sec: 3320.13 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:26:28,462 epoch 6 - iter 162/272 - loss 0.02185104 - time (sec): 9.24 - samples/sec: 3366.02 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:26:29,970 epoch 6 - iter 189/272 - loss 0.02306003 - time (sec): 10.75 - samples/sec: 3410.78 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:26:31,474 epoch 6 - iter 216/272 - loss 0.02366588 - time (sec): 12.25 - samples/sec: 3337.30 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:26:32,976 epoch 6 - iter 243/272 - loss 0.02325498 - time (sec): 13.75 - samples/sec: 3382.56 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:26:34,454 epoch 6 - iter 270/272 - loss 0.02340385 - time (sec): 15.23 - samples/sec: 3385.88 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:26:34,566 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:34,566 EPOCH 6 done: loss 0.0232 - lr: 0.000013
2023-10-25 21:26:35,844 DEV : loss 0.14856071770191193 - f1-score (micro avg) 0.8118
2023-10-25 21:26:35,850 saving best model
2023-10-25 21:26:36,568 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:38,086 epoch 7 - iter 27/272 - loss 0.01152928 - time (sec): 1.51 - samples/sec: 3665.41 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:26:39,572 epoch 7 - iter 54/272 - loss 0.01071992 - time (sec): 3.00 - samples/sec: 3522.80 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:26:41,063 epoch 7 - iter 81/272 - loss 0.01059353 - time (sec): 4.49 - samples/sec: 3519.69 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:26:42,575 epoch 7 - iter 108/272 - loss 0.01206115 - time (sec): 6.00 - samples/sec: 3579.45 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:26:44,092 epoch 7 - iter 135/272 - loss 0.01324880 - time (sec): 7.52 - samples/sec: 3495.96 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:26:45,553 epoch 7 - iter 162/272 - loss 0.01362028 - time (sec): 8.98 - samples/sec: 3493.65 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:26:47,112 epoch 7 - iter 189/272 - loss 0.01636107 - time (sec): 10.54 - samples/sec: 3516.51 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:26:48,600 epoch 7 - iter 216/272 - loss 0.01678638 - time (sec): 12.03 - samples/sec: 3514.09 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:26:50,082 epoch 7 - iter 243/272 - loss 0.01800984 - time (sec): 13.51 - samples/sec: 3469.39 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:26:51,588 epoch 7 - iter 270/272 - loss 0.01706655 - time (sec): 15.02 - samples/sec: 3438.63 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:26:51,705 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:51,705 EPOCH 7 done: loss 0.0170 - lr: 0.000010
2023-10-25 21:26:52,958 DEV : loss 0.15049101412296295 - f1-score (micro avg) 0.817
2023-10-25 21:26:52,964 saving best model
2023-10-25 21:26:53,672 ----------------------------------------------------------------------------------------------------
2023-10-25 21:26:55,233 epoch 8 - iter 27/272 - loss 0.01955441 - time (sec): 1.56 - samples/sec: 3900.71 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:26:56,670 epoch 8 - iter 54/272 - loss 0.02251273 - time (sec): 3.00 - samples/sec: 3684.76 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:26:58,230 epoch 8 - iter 81/272 - loss 0.01870268 - time (sec): 4.56 - samples/sec: 3653.96 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:26:59,730 epoch 8 - iter 108/272 - loss 0.01802886 - time (sec): 6.06 - samples/sec: 3597.69 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:27:01,270 epoch 8 - iter 135/272 - loss 0.01675014 - time (sec): 7.60 - samples/sec: 3588.73 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:27:02,817 epoch 8 - iter 162/272 - loss 0.01636457 - time (sec): 9.14 - samples/sec: 3512.11 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:27:04,347 epoch 8 - iter 189/272 - loss 0.01653653 - time (sec): 10.67 - samples/sec: 3493.60 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:27:05,891 epoch 8 - iter 216/272 - loss 0.01494514 - time (sec): 12.22 - samples/sec: 3482.53 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:27:07,413 epoch 8 - iter 243/272 - loss 0.01358305 - time (sec): 13.74 - samples/sec: 3451.57 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:27:08,867 epoch 8 - iter 270/272 - loss 0.01344559 - time (sec): 15.19 - samples/sec: 3413.90 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:27:08,973 ----------------------------------------------------------------------------------------------------
2023-10-25 21:27:08,973 EPOCH 8 done: loss 0.0134 - lr: 0.000007
2023-10-25 21:27:10,570 DEV : loss 0.16718466579914093 - f1-score (micro avg) 0.8244
2023-10-25 21:27:10,576 saving best model
2023-10-25 21:27:11,271 ----------------------------------------------------------------------------------------------------
2023-10-25 21:27:12,843 epoch 9 - iter 27/272 - loss 0.00164870 - time (sec): 1.57 - samples/sec: 3747.02 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:27:14,311 epoch 9 - iter 54/272 - loss 0.00851133 - time (sec): 3.04 - samples/sec: 3601.08 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:27:15,778 epoch 9 - iter 81/272 - loss 0.00761523 - time (sec): 4.51 - samples/sec: 3449.84 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:27:17,240 epoch 9 - iter 108/272 - loss 0.00949545 - time (sec): 5.97 - samples/sec: 3405.25 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:27:18,722 epoch 9 - iter 135/272 - loss 0.00962367 - time (sec): 7.45 - samples/sec: 3446.74 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:27:20,186 epoch 9 - iter 162/272 - loss 0.00970501 - time (sec): 8.91 - samples/sec: 3481.51 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:27:21,676 epoch 9 - iter 189/272 - loss 0.00915997 - time (sec): 10.40 - samples/sec: 3534.96 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:27:23,133 epoch 9 - iter 216/272 - loss 0.00823915 - time (sec): 11.86 - samples/sec: 3493.39 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:27:24,602 epoch 9 - iter 243/272 - loss 0.00817118 - time (sec): 13.33 - samples/sec: 3507.63 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:27:26,104 epoch 9 - iter 270/272 - loss 0.01010203 - time (sec): 14.83 - samples/sec: 3495.59 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:27:26,198 ----------------------------------------------------------------------------------------------------
2023-10-25 21:27:26,199 EPOCH 9 done: loss 0.0101 - lr: 0.000003
2023-10-25 21:27:27,425 DEV : loss 0.16382966935634613 - f1-score (micro avg) 0.8183
2023-10-25 21:27:27,431 ----------------------------------------------------------------------------------------------------
2023-10-25 21:27:28,874 epoch 10 - iter 27/272 - loss 0.00996204 - time (sec): 1.44 - samples/sec: 3784.50 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:27:30,250 epoch 10 - iter 54/272 - loss 0.00822023 - time (sec): 2.82 - samples/sec: 3628.66 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:27:31,708 epoch 10 - iter 81/272 - loss 0.00718997 - time (sec): 4.28 - samples/sec: 3668.97 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:27:33,179 epoch 10 - iter 108/272 - loss 0.00636107 - time (sec): 5.75 - samples/sec: 3707.51 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:27:34,708 epoch 10 - iter 135/272 - loss 0.00714120 - time (sec): 7.28 - samples/sec: 3587.09 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:27:36,272 epoch 10 - iter 162/272 - loss 0.00626853 - time (sec): 8.84 - samples/sec: 3523.31 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:27:37,846 epoch 10 - iter 189/272 - loss 0.00615876 - time (sec): 10.41 - samples/sec: 3525.63 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:27:39,413 epoch 10 - iter 216/272 - loss 0.00562173 - time (sec): 11.98 - samples/sec: 3477.07 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:27:40,880 epoch 10 - iter 243/272 - loss 0.00638001 - time (sec): 13.45 - samples/sec: 3458.88 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:27:42,267 epoch 10 - iter 270/272 - loss 0.00607942 - time (sec): 14.83 - samples/sec: 3488.34 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:27:42,364 ----------------------------------------------------------------------------------------------------
2023-10-25 21:27:42,364 EPOCH 10 done: loss 0.0061 - lr: 0.000000
2023-10-25 21:27:43,597 DEV : loss 0.16699165105819702 - f1-score (micro avg) 0.8281
2023-10-25 21:27:43,604 saving best model
2023-10-25 21:27:44,815 ----------------------------------------------------------------------------------------------------
2023-10-25 21:27:44,817 Loading model from best epoch ...
2023-10-25 21:27:46,707 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-25 21:27:48,940
Results:
- F-score (micro) 0.7769
- F-score (macro) 0.7273
- Accuracy 0.6538
By class:
precision recall f1-score support
LOC 0.8173 0.8462 0.8315 312
PER 0.6923 0.8654 0.7692 208
ORG 0.4643 0.4727 0.4685 55
HumanProd 0.7500 0.9545 0.8400 22
micro avg 0.7361 0.8224 0.7769 597
macro avg 0.6810 0.7847 0.7273 597
weighted avg 0.7388 0.8224 0.7767 597
2023-10-25 21:27:48,940 ----------------------------------------------------------------------------------------------------