stefan-it's picture
Upload ./training.log with huggingface_hub
2be697e
2023-10-25 20:59:44,781 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,782 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 20:59:44,782 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,783 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-25 20:59:44,783 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,783 Train: 1085 sentences
2023-10-25 20:59:44,783 (train_with_dev=False, train_with_test=False)
2023-10-25 20:59:44,783 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,783 Training Params:
2023-10-25 20:59:44,783 - learning_rate: "5e-05"
2023-10-25 20:59:44,783 - mini_batch_size: "8"
2023-10-25 20:59:44,783 - max_epochs: "10"
2023-10-25 20:59:44,783 - shuffle: "True"
2023-10-25 20:59:44,783 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,783 Plugins:
2023-10-25 20:59:44,783 - TensorboardLogger
2023-10-25 20:59:44,783 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 20:59:44,783 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,783 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 20:59:44,783 - metric: "('micro avg', 'f1-score')"
2023-10-25 20:59:44,783 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,783 Computation:
2023-10-25 20:59:44,784 - compute on device: cuda:0
2023-10-25 20:59:44,784 - embedding storage: none
2023-10-25 20:59:44,784 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,784 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 20:59:44,784 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,784 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:44,784 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 20:59:45,775 epoch 1 - iter 13/136 - loss 2.85947296 - time (sec): 0.99 - samples/sec: 5097.94 - lr: 0.000004 - momentum: 0.000000
2023-10-25 20:59:46,798 epoch 1 - iter 26/136 - loss 2.26347229 - time (sec): 2.01 - samples/sec: 5059.75 - lr: 0.000009 - momentum: 0.000000
2023-10-25 20:59:47,774 epoch 1 - iter 39/136 - loss 1.70067118 - time (sec): 2.99 - samples/sec: 5065.52 - lr: 0.000014 - momentum: 0.000000
2023-10-25 20:59:48,854 epoch 1 - iter 52/136 - loss 1.34707525 - time (sec): 4.07 - samples/sec: 5189.88 - lr: 0.000019 - momentum: 0.000000
2023-10-25 20:59:49,781 epoch 1 - iter 65/136 - loss 1.18558706 - time (sec): 5.00 - samples/sec: 5134.55 - lr: 0.000024 - momentum: 0.000000
2023-10-25 20:59:50,747 epoch 1 - iter 78/136 - loss 1.05425464 - time (sec): 5.96 - samples/sec: 5054.21 - lr: 0.000028 - momentum: 0.000000
2023-10-25 20:59:51,963 epoch 1 - iter 91/136 - loss 0.91480738 - time (sec): 7.18 - samples/sec: 5050.33 - lr: 0.000033 - momentum: 0.000000
2023-10-25 20:59:52,946 epoch 1 - iter 104/136 - loss 0.83931390 - time (sec): 8.16 - samples/sec: 5003.54 - lr: 0.000038 - momentum: 0.000000
2023-10-25 20:59:53,951 epoch 1 - iter 117/136 - loss 0.77070604 - time (sec): 9.17 - samples/sec: 4961.74 - lr: 0.000043 - momentum: 0.000000
2023-10-25 20:59:54,977 epoch 1 - iter 130/136 - loss 0.72306304 - time (sec): 10.19 - samples/sec: 4933.66 - lr: 0.000047 - momentum: 0.000000
2023-10-25 20:59:55,375 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:55,376 EPOCH 1 done: loss 0.7034 - lr: 0.000047
2023-10-25 20:59:56,927 DEV : loss 0.15223081409931183 - f1-score (micro avg) 0.6703
2023-10-25 20:59:56,935 saving best model
2023-10-25 20:59:57,569 ----------------------------------------------------------------------------------------------------
2023-10-25 20:59:58,582 epoch 2 - iter 13/136 - loss 0.16703540 - time (sec): 1.01 - samples/sec: 4671.90 - lr: 0.000050 - momentum: 0.000000
2023-10-25 20:59:59,547 epoch 2 - iter 26/136 - loss 0.15937506 - time (sec): 1.98 - samples/sec: 4707.65 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:00:00,594 epoch 2 - iter 39/136 - loss 0.15134578 - time (sec): 3.02 - samples/sec: 4812.24 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:00:01,675 epoch 2 - iter 52/136 - loss 0.14389325 - time (sec): 4.10 - samples/sec: 4706.84 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:00:02,708 epoch 2 - iter 65/136 - loss 0.14979586 - time (sec): 5.14 - samples/sec: 4760.65 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:00:03,789 epoch 2 - iter 78/136 - loss 0.13989186 - time (sec): 6.22 - samples/sec: 4835.41 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:00:04,794 epoch 2 - iter 91/136 - loss 0.14397482 - time (sec): 7.22 - samples/sec: 4808.49 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:00:05,746 epoch 2 - iter 104/136 - loss 0.14130762 - time (sec): 8.17 - samples/sec: 4834.70 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:00:06,868 epoch 2 - iter 117/136 - loss 0.13753071 - time (sec): 9.30 - samples/sec: 4842.68 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:00:07,855 epoch 2 - iter 130/136 - loss 0.13327994 - time (sec): 10.28 - samples/sec: 4878.03 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:00:08,266 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:08,267 EPOCH 2 done: loss 0.1319 - lr: 0.000045
2023-10-25 21:00:09,575 DEV : loss 0.1049661636352539 - f1-score (micro avg) 0.7519
2023-10-25 21:00:09,581 saving best model
2023-10-25 21:00:10,357 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:11,351 epoch 3 - iter 13/136 - loss 0.06158613 - time (sec): 0.99 - samples/sec: 5276.59 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:00:12,274 epoch 3 - iter 26/136 - loss 0.06262839 - time (sec): 1.92 - samples/sec: 5255.14 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:00:13,281 epoch 3 - iter 39/136 - loss 0.06000293 - time (sec): 2.92 - samples/sec: 5250.88 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:00:14,303 epoch 3 - iter 52/136 - loss 0.06495053 - time (sec): 3.94 - samples/sec: 5114.57 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:00:15,213 epoch 3 - iter 65/136 - loss 0.06661245 - time (sec): 4.85 - samples/sec: 5054.42 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:00:16,327 epoch 3 - iter 78/136 - loss 0.06974909 - time (sec): 5.97 - samples/sec: 4928.34 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:00:17,283 epoch 3 - iter 91/136 - loss 0.06900942 - time (sec): 6.92 - samples/sec: 5001.33 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:00:18,327 epoch 3 - iter 104/136 - loss 0.06816376 - time (sec): 7.97 - samples/sec: 4941.16 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:00:19,396 epoch 3 - iter 117/136 - loss 0.06920871 - time (sec): 9.04 - samples/sec: 4931.17 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:00:20,371 epoch 3 - iter 130/136 - loss 0.06823123 - time (sec): 10.01 - samples/sec: 4946.54 - lr: 0.000039 - momentum: 0.000000
2023-10-25 21:00:20,863 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:20,863 EPOCH 3 done: loss 0.0670 - lr: 0.000039
2023-10-25 21:00:22,080 DEV : loss 0.10162875056266785 - f1-score (micro avg) 0.7889
2023-10-25 21:00:22,089 saving best model
2023-10-25 21:00:22,834 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:24,300 epoch 4 - iter 13/136 - loss 0.04730517 - time (sec): 1.46 - samples/sec: 3970.30 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:00:25,192 epoch 4 - iter 26/136 - loss 0.04042726 - time (sec): 2.35 - samples/sec: 4381.47 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:00:26,282 epoch 4 - iter 39/136 - loss 0.03644611 - time (sec): 3.44 - samples/sec: 4303.84 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:00:27,369 epoch 4 - iter 52/136 - loss 0.03410621 - time (sec): 4.53 - samples/sec: 4514.71 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:00:28,493 epoch 4 - iter 65/136 - loss 0.04025331 - time (sec): 5.65 - samples/sec: 4490.11 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:00:29,464 epoch 4 - iter 78/136 - loss 0.04364137 - time (sec): 6.63 - samples/sec: 4521.96 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:00:30,468 epoch 4 - iter 91/136 - loss 0.04264546 - time (sec): 7.63 - samples/sec: 4533.25 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:00:31,368 epoch 4 - iter 104/136 - loss 0.04203329 - time (sec): 8.53 - samples/sec: 4600.66 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:00:32,340 epoch 4 - iter 117/136 - loss 0.04133705 - time (sec): 9.50 - samples/sec: 4692.10 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:00:33,290 epoch 4 - iter 130/136 - loss 0.04059355 - time (sec): 10.45 - samples/sec: 4737.24 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:00:33,721 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:33,721 EPOCH 4 done: loss 0.0405 - lr: 0.000034
2023-10-25 21:00:34,927 DEV : loss 0.10846184939146042 - f1-score (micro avg) 0.8308
2023-10-25 21:00:34,936 saving best model
2023-10-25 21:00:35,682 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:36,629 epoch 5 - iter 13/136 - loss 0.03141050 - time (sec): 0.94 - samples/sec: 4559.48 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:00:37,751 epoch 5 - iter 26/136 - loss 0.02872201 - time (sec): 2.07 - samples/sec: 4918.32 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:00:38,592 epoch 5 - iter 39/136 - loss 0.03507623 - time (sec): 2.91 - samples/sec: 4748.68 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:00:39,604 epoch 5 - iter 52/136 - loss 0.03164434 - time (sec): 3.92 - samples/sec: 4779.15 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:00:40,516 epoch 5 - iter 65/136 - loss 0.02918988 - time (sec): 4.83 - samples/sec: 4808.93 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:00:41,612 epoch 5 - iter 78/136 - loss 0.03150872 - time (sec): 5.93 - samples/sec: 4825.89 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:00:42,606 epoch 5 - iter 91/136 - loss 0.02890380 - time (sec): 6.92 - samples/sec: 4803.11 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:00:43,731 epoch 5 - iter 104/136 - loss 0.02786063 - time (sec): 8.05 - samples/sec: 4789.05 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:00:44,673 epoch 5 - iter 117/136 - loss 0.02639590 - time (sec): 8.99 - samples/sec: 4800.15 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:00:45,741 epoch 5 - iter 130/136 - loss 0.02732309 - time (sec): 10.06 - samples/sec: 4884.75 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:00:46,285 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:46,285 EPOCH 5 done: loss 0.0269 - lr: 0.000028
2023-10-25 21:00:47,478 DEV : loss 0.12008755654096603 - f1-score (micro avg) 0.8229
2023-10-25 21:00:47,486 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:48,513 epoch 6 - iter 13/136 - loss 0.01660326 - time (sec): 1.02 - samples/sec: 5215.58 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:00:49,506 epoch 6 - iter 26/136 - loss 0.01853080 - time (sec): 2.02 - samples/sec: 4998.73 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:00:51,193 epoch 6 - iter 39/136 - loss 0.02629362 - time (sec): 3.71 - samples/sec: 4163.95 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:00:52,218 epoch 6 - iter 52/136 - loss 0.02289507 - time (sec): 4.73 - samples/sec: 4335.97 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:00:53,321 epoch 6 - iter 65/136 - loss 0.02171427 - time (sec): 5.83 - samples/sec: 4421.36 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:00:54,469 epoch 6 - iter 78/136 - loss 0.02066805 - time (sec): 6.98 - samples/sec: 4438.53 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:00:55,500 epoch 6 - iter 91/136 - loss 0.01979943 - time (sec): 8.01 - samples/sec: 4490.87 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:00:56,454 epoch 6 - iter 104/136 - loss 0.01868719 - time (sec): 8.97 - samples/sec: 4542.88 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:00:57,486 epoch 6 - iter 117/136 - loss 0.01750765 - time (sec): 10.00 - samples/sec: 4564.53 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:00:58,488 epoch 6 - iter 130/136 - loss 0.01957900 - time (sec): 11.00 - samples/sec: 4564.31 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:00:58,902 ----------------------------------------------------------------------------------------------------
2023-10-25 21:00:58,902 EPOCH 6 done: loss 0.0193 - lr: 0.000023
2023-10-25 21:01:00,093 DEV : loss 0.1586846262216568 - f1-score (micro avg) 0.8037
2023-10-25 21:01:00,100 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:01,149 epoch 7 - iter 13/136 - loss 0.01331116 - time (sec): 1.05 - samples/sec: 4710.62 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:01:02,159 epoch 7 - iter 26/136 - loss 0.01370313 - time (sec): 2.06 - samples/sec: 4703.42 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:01:03,239 epoch 7 - iter 39/136 - loss 0.01517733 - time (sec): 3.14 - samples/sec: 4943.03 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:01:04,235 epoch 7 - iter 52/136 - loss 0.01384388 - time (sec): 4.13 - samples/sec: 5033.02 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:01:05,261 epoch 7 - iter 65/136 - loss 0.01349590 - time (sec): 5.16 - samples/sec: 5124.49 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:01:06,278 epoch 7 - iter 78/136 - loss 0.01274396 - time (sec): 6.18 - samples/sec: 5199.20 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:01:07,152 epoch 7 - iter 91/136 - loss 0.01291160 - time (sec): 7.05 - samples/sec: 5120.28 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:01:08,303 epoch 7 - iter 104/136 - loss 0.01227420 - time (sec): 8.20 - samples/sec: 5082.76 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:01:09,258 epoch 7 - iter 117/136 - loss 0.01318525 - time (sec): 9.16 - samples/sec: 5063.74 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:01:10,116 epoch 7 - iter 130/136 - loss 0.01416944 - time (sec): 10.01 - samples/sec: 4972.56 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:01:10,563 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:10,563 EPOCH 7 done: loss 0.0141 - lr: 0.000017
2023-10-25 21:01:11,820 DEV : loss 0.16495871543884277 - f1-score (micro avg) 0.8148
2023-10-25 21:01:11,826 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:12,857 epoch 8 - iter 13/136 - loss 0.01122739 - time (sec): 1.03 - samples/sec: 5265.27 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:01:14,332 epoch 8 - iter 26/136 - loss 0.00858612 - time (sec): 2.50 - samples/sec: 4312.20 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:01:15,304 epoch 8 - iter 39/136 - loss 0.00691838 - time (sec): 3.48 - samples/sec: 4508.95 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:01:16,236 epoch 8 - iter 52/136 - loss 0.00755005 - time (sec): 4.41 - samples/sec: 4682.13 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:01:17,275 epoch 8 - iter 65/136 - loss 0.00851409 - time (sec): 5.45 - samples/sec: 4800.93 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:01:18,133 epoch 8 - iter 78/136 - loss 0.01000584 - time (sec): 6.31 - samples/sec: 4757.37 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:01:19,143 epoch 8 - iter 91/136 - loss 0.00949333 - time (sec): 7.32 - samples/sec: 4784.99 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:01:20,118 epoch 8 - iter 104/136 - loss 0.00987032 - time (sec): 8.29 - samples/sec: 4794.40 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:01:21,124 epoch 8 - iter 117/136 - loss 0.01035730 - time (sec): 9.30 - samples/sec: 4836.36 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:01:22,127 epoch 8 - iter 130/136 - loss 0.00957887 - time (sec): 10.30 - samples/sec: 4838.75 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:01:22,545 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:22,546 EPOCH 8 done: loss 0.0102 - lr: 0.000012
2023-10-25 21:01:23,746 DEV : loss 0.16520875692367554 - f1-score (micro avg) 0.837
2023-10-25 21:01:23,753 saving best model
2023-10-25 21:01:24,481 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:25,536 epoch 9 - iter 13/136 - loss 0.00127382 - time (sec): 1.05 - samples/sec: 5511.06 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:01:26,507 epoch 9 - iter 26/136 - loss 0.00294119 - time (sec): 2.02 - samples/sec: 4875.77 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:01:27,476 epoch 9 - iter 39/136 - loss 0.00448983 - time (sec): 2.99 - samples/sec: 5024.62 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:01:28,555 epoch 9 - iter 52/136 - loss 0.00746805 - time (sec): 4.07 - samples/sec: 5126.83 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:01:29,495 epoch 9 - iter 65/136 - loss 0.00753869 - time (sec): 5.01 - samples/sec: 5092.24 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:01:30,570 epoch 9 - iter 78/136 - loss 0.00640345 - time (sec): 6.09 - samples/sec: 5044.18 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:01:31,486 epoch 9 - iter 91/136 - loss 0.00658349 - time (sec): 7.00 - samples/sec: 4964.34 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:01:32,359 epoch 9 - iter 104/136 - loss 0.00725150 - time (sec): 7.88 - samples/sec: 4989.10 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:01:33,337 epoch 9 - iter 117/136 - loss 0.00741016 - time (sec): 8.85 - samples/sec: 5028.89 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:01:34,317 epoch 9 - iter 130/136 - loss 0.00699976 - time (sec): 9.83 - samples/sec: 5058.75 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:01:34,750 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:34,750 EPOCH 9 done: loss 0.0071 - lr: 0.000006
2023-10-25 21:01:35,934 DEV : loss 0.1780930459499359 - f1-score (micro avg) 0.8272
2023-10-25 21:01:35,941 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:37,319 epoch 10 - iter 13/136 - loss 0.01811245 - time (sec): 1.38 - samples/sec: 3085.75 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:01:38,256 epoch 10 - iter 26/136 - loss 0.00912043 - time (sec): 2.31 - samples/sec: 3837.60 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:01:39,208 epoch 10 - iter 39/136 - loss 0.00728411 - time (sec): 3.26 - samples/sec: 4173.53 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:01:40,251 epoch 10 - iter 52/136 - loss 0.00678921 - time (sec): 4.31 - samples/sec: 4258.00 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:01:41,434 epoch 10 - iter 65/136 - loss 0.00635061 - time (sec): 5.49 - samples/sec: 4445.58 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:01:42,447 epoch 10 - iter 78/136 - loss 0.00610468 - time (sec): 6.50 - samples/sec: 4563.57 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:01:43,338 epoch 10 - iter 91/136 - loss 0.00568238 - time (sec): 7.39 - samples/sec: 4581.05 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:01:44,373 epoch 10 - iter 104/136 - loss 0.00490935 - time (sec): 8.43 - samples/sec: 4696.12 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:01:45,286 epoch 10 - iter 117/136 - loss 0.00489042 - time (sec): 9.34 - samples/sec: 4728.55 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:01:46,404 epoch 10 - iter 130/136 - loss 0.00561185 - time (sec): 10.46 - samples/sec: 4765.39 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:01:46,871 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:46,872 EPOCH 10 done: loss 0.0054 - lr: 0.000000
2023-10-25 21:01:48,026 DEV : loss 0.17910274863243103 - f1-score (micro avg) 0.8407
2023-10-25 21:01:48,032 saving best model
2023-10-25 21:01:49,290 ----------------------------------------------------------------------------------------------------
2023-10-25 21:01:49,292 Loading model from best epoch ...
2023-10-25 21:01:51,250 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-25 21:01:53,287
Results:
- F-score (micro) 0.7833
- F-score (macro) 0.7357
- Accuracy 0.6618
By class:
precision recall f1-score support
LOC 0.8060 0.8654 0.8346 312
PER 0.7047 0.8606 0.7749 208
ORG 0.4912 0.5091 0.5000 55
HumanProd 0.7692 0.9091 0.8333 22
micro avg 0.7396 0.8325 0.7833 597
macro avg 0.6928 0.7860 0.7357 597
weighted avg 0.7403 0.8325 0.7829 597
2023-10-25 21:01:53,288 ----------------------------------------------------------------------------------------------------