stefan-it's picture
Upload folder using huggingface_hub
05939f7
2023-10-19 12:00:56,175 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 Train: 20847 sentences
2023-10-19 12:00:56,176 (train_with_dev=False, train_with_test=False)
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 Training Params:
2023-10-19 12:00:56,176 - learning_rate: "3e-05"
2023-10-19 12:00:56,176 - mini_batch_size: "8"
2023-10-19 12:00:56,176 - max_epochs: "10"
2023-10-19 12:00:56,176 - shuffle: "True"
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 Plugins:
2023-10-19 12:00:56,176 - TensorboardLogger
2023-10-19 12:00:56,176 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 12:00:56,176 - metric: "('micro avg', 'f1-score')"
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 Computation:
2023-10-19 12:00:56,176 - compute on device: cuda:0
2023-10-19 12:00:56,176 - embedding storage: none
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,176 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-19 12:00:56,176 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,177 ----------------------------------------------------------------------------------------------------
2023-10-19 12:00:56,177 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 12:01:02,384 epoch 1 - iter 260/2606 - loss 3.40538242 - time (sec): 6.21 - samples/sec: 6104.20 - lr: 0.000003 - momentum: 0.000000
2023-10-19 12:01:08,549 epoch 1 - iter 520/2606 - loss 2.91311251 - time (sec): 12.37 - samples/sec: 6019.15 - lr: 0.000006 - momentum: 0.000000
2023-10-19 12:01:14,651 epoch 1 - iter 780/2606 - loss 2.35015293 - time (sec): 18.47 - samples/sec: 5836.80 - lr: 0.000009 - momentum: 0.000000
2023-10-19 12:01:20,859 epoch 1 - iter 1040/2606 - loss 1.89197470 - time (sec): 24.68 - samples/sec: 5868.35 - lr: 0.000012 - momentum: 0.000000
2023-10-19 12:01:26,933 epoch 1 - iter 1300/2606 - loss 1.63043529 - time (sec): 30.76 - samples/sec: 5863.50 - lr: 0.000015 - momentum: 0.000000
2023-10-19 12:01:33,771 epoch 1 - iter 1560/2606 - loss 1.43993948 - time (sec): 37.59 - samples/sec: 5810.02 - lr: 0.000018 - momentum: 0.000000
2023-10-19 12:01:39,944 epoch 1 - iter 1820/2606 - loss 1.31531984 - time (sec): 43.77 - samples/sec: 5792.69 - lr: 0.000021 - momentum: 0.000000
2023-10-19 12:01:46,212 epoch 1 - iter 2080/2606 - loss 1.21703578 - time (sec): 50.03 - samples/sec: 5792.61 - lr: 0.000024 - momentum: 0.000000
2023-10-19 12:01:52,532 epoch 1 - iter 2340/2606 - loss 1.12526280 - time (sec): 56.36 - samples/sec: 5834.62 - lr: 0.000027 - momentum: 0.000000
2023-10-19 12:01:59,056 epoch 1 - iter 2600/2606 - loss 1.05323112 - time (sec): 62.88 - samples/sec: 5831.22 - lr: 0.000030 - momentum: 0.000000
2023-10-19 12:01:59,199 ----------------------------------------------------------------------------------------------------
2023-10-19 12:01:59,199 EPOCH 1 done: loss 1.0518 - lr: 0.000030
2023-10-19 12:02:01,460 DEV : loss 0.1518947333097458 - f1-score (micro avg) 0.0
2023-10-19 12:02:01,483 ----------------------------------------------------------------------------------------------------
2023-10-19 12:02:07,653 epoch 2 - iter 260/2606 - loss 0.37279967 - time (sec): 6.17 - samples/sec: 6011.87 - lr: 0.000030 - momentum: 0.000000
2023-10-19 12:02:13,871 epoch 2 - iter 520/2606 - loss 0.38452370 - time (sec): 12.39 - samples/sec: 6141.84 - lr: 0.000029 - momentum: 0.000000
2023-10-19 12:02:20,069 epoch 2 - iter 780/2606 - loss 0.37622170 - time (sec): 18.58 - samples/sec: 6041.12 - lr: 0.000029 - momentum: 0.000000
2023-10-19 12:02:26,334 epoch 2 - iter 1040/2606 - loss 0.37748264 - time (sec): 24.85 - samples/sec: 6025.06 - lr: 0.000029 - momentum: 0.000000
2023-10-19 12:02:32,514 epoch 2 - iter 1300/2606 - loss 0.37993952 - time (sec): 31.03 - samples/sec: 5930.88 - lr: 0.000028 - momentum: 0.000000
2023-10-19 12:02:38,934 epoch 2 - iter 1560/2606 - loss 0.37734025 - time (sec): 37.45 - samples/sec: 5880.78 - lr: 0.000028 - momentum: 0.000000
2023-10-19 12:02:45,210 epoch 2 - iter 1820/2606 - loss 0.37244096 - time (sec): 43.73 - samples/sec: 5901.29 - lr: 0.000028 - momentum: 0.000000
2023-10-19 12:02:51,273 epoch 2 - iter 2080/2606 - loss 0.36740892 - time (sec): 49.79 - samples/sec: 5875.85 - lr: 0.000027 - momentum: 0.000000
2023-10-19 12:02:57,379 epoch 2 - iter 2340/2606 - loss 0.36606462 - time (sec): 55.90 - samples/sec: 5917.14 - lr: 0.000027 - momentum: 0.000000
2023-10-19 12:03:03,492 epoch 2 - iter 2600/2606 - loss 0.36137302 - time (sec): 62.01 - samples/sec: 5913.12 - lr: 0.000027 - momentum: 0.000000
2023-10-19 12:03:03,629 ----------------------------------------------------------------------------------------------------
2023-10-19 12:03:03,629 EPOCH 2 done: loss 0.3615 - lr: 0.000027
2023-10-19 12:03:08,748 DEV : loss 0.13267949223518372 - f1-score (micro avg) 0.1889
2023-10-19 12:03:08,771 saving best model
2023-10-19 12:03:08,800 ----------------------------------------------------------------------------------------------------
2023-10-19 12:03:14,910 epoch 3 - iter 260/2606 - loss 0.28770950 - time (sec): 6.11 - samples/sec: 5796.09 - lr: 0.000026 - momentum: 0.000000
2023-10-19 12:03:21,125 epoch 3 - iter 520/2606 - loss 0.30357365 - time (sec): 12.32 - samples/sec: 5936.08 - lr: 0.000026 - momentum: 0.000000
2023-10-19 12:03:27,126 epoch 3 - iter 780/2606 - loss 0.30509680 - time (sec): 18.33 - samples/sec: 5717.51 - lr: 0.000026 - momentum: 0.000000
2023-10-19 12:03:33,461 epoch 3 - iter 1040/2606 - loss 0.30592267 - time (sec): 24.66 - samples/sec: 5757.21 - lr: 0.000025 - momentum: 0.000000
2023-10-19 12:03:39,669 epoch 3 - iter 1300/2606 - loss 0.30579785 - time (sec): 30.87 - samples/sec: 5804.13 - lr: 0.000025 - momentum: 0.000000
2023-10-19 12:03:46,033 epoch 3 - iter 1560/2606 - loss 0.30511395 - time (sec): 37.23 - samples/sec: 5874.19 - lr: 0.000025 - momentum: 0.000000
2023-10-19 12:03:52,317 epoch 3 - iter 1820/2606 - loss 0.30304733 - time (sec): 43.52 - samples/sec: 5874.77 - lr: 0.000024 - momentum: 0.000000
2023-10-19 12:03:58,592 epoch 3 - iter 2080/2606 - loss 0.29992523 - time (sec): 49.79 - samples/sec: 5900.19 - lr: 0.000024 - momentum: 0.000000
2023-10-19 12:04:04,651 epoch 3 - iter 2340/2606 - loss 0.30166392 - time (sec): 55.85 - samples/sec: 5904.47 - lr: 0.000024 - momentum: 0.000000
2023-10-19 12:04:10,664 epoch 3 - iter 2600/2606 - loss 0.30048573 - time (sec): 61.86 - samples/sec: 5929.69 - lr: 0.000023 - momentum: 0.000000
2023-10-19 12:04:10,799 ----------------------------------------------------------------------------------------------------
2023-10-19 12:04:10,799 EPOCH 3 done: loss 0.3004 - lr: 0.000023
2023-10-19 12:04:15,259 DEV : loss 0.13826555013656616 - f1-score (micro avg) 0.2585
2023-10-19 12:04:15,282 saving best model
2023-10-19 12:04:15,315 ----------------------------------------------------------------------------------------------------
2023-10-19 12:04:22,111 epoch 4 - iter 260/2606 - loss 0.28814534 - time (sec): 6.80 - samples/sec: 5601.40 - lr: 0.000023 - momentum: 0.000000
2023-10-19 12:04:28,248 epoch 4 - iter 520/2606 - loss 0.27039875 - time (sec): 12.93 - samples/sec: 6000.16 - lr: 0.000023 - momentum: 0.000000
2023-10-19 12:04:34,408 epoch 4 - iter 780/2606 - loss 0.28075865 - time (sec): 19.09 - samples/sec: 5982.04 - lr: 0.000022 - momentum: 0.000000
2023-10-19 12:04:40,146 epoch 4 - iter 1040/2606 - loss 0.28581406 - time (sec): 24.83 - samples/sec: 6012.40 - lr: 0.000022 - momentum: 0.000000
2023-10-19 12:04:46,193 epoch 4 - iter 1300/2606 - loss 0.27788397 - time (sec): 30.88 - samples/sec: 6003.77 - lr: 0.000022 - momentum: 0.000000
2023-10-19 12:04:52,239 epoch 4 - iter 1560/2606 - loss 0.27453240 - time (sec): 36.92 - samples/sec: 5967.64 - lr: 0.000021 - momentum: 0.000000
2023-10-19 12:04:58,501 epoch 4 - iter 1820/2606 - loss 0.27244670 - time (sec): 43.19 - samples/sec: 5995.58 - lr: 0.000021 - momentum: 0.000000
2023-10-19 12:05:04,502 epoch 4 - iter 2080/2606 - loss 0.27111110 - time (sec): 49.19 - samples/sec: 5966.95 - lr: 0.000021 - momentum: 0.000000
2023-10-19 12:05:10,738 epoch 4 - iter 2340/2606 - loss 0.27083600 - time (sec): 55.42 - samples/sec: 5926.47 - lr: 0.000020 - momentum: 0.000000
2023-10-19 12:05:17,021 epoch 4 - iter 2600/2606 - loss 0.26888443 - time (sec): 61.71 - samples/sec: 5938.26 - lr: 0.000020 - momentum: 0.000000
2023-10-19 12:05:17,179 ----------------------------------------------------------------------------------------------------
2023-10-19 12:05:17,179 EPOCH 4 done: loss 0.2689 - lr: 0.000020
2023-10-19 12:05:21,652 DEV : loss 0.13496357202529907 - f1-score (micro avg) 0.2655
2023-10-19 12:05:21,676 saving best model
2023-10-19 12:05:21,711 ----------------------------------------------------------------------------------------------------
2023-10-19 12:05:27,994 epoch 5 - iter 260/2606 - loss 0.22499671 - time (sec): 6.28 - samples/sec: 5546.13 - lr: 0.000020 - momentum: 0.000000
2023-10-19 12:05:34,267 epoch 5 - iter 520/2606 - loss 0.25029462 - time (sec): 12.56 - samples/sec: 5624.23 - lr: 0.000019 - momentum: 0.000000
2023-10-19 12:05:40,390 epoch 5 - iter 780/2606 - loss 0.25249412 - time (sec): 18.68 - samples/sec: 5775.91 - lr: 0.000019 - momentum: 0.000000
2023-10-19 12:05:46,511 epoch 5 - iter 1040/2606 - loss 0.25195886 - time (sec): 24.80 - samples/sec: 5880.79 - lr: 0.000019 - momentum: 0.000000
2023-10-19 12:05:53,289 epoch 5 - iter 1300/2606 - loss 0.25517590 - time (sec): 31.58 - samples/sec: 5710.43 - lr: 0.000018 - momentum: 0.000000
2023-10-19 12:05:59,390 epoch 5 - iter 1560/2606 - loss 0.25285506 - time (sec): 37.68 - samples/sec: 5773.47 - lr: 0.000018 - momentum: 0.000000
2023-10-19 12:06:05,529 epoch 5 - iter 1820/2606 - loss 0.25402582 - time (sec): 43.82 - samples/sec: 5841.14 - lr: 0.000018 - momentum: 0.000000
2023-10-19 12:06:11,704 epoch 5 - iter 2080/2606 - loss 0.25162475 - time (sec): 49.99 - samples/sec: 5865.79 - lr: 0.000017 - momentum: 0.000000
2023-10-19 12:06:17,867 epoch 5 - iter 2340/2606 - loss 0.24897904 - time (sec): 56.15 - samples/sec: 5869.14 - lr: 0.000017 - momentum: 0.000000
2023-10-19 12:06:24,059 epoch 5 - iter 2600/2606 - loss 0.24992965 - time (sec): 62.35 - samples/sec: 5878.66 - lr: 0.000017 - momentum: 0.000000
2023-10-19 12:06:24,206 ----------------------------------------------------------------------------------------------------
2023-10-19 12:06:24,206 EPOCH 5 done: loss 0.2498 - lr: 0.000017
2023-10-19 12:06:28,655 DEV : loss 0.14713987708091736 - f1-score (micro avg) 0.2713
2023-10-19 12:06:28,681 saving best model
2023-10-19 12:06:28,720 ----------------------------------------------------------------------------------------------------
2023-10-19 12:06:34,853 epoch 6 - iter 260/2606 - loss 0.23364615 - time (sec): 6.13 - samples/sec: 5906.43 - lr: 0.000016 - momentum: 0.000000
2023-10-19 12:06:41,043 epoch 6 - iter 520/2606 - loss 0.24077423 - time (sec): 12.32 - samples/sec: 6051.37 - lr: 0.000016 - momentum: 0.000000
2023-10-19 12:06:47,054 epoch 6 - iter 780/2606 - loss 0.24195749 - time (sec): 18.33 - samples/sec: 6040.91 - lr: 0.000016 - momentum: 0.000000
2023-10-19 12:06:53,466 epoch 6 - iter 1040/2606 - loss 0.23662083 - time (sec): 24.74 - samples/sec: 6123.91 - lr: 0.000015 - momentum: 0.000000
2023-10-19 12:06:59,567 epoch 6 - iter 1300/2606 - loss 0.23644139 - time (sec): 30.85 - samples/sec: 6098.02 - lr: 0.000015 - momentum: 0.000000
2023-10-19 12:07:05,490 epoch 6 - iter 1560/2606 - loss 0.23948595 - time (sec): 36.77 - samples/sec: 6003.25 - lr: 0.000015 - momentum: 0.000000
2023-10-19 12:07:11,716 epoch 6 - iter 1820/2606 - loss 0.23331354 - time (sec): 43.00 - samples/sec: 5972.05 - lr: 0.000014 - momentum: 0.000000
2023-10-19 12:07:18,061 epoch 6 - iter 2080/2606 - loss 0.23556838 - time (sec): 49.34 - samples/sec: 5939.27 - lr: 0.000014 - momentum: 0.000000
2023-10-19 12:07:24,339 epoch 6 - iter 2340/2606 - loss 0.23590165 - time (sec): 55.62 - samples/sec: 5954.14 - lr: 0.000014 - momentum: 0.000000
2023-10-19 12:07:30,952 epoch 6 - iter 2600/2606 - loss 0.23617984 - time (sec): 62.23 - samples/sec: 5887.93 - lr: 0.000013 - momentum: 0.000000
2023-10-19 12:07:31,090 ----------------------------------------------------------------------------------------------------
2023-10-19 12:07:31,090 EPOCH 6 done: loss 0.2362 - lr: 0.000013
2023-10-19 12:07:35,613 DEV : loss 0.145651176571846 - f1-score (micro avg) 0.2658
2023-10-19 12:07:35,637 ----------------------------------------------------------------------------------------------------
2023-10-19 12:07:41,815 epoch 7 - iter 260/2606 - loss 0.23295970 - time (sec): 6.18 - samples/sec: 5910.42 - lr: 0.000013 - momentum: 0.000000
2023-10-19 12:07:47,962 epoch 7 - iter 520/2606 - loss 0.21670273 - time (sec): 12.33 - samples/sec: 5896.94 - lr: 0.000013 - momentum: 0.000000
2023-10-19 12:07:54,256 epoch 7 - iter 780/2606 - loss 0.22310923 - time (sec): 18.62 - samples/sec: 5836.72 - lr: 0.000012 - momentum: 0.000000
2023-10-19 12:08:00,533 epoch 7 - iter 1040/2606 - loss 0.22735368 - time (sec): 24.90 - samples/sec: 5904.01 - lr: 0.000012 - momentum: 0.000000
2023-10-19 12:08:06,697 epoch 7 - iter 1300/2606 - loss 0.23071935 - time (sec): 31.06 - samples/sec: 5845.74 - lr: 0.000012 - momentum: 0.000000
2023-10-19 12:08:12,828 epoch 7 - iter 1560/2606 - loss 0.22661985 - time (sec): 37.19 - samples/sec: 5873.54 - lr: 0.000011 - momentum: 0.000000
2023-10-19 12:08:18,938 epoch 7 - iter 1820/2606 - loss 0.22653157 - time (sec): 43.30 - samples/sec: 5880.75 - lr: 0.000011 - momentum: 0.000000
2023-10-19 12:08:25,175 epoch 7 - iter 2080/2606 - loss 0.22487565 - time (sec): 49.54 - samples/sec: 5887.39 - lr: 0.000011 - momentum: 0.000000
2023-10-19 12:08:31,637 epoch 7 - iter 2340/2606 - loss 0.22556286 - time (sec): 56.00 - samples/sec: 5850.29 - lr: 0.000010 - momentum: 0.000000
2023-10-19 12:08:38,149 epoch 7 - iter 2600/2606 - loss 0.22373051 - time (sec): 62.51 - samples/sec: 5860.52 - lr: 0.000010 - momentum: 0.000000
2023-10-19 12:08:38,301 ----------------------------------------------------------------------------------------------------
2023-10-19 12:08:38,302 EPOCH 7 done: loss 0.2239 - lr: 0.000010
2023-10-19 12:08:43,560 DEV : loss 0.1493058055639267 - f1-score (micro avg) 0.2694
2023-10-19 12:08:43,584 ----------------------------------------------------------------------------------------------------
2023-10-19 12:08:49,877 epoch 8 - iter 260/2606 - loss 0.21748453 - time (sec): 6.29 - samples/sec: 6011.39 - lr: 0.000010 - momentum: 0.000000
2023-10-19 12:08:56,114 epoch 8 - iter 520/2606 - loss 0.21796397 - time (sec): 12.53 - samples/sec: 6077.56 - lr: 0.000009 - momentum: 0.000000
2023-10-19 12:09:02,336 epoch 8 - iter 780/2606 - loss 0.22363675 - time (sec): 18.75 - samples/sec: 6025.50 - lr: 0.000009 - momentum: 0.000000
2023-10-19 12:09:08,641 epoch 8 - iter 1040/2606 - loss 0.22466731 - time (sec): 25.06 - samples/sec: 5859.93 - lr: 0.000009 - momentum: 0.000000
2023-10-19 12:09:14,751 epoch 8 - iter 1300/2606 - loss 0.21805738 - time (sec): 31.17 - samples/sec: 5936.89 - lr: 0.000008 - momentum: 0.000000
2023-10-19 12:09:20,786 epoch 8 - iter 1560/2606 - loss 0.21732187 - time (sec): 37.20 - samples/sec: 5898.27 - lr: 0.000008 - momentum: 0.000000
2023-10-19 12:09:27,121 epoch 8 - iter 1820/2606 - loss 0.21905354 - time (sec): 43.54 - samples/sec: 5893.52 - lr: 0.000008 - momentum: 0.000000
2023-10-19 12:09:33,241 epoch 8 - iter 2080/2606 - loss 0.21894204 - time (sec): 49.66 - samples/sec: 5892.24 - lr: 0.000007 - momentum: 0.000000
2023-10-19 12:09:39,459 epoch 8 - iter 2340/2606 - loss 0.21934014 - time (sec): 55.87 - samples/sec: 5895.25 - lr: 0.000007 - momentum: 0.000000
2023-10-19 12:09:45,540 epoch 8 - iter 2600/2606 - loss 0.21776469 - time (sec): 61.96 - samples/sec: 5911.41 - lr: 0.000007 - momentum: 0.000000
2023-10-19 12:09:45,701 ----------------------------------------------------------------------------------------------------
2023-10-19 12:09:45,702 EPOCH 8 done: loss 0.2179 - lr: 0.000007
2023-10-19 12:09:51,132 DEV : loss 0.1476389616727829 - f1-score (micro avg) 0.2709
2023-10-19 12:09:51,156 ----------------------------------------------------------------------------------------------------
2023-10-19 12:09:57,451 epoch 9 - iter 260/2606 - loss 0.21031450 - time (sec): 6.29 - samples/sec: 5576.49 - lr: 0.000006 - momentum: 0.000000
2023-10-19 12:10:03,553 epoch 9 - iter 520/2606 - loss 0.20122891 - time (sec): 12.40 - samples/sec: 5721.65 - lr: 0.000006 - momentum: 0.000000
2023-10-19 12:10:09,434 epoch 9 - iter 780/2606 - loss 0.19566717 - time (sec): 18.28 - samples/sec: 5970.33 - lr: 0.000006 - momentum: 0.000000
2023-10-19 12:10:15,594 epoch 9 - iter 1040/2606 - loss 0.20342649 - time (sec): 24.44 - samples/sec: 5996.37 - lr: 0.000005 - momentum: 0.000000
2023-10-19 12:10:21,820 epoch 9 - iter 1300/2606 - loss 0.20271150 - time (sec): 30.66 - samples/sec: 6007.21 - lr: 0.000005 - momentum: 0.000000
2023-10-19 12:10:28,209 epoch 9 - iter 1560/2606 - loss 0.20653268 - time (sec): 37.05 - samples/sec: 5971.23 - lr: 0.000005 - momentum: 0.000000
2023-10-19 12:10:34,354 epoch 9 - iter 1820/2606 - loss 0.20768132 - time (sec): 43.20 - samples/sec: 5941.15 - lr: 0.000004 - momentum: 0.000000
2023-10-19 12:10:40,736 epoch 9 - iter 2080/2606 - loss 0.20750751 - time (sec): 49.58 - samples/sec: 5932.55 - lr: 0.000004 - momentum: 0.000000
2023-10-19 12:10:46,708 epoch 9 - iter 2340/2606 - loss 0.20839454 - time (sec): 55.55 - samples/sec: 5946.27 - lr: 0.000004 - momentum: 0.000000
2023-10-19 12:10:52,840 epoch 9 - iter 2600/2606 - loss 0.20899152 - time (sec): 61.68 - samples/sec: 5940.26 - lr: 0.000003 - momentum: 0.000000
2023-10-19 12:10:52,990 ----------------------------------------------------------------------------------------------------
2023-10-19 12:10:52,991 EPOCH 9 done: loss 0.2090 - lr: 0.000003
2023-10-19 12:10:58,209 DEV : loss 0.1496579349040985 - f1-score (micro avg) 0.2732
2023-10-19 12:10:58,234 saving best model
2023-10-19 12:10:58,266 ----------------------------------------------------------------------------------------------------
2023-10-19 12:11:04,334 epoch 10 - iter 260/2606 - loss 0.22099529 - time (sec): 6.07 - samples/sec: 5358.54 - lr: 0.000003 - momentum: 0.000000
2023-10-19 12:11:10,601 epoch 10 - iter 520/2606 - loss 0.20573417 - time (sec): 12.33 - samples/sec: 5725.47 - lr: 0.000003 - momentum: 0.000000
2023-10-19 12:11:16,687 epoch 10 - iter 780/2606 - loss 0.20760408 - time (sec): 18.42 - samples/sec: 5702.75 - lr: 0.000002 - momentum: 0.000000
2023-10-19 12:11:23,120 epoch 10 - iter 1040/2606 - loss 0.21595404 - time (sec): 24.85 - samples/sec: 5730.46 - lr: 0.000002 - momentum: 0.000000
2023-10-19 12:11:29,235 epoch 10 - iter 1300/2606 - loss 0.21080473 - time (sec): 30.97 - samples/sec: 5804.93 - lr: 0.000002 - momentum: 0.000000
2023-10-19 12:11:35,569 epoch 10 - iter 1560/2606 - loss 0.21062924 - time (sec): 37.30 - samples/sec: 5854.00 - lr: 0.000001 - momentum: 0.000000
2023-10-19 12:11:41,642 epoch 10 - iter 1820/2606 - loss 0.20879304 - time (sec): 43.38 - samples/sec: 5815.43 - lr: 0.000001 - momentum: 0.000000
2023-10-19 12:11:47,915 epoch 10 - iter 2080/2606 - loss 0.20787473 - time (sec): 49.65 - samples/sec: 5856.60 - lr: 0.000001 - momentum: 0.000000
2023-10-19 12:11:54,226 epoch 10 - iter 2340/2606 - loss 0.20759076 - time (sec): 55.96 - samples/sec: 5892.12 - lr: 0.000000 - momentum: 0.000000
2023-10-19 12:12:00,576 epoch 10 - iter 2600/2606 - loss 0.20787899 - time (sec): 62.31 - samples/sec: 5888.75 - lr: 0.000000 - momentum: 0.000000
2023-10-19 12:12:00,722 ----------------------------------------------------------------------------------------------------
2023-10-19 12:12:00,722 EPOCH 10 done: loss 0.2078 - lr: 0.000000
2023-10-19 12:12:05,984 DEV : loss 0.1529059261083603 - f1-score (micro avg) 0.2686
2023-10-19 12:12:06,040 ----------------------------------------------------------------------------------------------------
2023-10-19 12:12:06,041 Loading model from best epoch ...
2023-10-19 12:12:06,121 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 12:12:12,588
Results:
- F-score (micro) 0.3045
- F-score (macro) 0.1659
- Accuracy 0.1814
By class:
precision recall f1-score support
LOC 0.4606 0.4811 0.4706 1214
PER 0.1468 0.1510 0.1489 808
ORG 0.0556 0.0368 0.0443 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.3082 0.3008 0.3045 2390
macro avg 0.1657 0.1672 0.1659 2390
weighted avg 0.2918 0.3008 0.2959 2390
2023-10-19 12:12:12,588 ----------------------------------------------------------------------------------------------------