|
2023-10-19 12:00:56,175 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences |
|
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 Train: 20847 sentences |
|
2023-10-19 12:00:56,176 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 Training Params: |
|
2023-10-19 12:00:56,176 - learning_rate: "3e-05" |
|
2023-10-19 12:00:56,176 - mini_batch_size: "8" |
|
2023-10-19 12:00:56,176 - max_epochs: "10" |
|
2023-10-19 12:00:56,176 - shuffle: "True" |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 Plugins: |
|
2023-10-19 12:00:56,176 - TensorboardLogger |
|
2023-10-19 12:00:56,176 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 12:00:56,176 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 Computation: |
|
2023-10-19 12:00:56,176 - compute on device: cuda:0 |
|
2023-10-19 12:00:56,176 - embedding storage: none |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,176 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-19 12:00:56,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,177 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:00:56,177 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 12:01:02,384 epoch 1 - iter 260/2606 - loss 3.40538242 - time (sec): 6.21 - samples/sec: 6104.20 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 12:01:08,549 epoch 1 - iter 520/2606 - loss 2.91311251 - time (sec): 12.37 - samples/sec: 6019.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 12:01:14,651 epoch 1 - iter 780/2606 - loss 2.35015293 - time (sec): 18.47 - samples/sec: 5836.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 12:01:20,859 epoch 1 - iter 1040/2606 - loss 1.89197470 - time (sec): 24.68 - samples/sec: 5868.35 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 12:01:26,933 epoch 1 - iter 1300/2606 - loss 1.63043529 - time (sec): 30.76 - samples/sec: 5863.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:01:33,771 epoch 1 - iter 1560/2606 - loss 1.43993948 - time (sec): 37.59 - samples/sec: 5810.02 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:01:39,944 epoch 1 - iter 1820/2606 - loss 1.31531984 - time (sec): 43.77 - samples/sec: 5792.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:01:46,212 epoch 1 - iter 2080/2606 - loss 1.21703578 - time (sec): 50.03 - samples/sec: 5792.61 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:01:52,532 epoch 1 - iter 2340/2606 - loss 1.12526280 - time (sec): 56.36 - samples/sec: 5834.62 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:01:59,056 epoch 1 - iter 2600/2606 - loss 1.05323112 - time (sec): 62.88 - samples/sec: 5831.22 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 12:01:59,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:01:59,199 EPOCH 1 done: loss 1.0518 - lr: 0.000030 |
|
2023-10-19 12:02:01,460 DEV : loss 0.1518947333097458 - f1-score (micro avg) 0.0 |
|
2023-10-19 12:02:01,483 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:02:07,653 epoch 2 - iter 260/2606 - loss 0.37279967 - time (sec): 6.17 - samples/sec: 6011.87 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 12:02:13,871 epoch 2 - iter 520/2606 - loss 0.38452370 - time (sec): 12.39 - samples/sec: 6141.84 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:02:20,069 epoch 2 - iter 780/2606 - loss 0.37622170 - time (sec): 18.58 - samples/sec: 6041.12 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:02:26,334 epoch 2 - iter 1040/2606 - loss 0.37748264 - time (sec): 24.85 - samples/sec: 6025.06 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 12:02:32,514 epoch 2 - iter 1300/2606 - loss 0.37993952 - time (sec): 31.03 - samples/sec: 5930.88 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:02:38,934 epoch 2 - iter 1560/2606 - loss 0.37734025 - time (sec): 37.45 - samples/sec: 5880.78 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:02:45,210 epoch 2 - iter 1820/2606 - loss 0.37244096 - time (sec): 43.73 - samples/sec: 5901.29 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 12:02:51,273 epoch 2 - iter 2080/2606 - loss 0.36740892 - time (sec): 49.79 - samples/sec: 5875.85 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:02:57,379 epoch 2 - iter 2340/2606 - loss 0.36606462 - time (sec): 55.90 - samples/sec: 5917.14 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:03:03,492 epoch 2 - iter 2600/2606 - loss 0.36137302 - time (sec): 62.01 - samples/sec: 5913.12 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 12:03:03,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:03:03,629 EPOCH 2 done: loss 0.3615 - lr: 0.000027 |
|
2023-10-19 12:03:08,748 DEV : loss 0.13267949223518372 - f1-score (micro avg) 0.1889 |
|
2023-10-19 12:03:08,771 saving best model |
|
2023-10-19 12:03:08,800 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:03:14,910 epoch 3 - iter 260/2606 - loss 0.28770950 - time (sec): 6.11 - samples/sec: 5796.09 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:03:21,125 epoch 3 - iter 520/2606 - loss 0.30357365 - time (sec): 12.32 - samples/sec: 5936.08 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:03:27,126 epoch 3 - iter 780/2606 - loss 0.30509680 - time (sec): 18.33 - samples/sec: 5717.51 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 12:03:33,461 epoch 3 - iter 1040/2606 - loss 0.30592267 - time (sec): 24.66 - samples/sec: 5757.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:03:39,669 epoch 3 - iter 1300/2606 - loss 0.30579785 - time (sec): 30.87 - samples/sec: 5804.13 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:03:46,033 epoch 3 - iter 1560/2606 - loss 0.30511395 - time (sec): 37.23 - samples/sec: 5874.19 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 12:03:52,317 epoch 3 - iter 1820/2606 - loss 0.30304733 - time (sec): 43.52 - samples/sec: 5874.77 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:03:58,592 epoch 3 - iter 2080/2606 - loss 0.29992523 - time (sec): 49.79 - samples/sec: 5900.19 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:04:04,651 epoch 3 - iter 2340/2606 - loss 0.30166392 - time (sec): 55.85 - samples/sec: 5904.47 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 12:04:10,664 epoch 3 - iter 2600/2606 - loss 0.30048573 - time (sec): 61.86 - samples/sec: 5929.69 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:04:10,799 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:04:10,799 EPOCH 3 done: loss 0.3004 - lr: 0.000023 |
|
2023-10-19 12:04:15,259 DEV : loss 0.13826555013656616 - f1-score (micro avg) 0.2585 |
|
2023-10-19 12:04:15,282 saving best model |
|
2023-10-19 12:04:15,315 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:04:22,111 epoch 4 - iter 260/2606 - loss 0.28814534 - time (sec): 6.80 - samples/sec: 5601.40 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:04:28,248 epoch 4 - iter 520/2606 - loss 0.27039875 - time (sec): 12.93 - samples/sec: 6000.16 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 12:04:34,408 epoch 4 - iter 780/2606 - loss 0.28075865 - time (sec): 19.09 - samples/sec: 5982.04 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:04:40,146 epoch 4 - iter 1040/2606 - loss 0.28581406 - time (sec): 24.83 - samples/sec: 6012.40 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:04:46,193 epoch 4 - iter 1300/2606 - loss 0.27788397 - time (sec): 30.88 - samples/sec: 6003.77 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 12:04:52,239 epoch 4 - iter 1560/2606 - loss 0.27453240 - time (sec): 36.92 - samples/sec: 5967.64 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:04:58,501 epoch 4 - iter 1820/2606 - loss 0.27244670 - time (sec): 43.19 - samples/sec: 5995.58 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:05:04,502 epoch 4 - iter 2080/2606 - loss 0.27111110 - time (sec): 49.19 - samples/sec: 5966.95 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 12:05:10,738 epoch 4 - iter 2340/2606 - loss 0.27083600 - time (sec): 55.42 - samples/sec: 5926.47 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:05:17,021 epoch 4 - iter 2600/2606 - loss 0.26888443 - time (sec): 61.71 - samples/sec: 5938.26 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:05:17,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:05:17,179 EPOCH 4 done: loss 0.2689 - lr: 0.000020 |
|
2023-10-19 12:05:21,652 DEV : loss 0.13496357202529907 - f1-score (micro avg) 0.2655 |
|
2023-10-19 12:05:21,676 saving best model |
|
2023-10-19 12:05:21,711 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:05:27,994 epoch 5 - iter 260/2606 - loss 0.22499671 - time (sec): 6.28 - samples/sec: 5546.13 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 12:05:34,267 epoch 5 - iter 520/2606 - loss 0.25029462 - time (sec): 12.56 - samples/sec: 5624.23 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:05:40,390 epoch 5 - iter 780/2606 - loss 0.25249412 - time (sec): 18.68 - samples/sec: 5775.91 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:05:46,511 epoch 5 - iter 1040/2606 - loss 0.25195886 - time (sec): 24.80 - samples/sec: 5880.79 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 12:05:53,289 epoch 5 - iter 1300/2606 - loss 0.25517590 - time (sec): 31.58 - samples/sec: 5710.43 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:05:59,390 epoch 5 - iter 1560/2606 - loss 0.25285506 - time (sec): 37.68 - samples/sec: 5773.47 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:06:05,529 epoch 5 - iter 1820/2606 - loss 0.25402582 - time (sec): 43.82 - samples/sec: 5841.14 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 12:06:11,704 epoch 5 - iter 2080/2606 - loss 0.25162475 - time (sec): 49.99 - samples/sec: 5865.79 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:06:17,867 epoch 5 - iter 2340/2606 - loss 0.24897904 - time (sec): 56.15 - samples/sec: 5869.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:06:24,059 epoch 5 - iter 2600/2606 - loss 0.24992965 - time (sec): 62.35 - samples/sec: 5878.66 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 12:06:24,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:06:24,206 EPOCH 5 done: loss 0.2498 - lr: 0.000017 |
|
2023-10-19 12:06:28,655 DEV : loss 0.14713987708091736 - f1-score (micro avg) 0.2713 |
|
2023-10-19 12:06:28,681 saving best model |
|
2023-10-19 12:06:28,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:06:34,853 epoch 6 - iter 260/2606 - loss 0.23364615 - time (sec): 6.13 - samples/sec: 5906.43 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:06:41,043 epoch 6 - iter 520/2606 - loss 0.24077423 - time (sec): 12.32 - samples/sec: 6051.37 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:06:47,054 epoch 6 - iter 780/2606 - loss 0.24195749 - time (sec): 18.33 - samples/sec: 6040.91 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 12:06:53,466 epoch 6 - iter 1040/2606 - loss 0.23662083 - time (sec): 24.74 - samples/sec: 6123.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:06:59,567 epoch 6 - iter 1300/2606 - loss 0.23644139 - time (sec): 30.85 - samples/sec: 6098.02 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:07:05,490 epoch 6 - iter 1560/2606 - loss 0.23948595 - time (sec): 36.77 - samples/sec: 6003.25 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 12:07:11,716 epoch 6 - iter 1820/2606 - loss 0.23331354 - time (sec): 43.00 - samples/sec: 5972.05 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 12:07:18,061 epoch 6 - iter 2080/2606 - loss 0.23556838 - time (sec): 49.34 - samples/sec: 5939.27 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 12:07:24,339 epoch 6 - iter 2340/2606 - loss 0.23590165 - time (sec): 55.62 - samples/sec: 5954.14 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 12:07:30,952 epoch 6 - iter 2600/2606 - loss 0.23617984 - time (sec): 62.23 - samples/sec: 5887.93 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 12:07:31,090 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:07:31,090 EPOCH 6 done: loss 0.2362 - lr: 0.000013 |
|
2023-10-19 12:07:35,613 DEV : loss 0.145651176571846 - f1-score (micro avg) 0.2658 |
|
2023-10-19 12:07:35,637 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:07:41,815 epoch 7 - iter 260/2606 - loss 0.23295970 - time (sec): 6.18 - samples/sec: 5910.42 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 12:07:47,962 epoch 7 - iter 520/2606 - loss 0.21670273 - time (sec): 12.33 - samples/sec: 5896.94 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 12:07:54,256 epoch 7 - iter 780/2606 - loss 0.22310923 - time (sec): 18.62 - samples/sec: 5836.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 12:08:00,533 epoch 7 - iter 1040/2606 - loss 0.22735368 - time (sec): 24.90 - samples/sec: 5904.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 12:08:06,697 epoch 7 - iter 1300/2606 - loss 0.23071935 - time (sec): 31.06 - samples/sec: 5845.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 12:08:12,828 epoch 7 - iter 1560/2606 - loss 0.22661985 - time (sec): 37.19 - samples/sec: 5873.54 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 12:08:18,938 epoch 7 - iter 1820/2606 - loss 0.22653157 - time (sec): 43.30 - samples/sec: 5880.75 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 12:08:25,175 epoch 7 - iter 2080/2606 - loss 0.22487565 - time (sec): 49.54 - samples/sec: 5887.39 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 12:08:31,637 epoch 7 - iter 2340/2606 - loss 0.22556286 - time (sec): 56.00 - samples/sec: 5850.29 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 12:08:38,149 epoch 7 - iter 2600/2606 - loss 0.22373051 - time (sec): 62.51 - samples/sec: 5860.52 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 12:08:38,301 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:08:38,302 EPOCH 7 done: loss 0.2239 - lr: 0.000010 |
|
2023-10-19 12:08:43,560 DEV : loss 0.1493058055639267 - f1-score (micro avg) 0.2694 |
|
2023-10-19 12:08:43,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:08:49,877 epoch 8 - iter 260/2606 - loss 0.21748453 - time (sec): 6.29 - samples/sec: 6011.39 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 12:08:56,114 epoch 8 - iter 520/2606 - loss 0.21796397 - time (sec): 12.53 - samples/sec: 6077.56 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 12:09:02,336 epoch 8 - iter 780/2606 - loss 0.22363675 - time (sec): 18.75 - samples/sec: 6025.50 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 12:09:08,641 epoch 8 - iter 1040/2606 - loss 0.22466731 - time (sec): 25.06 - samples/sec: 5859.93 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 12:09:14,751 epoch 8 - iter 1300/2606 - loss 0.21805738 - time (sec): 31.17 - samples/sec: 5936.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 12:09:20,786 epoch 8 - iter 1560/2606 - loss 0.21732187 - time (sec): 37.20 - samples/sec: 5898.27 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 12:09:27,121 epoch 8 - iter 1820/2606 - loss 0.21905354 - time (sec): 43.54 - samples/sec: 5893.52 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 12:09:33,241 epoch 8 - iter 2080/2606 - loss 0.21894204 - time (sec): 49.66 - samples/sec: 5892.24 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 12:09:39,459 epoch 8 - iter 2340/2606 - loss 0.21934014 - time (sec): 55.87 - samples/sec: 5895.25 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 12:09:45,540 epoch 8 - iter 2600/2606 - loss 0.21776469 - time (sec): 61.96 - samples/sec: 5911.41 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 12:09:45,701 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:09:45,702 EPOCH 8 done: loss 0.2179 - lr: 0.000007 |
|
2023-10-19 12:09:51,132 DEV : loss 0.1476389616727829 - f1-score (micro avg) 0.2709 |
|
2023-10-19 12:09:51,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:09:57,451 epoch 9 - iter 260/2606 - loss 0.21031450 - time (sec): 6.29 - samples/sec: 5576.49 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 12:10:03,553 epoch 9 - iter 520/2606 - loss 0.20122891 - time (sec): 12.40 - samples/sec: 5721.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 12:10:09,434 epoch 9 - iter 780/2606 - loss 0.19566717 - time (sec): 18.28 - samples/sec: 5970.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 12:10:15,594 epoch 9 - iter 1040/2606 - loss 0.20342649 - time (sec): 24.44 - samples/sec: 5996.37 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 12:10:21,820 epoch 9 - iter 1300/2606 - loss 0.20271150 - time (sec): 30.66 - samples/sec: 6007.21 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 12:10:28,209 epoch 9 - iter 1560/2606 - loss 0.20653268 - time (sec): 37.05 - samples/sec: 5971.23 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 12:10:34,354 epoch 9 - iter 1820/2606 - loss 0.20768132 - time (sec): 43.20 - samples/sec: 5941.15 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 12:10:40,736 epoch 9 - iter 2080/2606 - loss 0.20750751 - time (sec): 49.58 - samples/sec: 5932.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 12:10:46,708 epoch 9 - iter 2340/2606 - loss 0.20839454 - time (sec): 55.55 - samples/sec: 5946.27 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 12:10:52,840 epoch 9 - iter 2600/2606 - loss 0.20899152 - time (sec): 61.68 - samples/sec: 5940.26 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 12:10:52,990 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:10:52,991 EPOCH 9 done: loss 0.2090 - lr: 0.000003 |
|
2023-10-19 12:10:58,209 DEV : loss 0.1496579349040985 - f1-score (micro avg) 0.2732 |
|
2023-10-19 12:10:58,234 saving best model |
|
2023-10-19 12:10:58,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:11:04,334 epoch 10 - iter 260/2606 - loss 0.22099529 - time (sec): 6.07 - samples/sec: 5358.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 12:11:10,601 epoch 10 - iter 520/2606 - loss 0.20573417 - time (sec): 12.33 - samples/sec: 5725.47 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 12:11:16,687 epoch 10 - iter 780/2606 - loss 0.20760408 - time (sec): 18.42 - samples/sec: 5702.75 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 12:11:23,120 epoch 10 - iter 1040/2606 - loss 0.21595404 - time (sec): 24.85 - samples/sec: 5730.46 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 12:11:29,235 epoch 10 - iter 1300/2606 - loss 0.21080473 - time (sec): 30.97 - samples/sec: 5804.93 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 12:11:35,569 epoch 10 - iter 1560/2606 - loss 0.21062924 - time (sec): 37.30 - samples/sec: 5854.00 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 12:11:41,642 epoch 10 - iter 1820/2606 - loss 0.20879304 - time (sec): 43.38 - samples/sec: 5815.43 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 12:11:47,915 epoch 10 - iter 2080/2606 - loss 0.20787473 - time (sec): 49.65 - samples/sec: 5856.60 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 12:11:54,226 epoch 10 - iter 2340/2606 - loss 0.20759076 - time (sec): 55.96 - samples/sec: 5892.12 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 12:12:00,576 epoch 10 - iter 2600/2606 - loss 0.20787899 - time (sec): 62.31 - samples/sec: 5888.75 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 12:12:00,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:00,722 EPOCH 10 done: loss 0.2078 - lr: 0.000000 |
|
2023-10-19 12:12:05,984 DEV : loss 0.1529059261083603 - f1-score (micro avg) 0.2686 |
|
2023-10-19 12:12:06,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 12:12:06,041 Loading model from best epoch ... |
|
2023-10-19 12:12:06,121 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 12:12:12,588 |
|
Results: |
|
- F-score (micro) 0.3045 |
|
- F-score (macro) 0.1659 |
|
- Accuracy 0.1814 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4606 0.4811 0.4706 1214 |
|
PER 0.1468 0.1510 0.1489 808 |
|
ORG 0.0556 0.0368 0.0443 353 |
|
HumanProd 0.0000 0.0000 0.0000 15 |
|
|
|
micro avg 0.3082 0.3008 0.3045 2390 |
|
macro avg 0.1657 0.1672 0.1659 2390 |
|
weighted avg 0.2918 0.3008 0.2959 2390 |
|
|
|
2023-10-19 12:12:12,588 ---------------------------------------------------------------------------------------------------- |
|
|