|
2023-10-18 23:02:51,911 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,911 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 23:02:51,911 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,911 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-18 23:02:51,911 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,911 Train: 5777 sentences |
|
2023-10-18 23:02:51,911 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 23:02:51,911 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,912 Training Params: |
|
2023-10-18 23:02:51,912 - learning_rate: "3e-05" |
|
2023-10-18 23:02:51,912 - mini_batch_size: "4" |
|
2023-10-18 23:02:51,912 - max_epochs: "10" |
|
2023-10-18 23:02:51,912 - shuffle: "True" |
|
2023-10-18 23:02:51,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,912 Plugins: |
|
2023-10-18 23:02:51,912 - TensorboardLogger |
|
2023-10-18 23:02:51,912 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 23:02:51,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,912 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 23:02:51,912 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 23:02:51,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,912 Computation: |
|
2023-10-18 23:02:51,912 - compute on device: cuda:0 |
|
2023-10-18 23:02:51,912 - embedding storage: none |
|
2023-10-18 23:02:51,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,912 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-18 23:02:51,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:02:51,912 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 23:02:54,363 epoch 1 - iter 144/1445 - loss 2.40688323 - time (sec): 2.45 - samples/sec: 7209.07 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:02:56,803 epoch 1 - iter 288/1445 - loss 2.16702649 - time (sec): 4.89 - samples/sec: 6895.86 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:02:59,348 epoch 1 - iter 432/1445 - loss 1.77638178 - time (sec): 7.44 - samples/sec: 7054.30 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:03:01,885 epoch 1 - iter 576/1445 - loss 1.45489838 - time (sec): 9.97 - samples/sec: 7082.26 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:03:04,390 epoch 1 - iter 720/1445 - loss 1.22567144 - time (sec): 12.48 - samples/sec: 7150.33 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:03:06,774 epoch 1 - iter 864/1445 - loss 1.09340287 - time (sec): 14.86 - samples/sec: 7075.00 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:03:09,242 epoch 1 - iter 1008/1445 - loss 0.98083253 - time (sec): 17.33 - samples/sec: 7107.21 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:03:11,737 epoch 1 - iter 1152/1445 - loss 0.88743135 - time (sec): 19.82 - samples/sec: 7134.54 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:03:14,166 epoch 1 - iter 1296/1445 - loss 0.81960748 - time (sec): 22.25 - samples/sec: 7143.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:03:16,649 epoch 1 - iter 1440/1445 - loss 0.76764023 - time (sec): 24.74 - samples/sec: 7096.78 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 23:03:16,740 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:03:16,741 EPOCH 1 done: loss 0.7656 - lr: 0.000030 |
|
2023-10-18 23:03:18,374 DEV : loss 0.3041687309741974 - f1-score (micro avg) 0.0 |
|
2023-10-18 23:03:18,389 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:03:20,568 epoch 2 - iter 144/1445 - loss 0.29359387 - time (sec): 2.18 - samples/sec: 7939.30 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 23:03:23,066 epoch 2 - iter 288/1445 - loss 0.26273077 - time (sec): 4.68 - samples/sec: 7619.34 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:03:25,423 epoch 2 - iter 432/1445 - loss 0.25450001 - time (sec): 7.03 - samples/sec: 7358.15 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:03:27,866 epoch 2 - iter 576/1445 - loss 0.23603380 - time (sec): 9.48 - samples/sec: 7401.14 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 23:03:30,244 epoch 2 - iter 720/1445 - loss 0.23205726 - time (sec): 11.85 - samples/sec: 7331.02 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:03:32,580 epoch 2 - iter 864/1445 - loss 0.22948130 - time (sec): 14.19 - samples/sec: 7294.39 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:03:35,001 epoch 2 - iter 1008/1445 - loss 0.22745968 - time (sec): 16.61 - samples/sec: 7298.25 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 23:03:37,433 epoch 2 - iter 1152/1445 - loss 0.22343112 - time (sec): 19.04 - samples/sec: 7384.91 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:03:39,840 epoch 2 - iter 1296/1445 - loss 0.21942496 - time (sec): 21.45 - samples/sec: 7382.71 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:03:42,223 epoch 2 - iter 1440/1445 - loss 0.21609980 - time (sec): 23.83 - samples/sec: 7368.06 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 23:03:42,316 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:03:42,316 EPOCH 2 done: loss 0.2158 - lr: 0.000027 |
|
2023-10-18 23:03:44,052 DEV : loss 0.2594398558139801 - f1-score (micro avg) 0.1437 |
|
2023-10-18 23:03:44,067 saving best model |
|
2023-10-18 23:03:44,097 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:03:46,272 epoch 3 - iter 144/1445 - loss 0.19007076 - time (sec): 2.17 - samples/sec: 8535.60 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:03:48,606 epoch 3 - iter 288/1445 - loss 0.19580306 - time (sec): 4.51 - samples/sec: 7965.31 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:03:51,099 epoch 3 - iter 432/1445 - loss 0.18890403 - time (sec): 7.00 - samples/sec: 7744.07 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 23:03:53,495 epoch 3 - iter 576/1445 - loss 0.18383737 - time (sec): 9.40 - samples/sec: 7642.59 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:03:55,636 epoch 3 - iter 720/1445 - loss 0.18205510 - time (sec): 11.54 - samples/sec: 7660.26 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:03:57,909 epoch 3 - iter 864/1445 - loss 0.18445224 - time (sec): 13.81 - samples/sec: 7655.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 23:04:00,233 epoch 3 - iter 1008/1445 - loss 0.18375521 - time (sec): 16.14 - samples/sec: 7582.51 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:04:02,677 epoch 3 - iter 1152/1445 - loss 0.18536328 - time (sec): 18.58 - samples/sec: 7578.63 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:04:04,892 epoch 3 - iter 1296/1445 - loss 0.18246559 - time (sec): 20.79 - samples/sec: 7605.74 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 23:04:06,849 epoch 3 - iter 1440/1445 - loss 0.18300658 - time (sec): 22.75 - samples/sec: 7722.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:04:06,917 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:04:06,918 EPOCH 3 done: loss 0.1829 - lr: 0.000023 |
|
2023-10-18 23:04:08,669 DEV : loss 0.22454091906547546 - f1-score (micro avg) 0.3787 |
|
2023-10-18 23:04:08,684 saving best model |
|
2023-10-18 23:04:08,719 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:04:11,090 epoch 4 - iter 144/1445 - loss 0.20598709 - time (sec): 2.37 - samples/sec: 7208.81 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:04:13,606 epoch 4 - iter 288/1445 - loss 0.17152491 - time (sec): 4.89 - samples/sec: 7101.77 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 23:04:16,081 epoch 4 - iter 432/1445 - loss 0.17562171 - time (sec): 7.36 - samples/sec: 6963.09 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:04:18,526 epoch 4 - iter 576/1445 - loss 0.17237639 - time (sec): 9.81 - samples/sec: 7124.89 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:04:21,021 epoch 4 - iter 720/1445 - loss 0.17537045 - time (sec): 12.30 - samples/sec: 7209.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 23:04:23,452 epoch 4 - iter 864/1445 - loss 0.17043947 - time (sec): 14.73 - samples/sec: 7280.67 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:04:25,775 epoch 4 - iter 1008/1445 - loss 0.16954881 - time (sec): 17.05 - samples/sec: 7261.99 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:04:28,151 epoch 4 - iter 1152/1445 - loss 0.16995500 - time (sec): 19.43 - samples/sec: 7272.41 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 23:04:30,641 epoch 4 - iter 1296/1445 - loss 0.17077308 - time (sec): 21.92 - samples/sec: 7270.35 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:04:32,994 epoch 4 - iter 1440/1445 - loss 0.16925599 - time (sec): 24.27 - samples/sec: 7241.80 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:04:33,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:04:33,079 EPOCH 4 done: loss 0.1692 - lr: 0.000020 |
|
2023-10-18 23:04:35,223 DEV : loss 0.2026454210281372 - f1-score (micro avg) 0.46 |
|
2023-10-18 23:04:35,238 saving best model |
|
2023-10-18 23:04:35,274 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:04:37,628 epoch 5 - iter 144/1445 - loss 0.17612465 - time (sec): 2.35 - samples/sec: 7145.57 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 23:04:40,062 epoch 5 - iter 288/1445 - loss 0.16936594 - time (sec): 4.79 - samples/sec: 7039.13 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:04:42,527 epoch 5 - iter 432/1445 - loss 0.16852297 - time (sec): 7.25 - samples/sec: 6918.14 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:04:44,864 epoch 5 - iter 576/1445 - loss 0.16172731 - time (sec): 9.59 - samples/sec: 7053.17 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 23:04:47,339 epoch 5 - iter 720/1445 - loss 0.16454915 - time (sec): 12.06 - samples/sec: 7202.33 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:04:49,783 epoch 5 - iter 864/1445 - loss 0.16040247 - time (sec): 14.51 - samples/sec: 7242.88 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:04:52,149 epoch 5 - iter 1008/1445 - loss 0.15645471 - time (sec): 16.87 - samples/sec: 7246.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 23:04:54,559 epoch 5 - iter 1152/1445 - loss 0.15744076 - time (sec): 19.28 - samples/sec: 7229.50 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:04:56,966 epoch 5 - iter 1296/1445 - loss 0.16037484 - time (sec): 21.69 - samples/sec: 7223.75 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:04:59,458 epoch 5 - iter 1440/1445 - loss 0.15697416 - time (sec): 24.18 - samples/sec: 7265.65 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 23:04:59,533 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:04:59,533 EPOCH 5 done: loss 0.1573 - lr: 0.000017 |
|
2023-10-18 23:05:01,293 DEV : loss 0.20206767320632935 - f1-score (micro avg) 0.449 |
|
2023-10-18 23:05:01,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:05:03,768 epoch 6 - iter 144/1445 - loss 0.13278482 - time (sec): 2.46 - samples/sec: 7256.63 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:05:06,110 epoch 6 - iter 288/1445 - loss 0.14204724 - time (sec): 4.80 - samples/sec: 7521.54 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:05:08,561 epoch 6 - iter 432/1445 - loss 0.14541895 - time (sec): 7.25 - samples/sec: 7530.38 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 23:05:10,938 epoch 6 - iter 576/1445 - loss 0.14575876 - time (sec): 9.63 - samples/sec: 7468.77 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:05:13,326 epoch 6 - iter 720/1445 - loss 0.14699381 - time (sec): 12.02 - samples/sec: 7430.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:05:15,911 epoch 6 - iter 864/1445 - loss 0.14715400 - time (sec): 14.60 - samples/sec: 7375.03 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 23:05:18,364 epoch 6 - iter 1008/1445 - loss 0.15188207 - time (sec): 17.06 - samples/sec: 7337.49 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:05:20,770 epoch 6 - iter 1152/1445 - loss 0.15209714 - time (sec): 19.46 - samples/sec: 7267.03 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:05:23,165 epoch 6 - iter 1296/1445 - loss 0.15034065 - time (sec): 21.86 - samples/sec: 7243.64 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 23:05:25,550 epoch 6 - iter 1440/1445 - loss 0.14924088 - time (sec): 24.24 - samples/sec: 7244.99 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:05:25,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:05:25,629 EPOCH 6 done: loss 0.1493 - lr: 0.000013 |
|
2023-10-18 23:05:27,400 DEV : loss 0.1950778216123581 - f1-score (micro avg) 0.4819 |
|
2023-10-18 23:05:27,415 saving best model |
|
2023-10-18 23:05:27,450 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:05:29,821 epoch 7 - iter 144/1445 - loss 0.12999105 - time (sec): 2.37 - samples/sec: 7432.83 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:05:32,202 epoch 7 - iter 288/1445 - loss 0.13963385 - time (sec): 4.75 - samples/sec: 7084.75 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 23:05:34,673 epoch 7 - iter 432/1445 - loss 0.13721728 - time (sec): 7.22 - samples/sec: 7229.55 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:05:37,036 epoch 7 - iter 576/1445 - loss 0.13774287 - time (sec): 9.58 - samples/sec: 7191.97 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:05:39,502 epoch 7 - iter 720/1445 - loss 0.14017485 - time (sec): 12.05 - samples/sec: 7212.06 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 23:05:41,952 epoch 7 - iter 864/1445 - loss 0.14321206 - time (sec): 14.50 - samples/sec: 7239.14 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:05:44,360 epoch 7 - iter 1008/1445 - loss 0.14478599 - time (sec): 16.91 - samples/sec: 7226.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:05:47,222 epoch 7 - iter 1152/1445 - loss 0.14420379 - time (sec): 19.77 - samples/sec: 7150.02 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 23:05:49,644 epoch 7 - iter 1296/1445 - loss 0.14518711 - time (sec): 22.19 - samples/sec: 7128.53 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:05:52,025 epoch 7 - iter 1440/1445 - loss 0.14457363 - time (sec): 24.57 - samples/sec: 7153.41 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:05:52,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:05:52,100 EPOCH 7 done: loss 0.1445 - lr: 0.000010 |
|
2023-10-18 23:05:53,868 DEV : loss 0.19153057038784027 - f1-score (micro avg) 0.4971 |
|
2023-10-18 23:05:53,883 saving best model |
|
2023-10-18 23:05:53,917 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:05:56,285 epoch 8 - iter 144/1445 - loss 0.13969214 - time (sec): 2.37 - samples/sec: 7535.20 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 23:05:58,624 epoch 8 - iter 288/1445 - loss 0.13329125 - time (sec): 4.71 - samples/sec: 7374.16 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:06:01,150 epoch 8 - iter 432/1445 - loss 0.13744054 - time (sec): 7.23 - samples/sec: 7339.68 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:06:03,593 epoch 8 - iter 576/1445 - loss 0.13815295 - time (sec): 9.68 - samples/sec: 7379.36 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 23:06:05,998 epoch 8 - iter 720/1445 - loss 0.13947189 - time (sec): 12.08 - samples/sec: 7351.63 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:06:08,453 epoch 8 - iter 864/1445 - loss 0.14220794 - time (sec): 14.54 - samples/sec: 7331.00 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:06:10,928 epoch 8 - iter 1008/1445 - loss 0.14298530 - time (sec): 17.01 - samples/sec: 7349.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 23:06:13,291 epoch 8 - iter 1152/1445 - loss 0.14128849 - time (sec): 19.37 - samples/sec: 7316.98 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:06:15,657 epoch 8 - iter 1296/1445 - loss 0.14243983 - time (sec): 21.74 - samples/sec: 7310.36 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:06:17,972 epoch 8 - iter 1440/1445 - loss 0.14092288 - time (sec): 24.05 - samples/sec: 7302.66 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 23:06:18,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:06:18,051 EPOCH 8 done: loss 0.1409 - lr: 0.000007 |
|
2023-10-18 23:06:19,855 DEV : loss 0.1951456367969513 - f1-score (micro avg) 0.4929 |
|
2023-10-18 23:06:19,870 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:06:22,350 epoch 9 - iter 144/1445 - loss 0.12208432 - time (sec): 2.48 - samples/sec: 7527.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:06:24,790 epoch 9 - iter 288/1445 - loss 0.13529299 - time (sec): 4.92 - samples/sec: 7518.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:06:27,182 epoch 9 - iter 432/1445 - loss 0.13047517 - time (sec): 7.31 - samples/sec: 7515.75 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 23:06:29,562 epoch 9 - iter 576/1445 - loss 0.13510776 - time (sec): 9.69 - samples/sec: 7562.06 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:06:31,921 epoch 9 - iter 720/1445 - loss 0.13884068 - time (sec): 12.05 - samples/sec: 7427.40 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:06:34,296 epoch 9 - iter 864/1445 - loss 0.13749162 - time (sec): 14.43 - samples/sec: 7385.23 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 23:06:36,610 epoch 9 - iter 1008/1445 - loss 0.13635465 - time (sec): 16.74 - samples/sec: 7386.18 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:06:39,038 epoch 9 - iter 1152/1445 - loss 0.13400290 - time (sec): 19.17 - samples/sec: 7419.26 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:06:41,451 epoch 9 - iter 1296/1445 - loss 0.13501240 - time (sec): 21.58 - samples/sec: 7385.44 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 23:06:43,816 epoch 9 - iter 1440/1445 - loss 0.13689770 - time (sec): 23.95 - samples/sec: 7342.02 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:06:43,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:06:43,893 EPOCH 9 done: loss 0.1370 - lr: 0.000003 |
|
2023-10-18 23:06:45,669 DEV : loss 0.19276286661624908 - f1-score (micro avg) 0.4977 |
|
2023-10-18 23:06:45,685 saving best model |
|
2023-10-18 23:06:45,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:06:48,248 epoch 10 - iter 144/1445 - loss 0.14416059 - time (sec): 2.52 - samples/sec: 7055.02 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:06:51,014 epoch 10 - iter 288/1445 - loss 0.13432047 - time (sec): 5.29 - samples/sec: 6703.34 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 23:06:53,457 epoch 10 - iter 432/1445 - loss 0.13281270 - time (sec): 7.73 - samples/sec: 6825.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:06:55,815 epoch 10 - iter 576/1445 - loss 0.13398842 - time (sec): 10.09 - samples/sec: 6869.55 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:06:58,284 epoch 10 - iter 720/1445 - loss 0.14258203 - time (sec): 12.56 - samples/sec: 7008.31 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 23:07:00,861 epoch 10 - iter 864/1445 - loss 0.13774110 - time (sec): 15.14 - samples/sec: 7045.22 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:07:03,452 epoch 10 - iter 1008/1445 - loss 0.13798366 - time (sec): 17.73 - samples/sec: 7057.23 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:07:05,814 epoch 10 - iter 1152/1445 - loss 0.13688199 - time (sec): 20.09 - samples/sec: 7010.06 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 23:07:08,163 epoch 10 - iter 1296/1445 - loss 0.13490938 - time (sec): 22.44 - samples/sec: 7042.78 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 23:07:10,518 epoch 10 - iter 1440/1445 - loss 0.13401226 - time (sec): 24.79 - samples/sec: 7089.35 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 23:07:10,593 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:07:10,593 EPOCH 10 done: loss 0.1340 - lr: 0.000000 |
|
2023-10-18 23:07:12,377 DEV : loss 0.19287335872650146 - f1-score (micro avg) 0.4962 |
|
2023-10-18 23:07:12,421 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 23:07:12,421 Loading model from best epoch ... |
|
2023-10-18 23:07:12,502 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-18 23:07:13,863 |
|
Results: |
|
- F-score (micro) 0.5149 |
|
- F-score (macro) 0.3551 |
|
- Accuracy 0.358 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6145 0.6092 0.6118 458 |
|
PER 0.5159 0.4046 0.4535 482 |
|
ORG 0.0000 0.0000 0.0000 69 |
|
|
|
micro avg 0.5697 0.4698 0.5149 1009 |
|
macro avg 0.3768 0.3379 0.3551 1009 |
|
weighted avg 0.5254 0.4698 0.4944 1009 |
|
|
|
2023-10-18 23:07:13,864 ---------------------------------------------------------------------------------------------------- |
|
|