stefan-it's picture
Upload folder using huggingface_hub
5b5ee7a
2023-10-20 00:15:41,859 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,859 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-20 00:15:41,859 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,859 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-20 00:15:41,859 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,859 Train: 1085 sentences
2023-10-20 00:15:41,859 (train_with_dev=False, train_with_test=False)
2023-10-20 00:15:41,859 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,859 Training Params:
2023-10-20 00:15:41,859 - learning_rate: "5e-05"
2023-10-20 00:15:41,860 - mini_batch_size: "8"
2023-10-20 00:15:41,860 - max_epochs: "10"
2023-10-20 00:15:41,860 - shuffle: "True"
2023-10-20 00:15:41,860 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,860 Plugins:
2023-10-20 00:15:41,860 - TensorboardLogger
2023-10-20 00:15:41,860 - LinearScheduler | warmup_fraction: '0.1'
2023-10-20 00:15:41,860 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,860 Final evaluation on model from best epoch (best-model.pt)
2023-10-20 00:15:41,860 - metric: "('micro avg', 'f1-score')"
2023-10-20 00:15:41,860 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,860 Computation:
2023-10-20 00:15:41,860 - compute on device: cuda:0
2023-10-20 00:15:41,860 - embedding storage: none
2023-10-20 00:15:41,860 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,860 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-20 00:15:41,860 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,860 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:41,860 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-20 00:15:42,195 epoch 1 - iter 13/136 - loss 2.71580342 - time (sec): 0.33 - samples/sec: 14153.70 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:15:42,545 epoch 1 - iter 26/136 - loss 2.75774241 - time (sec): 0.68 - samples/sec: 15045.68 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:15:42,867 epoch 1 - iter 39/136 - loss 2.72096802 - time (sec): 1.01 - samples/sec: 13856.46 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:15:43,191 epoch 1 - iter 52/136 - loss 2.61686082 - time (sec): 1.33 - samples/sec: 13726.73 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:15:43,554 epoch 1 - iter 65/136 - loss 2.46982325 - time (sec): 1.69 - samples/sec: 13890.17 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:15:43,910 epoch 1 - iter 78/136 - loss 2.33117949 - time (sec): 2.05 - samples/sec: 13978.28 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:15:44,261 epoch 1 - iter 91/136 - loss 2.15087490 - time (sec): 2.40 - samples/sec: 14298.59 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:15:44,613 epoch 1 - iter 104/136 - loss 2.00246764 - time (sec): 2.75 - samples/sec: 14355.78 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:15:44,981 epoch 1 - iter 117/136 - loss 1.82360781 - time (sec): 3.12 - samples/sec: 14661.62 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:15:45,335 epoch 1 - iter 130/136 - loss 1.72004646 - time (sec): 3.47 - samples/sec: 14430.62 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:15:45,481 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:45,481 EPOCH 1 done: loss 1.6896 - lr: 0.000047
2023-10-20 00:15:45,742 DEV : loss 0.5129944086074829 - f1-score (micro avg) 0.0
2023-10-20 00:15:45,746 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:46,093 epoch 2 - iter 13/136 - loss 0.62034732 - time (sec): 0.35 - samples/sec: 13208.37 - lr: 0.000050 - momentum: 0.000000
2023-10-20 00:15:46,463 epoch 2 - iter 26/136 - loss 0.57324884 - time (sec): 0.72 - samples/sec: 13480.12 - lr: 0.000049 - momentum: 0.000000
2023-10-20 00:15:46,840 epoch 2 - iter 39/136 - loss 0.58051663 - time (sec): 1.09 - samples/sec: 14015.23 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:15:47,193 epoch 2 - iter 52/136 - loss 0.59433839 - time (sec): 1.45 - samples/sec: 13933.89 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:15:47,548 epoch 2 - iter 65/136 - loss 0.58965203 - time (sec): 1.80 - samples/sec: 14035.15 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:15:47,920 epoch 2 - iter 78/136 - loss 0.57088650 - time (sec): 2.17 - samples/sec: 13849.79 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:15:48,258 epoch 2 - iter 91/136 - loss 0.56744435 - time (sec): 2.51 - samples/sec: 13913.96 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:15:48,628 epoch 2 - iter 104/136 - loss 0.57337062 - time (sec): 2.88 - samples/sec: 14004.86 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:15:48,981 epoch 2 - iter 117/136 - loss 0.57817989 - time (sec): 3.23 - samples/sec: 13742.13 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:15:49,331 epoch 2 - iter 130/136 - loss 0.56972448 - time (sec): 3.58 - samples/sec: 13867.30 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:15:49,487 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:49,487 EPOCH 2 done: loss 0.5703 - lr: 0.000045
2023-10-20 00:15:50,240 DEV : loss 0.40280210971832275 - f1-score (micro avg) 0.0
2023-10-20 00:15:50,245 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:50,751 epoch 3 - iter 13/136 - loss 0.47143283 - time (sec): 0.51 - samples/sec: 9211.51 - lr: 0.000044 - momentum: 0.000000
2023-10-20 00:15:51,094 epoch 3 - iter 26/136 - loss 0.51209350 - time (sec): 0.85 - samples/sec: 10166.85 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:15:51,459 epoch 3 - iter 39/136 - loss 0.52369734 - time (sec): 1.21 - samples/sec: 10933.95 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:15:51,820 epoch 3 - iter 52/136 - loss 0.49813373 - time (sec): 1.57 - samples/sec: 11544.73 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:15:52,178 epoch 3 - iter 65/136 - loss 0.49048450 - time (sec): 1.93 - samples/sec: 12257.12 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:15:52,542 epoch 3 - iter 78/136 - loss 0.50132646 - time (sec): 2.30 - samples/sec: 12446.64 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:15:52,900 epoch 3 - iter 91/136 - loss 0.49578107 - time (sec): 2.65 - samples/sec: 12747.96 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:15:53,277 epoch 3 - iter 104/136 - loss 0.49932953 - time (sec): 3.03 - samples/sec: 13238.80 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:15:53,637 epoch 3 - iter 117/136 - loss 0.48966975 - time (sec): 3.39 - samples/sec: 13250.97 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:15:54,003 epoch 3 - iter 130/136 - loss 0.48182243 - time (sec): 3.76 - samples/sec: 13366.32 - lr: 0.000039 - momentum: 0.000000
2023-10-20 00:15:54,151 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:54,151 EPOCH 3 done: loss 0.4773 - lr: 0.000039
2023-10-20 00:15:54,913 DEV : loss 0.3341137170791626 - f1-score (micro avg) 0.0449
2023-10-20 00:15:54,917 saving best model
2023-10-20 00:15:54,944 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:55,298 epoch 4 - iter 13/136 - loss 0.43682280 - time (sec): 0.35 - samples/sec: 14480.67 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:15:55,664 epoch 4 - iter 26/136 - loss 0.43805806 - time (sec): 0.72 - samples/sec: 14581.64 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:15:56,021 epoch 4 - iter 39/136 - loss 0.42569577 - time (sec): 1.08 - samples/sec: 14557.46 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:15:56,366 epoch 4 - iter 52/136 - loss 0.43024297 - time (sec): 1.42 - samples/sec: 14120.12 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:15:56,709 epoch 4 - iter 65/136 - loss 0.43240328 - time (sec): 1.76 - samples/sec: 13951.12 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:15:57,056 epoch 4 - iter 78/136 - loss 0.43473880 - time (sec): 2.11 - samples/sec: 13893.05 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:15:57,424 epoch 4 - iter 91/136 - loss 0.43156765 - time (sec): 2.48 - samples/sec: 14057.94 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:15:57,800 epoch 4 - iter 104/136 - loss 0.41744288 - time (sec): 2.85 - samples/sec: 14170.57 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:15:58,153 epoch 4 - iter 117/136 - loss 0.42078699 - time (sec): 3.21 - samples/sec: 14016.33 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:15:58,517 epoch 4 - iter 130/136 - loss 0.42729131 - time (sec): 3.57 - samples/sec: 14153.82 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:15:58,665 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:58,665 EPOCH 4 done: loss 0.4280 - lr: 0.000034
2023-10-20 00:15:59,439 DEV : loss 0.3015088438987732 - f1-score (micro avg) 0.1556
2023-10-20 00:15:59,443 saving best model
2023-10-20 00:15:59,475 ----------------------------------------------------------------------------------------------------
2023-10-20 00:15:59,818 epoch 5 - iter 13/136 - loss 0.40628820 - time (sec): 0.34 - samples/sec: 11995.51 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:16:00,142 epoch 5 - iter 26/136 - loss 0.40792736 - time (sec): 0.67 - samples/sec: 13550.05 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:16:00,489 epoch 5 - iter 39/136 - loss 0.39472020 - time (sec): 1.01 - samples/sec: 13879.54 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:16:00,852 epoch 5 - iter 52/136 - loss 0.38363745 - time (sec): 1.38 - samples/sec: 13812.24 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:16:01,220 epoch 5 - iter 65/136 - loss 0.39560354 - time (sec): 1.74 - samples/sec: 14214.90 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:16:01,560 epoch 5 - iter 78/136 - loss 0.40319316 - time (sec): 2.08 - samples/sec: 13951.22 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:16:01,906 epoch 5 - iter 91/136 - loss 0.40240499 - time (sec): 2.43 - samples/sec: 13899.76 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:16:02,283 epoch 5 - iter 104/136 - loss 0.40383466 - time (sec): 2.81 - samples/sec: 14115.06 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:16:02,783 epoch 5 - iter 117/136 - loss 0.39938236 - time (sec): 3.31 - samples/sec: 13521.46 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:16:03,127 epoch 5 - iter 130/136 - loss 0.39994900 - time (sec): 3.65 - samples/sec: 13480.71 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:16:03,293 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:03,294 EPOCH 5 done: loss 0.3989 - lr: 0.000028
2023-10-20 00:16:04,057 DEV : loss 0.2784372568130493 - f1-score (micro avg) 0.2439
2023-10-20 00:16:04,061 saving best model
2023-10-20 00:16:04,091 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:04,438 epoch 6 - iter 13/136 - loss 0.35323433 - time (sec): 0.35 - samples/sec: 14856.59 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:16:04,787 epoch 6 - iter 26/136 - loss 0.33177870 - time (sec): 0.70 - samples/sec: 14858.13 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:16:05,146 epoch 6 - iter 39/136 - loss 0.32621623 - time (sec): 1.05 - samples/sec: 14212.70 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:16:05,502 epoch 6 - iter 52/136 - loss 0.34518206 - time (sec): 1.41 - samples/sec: 14138.18 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:16:05,881 epoch 6 - iter 65/136 - loss 0.36510659 - time (sec): 1.79 - samples/sec: 14228.33 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:16:06,219 epoch 6 - iter 78/136 - loss 0.37685407 - time (sec): 2.13 - samples/sec: 14271.61 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:16:06,558 epoch 6 - iter 91/136 - loss 0.37415174 - time (sec): 2.47 - samples/sec: 14208.04 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:16:06,906 epoch 6 - iter 104/136 - loss 0.37156632 - time (sec): 2.81 - samples/sec: 14184.70 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:16:07,250 epoch 6 - iter 117/136 - loss 0.37053677 - time (sec): 3.16 - samples/sec: 14284.81 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:16:07,585 epoch 6 - iter 130/136 - loss 0.37004141 - time (sec): 3.49 - samples/sec: 14164.21 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:16:07,756 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:07,757 EPOCH 6 done: loss 0.3718 - lr: 0.000023
2023-10-20 00:16:08,520 DEV : loss 0.26717308163642883 - f1-score (micro avg) 0.3377
2023-10-20 00:16:08,525 saving best model
2023-10-20 00:16:08,559 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:08,894 epoch 7 - iter 13/136 - loss 0.40120032 - time (sec): 0.33 - samples/sec: 15234.80 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:16:09,240 epoch 7 - iter 26/136 - loss 0.39554959 - time (sec): 0.68 - samples/sec: 15805.35 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:16:09,585 epoch 7 - iter 39/136 - loss 0.39304469 - time (sec): 1.03 - samples/sec: 14369.40 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:16:09,936 epoch 7 - iter 52/136 - loss 0.38423271 - time (sec): 1.38 - samples/sec: 14393.90 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:16:10,304 epoch 7 - iter 65/136 - loss 0.36837505 - time (sec): 1.74 - samples/sec: 14450.82 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:16:10,639 epoch 7 - iter 78/136 - loss 0.35903008 - time (sec): 2.08 - samples/sec: 14344.88 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:16:11,016 epoch 7 - iter 91/136 - loss 0.35381799 - time (sec): 2.46 - samples/sec: 14332.63 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:16:11,379 epoch 7 - iter 104/136 - loss 0.35303774 - time (sec): 2.82 - samples/sec: 14264.87 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:16:11,722 epoch 7 - iter 117/136 - loss 0.35243192 - time (sec): 3.16 - samples/sec: 14170.51 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:16:12,058 epoch 7 - iter 130/136 - loss 0.35436940 - time (sec): 3.50 - samples/sec: 14177.06 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:16:12,214 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:12,214 EPOCH 7 done: loss 0.3525 - lr: 0.000017
2023-10-20 00:16:12,988 DEV : loss 0.26430419087409973 - f1-score (micro avg) 0.3866
2023-10-20 00:16:12,991 saving best model
2023-10-20 00:16:13,028 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:13,367 epoch 8 - iter 13/136 - loss 0.29813625 - time (sec): 0.34 - samples/sec: 17220.95 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:16:13,673 epoch 8 - iter 26/136 - loss 0.33255910 - time (sec): 0.64 - samples/sec: 16743.16 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:16:13,978 epoch 8 - iter 39/136 - loss 0.34317481 - time (sec): 0.95 - samples/sec: 15949.58 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:16:14,490 epoch 8 - iter 52/136 - loss 0.32334500 - time (sec): 1.46 - samples/sec: 14093.23 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:16:14,843 epoch 8 - iter 65/136 - loss 0.33185318 - time (sec): 1.82 - samples/sec: 14103.91 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:16:15,198 epoch 8 - iter 78/136 - loss 0.32576278 - time (sec): 2.17 - samples/sec: 13984.00 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:16:15,575 epoch 8 - iter 91/136 - loss 0.33042413 - time (sec): 2.55 - samples/sec: 13702.12 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:16:15,960 epoch 8 - iter 104/136 - loss 0.33147055 - time (sec): 2.93 - samples/sec: 13641.42 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:16:16,334 epoch 8 - iter 117/136 - loss 0.33563937 - time (sec): 3.31 - samples/sec: 13536.66 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:16:16,693 epoch 8 - iter 130/136 - loss 0.33446959 - time (sec): 3.66 - samples/sec: 13400.45 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:16:16,873 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:16,873 EPOCH 8 done: loss 0.3366 - lr: 0.000012
2023-10-20 00:16:17,644 DEV : loss 0.2584059238433838 - f1-score (micro avg) 0.4194
2023-10-20 00:16:17,647 saving best model
2023-10-20 00:16:17,678 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:18,033 epoch 9 - iter 13/136 - loss 0.30167206 - time (sec): 0.35 - samples/sec: 14174.49 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:16:18,401 epoch 9 - iter 26/136 - loss 0.34416720 - time (sec): 0.72 - samples/sec: 14024.17 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:16:18,758 epoch 9 - iter 39/136 - loss 0.36372136 - time (sec): 1.08 - samples/sec: 13615.57 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:16:19,167 epoch 9 - iter 52/136 - loss 0.34342203 - time (sec): 1.49 - samples/sec: 13961.96 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:16:19,530 epoch 9 - iter 65/136 - loss 0.33424330 - time (sec): 1.85 - samples/sec: 14116.64 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:16:19,889 epoch 9 - iter 78/136 - loss 0.33573336 - time (sec): 2.21 - samples/sec: 14013.44 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:16:20,253 epoch 9 - iter 91/136 - loss 0.33573487 - time (sec): 2.58 - samples/sec: 13753.97 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:16:20,607 epoch 9 - iter 104/136 - loss 0.33651787 - time (sec): 2.93 - samples/sec: 13738.42 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:16:20,953 epoch 9 - iter 117/136 - loss 0.33685306 - time (sec): 3.27 - samples/sec: 13701.75 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:16:21,269 epoch 9 - iter 130/136 - loss 0.33220709 - time (sec): 3.59 - samples/sec: 13754.95 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:16:21,432 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:21,432 EPOCH 9 done: loss 0.3339 - lr: 0.000006
2023-10-20 00:16:22,195 DEV : loss 0.2596355974674225 - f1-score (micro avg) 0.4057
2023-10-20 00:16:22,199 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:22,529 epoch 10 - iter 13/136 - loss 0.38669202 - time (sec): 0.33 - samples/sec: 14013.53 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:16:22,883 epoch 10 - iter 26/136 - loss 0.32356639 - time (sec): 0.68 - samples/sec: 14649.18 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:16:23,221 epoch 10 - iter 39/136 - loss 0.30991686 - time (sec): 1.02 - samples/sec: 14213.28 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:16:23,594 epoch 10 - iter 52/136 - loss 0.32955241 - time (sec): 1.39 - samples/sec: 14244.46 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:16:23,947 epoch 10 - iter 65/136 - loss 0.31295652 - time (sec): 1.75 - samples/sec: 14096.10 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:16:24,285 epoch 10 - iter 78/136 - loss 0.32045448 - time (sec): 2.09 - samples/sec: 14141.19 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:16:24,642 epoch 10 - iter 91/136 - loss 0.32789206 - time (sec): 2.44 - samples/sec: 14172.52 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:16:24,980 epoch 10 - iter 104/136 - loss 0.33709961 - time (sec): 2.78 - samples/sec: 13970.42 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:16:25,320 epoch 10 - iter 117/136 - loss 0.32235981 - time (sec): 3.12 - samples/sec: 14493.77 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:16:25,634 epoch 10 - iter 130/136 - loss 0.32526182 - time (sec): 3.43 - samples/sec: 14489.48 - lr: 0.000000 - momentum: 0.000000
2023-10-20 00:16:25,946 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:25,946 EPOCH 10 done: loss 0.3257 - lr: 0.000000
2023-10-20 00:16:26,717 DEV : loss 0.25762829184532166 - f1-score (micro avg) 0.4177
2023-10-20 00:16:26,746 ----------------------------------------------------------------------------------------------------
2023-10-20 00:16:26,747 Loading model from best epoch ...
2023-10-20 00:16:26,821 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-20 00:16:27,643
Results:
- F-score (micro) 0.3368
- F-score (macro) 0.1713
- Accuracy 0.2119
By class:
precision recall f1-score support
LOC 0.4799 0.4583 0.4689 312
PER 0.2000 0.2356 0.2163 208
ORG 0.0000 0.0000 0.0000 55
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.3536 0.3216 0.3368 597
macro avg 0.1700 0.1735 0.1713 597
weighted avg 0.3205 0.3216 0.3204 597
2023-10-20 00:16:27,643 ----------------------------------------------------------------------------------------------------