|
2023-10-19 23:59:35,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,157 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 23:59:35,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,157 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-19 23:59:35,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,157 Train: 1166 sentences |
|
2023-10-19 23:59:35,157 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 23:59:35,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,157 Training Params: |
|
2023-10-19 23:59:35,157 - learning_rate: "3e-05" |
|
2023-10-19 23:59:35,157 - mini_batch_size: "8" |
|
2023-10-19 23:59:35,157 - max_epochs: "10" |
|
2023-10-19 23:59:35,157 - shuffle: "True" |
|
2023-10-19 23:59:35,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,157 Plugins: |
|
2023-10-19 23:59:35,157 - TensorboardLogger |
|
2023-10-19 23:59:35,157 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 23:59:35,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,157 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 23:59:35,157 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 23:59:35,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,157 Computation: |
|
2023-10-19 23:59:35,157 - compute on device: cuda:0 |
|
2023-10-19 23:59:35,157 - embedding storage: none |
|
2023-10-19 23:59:35,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,158 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-19 23:59:35,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:35,158 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 23:59:35,476 epoch 1 - iter 14/146 - loss 3.18295949 - time (sec): 0.32 - samples/sec: 12068.60 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:59:35,779 epoch 1 - iter 28/146 - loss 3.18089286 - time (sec): 0.62 - samples/sec: 12118.44 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:59:36,085 epoch 1 - iter 42/146 - loss 3.14660440 - time (sec): 0.93 - samples/sec: 12242.69 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:59:36,432 epoch 1 - iter 56/146 - loss 3.13843486 - time (sec): 1.27 - samples/sec: 12087.70 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:59:36,809 epoch 1 - iter 70/146 - loss 3.02837731 - time (sec): 1.65 - samples/sec: 12115.45 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:59:37,161 epoch 1 - iter 84/146 - loss 2.92849125 - time (sec): 2.00 - samples/sec: 11960.64 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:59:37,521 epoch 1 - iter 98/146 - loss 2.78120716 - time (sec): 2.36 - samples/sec: 12020.12 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:59:37,894 epoch 1 - iter 112/146 - loss 2.62445154 - time (sec): 2.74 - samples/sec: 12187.09 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:59:38,265 epoch 1 - iter 126/146 - loss 2.48234548 - time (sec): 3.11 - samples/sec: 12084.25 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:59:38,653 epoch 1 - iter 140/146 - loss 2.32927838 - time (sec): 3.49 - samples/sec: 12165.04 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:59:38,809 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:38,809 EPOCH 1 done: loss 2.2660 - lr: 0.000029 |
|
2023-10-19 23:59:39,074 DEV : loss 0.5118019580841064 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:59:39,078 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:39,463 epoch 2 - iter 14/146 - loss 1.25242238 - time (sec): 0.38 - samples/sec: 12421.92 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:59:39,839 epoch 2 - iter 28/146 - loss 1.07356757 - time (sec): 0.76 - samples/sec: 12143.27 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:59:40,197 epoch 2 - iter 42/146 - loss 0.95455788 - time (sec): 1.12 - samples/sec: 11891.73 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:59:40,545 epoch 2 - iter 56/146 - loss 0.91349265 - time (sec): 1.47 - samples/sec: 11596.85 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:59:40,917 epoch 2 - iter 70/146 - loss 0.86781807 - time (sec): 1.84 - samples/sec: 11466.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:59:41,263 epoch 2 - iter 84/146 - loss 0.83753148 - time (sec): 2.19 - samples/sec: 11414.05 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:59:41,614 epoch 2 - iter 98/146 - loss 0.81276469 - time (sec): 2.54 - samples/sec: 11316.83 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:59:42,009 epoch 2 - iter 112/146 - loss 0.78058945 - time (sec): 2.93 - samples/sec: 11521.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:59:42,400 epoch 2 - iter 126/146 - loss 0.75445669 - time (sec): 3.32 - samples/sec: 11782.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:59:42,759 epoch 2 - iter 140/146 - loss 0.76111334 - time (sec): 3.68 - samples/sec: 11642.35 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:59:42,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:42,893 EPOCH 2 done: loss 0.7533 - lr: 0.000027 |
|
2023-10-19 23:59:43,532 DEV : loss 0.4549146592617035 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:59:43,535 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:43,881 epoch 3 - iter 14/146 - loss 0.52019195 - time (sec): 0.35 - samples/sec: 11650.89 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:59:44,397 epoch 3 - iter 28/146 - loss 0.55554037 - time (sec): 0.86 - samples/sec: 9948.50 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:59:44,750 epoch 3 - iter 42/146 - loss 0.60565982 - time (sec): 1.21 - samples/sec: 10668.62 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:59:45,138 epoch 3 - iter 56/146 - loss 0.68493437 - time (sec): 1.60 - samples/sec: 10873.09 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:59:45,503 epoch 3 - iter 70/146 - loss 0.67322012 - time (sec): 1.97 - samples/sec: 10774.08 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:59:45,907 epoch 3 - iter 84/146 - loss 0.66209201 - time (sec): 2.37 - samples/sec: 11106.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:59:46,247 epoch 3 - iter 98/146 - loss 0.65709488 - time (sec): 2.71 - samples/sec: 11125.35 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:59:46,605 epoch 3 - iter 112/146 - loss 0.63909005 - time (sec): 3.07 - samples/sec: 11234.49 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:59:46,952 epoch 3 - iter 126/146 - loss 0.62972925 - time (sec): 3.42 - samples/sec: 11133.50 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:59:47,321 epoch 3 - iter 140/146 - loss 0.61992130 - time (sec): 3.79 - samples/sec: 11290.42 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:59:47,477 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:47,478 EPOCH 3 done: loss 0.6170 - lr: 0.000024 |
|
2023-10-19 23:59:48,116 DEV : loss 0.3993528187274933 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:59:48,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:48,473 epoch 4 - iter 14/146 - loss 0.49195585 - time (sec): 0.35 - samples/sec: 10485.17 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:59:48,847 epoch 4 - iter 28/146 - loss 0.52096420 - time (sec): 0.73 - samples/sec: 10514.36 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:59:49,211 epoch 4 - iter 42/146 - loss 0.50110568 - time (sec): 1.09 - samples/sec: 11336.25 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:59:49,583 epoch 4 - iter 56/146 - loss 0.51380249 - time (sec): 1.46 - samples/sec: 11278.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:59:49,934 epoch 4 - iter 70/146 - loss 0.51458773 - time (sec): 1.81 - samples/sec: 11324.99 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:59:50,283 epoch 4 - iter 84/146 - loss 0.51597388 - time (sec): 2.16 - samples/sec: 11327.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:59:50,662 epoch 4 - iter 98/146 - loss 0.55252997 - time (sec): 2.54 - samples/sec: 11496.30 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:59:51,031 epoch 4 - iter 112/146 - loss 0.53615248 - time (sec): 2.91 - samples/sec: 11591.09 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:59:51,392 epoch 4 - iter 126/146 - loss 0.53462161 - time (sec): 3.27 - samples/sec: 11526.68 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:59:51,766 epoch 4 - iter 140/146 - loss 0.53229005 - time (sec): 3.65 - samples/sec: 11638.70 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:59:51,922 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:51,923 EPOCH 4 done: loss 0.5321 - lr: 0.000020 |
|
2023-10-19 23:59:52,563 DEV : loss 0.35903558135032654 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:59:52,567 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:52,950 epoch 5 - iter 14/146 - loss 0.48076598 - time (sec): 0.38 - samples/sec: 13506.37 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:59:53,338 epoch 5 - iter 28/146 - loss 0.56335970 - time (sec): 0.77 - samples/sec: 12416.48 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:59:53,693 epoch 5 - iter 42/146 - loss 0.53527449 - time (sec): 1.13 - samples/sec: 11906.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:59:54,053 epoch 5 - iter 56/146 - loss 0.52682013 - time (sec): 1.48 - samples/sec: 11577.85 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:59:54,452 epoch 5 - iter 70/146 - loss 0.51918424 - time (sec): 1.88 - samples/sec: 11681.15 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:59:54,817 epoch 5 - iter 84/146 - loss 0.50445202 - time (sec): 2.25 - samples/sec: 11688.31 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:59:55,176 epoch 5 - iter 98/146 - loss 0.50158859 - time (sec): 2.61 - samples/sec: 11454.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:59:55,540 epoch 5 - iter 112/146 - loss 0.50004276 - time (sec): 2.97 - samples/sec: 11596.90 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:59:55,879 epoch 5 - iter 126/146 - loss 0.51064784 - time (sec): 3.31 - samples/sec: 11503.43 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:59:56,276 epoch 5 - iter 140/146 - loss 0.49969320 - time (sec): 3.71 - samples/sec: 11598.76 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:59:56,428 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:56,428 EPOCH 5 done: loss 0.4946 - lr: 0.000017 |
|
2023-10-19 23:59:57,062 DEV : loss 0.34515950083732605 - f1-score (micro avg) 0.0081 |
|
2023-10-19 23:59:57,066 saving best model |
|
2023-10-19 23:59:57,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:59:57,478 epoch 6 - iter 14/146 - loss 0.50665335 - time (sec): 0.38 - samples/sec: 11411.49 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:59:57,841 epoch 6 - iter 28/146 - loss 0.46417149 - time (sec): 0.75 - samples/sec: 10997.53 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:59:58,217 epoch 6 - iter 42/146 - loss 0.47608930 - time (sec): 1.12 - samples/sec: 10911.57 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:59:58,582 epoch 6 - iter 56/146 - loss 0.48079412 - time (sec): 1.49 - samples/sec: 11158.65 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:59:58,951 epoch 6 - iter 70/146 - loss 0.46065126 - time (sec): 1.86 - samples/sec: 11488.54 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:59:59,328 epoch 6 - iter 84/146 - loss 0.44948389 - time (sec): 2.23 - samples/sec: 11435.45 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:59:59,718 epoch 6 - iter 98/146 - loss 0.44371072 - time (sec): 2.62 - samples/sec: 11532.27 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:00:00,073 epoch 6 - iter 112/146 - loss 0.44471092 - time (sec): 2.98 - samples/sec: 11667.98 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:00:00,428 epoch 6 - iter 126/146 - loss 0.44516141 - time (sec): 3.33 - samples/sec: 11597.63 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:00:00,799 epoch 6 - iter 140/146 - loss 0.45814073 - time (sec): 3.71 - samples/sec: 11613.79 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:00:00,948 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:00,948 EPOCH 6 done: loss 0.4585 - lr: 0.000014 |
|
2023-10-20 00:00:01,595 DEV : loss 0.34095141291618347 - f1-score (micro avg) 0.0154 |
|
2023-10-20 00:00:01,600 saving best model |
|
2023-10-20 00:00:01,642 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:02,017 epoch 7 - iter 14/146 - loss 0.37536862 - time (sec): 0.37 - samples/sec: 14301.95 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:00:02,380 epoch 7 - iter 28/146 - loss 0.43351013 - time (sec): 0.74 - samples/sec: 12252.57 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:00:02,752 epoch 7 - iter 42/146 - loss 0.45088688 - time (sec): 1.11 - samples/sec: 11466.86 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:00:03,117 epoch 7 - iter 56/146 - loss 0.42924375 - time (sec): 1.47 - samples/sec: 11822.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:00:03,427 epoch 7 - iter 70/146 - loss 0.43297036 - time (sec): 1.78 - samples/sec: 11809.32 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:00:03,696 epoch 7 - iter 84/146 - loss 0.43010334 - time (sec): 2.05 - samples/sec: 12083.12 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:00:04,108 epoch 7 - iter 98/146 - loss 0.44555768 - time (sec): 2.47 - samples/sec: 12339.04 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:00:04,448 epoch 7 - iter 112/146 - loss 0.44808496 - time (sec): 2.81 - samples/sec: 12301.85 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:00:04,967 epoch 7 - iter 126/146 - loss 0.45269904 - time (sec): 3.32 - samples/sec: 11712.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:00:05,313 epoch 7 - iter 140/146 - loss 0.44901216 - time (sec): 3.67 - samples/sec: 11580.63 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:00:05,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:05,473 EPOCH 7 done: loss 0.4437 - lr: 0.000010 |
|
2023-10-20 00:00:06,110 DEV : loss 0.32878896594047546 - f1-score (micro avg) 0.0495 |
|
2023-10-20 00:00:06,114 saving best model |
|
2023-10-20 00:00:06,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:06,534 epoch 8 - iter 14/146 - loss 0.39681875 - time (sec): 0.38 - samples/sec: 11339.31 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:00:06,942 epoch 8 - iter 28/146 - loss 0.41183417 - time (sec): 0.78 - samples/sec: 11136.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:00:07,341 epoch 8 - iter 42/146 - loss 0.37624176 - time (sec): 1.18 - samples/sec: 12052.62 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:00:07,687 epoch 8 - iter 56/146 - loss 0.40502606 - time (sec): 1.53 - samples/sec: 11829.40 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:00:08,032 epoch 8 - iter 70/146 - loss 0.40968586 - time (sec): 1.87 - samples/sec: 11413.53 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:00:08,402 epoch 8 - iter 84/146 - loss 0.41848763 - time (sec): 2.24 - samples/sec: 11259.67 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:00:08,753 epoch 8 - iter 98/146 - loss 0.42305015 - time (sec): 2.59 - samples/sec: 11220.23 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:00:09,114 epoch 8 - iter 112/146 - loss 0.42016543 - time (sec): 2.96 - samples/sec: 11178.15 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:00:09,488 epoch 8 - iter 126/146 - loss 0.42102476 - time (sec): 3.33 - samples/sec: 11191.99 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:00:09,901 epoch 8 - iter 140/146 - loss 0.43500062 - time (sec): 3.74 - samples/sec: 11474.44 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:00:10,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:10,065 EPOCH 8 done: loss 0.4372 - lr: 0.000007 |
|
2023-10-20 00:00:10,698 DEV : loss 0.3299644887447357 - f1-score (micro avg) 0.0741 |
|
2023-10-20 00:00:10,702 saving best model |
|
2023-10-20 00:00:10,737 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:11,089 epoch 9 - iter 14/146 - loss 0.43566357 - time (sec): 0.35 - samples/sec: 11233.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:00:11,458 epoch 9 - iter 28/146 - loss 0.40848825 - time (sec): 0.72 - samples/sec: 11203.25 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:00:11,837 epoch 9 - iter 42/146 - loss 0.40732076 - time (sec): 1.10 - samples/sec: 11414.38 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:00:12,194 epoch 9 - iter 56/146 - loss 0.39577925 - time (sec): 1.46 - samples/sec: 11311.79 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:00:12,528 epoch 9 - iter 70/146 - loss 0.40666537 - time (sec): 1.79 - samples/sec: 11346.74 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:00:12,891 epoch 9 - iter 84/146 - loss 0.41239649 - time (sec): 2.15 - samples/sec: 11391.10 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:00:13,234 epoch 9 - iter 98/146 - loss 0.41115755 - time (sec): 2.50 - samples/sec: 11527.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:00:13,627 epoch 9 - iter 112/146 - loss 0.40314181 - time (sec): 2.89 - samples/sec: 11627.06 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:00:14,025 epoch 9 - iter 126/146 - loss 0.42051838 - time (sec): 3.29 - samples/sec: 11689.03 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:00:14,386 epoch 9 - iter 140/146 - loss 0.42508445 - time (sec): 3.65 - samples/sec: 11694.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:00:14,537 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:14,537 EPOCH 9 done: loss 0.4241 - lr: 0.000004 |
|
2023-10-20 00:00:15,175 DEV : loss 0.32740116119384766 - f1-score (micro avg) 0.0936 |
|
2023-10-20 00:00:15,179 saving best model |
|
2023-10-20 00:00:15,213 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:15,569 epoch 10 - iter 14/146 - loss 0.38655630 - time (sec): 0.35 - samples/sec: 13761.43 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:00:15,986 epoch 10 - iter 28/146 - loss 0.41125034 - time (sec): 0.77 - samples/sec: 13487.87 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:00:16,338 epoch 10 - iter 42/146 - loss 0.42992884 - time (sec): 1.12 - samples/sec: 12256.74 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:00:16,705 epoch 10 - iter 56/146 - loss 0.41655376 - time (sec): 1.49 - samples/sec: 12220.10 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:00:17,068 epoch 10 - iter 70/146 - loss 0.42169088 - time (sec): 1.85 - samples/sec: 11762.54 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:00:17,434 epoch 10 - iter 84/146 - loss 0.41552101 - time (sec): 2.22 - samples/sec: 11605.28 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:00:17,786 epoch 10 - iter 98/146 - loss 0.41674133 - time (sec): 2.57 - samples/sec: 11569.70 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:00:18,145 epoch 10 - iter 112/146 - loss 0.42416179 - time (sec): 2.93 - samples/sec: 11546.08 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:00:18,512 epoch 10 - iter 126/146 - loss 0.42713975 - time (sec): 3.30 - samples/sec: 11437.32 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:00:18,886 epoch 10 - iter 140/146 - loss 0.42392021 - time (sec): 3.67 - samples/sec: 11405.36 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-20 00:00:19,064 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:19,064 EPOCH 10 done: loss 0.4288 - lr: 0.000000 |
|
2023-10-20 00:00:19,699 DEV : loss 0.32682910561561584 - f1-score (micro avg) 0.1056 |
|
2023-10-20 00:00:19,702 saving best model |
|
2023-10-20 00:00:19,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:00:19,763 Loading model from best epoch ... |
|
2023-10-20 00:00:19,838 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-20 00:00:20,720 |
|
Results: |
|
- F-score (micro) 0.1894 |
|
- F-score (macro) 0.0812 |
|
- Accuracy 0.1071 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.2813 0.2644 0.2726 348 |
|
LOC 0.1739 0.0307 0.0521 261 |
|
ORG 0.0000 0.0000 0.0000 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.2681 0.1464 0.1894 683 |
|
macro avg 0.1138 0.0738 0.0812 683 |
|
weighted avg 0.2098 0.1464 0.1588 683 |
|
|
|
2023-10-20 00:00:20,720 ---------------------------------------------------------------------------------------------------- |
|
|