stefan-it's picture
Upload folder using huggingface_hub
4f51710
2023-10-18 14:47:21,880 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,881 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:47:21,881 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,881 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:47:21,881 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,881 Train: 1100 sentences
2023-10-18 14:47:21,881 (train_with_dev=False, train_with_test=False)
2023-10-18 14:47:21,881 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,881 Training Params:
2023-10-18 14:47:21,881 - learning_rate: "3e-05"
2023-10-18 14:47:21,881 - mini_batch_size: "8"
2023-10-18 14:47:21,881 - max_epochs: "10"
2023-10-18 14:47:21,881 - shuffle: "True"
2023-10-18 14:47:21,881 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,881 Plugins:
2023-10-18 14:47:21,881 - TensorboardLogger
2023-10-18 14:47:21,881 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:47:21,881 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,881 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:47:21,881 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:47:21,881 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,882 Computation:
2023-10-18 14:47:21,882 - compute on device: cuda:0
2023-10-18 14:47:21,882 - embedding storage: none
2023-10-18 14:47:21,882 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,882 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 14:47:21,882 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,882 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:21,882 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:47:22,151 epoch 1 - iter 13/138 - loss 3.62072927 - time (sec): 0.27 - samples/sec: 8476.18 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:47:22,434 epoch 1 - iter 26/138 - loss 3.60750674 - time (sec): 0.55 - samples/sec: 7801.18 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:47:22,728 epoch 1 - iter 39/138 - loss 3.60747891 - time (sec): 0.85 - samples/sec: 7892.50 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:47:23,014 epoch 1 - iter 52/138 - loss 3.54157952 - time (sec): 1.13 - samples/sec: 7737.79 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:47:23,304 epoch 1 - iter 65/138 - loss 3.48190713 - time (sec): 1.42 - samples/sec: 7627.98 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:47:23,586 epoch 1 - iter 78/138 - loss 3.37752540 - time (sec): 1.70 - samples/sec: 7711.28 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:47:23,872 epoch 1 - iter 91/138 - loss 3.25545181 - time (sec): 1.99 - samples/sec: 7688.55 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:47:24,175 epoch 1 - iter 104/138 - loss 3.11336421 - time (sec): 2.29 - samples/sec: 7759.88 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:47:24,475 epoch 1 - iter 117/138 - loss 2.99799804 - time (sec): 2.59 - samples/sec: 7640.76 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:47:24,773 epoch 1 - iter 130/138 - loss 2.88206104 - time (sec): 2.89 - samples/sec: 7506.05 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:47:24,944 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:24,944 EPOCH 1 done: loss 2.8146 - lr: 0.000028
2023-10-18 14:47:25,203 DEV : loss 0.9443248510360718 - f1-score (micro avg) 0.0
2023-10-18 14:47:25,209 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:25,521 epoch 2 - iter 13/138 - loss 1.18225113 - time (sec): 0.31 - samples/sec: 7836.63 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:47:25,802 epoch 2 - iter 26/138 - loss 1.12231070 - time (sec): 0.59 - samples/sec: 7584.97 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:47:26,089 epoch 2 - iter 39/138 - loss 1.11542829 - time (sec): 0.88 - samples/sec: 7487.45 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:47:26,372 epoch 2 - iter 52/138 - loss 1.10165347 - time (sec): 1.16 - samples/sec: 7552.08 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:47:26,661 epoch 2 - iter 65/138 - loss 1.08326273 - time (sec): 1.45 - samples/sec: 7576.21 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:47:26,964 epoch 2 - iter 78/138 - loss 1.06800328 - time (sec): 1.75 - samples/sec: 7585.72 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:47:27,243 epoch 2 - iter 91/138 - loss 1.04900022 - time (sec): 2.03 - samples/sec: 7520.53 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:47:27,536 epoch 2 - iter 104/138 - loss 1.04011956 - time (sec): 2.33 - samples/sec: 7429.04 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:47:27,842 epoch 2 - iter 117/138 - loss 1.04597540 - time (sec): 2.63 - samples/sec: 7431.41 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:47:28,144 epoch 2 - iter 130/138 - loss 1.02664976 - time (sec): 2.93 - samples/sec: 7410.49 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:47:28,309 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:28,309 EPOCH 2 done: loss 1.0274 - lr: 0.000027
2023-10-18 14:47:28,666 DEV : loss 0.7758221626281738 - f1-score (micro avg) 0.0
2023-10-18 14:47:28,672 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:28,947 epoch 3 - iter 13/138 - loss 0.87554221 - time (sec): 0.27 - samples/sec: 7353.98 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:47:29,220 epoch 3 - iter 26/138 - loss 0.83815919 - time (sec): 0.55 - samples/sec: 7528.81 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:47:29,502 epoch 3 - iter 39/138 - loss 0.84854457 - time (sec): 0.83 - samples/sec: 7733.60 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:47:29,789 epoch 3 - iter 52/138 - loss 0.82203884 - time (sec): 1.12 - samples/sec: 7973.23 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:47:30,066 epoch 3 - iter 65/138 - loss 0.81519984 - time (sec): 1.39 - samples/sec: 7916.83 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:47:30,325 epoch 3 - iter 78/138 - loss 0.80371547 - time (sec): 1.65 - samples/sec: 7924.96 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:47:30,597 epoch 3 - iter 91/138 - loss 0.80307735 - time (sec): 1.92 - samples/sec: 7754.83 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:47:30,874 epoch 3 - iter 104/138 - loss 0.79904671 - time (sec): 2.20 - samples/sec: 7720.20 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:47:31,164 epoch 3 - iter 117/138 - loss 0.79567570 - time (sec): 2.49 - samples/sec: 7754.39 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:47:31,459 epoch 3 - iter 130/138 - loss 0.78976716 - time (sec): 2.79 - samples/sec: 7740.19 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:47:31,617 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:31,617 EPOCH 3 done: loss 0.7852 - lr: 0.000024
2023-10-18 14:47:31,971 DEV : loss 0.602449893951416 - f1-score (micro avg) 0.0591
2023-10-18 14:47:31,975 saving best model
2023-10-18 14:47:32,010 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:32,294 epoch 4 - iter 13/138 - loss 0.70493658 - time (sec): 0.28 - samples/sec: 7607.85 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:47:32,580 epoch 4 - iter 26/138 - loss 0.62818695 - time (sec): 0.57 - samples/sec: 7911.87 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:47:32,855 epoch 4 - iter 39/138 - loss 0.63384713 - time (sec): 0.84 - samples/sec: 7604.17 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:47:33,129 epoch 4 - iter 52/138 - loss 0.63226467 - time (sec): 1.12 - samples/sec: 7670.54 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:47:33,436 epoch 4 - iter 65/138 - loss 0.64867891 - time (sec): 1.43 - samples/sec: 7647.32 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:47:33,731 epoch 4 - iter 78/138 - loss 0.64779888 - time (sec): 1.72 - samples/sec: 7544.82 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:47:34,015 epoch 4 - iter 91/138 - loss 0.65100493 - time (sec): 2.01 - samples/sec: 7478.97 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:47:34,281 epoch 4 - iter 104/138 - loss 0.65789591 - time (sec): 2.27 - samples/sec: 7444.40 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:47:34,583 epoch 4 - iter 117/138 - loss 0.68086311 - time (sec): 2.57 - samples/sec: 7486.57 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:47:34,866 epoch 4 - iter 130/138 - loss 0.67265209 - time (sec): 2.86 - samples/sec: 7562.93 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:47:35,025 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:35,025 EPOCH 4 done: loss 0.6677 - lr: 0.000020
2023-10-18 14:47:35,506 DEV : loss 0.5561308264732361 - f1-score (micro avg) 0.124
2023-10-18 14:47:35,511 saving best model
2023-10-18 14:47:35,547 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:35,820 epoch 5 - iter 13/138 - loss 0.58767317 - time (sec): 0.27 - samples/sec: 7741.88 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:47:36,089 epoch 5 - iter 26/138 - loss 0.60257740 - time (sec): 0.54 - samples/sec: 7564.31 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:47:36,373 epoch 5 - iter 39/138 - loss 0.60854760 - time (sec): 0.83 - samples/sec: 7677.93 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:47:36,653 epoch 5 - iter 52/138 - loss 0.58967732 - time (sec): 1.11 - samples/sec: 7848.06 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:47:36,920 epoch 5 - iter 65/138 - loss 0.60331736 - time (sec): 1.37 - samples/sec: 7675.18 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:47:37,190 epoch 5 - iter 78/138 - loss 0.60221453 - time (sec): 1.64 - samples/sec: 7722.38 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:47:37,465 epoch 5 - iter 91/138 - loss 0.60472936 - time (sec): 1.92 - samples/sec: 7795.23 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:47:37,742 epoch 5 - iter 104/138 - loss 0.58574649 - time (sec): 2.20 - samples/sec: 7735.29 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:47:38,015 epoch 5 - iter 117/138 - loss 0.59225654 - time (sec): 2.47 - samples/sec: 7777.80 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:47:38,299 epoch 5 - iter 130/138 - loss 0.60404824 - time (sec): 2.75 - samples/sec: 7819.70 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:47:38,486 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:38,486 EPOCH 5 done: loss 0.6001 - lr: 0.000017
2023-10-18 14:47:38,860 DEV : loss 0.4696231186389923 - f1-score (micro avg) 0.3133
2023-10-18 14:47:38,864 saving best model
2023-10-18 14:47:38,906 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:39,212 epoch 6 - iter 13/138 - loss 0.57806662 - time (sec): 0.31 - samples/sec: 7291.95 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:47:39,508 epoch 6 - iter 26/138 - loss 0.55452295 - time (sec): 0.60 - samples/sec: 6886.60 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:47:39,805 epoch 6 - iter 39/138 - loss 0.55261324 - time (sec): 0.90 - samples/sec: 6942.35 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:47:40,093 epoch 6 - iter 52/138 - loss 0.55599866 - time (sec): 1.19 - samples/sec: 6926.28 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:47:40,392 epoch 6 - iter 65/138 - loss 0.55024638 - time (sec): 1.49 - samples/sec: 7117.29 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:47:40,684 epoch 6 - iter 78/138 - loss 0.55597302 - time (sec): 1.78 - samples/sec: 7187.50 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:47:40,990 epoch 6 - iter 91/138 - loss 0.55233598 - time (sec): 2.08 - samples/sec: 7212.15 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:47:41,285 epoch 6 - iter 104/138 - loss 0.55675965 - time (sec): 2.38 - samples/sec: 7201.98 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:47:41,596 epoch 6 - iter 117/138 - loss 0.55680457 - time (sec): 2.69 - samples/sec: 7208.19 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:47:41,896 epoch 6 - iter 130/138 - loss 0.56548752 - time (sec): 2.99 - samples/sec: 7217.83 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:47:42,088 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:42,088 EPOCH 6 done: loss 0.5577 - lr: 0.000014
2023-10-18 14:47:42,458 DEV : loss 0.4384450912475586 - f1-score (micro avg) 0.3716
2023-10-18 14:47:42,462 saving best model
2023-10-18 14:47:42,496 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:42,771 epoch 7 - iter 13/138 - loss 0.57364294 - time (sec): 0.27 - samples/sec: 8001.92 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:47:43,049 epoch 7 - iter 26/138 - loss 0.56147572 - time (sec): 0.55 - samples/sec: 7845.95 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:47:43,342 epoch 7 - iter 39/138 - loss 0.54627425 - time (sec): 0.85 - samples/sec: 7653.87 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:47:43,635 epoch 7 - iter 52/138 - loss 0.54367634 - time (sec): 1.14 - samples/sec: 7689.03 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:47:43,937 epoch 7 - iter 65/138 - loss 0.53733964 - time (sec): 1.44 - samples/sec: 7607.53 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:47:44,223 epoch 7 - iter 78/138 - loss 0.52203462 - time (sec): 1.73 - samples/sec: 7614.66 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:47:44,488 epoch 7 - iter 91/138 - loss 0.51401142 - time (sec): 1.99 - samples/sec: 7597.04 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:47:44,761 epoch 7 - iter 104/138 - loss 0.51251141 - time (sec): 2.27 - samples/sec: 7578.80 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:47:45,052 epoch 7 - iter 117/138 - loss 0.51458274 - time (sec): 2.56 - samples/sec: 7621.65 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:47:45,330 epoch 7 - iter 130/138 - loss 0.51828471 - time (sec): 2.83 - samples/sec: 7676.78 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:47:45,485 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:45,485 EPOCH 7 done: loss 0.5179 - lr: 0.000010
2023-10-18 14:47:45,850 DEV : loss 0.4271453320980072 - f1-score (micro avg) 0.3772
2023-10-18 14:47:45,854 saving best model
2023-10-18 14:47:45,889 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:46,159 epoch 8 - iter 13/138 - loss 0.54169464 - time (sec): 0.27 - samples/sec: 7167.85 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:47:46,435 epoch 8 - iter 26/138 - loss 0.51094577 - time (sec): 0.55 - samples/sec: 7456.03 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:47:46,717 epoch 8 - iter 39/138 - loss 0.52429887 - time (sec): 0.83 - samples/sec: 7394.47 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:47:47,004 epoch 8 - iter 52/138 - loss 0.51964712 - time (sec): 1.11 - samples/sec: 7471.34 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:47:47,298 epoch 8 - iter 65/138 - loss 0.51295695 - time (sec): 1.41 - samples/sec: 7679.22 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:47:47,601 epoch 8 - iter 78/138 - loss 0.52326153 - time (sec): 1.71 - samples/sec: 7663.41 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:47:47,875 epoch 8 - iter 91/138 - loss 0.50864867 - time (sec): 1.99 - samples/sec: 7749.05 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:47:48,177 epoch 8 - iter 104/138 - loss 0.50328546 - time (sec): 2.29 - samples/sec: 7641.40 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:47:48,453 epoch 8 - iter 117/138 - loss 0.51140037 - time (sec): 2.56 - samples/sec: 7607.12 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:47:48,734 epoch 8 - iter 130/138 - loss 0.50456445 - time (sec): 2.84 - samples/sec: 7563.89 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:47:48,909 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:48,909 EPOCH 8 done: loss 0.5021 - lr: 0.000007
2023-10-18 14:47:49,300 DEV : loss 0.40488675236701965 - f1-score (micro avg) 0.4012
2023-10-18 14:47:49,305 saving best model
2023-10-18 14:47:49,346 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:49,668 epoch 9 - iter 13/138 - loss 0.60467250 - time (sec): 0.32 - samples/sec: 6626.05 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:47:49,962 epoch 9 - iter 26/138 - loss 0.52526270 - time (sec): 0.61 - samples/sec: 6986.69 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:47:50,251 epoch 9 - iter 39/138 - loss 0.53482102 - time (sec): 0.90 - samples/sec: 7238.65 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:47:50,556 epoch 9 - iter 52/138 - loss 0.52194638 - time (sec): 1.21 - samples/sec: 7124.27 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:47:50,845 epoch 9 - iter 65/138 - loss 0.52036899 - time (sec): 1.50 - samples/sec: 7182.20 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:47:51,139 epoch 9 - iter 78/138 - loss 0.50383512 - time (sec): 1.79 - samples/sec: 7047.84 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:47:51,434 epoch 9 - iter 91/138 - loss 0.48411989 - time (sec): 2.09 - samples/sec: 7185.75 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:47:51,710 epoch 9 - iter 104/138 - loss 0.47992387 - time (sec): 2.36 - samples/sec: 7202.00 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:47:51,988 epoch 9 - iter 117/138 - loss 0.48349693 - time (sec): 2.64 - samples/sec: 7318.21 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:47:52,283 epoch 9 - iter 130/138 - loss 0.48756545 - time (sec): 2.94 - samples/sec: 7396.96 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:47:52,454 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:52,454 EPOCH 9 done: loss 0.4890 - lr: 0.000004
2023-10-18 14:47:52,826 DEV : loss 0.39584705233573914 - f1-score (micro avg) 0.4097
2023-10-18 14:47:52,830 saving best model
2023-10-18 14:47:52,863 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:53,157 epoch 10 - iter 13/138 - loss 0.47388399 - time (sec): 0.29 - samples/sec: 7846.83 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:47:53,441 epoch 10 - iter 26/138 - loss 0.46298069 - time (sec): 0.58 - samples/sec: 7521.81 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:47:53,715 epoch 10 - iter 39/138 - loss 0.46883862 - time (sec): 0.85 - samples/sec: 7551.92 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:47:53,997 epoch 10 - iter 52/138 - loss 0.46166111 - time (sec): 1.13 - samples/sec: 7800.84 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:47:54,289 epoch 10 - iter 65/138 - loss 0.46831864 - time (sec): 1.42 - samples/sec: 7742.62 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:47:54,578 epoch 10 - iter 78/138 - loss 0.47272619 - time (sec): 1.71 - samples/sec: 7730.96 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:47:54,881 epoch 10 - iter 91/138 - loss 0.47091297 - time (sec): 2.02 - samples/sec: 7630.22 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:47:55,165 epoch 10 - iter 104/138 - loss 0.48530636 - time (sec): 2.30 - samples/sec: 7582.44 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:47:55,450 epoch 10 - iter 117/138 - loss 0.48410095 - time (sec): 2.59 - samples/sec: 7584.03 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:47:55,719 epoch 10 - iter 130/138 - loss 0.48003303 - time (sec): 2.86 - samples/sec: 7528.44 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:47:55,887 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:55,887 EPOCH 10 done: loss 0.4833 - lr: 0.000000
2023-10-18 14:47:56,260 DEV : loss 0.3930450677871704 - f1-score (micro avg) 0.4139
2023-10-18 14:47:56,264 saving best model
2023-10-18 14:47:56,329 ----------------------------------------------------------------------------------------------------
2023-10-18 14:47:56,329 Loading model from best epoch ...
2023-10-18 14:47:56,402 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:47:56,697
Results:
- F-score (micro) 0.4462
- F-score (macro) 0.2348
- Accuracy 0.2935
By class:
precision recall f1-score support
scope 0.6164 0.5568 0.5851 176
work 0.3696 0.4595 0.4096 74
pers 0.7647 0.1016 0.1793 128
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.5410 0.3796 0.4462 382
macro avg 0.3501 0.2236 0.2348 382
weighted avg 0.6118 0.3796 0.4090 382
2023-10-18 14:47:56,697 ----------------------------------------------------------------------------------------------------