stefan-it's picture
Upload folder using huggingface_hub
b18f6ec
2023-10-18 14:35:15,518 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,518 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 14:35:15,518 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,518 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-18 14:35:15,518 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,518 Train: 1100 sentences
2023-10-18 14:35:15,518 (train_with_dev=False, train_with_test=False)
2023-10-18 14:35:15,519 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,519 Training Params:
2023-10-18 14:35:15,519 - learning_rate: "3e-05"
2023-10-18 14:35:15,519 - mini_batch_size: "8"
2023-10-18 14:35:15,519 - max_epochs: "10"
2023-10-18 14:35:15,519 - shuffle: "True"
2023-10-18 14:35:15,519 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,519 Plugins:
2023-10-18 14:35:15,519 - TensorboardLogger
2023-10-18 14:35:15,519 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 14:35:15,519 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,519 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 14:35:15,519 - metric: "('micro avg', 'f1-score')"
2023-10-18 14:35:15,519 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,519 Computation:
2023-10-18 14:35:15,519 - compute on device: cuda:0
2023-10-18 14:35:15,519 - embedding storage: none
2023-10-18 14:35:15,519 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,519 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-18 14:35:15,519 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,519 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:15,519 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 14:35:15,840 epoch 1 - iter 13/138 - loss 4.01595459 - time (sec): 0.32 - samples/sec: 7193.68 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:35:16,145 epoch 1 - iter 26/138 - loss 3.88075465 - time (sec): 0.63 - samples/sec: 7242.11 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:35:16,422 epoch 1 - iter 39/138 - loss 3.95495640 - time (sec): 0.90 - samples/sec: 7132.55 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:35:16,703 epoch 1 - iter 52/138 - loss 3.88480641 - time (sec): 1.18 - samples/sec: 7139.35 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:35:16,987 epoch 1 - iter 65/138 - loss 3.79221254 - time (sec): 1.47 - samples/sec: 7394.70 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:35:17,283 epoch 1 - iter 78/138 - loss 3.66709626 - time (sec): 1.76 - samples/sec: 7437.21 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:35:17,547 epoch 1 - iter 91/138 - loss 3.55398152 - time (sec): 2.03 - samples/sec: 7455.78 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:35:17,846 epoch 1 - iter 104/138 - loss 3.40026098 - time (sec): 2.33 - samples/sec: 7525.03 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:35:18,148 epoch 1 - iter 117/138 - loss 3.26971145 - time (sec): 2.63 - samples/sec: 7416.50 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:35:18,418 epoch 1 - iter 130/138 - loss 3.13099779 - time (sec): 2.90 - samples/sec: 7399.63 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:35:18,586 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:18,586 EPOCH 1 done: loss 3.0413 - lr: 0.000028
2023-10-18 14:35:18,828 DEV : loss 0.9774365425109863 - f1-score (micro avg) 0.0
2023-10-18 14:35:18,832 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:19,112 epoch 2 - iter 13/138 - loss 1.15730821 - time (sec): 0.28 - samples/sec: 8573.70 - lr: 0.000030 - momentum: 0.000000
2023-10-18 14:35:19,386 epoch 2 - iter 26/138 - loss 1.16314490 - time (sec): 0.55 - samples/sec: 8192.55 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:35:19,660 epoch 2 - iter 39/138 - loss 1.19310893 - time (sec): 0.83 - samples/sec: 8095.16 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:35:19,932 epoch 2 - iter 52/138 - loss 1.15540864 - time (sec): 1.10 - samples/sec: 8083.74 - lr: 0.000029 - momentum: 0.000000
2023-10-18 14:35:20,209 epoch 2 - iter 65/138 - loss 1.12506005 - time (sec): 1.38 - samples/sec: 7945.74 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:35:20,485 epoch 2 - iter 78/138 - loss 1.10090968 - time (sec): 1.65 - samples/sec: 7920.27 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:35:20,767 epoch 2 - iter 91/138 - loss 1.07945188 - time (sec): 1.94 - samples/sec: 7861.44 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:35:21,049 epoch 2 - iter 104/138 - loss 1.05245790 - time (sec): 2.22 - samples/sec: 7818.57 - lr: 0.000028 - momentum: 0.000000
2023-10-18 14:35:21,332 epoch 2 - iter 117/138 - loss 1.03930511 - time (sec): 2.50 - samples/sec: 7774.42 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:35:21,614 epoch 2 - iter 130/138 - loss 1.00843280 - time (sec): 2.78 - samples/sec: 7760.72 - lr: 0.000027 - momentum: 0.000000
2023-10-18 14:35:21,778 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:21,778 EPOCH 2 done: loss 0.9997 - lr: 0.000027
2023-10-18 14:35:22,129 DEV : loss 0.8055396676063538 - f1-score (micro avg) 0.0
2023-10-18 14:35:22,133 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:22,441 epoch 3 - iter 13/138 - loss 0.84178589 - time (sec): 0.31 - samples/sec: 6880.23 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:35:22,719 epoch 3 - iter 26/138 - loss 0.82283715 - time (sec): 0.59 - samples/sec: 7510.38 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:35:22,995 epoch 3 - iter 39/138 - loss 0.86832670 - time (sec): 0.86 - samples/sec: 7735.16 - lr: 0.000026 - momentum: 0.000000
2023-10-18 14:35:23,238 epoch 3 - iter 52/138 - loss 0.84001981 - time (sec): 1.10 - samples/sec: 8048.98 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:35:23,528 epoch 3 - iter 65/138 - loss 0.83044980 - time (sec): 1.39 - samples/sec: 7937.35 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:35:23,806 epoch 3 - iter 78/138 - loss 0.83286497 - time (sec): 1.67 - samples/sec: 7811.60 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:35:24,096 epoch 3 - iter 91/138 - loss 0.82118883 - time (sec): 1.96 - samples/sec: 7804.36 - lr: 0.000025 - momentum: 0.000000
2023-10-18 14:35:24,386 epoch 3 - iter 104/138 - loss 0.83867439 - time (sec): 2.25 - samples/sec: 7767.94 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:35:24,681 epoch 3 - iter 117/138 - loss 0.82797548 - time (sec): 2.55 - samples/sec: 7618.84 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:35:24,969 epoch 3 - iter 130/138 - loss 0.83176552 - time (sec): 2.84 - samples/sec: 7609.66 - lr: 0.000024 - momentum: 0.000000
2023-10-18 14:35:25,136 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:25,136 EPOCH 3 done: loss 0.8219 - lr: 0.000024
2023-10-18 14:35:25,491 DEV : loss 0.6568560600280762 - f1-score (micro avg) 0.0333
2023-10-18 14:35:25,495 saving best model
2023-10-18 14:35:25,532 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:25,805 epoch 4 - iter 13/138 - loss 0.76986915 - time (sec): 0.27 - samples/sec: 7824.27 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:35:26,088 epoch 4 - iter 26/138 - loss 0.73522087 - time (sec): 0.56 - samples/sec: 7595.06 - lr: 0.000023 - momentum: 0.000000
2023-10-18 14:35:26,362 epoch 4 - iter 39/138 - loss 0.72220832 - time (sec): 0.83 - samples/sec: 7525.86 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:35:26,633 epoch 4 - iter 52/138 - loss 0.73538732 - time (sec): 1.10 - samples/sec: 7529.25 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:35:26,925 epoch 4 - iter 65/138 - loss 0.73249348 - time (sec): 1.39 - samples/sec: 7617.95 - lr: 0.000022 - momentum: 0.000000
2023-10-18 14:35:27,215 epoch 4 - iter 78/138 - loss 0.71376298 - time (sec): 1.68 - samples/sec: 7714.85 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:35:27,486 epoch 4 - iter 91/138 - loss 0.71714210 - time (sec): 1.95 - samples/sec: 7703.22 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:35:27,773 epoch 4 - iter 104/138 - loss 0.71608761 - time (sec): 2.24 - samples/sec: 7737.05 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:35:28,056 epoch 4 - iter 117/138 - loss 0.71004257 - time (sec): 2.52 - samples/sec: 7732.74 - lr: 0.000021 - momentum: 0.000000
2023-10-18 14:35:28,329 epoch 4 - iter 130/138 - loss 0.69949069 - time (sec): 2.80 - samples/sec: 7692.63 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:35:28,502 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:28,502 EPOCH 4 done: loss 0.7019 - lr: 0.000020
2023-10-18 14:35:28,989 DEV : loss 0.5695884227752686 - f1-score (micro avg) 0.0891
2023-10-18 14:35:28,994 saving best model
2023-10-18 14:35:29,028 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:29,355 epoch 5 - iter 13/138 - loss 0.55916364 - time (sec): 0.33 - samples/sec: 7910.32 - lr: 0.000020 - momentum: 0.000000
2023-10-18 14:35:29,637 epoch 5 - iter 26/138 - loss 0.59283853 - time (sec): 0.61 - samples/sec: 7406.15 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:35:29,915 epoch 5 - iter 39/138 - loss 0.62179245 - time (sec): 0.89 - samples/sec: 7661.13 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:35:30,204 epoch 5 - iter 52/138 - loss 0.62840361 - time (sec): 1.18 - samples/sec: 7594.36 - lr: 0.000019 - momentum: 0.000000
2023-10-18 14:35:30,500 epoch 5 - iter 65/138 - loss 0.62971796 - time (sec): 1.47 - samples/sec: 7473.78 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:35:30,799 epoch 5 - iter 78/138 - loss 0.64093939 - time (sec): 1.77 - samples/sec: 7397.37 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:35:31,097 epoch 5 - iter 91/138 - loss 0.63720772 - time (sec): 2.07 - samples/sec: 7325.59 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:35:31,407 epoch 5 - iter 104/138 - loss 0.64305218 - time (sec): 2.38 - samples/sec: 7309.73 - lr: 0.000018 - momentum: 0.000000
2023-10-18 14:35:31,690 epoch 5 - iter 117/138 - loss 0.63686579 - time (sec): 2.66 - samples/sec: 7244.05 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:35:31,961 epoch 5 - iter 130/138 - loss 0.63658731 - time (sec): 2.93 - samples/sec: 7366.49 - lr: 0.000017 - momentum: 0.000000
2023-10-18 14:35:32,126 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:32,126 EPOCH 5 done: loss 0.6334 - lr: 0.000017
2023-10-18 14:35:32,483 DEV : loss 0.507025957107544 - f1-score (micro avg) 0.2395
2023-10-18 14:35:32,487 saving best model
2023-10-18 14:35:32,522 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:32,813 epoch 6 - iter 13/138 - loss 0.67434508 - time (sec): 0.29 - samples/sec: 6528.87 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:35:33,105 epoch 6 - iter 26/138 - loss 0.61013499 - time (sec): 0.58 - samples/sec: 6942.40 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:35:33,389 epoch 6 - iter 39/138 - loss 0.62858647 - time (sec): 0.87 - samples/sec: 7278.48 - lr: 0.000016 - momentum: 0.000000
2023-10-18 14:35:33,670 epoch 6 - iter 52/138 - loss 0.62072615 - time (sec): 1.15 - samples/sec: 7236.34 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:35:33,960 epoch 6 - iter 65/138 - loss 0.59699510 - time (sec): 1.44 - samples/sec: 7325.37 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:35:34,240 epoch 6 - iter 78/138 - loss 0.59352031 - time (sec): 1.72 - samples/sec: 7552.55 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:35:34,529 epoch 6 - iter 91/138 - loss 0.59368413 - time (sec): 2.01 - samples/sec: 7500.00 - lr: 0.000015 - momentum: 0.000000
2023-10-18 14:35:34,817 epoch 6 - iter 104/138 - loss 0.59137734 - time (sec): 2.29 - samples/sec: 7497.31 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:35:35,096 epoch 6 - iter 117/138 - loss 0.58759101 - time (sec): 2.57 - samples/sec: 7599.02 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:35:35,340 epoch 6 - iter 130/138 - loss 0.58846987 - time (sec): 2.82 - samples/sec: 7690.98 - lr: 0.000014 - momentum: 0.000000
2023-10-18 14:35:35,480 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:35,480 EPOCH 6 done: loss 0.5841 - lr: 0.000014
2023-10-18 14:35:35,846 DEV : loss 0.45362991094589233 - f1-score (micro avg) 0.3401
2023-10-18 14:35:35,850 saving best model
2023-10-18 14:35:35,883 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:36,193 epoch 7 - iter 13/138 - loss 0.62411415 - time (sec): 0.31 - samples/sec: 6353.96 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:35:36,475 epoch 7 - iter 26/138 - loss 0.58965052 - time (sec): 0.59 - samples/sec: 6928.87 - lr: 0.000013 - momentum: 0.000000
2023-10-18 14:35:36,749 epoch 7 - iter 39/138 - loss 0.57222957 - time (sec): 0.87 - samples/sec: 7067.79 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:35:37,021 epoch 7 - iter 52/138 - loss 0.57042084 - time (sec): 1.14 - samples/sec: 7104.24 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:35:37,306 epoch 7 - iter 65/138 - loss 0.56592963 - time (sec): 1.42 - samples/sec: 7184.14 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:35:37,602 epoch 7 - iter 78/138 - loss 0.55236421 - time (sec): 1.72 - samples/sec: 7293.03 - lr: 0.000012 - momentum: 0.000000
2023-10-18 14:35:37,897 epoch 7 - iter 91/138 - loss 0.56072911 - time (sec): 2.01 - samples/sec: 7373.09 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:35:38,164 epoch 7 - iter 104/138 - loss 0.56517878 - time (sec): 2.28 - samples/sec: 7481.10 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:35:38,457 epoch 7 - iter 117/138 - loss 0.55982223 - time (sec): 2.57 - samples/sec: 7516.86 - lr: 0.000011 - momentum: 0.000000
2023-10-18 14:35:38,747 epoch 7 - iter 130/138 - loss 0.55016150 - time (sec): 2.86 - samples/sec: 7576.66 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:35:38,909 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:38,909 EPOCH 7 done: loss 0.5501 - lr: 0.000010
2023-10-18 14:35:39,275 DEV : loss 0.43054693937301636 - f1-score (micro avg) 0.3779
2023-10-18 14:35:39,279 saving best model
2023-10-18 14:35:39,315 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:39,587 epoch 8 - iter 13/138 - loss 0.56052524 - time (sec): 0.27 - samples/sec: 8169.85 - lr: 0.000010 - momentum: 0.000000
2023-10-18 14:35:39,860 epoch 8 - iter 26/138 - loss 0.55021934 - time (sec): 0.54 - samples/sec: 7462.93 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:35:40,130 epoch 8 - iter 39/138 - loss 0.54002147 - time (sec): 0.81 - samples/sec: 7706.34 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:35:40,411 epoch 8 - iter 52/138 - loss 0.54809034 - time (sec): 1.10 - samples/sec: 7781.05 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:35:40,677 epoch 8 - iter 65/138 - loss 0.53544888 - time (sec): 1.36 - samples/sec: 7865.37 - lr: 0.000009 - momentum: 0.000000
2023-10-18 14:35:40,950 epoch 8 - iter 78/138 - loss 0.52007725 - time (sec): 1.63 - samples/sec: 7910.92 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:35:41,245 epoch 8 - iter 91/138 - loss 0.52160921 - time (sec): 1.93 - samples/sec: 7850.41 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:35:41,528 epoch 8 - iter 104/138 - loss 0.53397728 - time (sec): 2.21 - samples/sec: 7711.48 - lr: 0.000008 - momentum: 0.000000
2023-10-18 14:35:41,830 epoch 8 - iter 117/138 - loss 0.52398750 - time (sec): 2.51 - samples/sec: 7641.37 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:35:42,109 epoch 8 - iter 130/138 - loss 0.53283523 - time (sec): 2.79 - samples/sec: 7670.34 - lr: 0.000007 - momentum: 0.000000
2023-10-18 14:35:42,273 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:42,273 EPOCH 8 done: loss 0.5313 - lr: 0.000007
2023-10-18 14:35:42,636 DEV : loss 0.4122130572795868 - f1-score (micro avg) 0.3949
2023-10-18 14:35:42,640 saving best model
2023-10-18 14:35:42,674 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:42,949 epoch 9 - iter 13/138 - loss 0.49872909 - time (sec): 0.28 - samples/sec: 7981.18 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:35:43,223 epoch 9 - iter 26/138 - loss 0.49738613 - time (sec): 0.55 - samples/sec: 8114.88 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:35:43,508 epoch 9 - iter 39/138 - loss 0.49603066 - time (sec): 0.83 - samples/sec: 7881.35 - lr: 0.000006 - momentum: 0.000000
2023-10-18 14:35:43,777 epoch 9 - iter 52/138 - loss 0.53067782 - time (sec): 1.10 - samples/sec: 7814.16 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:35:44,052 epoch 9 - iter 65/138 - loss 0.53559716 - time (sec): 1.38 - samples/sec: 7701.01 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:35:44,335 epoch 9 - iter 78/138 - loss 0.53184310 - time (sec): 1.66 - samples/sec: 7719.63 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:35:44,606 epoch 9 - iter 91/138 - loss 0.52806519 - time (sec): 1.93 - samples/sec: 7660.29 - lr: 0.000005 - momentum: 0.000000
2023-10-18 14:35:44,917 epoch 9 - iter 104/138 - loss 0.51389363 - time (sec): 2.24 - samples/sec: 7668.77 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:35:45,205 epoch 9 - iter 117/138 - loss 0.51109485 - time (sec): 2.53 - samples/sec: 7685.58 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:35:45,487 epoch 9 - iter 130/138 - loss 0.50684517 - time (sec): 2.81 - samples/sec: 7698.36 - lr: 0.000004 - momentum: 0.000000
2023-10-18 14:35:45,650 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:45,650 EPOCH 9 done: loss 0.5059 - lr: 0.000004
2023-10-18 14:35:46,013 DEV : loss 0.40028560161590576 - f1-score (micro avg) 0.408
2023-10-18 14:35:46,017 saving best model
2023-10-18 14:35:46,051 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:46,328 epoch 10 - iter 13/138 - loss 0.42730878 - time (sec): 0.28 - samples/sec: 7852.59 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:35:46,598 epoch 10 - iter 26/138 - loss 0.50137455 - time (sec): 0.55 - samples/sec: 7854.15 - lr: 0.000003 - momentum: 0.000000
2023-10-18 14:35:46,869 epoch 10 - iter 39/138 - loss 0.49459383 - time (sec): 0.82 - samples/sec: 7773.05 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:35:47,136 epoch 10 - iter 52/138 - loss 0.48378783 - time (sec): 1.08 - samples/sec: 7752.45 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:35:47,417 epoch 10 - iter 65/138 - loss 0.49450996 - time (sec): 1.37 - samples/sec: 7712.77 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:35:47,702 epoch 10 - iter 78/138 - loss 0.48763963 - time (sec): 1.65 - samples/sec: 7785.01 - lr: 0.000002 - momentum: 0.000000
2023-10-18 14:35:47,988 epoch 10 - iter 91/138 - loss 0.51161544 - time (sec): 1.94 - samples/sec: 7865.03 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:35:48,271 epoch 10 - iter 104/138 - loss 0.50184091 - time (sec): 2.22 - samples/sec: 7812.54 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:35:48,545 epoch 10 - iter 117/138 - loss 0.50548111 - time (sec): 2.49 - samples/sec: 7852.47 - lr: 0.000001 - momentum: 0.000000
2023-10-18 14:35:48,846 epoch 10 - iter 130/138 - loss 0.50036723 - time (sec): 2.79 - samples/sec: 7696.25 - lr: 0.000000 - momentum: 0.000000
2023-10-18 14:35:49,028 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:49,028 EPOCH 10 done: loss 0.4986 - lr: 0.000000
2023-10-18 14:35:49,403 DEV : loss 0.39673560857772827 - f1-score (micro avg) 0.4147
2023-10-18 14:35:49,407 saving best model
2023-10-18 14:35:49,467 ----------------------------------------------------------------------------------------------------
2023-10-18 14:35:49,468 Loading model from best epoch ...
2023-10-18 14:35:49,548 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-18 14:35:49,848
Results:
- F-score (micro) 0.4535
- F-score (macro) 0.242
- Accuracy 0.3013
By class:
precision recall f1-score support
scope 0.5808 0.5511 0.5656 176
work 0.4730 0.4730 0.4730 74
pers 1.0000 0.0938 0.1714 128
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.5692 0.3770 0.4535 382
macro avg 0.4108 0.2236 0.2420 382
weighted avg 0.6943 0.3770 0.4097 382
2023-10-18 14:35:49,849 ----------------------------------------------------------------------------------------------------