stefan-it's picture
Upload folder using huggingface_hub
60a0e2d
2023-10-20 00:26:59,401 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,401 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-20 00:26:59,401 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 Train: 1085 sentences
2023-10-20 00:26:59,402 (train_with_dev=False, train_with_test=False)
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 Training Params:
2023-10-20 00:26:59,402 - learning_rate: "5e-05"
2023-10-20 00:26:59,402 - mini_batch_size: "8"
2023-10-20 00:26:59,402 - max_epochs: "10"
2023-10-20 00:26:59,402 - shuffle: "True"
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 Plugins:
2023-10-20 00:26:59,402 - TensorboardLogger
2023-10-20 00:26:59,402 - LinearScheduler | warmup_fraction: '0.1'
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 Final evaluation on model from best epoch (best-model.pt)
2023-10-20 00:26:59,402 - metric: "('micro avg', 'f1-score')"
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 Computation:
2023-10-20 00:26:59,402 - compute on device: cuda:0
2023-10-20 00:26:59,402 - embedding storage: none
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:59,402 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-20 00:26:59,746 epoch 1 - iter 13/136 - loss 2.99174193 - time (sec): 0.34 - samples/sec: 15290.18 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:27:00,082 epoch 1 - iter 26/136 - loss 2.96413466 - time (sec): 0.68 - samples/sec: 14525.00 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:27:00,448 epoch 1 - iter 39/136 - loss 2.84053462 - time (sec): 1.05 - samples/sec: 13972.13 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:27:00,831 epoch 1 - iter 52/136 - loss 2.76852975 - time (sec): 1.43 - samples/sec: 13566.94 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:27:01,184 epoch 1 - iter 65/136 - loss 2.62124449 - time (sec): 1.78 - samples/sec: 13686.57 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:27:01,539 epoch 1 - iter 78/136 - loss 2.45694965 - time (sec): 2.14 - samples/sec: 13863.28 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:27:01,909 epoch 1 - iter 91/136 - loss 2.29121118 - time (sec): 2.51 - samples/sec: 14173.74 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:27:02,238 epoch 1 - iter 104/136 - loss 2.18437061 - time (sec): 2.84 - samples/sec: 13903.44 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:27:02,598 epoch 1 - iter 117/136 - loss 2.04645061 - time (sec): 3.20 - samples/sec: 13827.93 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:27:02,955 epoch 1 - iter 130/136 - loss 1.89109901 - time (sec): 3.55 - samples/sec: 13960.34 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:27:03,099 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:03,100 EPOCH 1 done: loss 1.8348 - lr: 0.000047
2023-10-20 00:27:03,368 DEV : loss 0.5015563368797302 - f1-score (micro avg) 0.0
2023-10-20 00:27:03,372 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:03,729 epoch 2 - iter 13/136 - loss 0.60857951 - time (sec): 0.36 - samples/sec: 16459.68 - lr: 0.000050 - momentum: 0.000000
2023-10-20 00:27:04,077 epoch 2 - iter 26/136 - loss 0.63793114 - time (sec): 0.70 - samples/sec: 14440.02 - lr: 0.000049 - momentum: 0.000000
2023-10-20 00:27:04,439 epoch 2 - iter 39/136 - loss 0.68791160 - time (sec): 1.07 - samples/sec: 13666.75 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:27:04,793 epoch 2 - iter 52/136 - loss 0.63859874 - time (sec): 1.42 - samples/sec: 13906.09 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:27:05,149 epoch 2 - iter 65/136 - loss 0.61668645 - time (sec): 1.78 - samples/sec: 14049.74 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:27:05,498 epoch 2 - iter 78/136 - loss 0.60331772 - time (sec): 2.13 - samples/sec: 13974.49 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:27:05,859 epoch 2 - iter 91/136 - loss 0.61615866 - time (sec): 2.49 - samples/sec: 14136.37 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:27:06,221 epoch 2 - iter 104/136 - loss 0.60741981 - time (sec): 2.85 - samples/sec: 13954.59 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:27:06,586 epoch 2 - iter 117/136 - loss 0.60342841 - time (sec): 3.21 - samples/sec: 14014.59 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:27:06,936 epoch 2 - iter 130/136 - loss 0.59755673 - time (sec): 3.56 - samples/sec: 13976.85 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:27:07,089 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:07,089 EPOCH 2 done: loss 0.5906 - lr: 0.000045
2023-10-20 00:27:08,025 DEV : loss 0.39024820923805237 - f1-score (micro avg) 0.0071
2023-10-20 00:27:08,029 saving best model
2023-10-20 00:27:08,054 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:08,423 epoch 3 - iter 13/136 - loss 0.44911705 - time (sec): 0.37 - samples/sec: 14818.97 - lr: 0.000044 - momentum: 0.000000
2023-10-20 00:27:08,766 epoch 3 - iter 26/136 - loss 0.50074753 - time (sec): 0.71 - samples/sec: 14126.48 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:27:09,115 epoch 3 - iter 39/136 - loss 0.50283672 - time (sec): 1.06 - samples/sec: 13887.55 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:27:09,457 epoch 3 - iter 52/136 - loss 0.47622121 - time (sec): 1.40 - samples/sec: 13509.51 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:27:09,803 epoch 3 - iter 65/136 - loss 0.48752604 - time (sec): 1.75 - samples/sec: 13728.02 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:27:10,172 epoch 3 - iter 78/136 - loss 0.48621133 - time (sec): 2.12 - samples/sec: 14277.44 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:27:10,541 epoch 3 - iter 91/136 - loss 0.46968554 - time (sec): 2.49 - samples/sec: 14755.84 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:27:10,883 epoch 3 - iter 104/136 - loss 0.46674456 - time (sec): 2.83 - samples/sec: 14647.74 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:27:11,219 epoch 3 - iter 117/136 - loss 0.46645110 - time (sec): 3.16 - samples/sec: 14437.25 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:27:11,564 epoch 3 - iter 130/136 - loss 0.46824379 - time (sec): 3.51 - samples/sec: 14300.80 - lr: 0.000039 - momentum: 0.000000
2023-10-20 00:27:11,715 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:11,715 EPOCH 3 done: loss 0.4653 - lr: 0.000039
2023-10-20 00:27:12,469 DEV : loss 0.3429010510444641 - f1-score (micro avg) 0.0399
2023-10-20 00:27:12,473 saving best model
2023-10-20 00:27:12,505 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:12,846 epoch 4 - iter 13/136 - loss 0.45667035 - time (sec): 0.34 - samples/sec: 12986.45 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:27:13,200 epoch 4 - iter 26/136 - loss 0.40057927 - time (sec): 0.69 - samples/sec: 13731.49 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:27:13,555 epoch 4 - iter 39/136 - loss 0.40562505 - time (sec): 1.05 - samples/sec: 13990.65 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:27:13,940 epoch 4 - iter 52/136 - loss 0.40273949 - time (sec): 1.43 - samples/sec: 14383.51 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:27:14,280 epoch 4 - iter 65/136 - loss 0.40903860 - time (sec): 1.77 - samples/sec: 14456.07 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:27:14,636 epoch 4 - iter 78/136 - loss 0.40452734 - time (sec): 2.13 - samples/sec: 14290.94 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:27:14,984 epoch 4 - iter 91/136 - loss 0.41457293 - time (sec): 2.48 - samples/sec: 14381.24 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:27:15,333 epoch 4 - iter 104/136 - loss 0.41519681 - time (sec): 2.83 - samples/sec: 14305.51 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:27:15,674 epoch 4 - iter 117/136 - loss 0.42095346 - time (sec): 3.17 - samples/sec: 14035.12 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:27:16,039 epoch 4 - iter 130/136 - loss 0.41867387 - time (sec): 3.53 - samples/sec: 14113.12 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:27:16,203 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:16,203 EPOCH 4 done: loss 0.4188 - lr: 0.000034
2023-10-20 00:27:16,953 DEV : loss 0.32259702682495117 - f1-score (micro avg) 0.0985
2023-10-20 00:27:16,957 saving best model
2023-10-20 00:27:16,987 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:17,345 epoch 5 - iter 13/136 - loss 0.40845983 - time (sec): 0.36 - samples/sec: 14213.54 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:27:17,703 epoch 5 - iter 26/136 - loss 0.37364817 - time (sec): 0.72 - samples/sec: 13529.45 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:27:18,046 epoch 5 - iter 39/136 - loss 0.39076166 - time (sec): 1.06 - samples/sec: 13658.06 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:27:18,401 epoch 5 - iter 52/136 - loss 0.39259624 - time (sec): 1.41 - samples/sec: 13777.80 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:27:18,757 epoch 5 - iter 65/136 - loss 0.39848236 - time (sec): 1.77 - samples/sec: 14109.05 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:27:19,107 epoch 5 - iter 78/136 - loss 0.39252052 - time (sec): 2.12 - samples/sec: 13939.66 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:27:19,458 epoch 5 - iter 91/136 - loss 0.38452188 - time (sec): 2.47 - samples/sec: 14039.84 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:27:19,812 epoch 5 - iter 104/136 - loss 0.37804562 - time (sec): 2.82 - samples/sec: 14043.23 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:27:20,320 epoch 5 - iter 117/136 - loss 0.38959493 - time (sec): 3.33 - samples/sec: 13289.05 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:27:20,686 epoch 5 - iter 130/136 - loss 0.38953709 - time (sec): 3.70 - samples/sec: 13406.52 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:27:20,849 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:20,850 EPOCH 5 done: loss 0.3876 - lr: 0.000028
2023-10-20 00:27:21,634 DEV : loss 0.28990423679351807 - f1-score (micro avg) 0.2169
2023-10-20 00:27:21,638 saving best model
2023-10-20 00:27:21,673 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:22,002 epoch 6 - iter 13/136 - loss 0.43011177 - time (sec): 0.33 - samples/sec: 13614.39 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:27:22,366 epoch 6 - iter 26/136 - loss 0.42935702 - time (sec): 0.69 - samples/sec: 13490.60 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:27:22,730 epoch 6 - iter 39/136 - loss 0.38279064 - time (sec): 1.06 - samples/sec: 14158.37 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:27:23,084 epoch 6 - iter 52/136 - loss 0.38870660 - time (sec): 1.41 - samples/sec: 14264.05 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:27:23,445 epoch 6 - iter 65/136 - loss 0.38485490 - time (sec): 1.77 - samples/sec: 14103.90 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:27:23,786 epoch 6 - iter 78/136 - loss 0.37129942 - time (sec): 2.11 - samples/sec: 14357.55 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:27:24,144 epoch 6 - iter 91/136 - loss 0.36016383 - time (sec): 2.47 - samples/sec: 14405.05 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:27:24,497 epoch 6 - iter 104/136 - loss 0.36838511 - time (sec): 2.82 - samples/sec: 14182.08 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:27:24,832 epoch 6 - iter 117/136 - loss 0.36617739 - time (sec): 3.16 - samples/sec: 14156.54 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:27:25,174 epoch 6 - iter 130/136 - loss 0.36864604 - time (sec): 3.50 - samples/sec: 14217.13 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:27:25,337 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:25,337 EPOCH 6 done: loss 0.3678 - lr: 0.000023
2023-10-20 00:27:26,103 DEV : loss 0.2851005494594574 - f1-score (micro avg) 0.2696
2023-10-20 00:27:26,107 saving best model
2023-10-20 00:27:26,137 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:26,513 epoch 7 - iter 13/136 - loss 0.41319372 - time (sec): 0.37 - samples/sec: 14234.49 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:27:26,864 epoch 7 - iter 26/136 - loss 0.38918016 - time (sec): 0.73 - samples/sec: 14613.85 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:27:27,220 epoch 7 - iter 39/136 - loss 0.37782537 - time (sec): 1.08 - samples/sec: 14896.89 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:27:27,573 epoch 7 - iter 52/136 - loss 0.35882322 - time (sec): 1.43 - samples/sec: 14644.40 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:27:27,919 epoch 7 - iter 65/136 - loss 0.35751503 - time (sec): 1.78 - samples/sec: 14157.69 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:27:28,279 epoch 7 - iter 78/136 - loss 0.34501957 - time (sec): 2.14 - samples/sec: 14333.12 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:27:28,640 epoch 7 - iter 91/136 - loss 0.34745316 - time (sec): 2.50 - samples/sec: 14243.65 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:27:28,993 epoch 7 - iter 104/136 - loss 0.34338987 - time (sec): 2.85 - samples/sec: 14195.63 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:27:29,322 epoch 7 - iter 117/136 - loss 0.34562531 - time (sec): 3.18 - samples/sec: 14058.38 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:27:29,686 epoch 7 - iter 130/136 - loss 0.34425524 - time (sec): 3.55 - samples/sec: 14108.36 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:27:29,840 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:29,841 EPOCH 7 done: loss 0.3456 - lr: 0.000017
2023-10-20 00:27:30,605 DEV : loss 0.27339091897010803 - f1-score (micro avg) 0.39
2023-10-20 00:27:30,609 saving best model
2023-10-20 00:27:30,639 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:31,001 epoch 8 - iter 13/136 - loss 0.26407522 - time (sec): 0.36 - samples/sec: 13674.20 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:27:31,521 epoch 8 - iter 26/136 - loss 0.31575481 - time (sec): 0.88 - samples/sec: 11408.81 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:27:31,884 epoch 8 - iter 39/136 - loss 0.33893523 - time (sec): 1.24 - samples/sec: 13306.81 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:27:32,233 epoch 8 - iter 52/136 - loss 0.33397543 - time (sec): 1.59 - samples/sec: 12976.12 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:27:32,585 epoch 8 - iter 65/136 - loss 0.34781438 - time (sec): 1.95 - samples/sec: 13406.32 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:27:32,938 epoch 8 - iter 78/136 - loss 0.35349291 - time (sec): 2.30 - samples/sec: 13593.25 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:27:33,320 epoch 8 - iter 91/136 - loss 0.34378778 - time (sec): 2.68 - samples/sec: 13781.68 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:27:33,656 epoch 8 - iter 104/136 - loss 0.34014186 - time (sec): 3.02 - samples/sec: 13587.33 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:27:34,006 epoch 8 - iter 117/136 - loss 0.34142943 - time (sec): 3.37 - samples/sec: 13532.46 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:27:34,342 epoch 8 - iter 130/136 - loss 0.34943882 - time (sec): 3.70 - samples/sec: 13511.12 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:27:34,492 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:34,492 EPOCH 8 done: loss 0.3475 - lr: 0.000012
2023-10-20 00:27:35,269 DEV : loss 0.27384471893310547 - f1-score (micro avg) 0.3753
2023-10-20 00:27:35,273 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:35,609 epoch 9 - iter 13/136 - loss 0.36272695 - time (sec): 0.34 - samples/sec: 14199.66 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:27:36,002 epoch 9 - iter 26/136 - loss 0.34985916 - time (sec): 0.73 - samples/sec: 14210.84 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:27:36,364 epoch 9 - iter 39/136 - loss 0.37371902 - time (sec): 1.09 - samples/sec: 14131.43 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:27:36,735 epoch 9 - iter 52/136 - loss 0.34718522 - time (sec): 1.46 - samples/sec: 14590.93 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:27:37,091 epoch 9 - iter 65/136 - loss 0.34096769 - time (sec): 1.82 - samples/sec: 14320.87 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:27:37,429 epoch 9 - iter 78/136 - loss 0.34713841 - time (sec): 2.16 - samples/sec: 13946.64 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:27:37,798 epoch 9 - iter 91/136 - loss 0.34101774 - time (sec): 2.52 - samples/sec: 14043.44 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:27:38,156 epoch 9 - iter 104/136 - loss 0.33583877 - time (sec): 2.88 - samples/sec: 14060.14 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:27:38,511 epoch 9 - iter 117/136 - loss 0.33839451 - time (sec): 3.24 - samples/sec: 14213.40 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:27:38,847 epoch 9 - iter 130/136 - loss 0.33797165 - time (sec): 3.57 - samples/sec: 14106.59 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:27:39,008 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:39,009 EPOCH 9 done: loss 0.3374 - lr: 0.000006
2023-10-20 00:27:39,775 DEV : loss 0.27189964056015015 - f1-score (micro avg) 0.4053
2023-10-20 00:27:39,779 saving best model
2023-10-20 00:27:39,814 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:40,179 epoch 10 - iter 13/136 - loss 0.36346730 - time (sec): 0.37 - samples/sec: 15436.52 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:27:40,529 epoch 10 - iter 26/136 - loss 0.32886607 - time (sec): 0.71 - samples/sec: 14312.16 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:27:40,872 epoch 10 - iter 39/136 - loss 0.32462501 - time (sec): 1.06 - samples/sec: 14506.08 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:27:41,225 epoch 10 - iter 52/136 - loss 0.32252604 - time (sec): 1.41 - samples/sec: 14447.57 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:27:41,594 epoch 10 - iter 65/136 - loss 0.31124105 - time (sec): 1.78 - samples/sec: 14289.21 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:27:41,892 epoch 10 - iter 78/136 - loss 0.31555264 - time (sec): 2.08 - samples/sec: 14366.24 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:27:42,210 epoch 10 - iter 91/136 - loss 0.31866405 - time (sec): 2.40 - samples/sec: 14494.69 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:27:42,523 epoch 10 - iter 104/136 - loss 0.32646837 - time (sec): 2.71 - samples/sec: 14646.78 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:27:42,856 epoch 10 - iter 117/136 - loss 0.32086288 - time (sec): 3.04 - samples/sec: 14683.40 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:27:43,385 epoch 10 - iter 130/136 - loss 0.32547930 - time (sec): 3.57 - samples/sec: 13985.99 - lr: 0.000000 - momentum: 0.000000
2023-10-20 00:27:43,536 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:43,536 EPOCH 10 done: loss 0.3254 - lr: 0.000000
2023-10-20 00:27:44,298 DEV : loss 0.2700875699520111 - f1-score (micro avg) 0.4053
2023-10-20 00:27:44,328 ----------------------------------------------------------------------------------------------------
2023-10-20 00:27:44,328 Loading model from best epoch ...
2023-10-20 00:27:44,402 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-20 00:27:45,224
Results:
- F-score (micro) 0.3025
- F-score (macro) 0.1634
- Accuracy 0.1879
By class:
precision recall f1-score support
PER 0.1911 0.2692 0.2236 208
LOC 0.5856 0.3397 0.4300 312
ORG 0.0000 0.0000 0.0000 55
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.3418 0.2714 0.3025 597
macro avg 0.1942 0.1522 0.1634 597
weighted avg 0.3727 0.2714 0.3026 597
2023-10-20 00:27:45,225 ----------------------------------------------------------------------------------------------------