stefan-it's picture
Upload folder using huggingface_hub
f4d9810
2023-10-20 00:10:01,865 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,865 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 Train: 1085 sentences
2023-10-20 00:10:01,866 (train_with_dev=False, train_with_test=False)
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 Training Params:
2023-10-20 00:10:01,866 - learning_rate: "5e-05"
2023-10-20 00:10:01,866 - mini_batch_size: "4"
2023-10-20 00:10:01,866 - max_epochs: "10"
2023-10-20 00:10:01,866 - shuffle: "True"
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 Plugins:
2023-10-20 00:10:01,866 - TensorboardLogger
2023-10-20 00:10:01,866 - LinearScheduler | warmup_fraction: '0.1'
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 Final evaluation on model from best epoch (best-model.pt)
2023-10-20 00:10:01,866 - metric: "('micro avg', 'f1-score')"
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 Computation:
2023-10-20 00:10:01,866 - compute on device: cuda:0
2023-10-20 00:10:01,866 - embedding storage: none
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,866 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:01,867 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-20 00:10:02,379 epoch 1 - iter 27/272 - loss 3.27370034 - time (sec): 0.51 - samples/sec: 10928.06 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:10:02,881 epoch 1 - iter 54/272 - loss 3.17903863 - time (sec): 1.01 - samples/sec: 10349.91 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:10:03,381 epoch 1 - iter 81/272 - loss 3.03675988 - time (sec): 1.51 - samples/sec: 10667.36 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:10:03,827 epoch 1 - iter 108/272 - loss 2.83129903 - time (sec): 1.96 - samples/sec: 10660.83 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:10:04,268 epoch 1 - iter 135/272 - loss 2.62136473 - time (sec): 2.40 - samples/sec: 10574.98 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:10:04,747 epoch 1 - iter 162/272 - loss 2.33357166 - time (sec): 2.88 - samples/sec: 10810.83 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:10:05,198 epoch 1 - iter 189/272 - loss 2.15035600 - time (sec): 3.33 - samples/sec: 10730.77 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:10:05,694 epoch 1 - iter 216/272 - loss 1.94404664 - time (sec): 3.83 - samples/sec: 10777.68 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:10:06,158 epoch 1 - iter 243/272 - loss 1.80696099 - time (sec): 4.29 - samples/sec: 10804.88 - lr: 0.000044 - momentum: 0.000000
2023-10-20 00:10:06,681 epoch 1 - iter 270/272 - loss 1.68559515 - time (sec): 4.81 - samples/sec: 10764.05 - lr: 0.000049 - momentum: 0.000000
2023-10-20 00:10:06,710 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:06,710 EPOCH 1 done: loss 1.6836 - lr: 0.000049
2023-10-20 00:10:06,979 DEV : loss 0.47485587000846863 - f1-score (micro avg) 0.0
2023-10-20 00:10:06,983 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:07,504 epoch 2 - iter 27/272 - loss 0.71854839 - time (sec): 0.52 - samples/sec: 10563.71 - lr: 0.000049 - momentum: 0.000000
2023-10-20 00:10:08,029 epoch 2 - iter 54/272 - loss 0.69126901 - time (sec): 1.05 - samples/sec: 10961.74 - lr: 0.000049 - momentum: 0.000000
2023-10-20 00:10:08,524 epoch 2 - iter 81/272 - loss 0.69336256 - time (sec): 1.54 - samples/sec: 10565.69 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:10:09,042 epoch 2 - iter 108/272 - loss 0.68028598 - time (sec): 2.06 - samples/sec: 10578.73 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:10:09,569 epoch 2 - iter 135/272 - loss 0.65069861 - time (sec): 2.59 - samples/sec: 10767.83 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:10:10,032 epoch 2 - iter 162/272 - loss 0.62575079 - time (sec): 3.05 - samples/sec: 10542.90 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:10:10,535 epoch 2 - iter 189/272 - loss 0.60004979 - time (sec): 3.55 - samples/sec: 10434.51 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:10:11,037 epoch 2 - iter 216/272 - loss 0.59524114 - time (sec): 4.05 - samples/sec: 10334.91 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:10:11,546 epoch 2 - iter 243/272 - loss 0.58323312 - time (sec): 4.56 - samples/sec: 10241.49 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:10:12,008 epoch 2 - iter 270/272 - loss 0.56554405 - time (sec): 5.03 - samples/sec: 10283.98 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:10:12,042 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:12,042 EPOCH 2 done: loss 0.5664 - lr: 0.000045
2023-10-20 00:10:12,800 DEV : loss 0.37471088767051697 - f1-score (micro avg) 0.0
2023-10-20 00:10:12,805 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:13,321 epoch 3 - iter 27/272 - loss 0.50465870 - time (sec): 0.52 - samples/sec: 9483.24 - lr: 0.000044 - momentum: 0.000000
2023-10-20 00:10:13,863 epoch 3 - iter 54/272 - loss 0.48630798 - time (sec): 1.06 - samples/sec: 9472.20 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:10:14,373 epoch 3 - iter 81/272 - loss 0.46931093 - time (sec): 1.57 - samples/sec: 10238.28 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:10:14,834 epoch 3 - iter 108/272 - loss 0.45588380 - time (sec): 2.03 - samples/sec: 10557.30 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:10:15,459 epoch 3 - iter 135/272 - loss 0.45732524 - time (sec): 2.65 - samples/sec: 9906.58 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:10:15,898 epoch 3 - iter 162/272 - loss 0.45347777 - time (sec): 3.09 - samples/sec: 10045.11 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:10:16,342 epoch 3 - iter 189/272 - loss 0.45441373 - time (sec): 3.54 - samples/sec: 9992.82 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:10:16,797 epoch 3 - iter 216/272 - loss 0.45094356 - time (sec): 3.99 - samples/sec: 10134.42 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:10:17,287 epoch 3 - iter 243/272 - loss 0.44316849 - time (sec): 4.48 - samples/sec: 10179.48 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:10:17,774 epoch 3 - iter 270/272 - loss 0.43558974 - time (sec): 4.97 - samples/sec: 10432.51 - lr: 0.000039 - momentum: 0.000000
2023-10-20 00:10:17,801 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:17,801 EPOCH 3 done: loss 0.4352 - lr: 0.000039
2023-10-20 00:10:18,558 DEV : loss 0.2934219241142273 - f1-score (micro avg) 0.2097
2023-10-20 00:10:18,562 saving best model
2023-10-20 00:10:18,589 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:19,077 epoch 4 - iter 27/272 - loss 0.34274907 - time (sec): 0.49 - samples/sec: 10600.34 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:10:19,548 epoch 4 - iter 54/272 - loss 0.38585840 - time (sec): 0.96 - samples/sec: 10938.94 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:10:20,038 epoch 4 - iter 81/272 - loss 0.41785414 - time (sec): 1.45 - samples/sec: 11044.43 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:10:20,485 epoch 4 - iter 108/272 - loss 0.40897222 - time (sec): 1.90 - samples/sec: 11022.39 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:10:20,950 epoch 4 - iter 135/272 - loss 0.38196432 - time (sec): 2.36 - samples/sec: 11118.53 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:10:21,402 epoch 4 - iter 162/272 - loss 0.37036420 - time (sec): 2.81 - samples/sec: 10991.66 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:10:21,870 epoch 4 - iter 189/272 - loss 0.38038372 - time (sec): 3.28 - samples/sec: 11153.47 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:10:22,302 epoch 4 - iter 216/272 - loss 0.37697104 - time (sec): 3.71 - samples/sec: 11153.61 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:10:22,766 epoch 4 - iter 243/272 - loss 0.37353457 - time (sec): 4.18 - samples/sec: 11184.82 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:10:23,219 epoch 4 - iter 270/272 - loss 0.38186740 - time (sec): 4.63 - samples/sec: 11185.00 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:10:23,245 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:23,245 EPOCH 4 done: loss 0.3816 - lr: 0.000033
2023-10-20 00:10:24,005 DEV : loss 0.2845022976398468 - f1-score (micro avg) 0.3411
2023-10-20 00:10:24,009 saving best model
2023-10-20 00:10:24,041 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:24,551 epoch 5 - iter 27/272 - loss 0.31416516 - time (sec): 0.51 - samples/sec: 11673.42 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:10:25,066 epoch 5 - iter 54/272 - loss 0.34632047 - time (sec): 1.02 - samples/sec: 11559.14 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:10:25,589 epoch 5 - iter 81/272 - loss 0.34140409 - time (sec): 1.55 - samples/sec: 11346.81 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:10:26,078 epoch 5 - iter 108/272 - loss 0.34931667 - time (sec): 2.04 - samples/sec: 10850.34 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:10:26,573 epoch 5 - iter 135/272 - loss 0.36838290 - time (sec): 2.53 - samples/sec: 10738.64 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:10:27,076 epoch 5 - iter 162/272 - loss 0.35263557 - time (sec): 3.03 - samples/sec: 10855.75 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:10:27,592 epoch 5 - iter 189/272 - loss 0.34759117 - time (sec): 3.55 - samples/sec: 10714.03 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:10:28,066 epoch 5 - iter 216/272 - loss 0.34635807 - time (sec): 4.02 - samples/sec: 10493.00 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:10:28,552 epoch 5 - iter 243/272 - loss 0.34919162 - time (sec): 4.51 - samples/sec: 10458.26 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:10:29,031 epoch 5 - iter 270/272 - loss 0.34703800 - time (sec): 4.99 - samples/sec: 10371.11 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:10:29,062 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:29,062 EPOCH 5 done: loss 0.3459 - lr: 0.000028
2023-10-20 00:10:29,813 DEV : loss 0.2660656273365021 - f1-score (micro avg) 0.4177
2023-10-20 00:10:29,817 saving best model
2023-10-20 00:10:29,850 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:30,362 epoch 6 - iter 27/272 - loss 0.33120098 - time (sec): 0.51 - samples/sec: 9847.18 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:10:30,859 epoch 6 - iter 54/272 - loss 0.30382872 - time (sec): 1.01 - samples/sec: 10381.06 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:10:31,358 epoch 6 - iter 81/272 - loss 0.30565634 - time (sec): 1.51 - samples/sec: 10442.74 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:10:31,882 epoch 6 - iter 108/272 - loss 0.33247609 - time (sec): 2.03 - samples/sec: 10860.62 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:10:32,392 epoch 6 - iter 135/272 - loss 0.32788335 - time (sec): 2.54 - samples/sec: 10729.03 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:10:32,846 epoch 6 - iter 162/272 - loss 0.32789529 - time (sec): 3.00 - samples/sec: 10587.69 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:10:33,367 epoch 6 - iter 189/272 - loss 0.33278159 - time (sec): 3.52 - samples/sec: 10501.36 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:10:33,869 epoch 6 - iter 216/272 - loss 0.32579663 - time (sec): 4.02 - samples/sec: 10524.59 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:10:34,364 epoch 6 - iter 243/272 - loss 0.32305445 - time (sec): 4.51 - samples/sec: 10491.65 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:10:34,831 epoch 6 - iter 270/272 - loss 0.32336884 - time (sec): 4.98 - samples/sec: 10419.07 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:10:34,857 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:34,857 EPOCH 6 done: loss 0.3229 - lr: 0.000022
2023-10-20 00:10:35,629 DEV : loss 0.25167223811149597 - f1-score (micro avg) 0.4637
2023-10-20 00:10:35,632 saving best model
2023-10-20 00:10:35,666 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:36,187 epoch 7 - iter 27/272 - loss 0.40516393 - time (sec): 0.52 - samples/sec: 10828.35 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:10:36,726 epoch 7 - iter 54/272 - loss 0.33504991 - time (sec): 1.06 - samples/sec: 10997.35 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:10:37,225 epoch 7 - iter 81/272 - loss 0.31111585 - time (sec): 1.56 - samples/sec: 10971.57 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:10:37,689 epoch 7 - iter 108/272 - loss 0.31209626 - time (sec): 2.02 - samples/sec: 10531.08 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:10:38,190 epoch 7 - iter 135/272 - loss 0.30132911 - time (sec): 2.52 - samples/sec: 10468.80 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:10:38,689 epoch 7 - iter 162/272 - loss 0.30901869 - time (sec): 3.02 - samples/sec: 10352.48 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:10:39,212 epoch 7 - iter 189/272 - loss 0.31738761 - time (sec): 3.55 - samples/sec: 10349.55 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:10:39,725 epoch 7 - iter 216/272 - loss 0.31026399 - time (sec): 4.06 - samples/sec: 10154.23 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:10:40,258 epoch 7 - iter 243/272 - loss 0.31390048 - time (sec): 4.59 - samples/sec: 10250.46 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:10:40,763 epoch 7 - iter 270/272 - loss 0.31141067 - time (sec): 5.10 - samples/sec: 10155.93 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:10:40,797 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:40,797 EPOCH 7 done: loss 0.3113 - lr: 0.000017
2023-10-20 00:10:41,584 DEV : loss 0.24986842274665833 - f1-score (micro avg) 0.4673
2023-10-20 00:10:41,588 saving best model
2023-10-20 00:10:41,621 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:42,151 epoch 8 - iter 27/272 - loss 0.21596682 - time (sec): 0.53 - samples/sec: 10324.40 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:10:42,704 epoch 8 - iter 54/272 - loss 0.25341462 - time (sec): 1.08 - samples/sec: 9899.98 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:10:43,230 epoch 8 - iter 81/272 - loss 0.27916670 - time (sec): 1.61 - samples/sec: 9787.12 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:10:43,790 epoch 8 - iter 108/272 - loss 0.31225710 - time (sec): 2.17 - samples/sec: 9838.49 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:10:44,441 epoch 8 - iter 135/272 - loss 0.30190443 - time (sec): 2.82 - samples/sec: 9574.76 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:10:44,980 epoch 8 - iter 162/272 - loss 0.29499440 - time (sec): 3.36 - samples/sec: 9532.85 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:10:45,536 epoch 8 - iter 189/272 - loss 0.29255034 - time (sec): 3.91 - samples/sec: 9549.95 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:10:46,020 epoch 8 - iter 216/272 - loss 0.29419704 - time (sec): 4.40 - samples/sec: 9476.05 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:10:46,489 epoch 8 - iter 243/272 - loss 0.29641175 - time (sec): 4.87 - samples/sec: 9471.44 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:10:47,006 epoch 8 - iter 270/272 - loss 0.29820061 - time (sec): 5.38 - samples/sec: 9579.52 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:10:47,044 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:47,045 EPOCH 8 done: loss 0.2978 - lr: 0.000011
2023-10-20 00:10:47,812 DEV : loss 0.2413243055343628 - f1-score (micro avg) 0.4778
2023-10-20 00:10:47,816 saving best model
2023-10-20 00:10:47,848 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:48,335 epoch 9 - iter 27/272 - loss 0.30738887 - time (sec): 0.49 - samples/sec: 10001.71 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:10:48,837 epoch 9 - iter 54/272 - loss 0.28206474 - time (sec): 0.99 - samples/sec: 9975.60 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:10:49,367 epoch 9 - iter 81/272 - loss 0.28407417 - time (sec): 1.52 - samples/sec: 10336.58 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:10:49,854 epoch 9 - iter 108/272 - loss 0.30604991 - time (sec): 2.01 - samples/sec: 10319.44 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:10:50,353 epoch 9 - iter 135/272 - loss 0.29423229 - time (sec): 2.50 - samples/sec: 10262.21 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:10:50,847 epoch 9 - iter 162/272 - loss 0.30151775 - time (sec): 3.00 - samples/sec: 10223.80 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:10:51,348 epoch 9 - iter 189/272 - loss 0.29419587 - time (sec): 3.50 - samples/sec: 10128.05 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:10:51,848 epoch 9 - iter 216/272 - loss 0.29606309 - time (sec): 4.00 - samples/sec: 10152.35 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:10:52,373 epoch 9 - iter 243/272 - loss 0.29534319 - time (sec): 4.52 - samples/sec: 10314.68 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:10:52,882 epoch 9 - iter 270/272 - loss 0.29063737 - time (sec): 5.03 - samples/sec: 10307.02 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:10:52,911 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:52,911 EPOCH 9 done: loss 0.2906 - lr: 0.000006
2023-10-20 00:10:53,695 DEV : loss 0.2407713383436203 - f1-score (micro avg) 0.4847
2023-10-20 00:10:53,699 saving best model
2023-10-20 00:10:53,730 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:54,262 epoch 10 - iter 27/272 - loss 0.26292708 - time (sec): 0.53 - samples/sec: 10922.93 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:10:54,737 epoch 10 - iter 54/272 - loss 0.30143339 - time (sec): 1.01 - samples/sec: 9895.19 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:10:55,241 epoch 10 - iter 81/272 - loss 0.28174577 - time (sec): 1.51 - samples/sec: 9911.78 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:10:55,788 epoch 10 - iter 108/272 - loss 0.29077417 - time (sec): 2.06 - samples/sec: 9840.15 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:10:56,379 epoch 10 - iter 135/272 - loss 0.29242246 - time (sec): 2.65 - samples/sec: 9882.17 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:10:56,922 epoch 10 - iter 162/272 - loss 0.29184925 - time (sec): 3.19 - samples/sec: 9737.74 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:10:57,415 epoch 10 - iter 189/272 - loss 0.28160903 - time (sec): 3.68 - samples/sec: 9886.93 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:10:57,922 epoch 10 - iter 216/272 - loss 0.29090535 - time (sec): 4.19 - samples/sec: 10019.03 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:10:58,409 epoch 10 - iter 243/272 - loss 0.28667582 - time (sec): 4.68 - samples/sec: 9938.18 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:10:58,920 epoch 10 - iter 270/272 - loss 0.28590550 - time (sec): 5.19 - samples/sec: 9980.62 - lr: 0.000000 - momentum: 0.000000
2023-10-20 00:10:58,949 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:58,949 EPOCH 10 done: loss 0.2864 - lr: 0.000000
2023-10-20 00:10:59,722 DEV : loss 0.2398698478937149 - f1-score (micro avg) 0.4836
2023-10-20 00:10:59,754 ----------------------------------------------------------------------------------------------------
2023-10-20 00:10:59,755 Loading model from best epoch ...
2023-10-20 00:10:59,829 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-20 00:11:00,637
Results:
- F-score (micro) 0.4062
- F-score (macro) 0.2043
- Accuracy 0.2646
By class:
precision recall f1-score support
LOC 0.5027 0.5962 0.5455 312
PER 0.2461 0.3029 0.2716 208
ORG 0.0000 0.0000 0.0000 55
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.3959 0.4171 0.4062 597
macro avg 0.1872 0.2248 0.2043 597
weighted avg 0.3485 0.4171 0.3797 597
2023-10-20 00:11:00,637 ----------------------------------------------------------------------------------------------------