stefan-it's picture
Upload folder using huggingface_hub
7f88ac2
2023-10-20 00:00:24,420 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,420 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-20 00:00:24,420 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,420 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-20 00:00:24,420 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,420 Train: 1166 sentences
2023-10-20 00:00:24,420 (train_with_dev=False, train_with_test=False)
2023-10-20 00:00:24,420 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,420 Training Params:
2023-10-20 00:00:24,420 - learning_rate: "5e-05"
2023-10-20 00:00:24,420 - mini_batch_size: "8"
2023-10-20 00:00:24,421 - max_epochs: "10"
2023-10-20 00:00:24,421 - shuffle: "True"
2023-10-20 00:00:24,421 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,421 Plugins:
2023-10-20 00:00:24,421 - TensorboardLogger
2023-10-20 00:00:24,421 - LinearScheduler | warmup_fraction: '0.1'
2023-10-20 00:00:24,421 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,421 Final evaluation on model from best epoch (best-model.pt)
2023-10-20 00:00:24,421 - metric: "('micro avg', 'f1-score')"
2023-10-20 00:00:24,421 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,421 Computation:
2023-10-20 00:00:24,421 - compute on device: cuda:0
2023-10-20 00:00:24,421 - embedding storage: none
2023-10-20 00:00:24,421 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,421 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-20 00:00:24,421 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,421 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:24,421 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-20 00:00:24,767 epoch 1 - iter 14/146 - loss 3.17723201 - time (sec): 0.35 - samples/sec: 11122.26 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:00:25,120 epoch 1 - iter 28/146 - loss 3.15612779 - time (sec): 0.70 - samples/sec: 10763.94 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:00:25,478 epoch 1 - iter 42/146 - loss 3.09001890 - time (sec): 1.06 - samples/sec: 10740.23 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:00:25,849 epoch 1 - iter 56/146 - loss 3.04725899 - time (sec): 1.43 - samples/sec: 10784.62 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:00:26,225 epoch 1 - iter 70/146 - loss 2.87972314 - time (sec): 1.80 - samples/sec: 11088.76 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:00:26,593 epoch 1 - iter 84/146 - loss 2.72017656 - time (sec): 2.17 - samples/sec: 11030.59 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:00:26,980 epoch 1 - iter 98/146 - loss 2.49850046 - time (sec): 2.56 - samples/sec: 11104.17 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:00:27,515 epoch 1 - iter 112/146 - loss 2.28917480 - time (sec): 3.09 - samples/sec: 10779.71 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:00:27,902 epoch 1 - iter 126/146 - loss 2.12831603 - time (sec): 3.48 - samples/sec: 10785.86 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:00:28,284 epoch 1 - iter 140/146 - loss 1.97863993 - time (sec): 3.86 - samples/sec: 11006.51 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:00:28,429 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:28,429 EPOCH 1 done: loss 1.9241 - lr: 0.000048
2023-10-20 00:00:28,690 DEV : loss 0.4782404601573944 - f1-score (micro avg) 0.0
2023-10-20 00:00:28,694 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:29,098 epoch 2 - iter 14/146 - loss 1.09887195 - time (sec): 0.40 - samples/sec: 11822.61 - lr: 0.000050 - momentum: 0.000000
2023-10-20 00:00:29,466 epoch 2 - iter 28/146 - loss 0.95519720 - time (sec): 0.77 - samples/sec: 11969.87 - lr: 0.000049 - momentum: 0.000000
2023-10-20 00:00:29,835 epoch 2 - iter 42/146 - loss 0.84853598 - time (sec): 1.14 - samples/sec: 11667.50 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:00:30,190 epoch 2 - iter 56/146 - loss 0.81643196 - time (sec): 1.50 - samples/sec: 11366.36 - lr: 0.000048 - momentum: 0.000000
2023-10-20 00:00:30,567 epoch 2 - iter 70/146 - loss 0.77926945 - time (sec): 1.87 - samples/sec: 11260.00 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:00:30,917 epoch 2 - iter 84/146 - loss 0.75411002 - time (sec): 2.22 - samples/sec: 11220.62 - lr: 0.000047 - momentum: 0.000000
2023-10-20 00:00:31,280 epoch 2 - iter 98/146 - loss 0.73229273 - time (sec): 2.59 - samples/sec: 11100.80 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:00:31,664 epoch 2 - iter 112/146 - loss 0.70389120 - time (sec): 2.97 - samples/sec: 11371.26 - lr: 0.000046 - momentum: 0.000000
2023-10-20 00:00:32,049 epoch 2 - iter 126/146 - loss 0.67720702 - time (sec): 3.36 - samples/sec: 11665.57 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:00:32,398 epoch 2 - iter 140/146 - loss 0.68241789 - time (sec): 3.70 - samples/sec: 11569.27 - lr: 0.000045 - momentum: 0.000000
2023-10-20 00:00:32,541 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:32,541 EPOCH 2 done: loss 0.6755 - lr: 0.000045
2023-10-20 00:00:33,184 DEV : loss 0.3925292193889618 - f1-score (micro avg) 0.0
2023-10-20 00:00:33,189 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:33,561 epoch 3 - iter 14/146 - loss 0.45005562 - time (sec): 0.37 - samples/sec: 10815.64 - lr: 0.000044 - momentum: 0.000000
2023-10-20 00:00:33,943 epoch 3 - iter 28/146 - loss 0.47075459 - time (sec): 0.75 - samples/sec: 11363.18 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:00:34,329 epoch 3 - iter 42/146 - loss 0.50697891 - time (sec): 1.14 - samples/sec: 11363.12 - lr: 0.000043 - momentum: 0.000000
2023-10-20 00:00:34,723 epoch 3 - iter 56/146 - loss 0.55832426 - time (sec): 1.53 - samples/sec: 11356.86 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:00:35,241 epoch 3 - iter 70/146 - loss 0.55201558 - time (sec): 2.05 - samples/sec: 10326.26 - lr: 0.000042 - momentum: 0.000000
2023-10-20 00:00:35,622 epoch 3 - iter 84/146 - loss 0.54282315 - time (sec): 2.43 - samples/sec: 10823.47 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:00:35,961 epoch 3 - iter 98/146 - loss 0.54065809 - time (sec): 2.77 - samples/sec: 10884.07 - lr: 0.000041 - momentum: 0.000000
2023-10-20 00:00:36,326 epoch 3 - iter 112/146 - loss 0.52730552 - time (sec): 3.14 - samples/sec: 10993.96 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:00:36,680 epoch 3 - iter 126/146 - loss 0.52185578 - time (sec): 3.49 - samples/sec: 10896.07 - lr: 0.000040 - momentum: 0.000000
2023-10-20 00:00:37,046 epoch 3 - iter 140/146 - loss 0.51713714 - time (sec): 3.86 - samples/sec: 11080.53 - lr: 0.000039 - momentum: 0.000000
2023-10-20 00:00:37,194 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:37,194 EPOCH 3 done: loss 0.5156 - lr: 0.000039
2023-10-20 00:00:37,825 DEV : loss 0.36764416098594666 - f1-score (micro avg) 0.0082
2023-10-20 00:00:37,829 saving best model
2023-10-20 00:00:37,856 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:38,213 epoch 4 - iter 14/146 - loss 0.42998140 - time (sec): 0.36 - samples/sec: 10386.28 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:00:38,576 epoch 4 - iter 28/146 - loss 0.44282841 - time (sec): 0.72 - samples/sec: 10623.64 - lr: 0.000038 - momentum: 0.000000
2023-10-20 00:00:38,971 epoch 4 - iter 42/146 - loss 0.42573623 - time (sec): 1.11 - samples/sec: 11103.16 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:00:39,329 epoch 4 - iter 56/146 - loss 0.43880130 - time (sec): 1.47 - samples/sec: 11205.95 - lr: 0.000037 - momentum: 0.000000
2023-10-20 00:00:39,684 epoch 4 - iter 70/146 - loss 0.43866120 - time (sec): 1.83 - samples/sec: 11241.65 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:00:40,035 epoch 4 - iter 84/146 - loss 0.44082921 - time (sec): 2.18 - samples/sec: 11247.76 - lr: 0.000036 - momentum: 0.000000
2023-10-20 00:00:40,399 epoch 4 - iter 98/146 - loss 0.47160037 - time (sec): 2.54 - samples/sec: 11494.57 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:00:40,756 epoch 4 - iter 112/146 - loss 0.45736533 - time (sec): 2.90 - samples/sec: 11635.12 - lr: 0.000035 - momentum: 0.000000
2023-10-20 00:00:41,115 epoch 4 - iter 126/146 - loss 0.45742994 - time (sec): 3.26 - samples/sec: 11572.35 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:00:41,489 epoch 4 - iter 140/146 - loss 0.45572260 - time (sec): 3.63 - samples/sec: 11683.19 - lr: 0.000034 - momentum: 0.000000
2023-10-20 00:00:41,652 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:41,652 EPOCH 4 done: loss 0.4557 - lr: 0.000034
2023-10-20 00:00:42,292 DEV : loss 0.3331185281276703 - f1-score (micro avg) 0.0491
2023-10-20 00:00:42,296 saving best model
2023-10-20 00:00:42,336 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:42,741 epoch 5 - iter 14/146 - loss 0.41530283 - time (sec): 0.40 - samples/sec: 12759.36 - lr: 0.000033 - momentum: 0.000000
2023-10-20 00:00:43,124 epoch 5 - iter 28/146 - loss 0.48554595 - time (sec): 0.79 - samples/sec: 12148.86 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:00:43,492 epoch 5 - iter 42/146 - loss 0.45840546 - time (sec): 1.16 - samples/sec: 11590.19 - lr: 0.000032 - momentum: 0.000000
2023-10-20 00:00:43,866 epoch 5 - iter 56/146 - loss 0.45258262 - time (sec): 1.53 - samples/sec: 11242.28 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:00:44,256 epoch 5 - iter 70/146 - loss 0.44561905 - time (sec): 1.92 - samples/sec: 11460.68 - lr: 0.000031 - momentum: 0.000000
2023-10-20 00:00:44,623 epoch 5 - iter 84/146 - loss 0.43129736 - time (sec): 2.29 - samples/sec: 11495.36 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:00:44,976 epoch 5 - iter 98/146 - loss 0.42846010 - time (sec): 2.64 - samples/sec: 11316.63 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:00:45,358 epoch 5 - iter 112/146 - loss 0.42713898 - time (sec): 3.02 - samples/sec: 11406.47 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:00:45,717 epoch 5 - iter 126/146 - loss 0.43711608 - time (sec): 3.38 - samples/sec: 11265.73 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:00:46,111 epoch 5 - iter 140/146 - loss 0.42715479 - time (sec): 3.77 - samples/sec: 11395.93 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:00:46,261 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:46,261 EPOCH 5 done: loss 0.4227 - lr: 0.000028
2023-10-20 00:00:46,901 DEV : loss 0.32058432698249817 - f1-score (micro avg) 0.1371
2023-10-20 00:00:46,905 saving best model
2023-10-20 00:00:46,939 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:47,319 epoch 6 - iter 14/146 - loss 0.43001814 - time (sec): 0.38 - samples/sec: 11530.08 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:00:47,649 epoch 6 - iter 28/146 - loss 0.39433315 - time (sec): 0.71 - samples/sec: 11589.67 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:00:47,963 epoch 6 - iter 42/146 - loss 0.40571073 - time (sec): 1.02 - samples/sec: 11968.50 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:00:48,306 epoch 6 - iter 56/146 - loss 0.40796916 - time (sec): 1.37 - samples/sec: 12155.22 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:00:48,675 epoch 6 - iter 70/146 - loss 0.39108897 - time (sec): 1.74 - samples/sec: 12294.12 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:00:49,047 epoch 6 - iter 84/146 - loss 0.38197870 - time (sec): 2.11 - samples/sec: 12125.77 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:00:49,433 epoch 6 - iter 98/146 - loss 0.37692165 - time (sec): 2.49 - samples/sec: 12136.60 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:00:49,796 epoch 6 - iter 112/146 - loss 0.37784940 - time (sec): 2.86 - samples/sec: 12169.87 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:00:50,147 epoch 6 - iter 126/146 - loss 0.37856587 - time (sec): 3.21 - samples/sec: 12056.87 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:00:50,536 epoch 6 - iter 140/146 - loss 0.38924008 - time (sec): 3.60 - samples/sec: 11965.06 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:00:50,687 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:50,687 EPOCH 6 done: loss 0.3894 - lr: 0.000023
2023-10-20 00:00:51,326 DEV : loss 0.3271176218986511 - f1-score (micro avg) 0.1433
2023-10-20 00:00:51,330 saving best model
2023-10-20 00:00:51,363 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:51,750 epoch 7 - iter 14/146 - loss 0.31479723 - time (sec): 0.39 - samples/sec: 13842.19 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:00:52,094 epoch 7 - iter 28/146 - loss 0.37446333 - time (sec): 0.73 - samples/sec: 12377.93 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:00:52,444 epoch 7 - iter 42/146 - loss 0.38718033 - time (sec): 1.08 - samples/sec: 11771.73 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:00:52,802 epoch 7 - iter 56/146 - loss 0.36802516 - time (sec): 1.44 - samples/sec: 12119.42 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:00:53,167 epoch 7 - iter 70/146 - loss 0.37056283 - time (sec): 1.80 - samples/sec: 11682.57 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:00:53,522 epoch 7 - iter 84/146 - loss 0.36763286 - time (sec): 2.16 - samples/sec: 11496.29 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:00:53,944 epoch 7 - iter 98/146 - loss 0.37991055 - time (sec): 2.58 - samples/sec: 11791.09 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:00:54,307 epoch 7 - iter 112/146 - loss 0.38168082 - time (sec): 2.94 - samples/sec: 11725.81 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:00:54,667 epoch 7 - iter 126/146 - loss 0.38505741 - time (sec): 3.30 - samples/sec: 11786.73 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:00:55,014 epoch 7 - iter 140/146 - loss 0.38238690 - time (sec): 3.65 - samples/sec: 11643.60 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:00:55,168 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:55,168 EPOCH 7 done: loss 0.3772 - lr: 0.000017
2023-10-20 00:00:55,969 DEV : loss 0.30373382568359375 - f1-score (micro avg) 0.2063
2023-10-20 00:00:55,973 saving best model
2023-10-20 00:00:56,008 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:56,366 epoch 8 - iter 14/146 - loss 0.33318853 - time (sec): 0.36 - samples/sec: 11928.89 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:00:56,737 epoch 8 - iter 28/146 - loss 0.34962528 - time (sec): 0.73 - samples/sec: 11985.49 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:00:57,126 epoch 8 - iter 42/146 - loss 0.31483010 - time (sec): 1.12 - samples/sec: 12767.48 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:00:57,469 epoch 8 - iter 56/146 - loss 0.34048941 - time (sec): 1.46 - samples/sec: 12385.45 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:00:57,798 epoch 8 - iter 70/146 - loss 0.34548513 - time (sec): 1.79 - samples/sec: 11953.56 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:00:58,153 epoch 8 - iter 84/146 - loss 0.35334959 - time (sec): 2.14 - samples/sec: 11784.25 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:00:58,518 epoch 8 - iter 98/146 - loss 0.35855547 - time (sec): 2.51 - samples/sec: 11605.02 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:00:58,892 epoch 8 - iter 112/146 - loss 0.35659111 - time (sec): 2.88 - samples/sec: 11459.13 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:00:59,262 epoch 8 - iter 126/146 - loss 0.35769873 - time (sec): 3.25 - samples/sec: 11455.20 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:00:59,666 epoch 8 - iter 140/146 - loss 0.36940728 - time (sec): 3.66 - samples/sec: 11741.94 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:00:59,815 ----------------------------------------------------------------------------------------------------
2023-10-20 00:00:59,815 EPOCH 8 done: loss 0.3708 - lr: 0.000012
2023-10-20 00:01:00,455 DEV : loss 0.30823713541030884 - f1-score (micro avg) 0.2164
2023-10-20 00:01:00,459 saving best model
2023-10-20 00:01:00,492 ----------------------------------------------------------------------------------------------------
2023-10-20 00:01:00,845 epoch 9 - iter 14/146 - loss 0.37337414 - time (sec): 0.35 - samples/sec: 11208.87 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:01:01,215 epoch 9 - iter 28/146 - loss 0.34486223 - time (sec): 0.72 - samples/sec: 11175.85 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:01:01,583 epoch 9 - iter 42/146 - loss 0.34714814 - time (sec): 1.09 - samples/sec: 11513.31 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:01:01,964 epoch 9 - iter 56/146 - loss 0.33548603 - time (sec): 1.47 - samples/sec: 11196.69 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:01:02,313 epoch 9 - iter 70/146 - loss 0.34515425 - time (sec): 1.82 - samples/sec: 11158.20 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:01:02,677 epoch 9 - iter 84/146 - loss 0.35039747 - time (sec): 2.18 - samples/sec: 11232.36 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:01:03,030 epoch 9 - iter 98/146 - loss 0.34912574 - time (sec): 2.54 - samples/sec: 11341.25 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:01:03,430 epoch 9 - iter 112/146 - loss 0.34284354 - time (sec): 2.94 - samples/sec: 11437.46 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:01:03,805 epoch 9 - iter 126/146 - loss 0.35573528 - time (sec): 3.31 - samples/sec: 11600.31 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:01:04,162 epoch 9 - iter 140/146 - loss 0.35953636 - time (sec): 3.67 - samples/sec: 11628.28 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:01:04,307 ----------------------------------------------------------------------------------------------------
2023-10-20 00:01:04,307 EPOCH 9 done: loss 0.3588 - lr: 0.000006
2023-10-20 00:01:04,967 DEV : loss 0.30730772018432617 - f1-score (micro avg) 0.2005
2023-10-20 00:01:04,971 ----------------------------------------------------------------------------------------------------
2023-10-20 00:01:05,337 epoch 10 - iter 14/146 - loss 0.32241848 - time (sec): 0.37 - samples/sec: 13356.22 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:01:05,746 epoch 10 - iter 28/146 - loss 0.34594006 - time (sec): 0.77 - samples/sec: 13440.68 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:01:06,077 epoch 10 - iter 42/146 - loss 0.36323167 - time (sec): 1.11 - samples/sec: 12461.47 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:01:06,449 epoch 10 - iter 56/146 - loss 0.35186528 - time (sec): 1.48 - samples/sec: 12336.13 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:01:06,816 epoch 10 - iter 70/146 - loss 0.35596110 - time (sec): 1.84 - samples/sec: 11822.64 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:01:07,203 epoch 10 - iter 84/146 - loss 0.34984725 - time (sec): 2.23 - samples/sec: 11545.29 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:01:07,577 epoch 10 - iter 98/146 - loss 0.35158385 - time (sec): 2.61 - samples/sec: 11423.74 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:01:07,952 epoch 10 - iter 112/146 - loss 0.35863875 - time (sec): 2.98 - samples/sec: 11356.89 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:01:08,342 epoch 10 - iter 126/146 - loss 0.36121485 - time (sec): 3.37 - samples/sec: 11193.43 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:01:08,742 epoch 10 - iter 140/146 - loss 0.35905657 - time (sec): 3.77 - samples/sec: 11105.24 - lr: 0.000000 - momentum: 0.000000
2023-10-20 00:01:08,922 ----------------------------------------------------------------------------------------------------
2023-10-20 00:01:08,922 EPOCH 10 done: loss 0.3633 - lr: 0.000000
2023-10-20 00:01:09,559 DEV : loss 0.30496951937675476 - f1-score (micro avg) 0.2047
2023-10-20 00:01:09,591 ----------------------------------------------------------------------------------------------------
2023-10-20 00:01:09,591 Loading model from best epoch ...
2023-10-20 00:01:09,665 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-20 00:01:10,560
Results:
- F-score (micro) 0.3197
- F-score (macro) 0.1637
- Accuracy 0.1977
By class:
precision recall f1-score support
PER 0.3933 0.4023 0.3977 348
LOC 0.2700 0.2452 0.2570 261
ORG 0.0000 0.0000 0.0000 52
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.3440 0.2987 0.3197 683
macro avg 0.1658 0.1619 0.1637 683
weighted avg 0.3036 0.2987 0.3009 683
2023-10-20 00:01:10,560 ----------------------------------------------------------------------------------------------------