2023-10-18 21:00:15,480 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 Train: 7936 sentences 2023-10-18 21:00:15,481 (train_with_dev=False, train_with_test=False) 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 Training Params: 2023-10-18 21:00:15,481 - learning_rate: "3e-05" 2023-10-18 21:00:15,481 - mini_batch_size: "4" 2023-10-18 21:00:15,481 - max_epochs: "10" 2023-10-18 21:00:15,481 - shuffle: "True" 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 Plugins: 2023-10-18 21:00:15,481 - TensorboardLogger 2023-10-18 21:00:15,481 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 21:00:15,481 - metric: "('micro avg', 'f1-score')" 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 Computation: 2023-10-18 21:00:15,481 - compute on device: cuda:0 2023-10-18 21:00:15,481 - embedding storage: none 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,481 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-18 21:00:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,482 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:15,482 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 21:00:18,567 epoch 1 - iter 198/1984 - loss 3.01630689 - time (sec): 3.09 - samples/sec: 5456.01 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:00:21,616 epoch 1 - iter 396/1984 - loss 2.63781109 - time (sec): 6.13 - samples/sec: 5404.52 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:00:24,699 epoch 1 - iter 594/1984 - loss 2.12857008 - time (sec): 9.22 - samples/sec: 5476.75 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:00:27,704 epoch 1 - iter 792/1984 - loss 1.76891056 - time (sec): 12.22 - samples/sec: 5401.68 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:00:30,728 epoch 1 - iter 990/1984 - loss 1.51801413 - time (sec): 15.25 - samples/sec: 5381.65 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:00:33,749 epoch 1 - iter 1188/1984 - loss 1.33670417 - time (sec): 18.27 - samples/sec: 5394.02 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:00:36,758 epoch 1 - iter 1386/1984 - loss 1.20433738 - time (sec): 21.28 - samples/sec: 5393.46 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:00:39,929 epoch 1 - iter 1584/1984 - loss 1.09790683 - time (sec): 24.45 - samples/sec: 5364.53 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:00:42,973 epoch 1 - iter 1782/1984 - loss 1.01379422 - time (sec): 27.49 - samples/sec: 5367.94 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:00:46,000 epoch 1 - iter 1980/1984 - loss 0.94834740 - time (sec): 30.52 - samples/sec: 5364.02 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:00:46,059 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:46,060 EPOCH 1 done: loss 0.9476 - lr: 0.000030 2023-10-18 21:00:47,532 DEV : loss 0.21770097315311432 - f1-score (micro avg) 0.2567 2023-10-18 21:00:47,550 saving best model 2023-10-18 21:00:47,583 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:00:50,779 epoch 2 - iter 198/1984 - loss 0.30932586 - time (sec): 3.20 - samples/sec: 4982.69 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:00:53,869 epoch 2 - iter 396/1984 - loss 0.31122160 - time (sec): 6.29 - samples/sec: 5344.93 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:00:56,844 epoch 2 - iter 594/1984 - loss 0.30813418 - time (sec): 9.26 - samples/sec: 5404.28 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:00:59,899 epoch 2 - iter 792/1984 - loss 0.29740621 - time (sec): 12.32 - samples/sec: 5388.63 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:01:02,946 epoch 2 - iter 990/1984 - loss 0.29371240 - time (sec): 15.36 - samples/sec: 5349.90 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:01:05,979 epoch 2 - iter 1188/1984 - loss 0.28595077 - time (sec): 18.40 - samples/sec: 5397.68 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:01:09,049 epoch 2 - iter 1386/1984 - loss 0.28226319 - time (sec): 21.47 - samples/sec: 5390.56 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:01:12,039 epoch 2 - iter 1584/1984 - loss 0.28052267 - time (sec): 24.46 - samples/sec: 5365.14 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:01:14,861 epoch 2 - iter 1782/1984 - loss 0.27780729 - time (sec): 27.28 - samples/sec: 5346.28 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:01:17,720 epoch 2 - iter 1980/1984 - loss 0.27405839 - time (sec): 30.14 - samples/sec: 5433.96 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:01:17,777 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:01:17,777 EPOCH 2 done: loss 0.2739 - lr: 0.000027 2023-10-18 21:01:19,574 DEV : loss 0.17515970766544342 - f1-score (micro avg) 0.4234 2023-10-18 21:01:19,592 saving best model 2023-10-18 21:01:19,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:01:22,654 epoch 3 - iter 198/1984 - loss 0.24700045 - time (sec): 3.03 - samples/sec: 5439.15 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:01:25,736 epoch 3 - iter 396/1984 - loss 0.24227987 - time (sec): 6.11 - samples/sec: 5441.41 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:01:28,779 epoch 3 - iter 594/1984 - loss 0.23314960 - time (sec): 9.15 - samples/sec: 5435.18 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:01:32,174 epoch 3 - iter 792/1984 - loss 0.23924145 - time (sec): 12.55 - samples/sec: 5258.76 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:01:35,155 epoch 3 - iter 990/1984 - loss 0.23534332 - time (sec): 15.53 - samples/sec: 5257.15 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:01:38,046 epoch 3 - iter 1188/1984 - loss 0.23498720 - time (sec): 18.42 - samples/sec: 5335.03 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:01:41,143 epoch 3 - iter 1386/1984 - loss 0.23328378 - time (sec): 21.52 - samples/sec: 5320.00 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:01:44,007 epoch 3 - iter 1584/1984 - loss 0.22965641 - time (sec): 24.38 - samples/sec: 5419.82 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:01:47,050 epoch 3 - iter 1782/1984 - loss 0.22823569 - time (sec): 27.42 - samples/sec: 5393.41 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:01:50,131 epoch 3 - iter 1980/1984 - loss 0.22710576 - time (sec): 30.50 - samples/sec: 5367.17 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:01:50,189 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:01:50,189 EPOCH 3 done: loss 0.2270 - lr: 0.000023 2023-10-18 21:01:51,980 DEV : loss 0.16526508331298828 - f1-score (micro avg) 0.4635 2023-10-18 21:01:51,998 saving best model 2023-10-18 21:01:52,034 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:01:55,084 epoch 4 - iter 198/1984 - loss 0.19959679 - time (sec): 3.05 - samples/sec: 5568.23 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:01:58,107 epoch 4 - iter 396/1984 - loss 0.20084779 - time (sec): 6.07 - samples/sec: 5214.03 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:02:01,147 epoch 4 - iter 594/1984 - loss 0.20084184 - time (sec): 9.11 - samples/sec: 5226.82 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:02:04,118 epoch 4 - iter 792/1984 - loss 0.20056791 - time (sec): 12.08 - samples/sec: 5282.20 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:02:07,155 epoch 4 - iter 990/1984 - loss 0.20361150 - time (sec): 15.12 - samples/sec: 5345.29 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:02:10,187 epoch 4 - iter 1188/1984 - loss 0.20112782 - time (sec): 18.15 - samples/sec: 5373.01 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:02:13,230 epoch 4 - iter 1386/1984 - loss 0.20331136 - time (sec): 21.20 - samples/sec: 5352.07 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:02:16,315 epoch 4 - iter 1584/1984 - loss 0.20116034 - time (sec): 24.28 - samples/sec: 5344.62 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:02:19,409 epoch 4 - iter 1782/1984 - loss 0.20199094 - time (sec): 27.37 - samples/sec: 5328.23 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:02:22,522 epoch 4 - iter 1980/1984 - loss 0.20268267 - time (sec): 30.49 - samples/sec: 5367.05 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:02:22,585 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:02:22,585 EPOCH 4 done: loss 0.2029 - lr: 0.000020 2023-10-18 21:02:24,419 DEV : loss 0.1497766077518463 - f1-score (micro avg) 0.5188 2023-10-18 21:02:24,438 saving best model 2023-10-18 21:02:24,473 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:02:27,511 epoch 5 - iter 198/1984 - loss 0.16602097 - time (sec): 3.04 - samples/sec: 5316.84 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:02:30,548 epoch 5 - iter 396/1984 - loss 0.17603389 - time (sec): 6.07 - samples/sec: 5339.65 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:02:33,632 epoch 5 - iter 594/1984 - loss 0.18337612 - time (sec): 9.16 - samples/sec: 5270.35 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:02:36,694 epoch 5 - iter 792/1984 - loss 0.18104087 - time (sec): 12.22 - samples/sec: 5326.19 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:02:39,745 epoch 5 - iter 990/1984 - loss 0.18084675 - time (sec): 15.27 - samples/sec: 5312.46 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:02:42,763 epoch 5 - iter 1188/1984 - loss 0.18622428 - time (sec): 18.29 - samples/sec: 5279.30 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:02:46,034 epoch 5 - iter 1386/1984 - loss 0.18678358 - time (sec): 21.56 - samples/sec: 5249.61 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:02:49,092 epoch 5 - iter 1584/1984 - loss 0.18831101 - time (sec): 24.62 - samples/sec: 5302.59 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:02:52,132 epoch 5 - iter 1782/1984 - loss 0.18844319 - time (sec): 27.66 - samples/sec: 5333.63 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:02:55,159 epoch 5 - iter 1980/1984 - loss 0.18797599 - time (sec): 30.69 - samples/sec: 5334.01 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:02:55,221 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:02:55,221 EPOCH 5 done: loss 0.1878 - lr: 0.000017 2023-10-18 21:02:57,072 DEV : loss 0.1468917280435562 - f1-score (micro avg) 0.529 2023-10-18 21:02:57,090 saving best model 2023-10-18 21:02:57,126 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:03:00,166 epoch 6 - iter 198/1984 - loss 0.18761850 - time (sec): 3.04 - samples/sec: 5276.36 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:03:03,253 epoch 6 - iter 396/1984 - loss 0.18825065 - time (sec): 6.13 - samples/sec: 5298.16 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:03:06,340 epoch 6 - iter 594/1984 - loss 0.18134294 - time (sec): 9.21 - samples/sec: 5406.12 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:03:09,366 epoch 6 - iter 792/1984 - loss 0.17859959 - time (sec): 12.24 - samples/sec: 5356.48 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:03:12,510 epoch 6 - iter 990/1984 - loss 0.17744544 - time (sec): 15.38 - samples/sec: 5360.24 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:03:15,499 epoch 6 - iter 1188/1984 - loss 0.17789956 - time (sec): 18.37 - samples/sec: 5339.87 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:03:18,564 epoch 6 - iter 1386/1984 - loss 0.17511154 - time (sec): 21.44 - samples/sec: 5357.53 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:03:21,609 epoch 6 - iter 1584/1984 - loss 0.17280066 - time (sec): 24.48 - samples/sec: 5364.50 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:03:24,604 epoch 6 - iter 1782/1984 - loss 0.17741738 - time (sec): 27.48 - samples/sec: 5339.01 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:03:27,714 epoch 6 - iter 1980/1984 - loss 0.17636692 - time (sec): 30.59 - samples/sec: 5349.68 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:03:27,778 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:03:27,778 EPOCH 6 done: loss 0.1763 - lr: 0.000013 2023-10-18 21:03:29,615 DEV : loss 0.14805352687835693 - f1-score (micro avg) 0.5478 2023-10-18 21:03:29,633 saving best model 2023-10-18 21:03:29,668 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:03:32,934 epoch 7 - iter 198/1984 - loss 0.16294937 - time (sec): 3.26 - samples/sec: 5259.28 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:03:36,052 epoch 7 - iter 396/1984 - loss 0.15972725 - time (sec): 6.38 - samples/sec: 5393.35 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:03:39,089 epoch 7 - iter 594/1984 - loss 0.16608298 - time (sec): 9.42 - samples/sec: 5413.66 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:03:41,894 epoch 7 - iter 792/1984 - loss 0.17118937 - time (sec): 12.23 - samples/sec: 5487.04 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:03:44,824 epoch 7 - iter 990/1984 - loss 0.16957906 - time (sec): 15.16 - samples/sec: 5520.40 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:03:47,915 epoch 7 - iter 1188/1984 - loss 0.16659187 - time (sec): 18.25 - samples/sec: 5485.38 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:03:50,963 epoch 7 - iter 1386/1984 - loss 0.16428979 - time (sec): 21.29 - samples/sec: 5483.11 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:03:54,010 epoch 7 - iter 1584/1984 - loss 0.16637073 - time (sec): 24.34 - samples/sec: 5460.47 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:03:57,061 epoch 7 - iter 1782/1984 - loss 0.16747526 - time (sec): 27.39 - samples/sec: 5401.79 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:04:00,102 epoch 7 - iter 1980/1984 - loss 0.16822466 - time (sec): 30.43 - samples/sec: 5373.02 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:04:00,169 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:04:00,169 EPOCH 7 done: loss 0.1682 - lr: 0.000010 2023-10-18 21:04:02,013 DEV : loss 0.14143583178520203 - f1-score (micro avg) 0.5793 2023-10-18 21:04:02,032 saving best model 2023-10-18 21:04:02,068 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:04:05,159 epoch 8 - iter 198/1984 - loss 0.15369511 - time (sec): 3.09 - samples/sec: 5411.94 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:04:08,202 epoch 8 - iter 396/1984 - loss 0.15903392 - time (sec): 6.13 - samples/sec: 5418.10 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:04:11,203 epoch 8 - iter 594/1984 - loss 0.16036789 - time (sec): 9.13 - samples/sec: 5359.75 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:04:14,291 epoch 8 - iter 792/1984 - loss 0.16222131 - time (sec): 12.22 - samples/sec: 5325.47 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:04:17,419 epoch 8 - iter 990/1984 - loss 0.16439874 - time (sec): 15.35 - samples/sec: 5379.70 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:04:20,488 epoch 8 - iter 1188/1984 - loss 0.16575296 - time (sec): 18.42 - samples/sec: 5360.09 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:04:23,524 epoch 8 - iter 1386/1984 - loss 0.16512080 - time (sec): 21.46 - samples/sec: 5320.22 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:04:26,522 epoch 8 - iter 1584/1984 - loss 0.16294057 - time (sec): 24.45 - samples/sec: 5357.27 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:04:29,251 epoch 8 - iter 1782/1984 - loss 0.16387429 - time (sec): 27.18 - samples/sec: 5416.81 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:04:32,158 epoch 8 - iter 1980/1984 - loss 0.16221637 - time (sec): 30.09 - samples/sec: 5436.71 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:04:32,223 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:04:32,223 EPOCH 8 done: loss 0.1620 - lr: 0.000007 2023-10-18 21:04:34,443 DEV : loss 0.1451966017484665 - f1-score (micro avg) 0.5774 2023-10-18 21:04:34,461 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:04:37,365 epoch 9 - iter 198/1984 - loss 0.15397460 - time (sec): 2.90 - samples/sec: 5578.27 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:04:40,442 epoch 9 - iter 396/1984 - loss 0.15823775 - time (sec): 5.98 - samples/sec: 5461.82 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:04:43,496 epoch 9 - iter 594/1984 - loss 0.15730895 - time (sec): 9.03 - samples/sec: 5389.32 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:04:46,515 epoch 9 - iter 792/1984 - loss 0.15727437 - time (sec): 12.05 - samples/sec: 5347.56 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:04:49,568 epoch 9 - iter 990/1984 - loss 0.15572603 - time (sec): 15.11 - samples/sec: 5457.65 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:04:52,604 epoch 9 - iter 1188/1984 - loss 0.15640171 - time (sec): 18.14 - samples/sec: 5456.50 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:04:55,665 epoch 9 - iter 1386/1984 - loss 0.15811620 - time (sec): 21.20 - samples/sec: 5429.92 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:04:58,709 epoch 9 - iter 1584/1984 - loss 0.15794423 - time (sec): 24.25 - samples/sec: 5430.18 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:05:01,748 epoch 9 - iter 1782/1984 - loss 0.15821200 - time (sec): 27.29 - samples/sec: 5410.10 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:05:04,845 epoch 9 - iter 1980/1984 - loss 0.16022348 - time (sec): 30.38 - samples/sec: 5388.13 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:05:04,903 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:05:04,904 EPOCH 9 done: loss 0.1601 - lr: 0.000003 2023-10-18 21:05:06,717 DEV : loss 0.1431300789117813 - f1-score (micro avg) 0.5803 2023-10-18 21:05:06,735 saving best model 2023-10-18 21:05:06,770 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:05:09,750 epoch 10 - iter 198/1984 - loss 0.15494103 - time (sec): 2.98 - samples/sec: 5371.44 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:05:12,786 epoch 10 - iter 396/1984 - loss 0.15906693 - time (sec): 6.02 - samples/sec: 5461.70 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:05:15,887 epoch 10 - iter 594/1984 - loss 0.15521525 - time (sec): 9.12 - samples/sec: 5399.74 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:05:18,916 epoch 10 - iter 792/1984 - loss 0.15908706 - time (sec): 12.15 - samples/sec: 5422.73 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:05:22,124 epoch 10 - iter 990/1984 - loss 0.15901277 - time (sec): 15.35 - samples/sec: 5388.53 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:05:25,181 epoch 10 - iter 1188/1984 - loss 0.16174865 - time (sec): 18.41 - samples/sec: 5361.83 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:05:28,234 epoch 10 - iter 1386/1984 - loss 0.16037707 - time (sec): 21.46 - samples/sec: 5327.63 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:05:30,933 epoch 10 - iter 1584/1984 - loss 0.15835073 - time (sec): 24.16 - samples/sec: 5413.33 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:05:33,808 epoch 10 - iter 1782/1984 - loss 0.16011563 - time (sec): 27.04 - samples/sec: 5426.26 - lr: 0.000000 - momentum: 0.000000 2023-10-18 21:05:36,832 epoch 10 - iter 1980/1984 - loss 0.15752633 - time (sec): 30.06 - samples/sec: 5443.78 - lr: 0.000000 - momentum: 0.000000 2023-10-18 21:05:36,890 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:05:36,890 EPOCH 10 done: loss 0.1574 - lr: 0.000000 2023-10-18 21:05:38,730 DEV : loss 0.14386418461799622 - f1-score (micro avg) 0.5837 2023-10-18 21:05:38,748 saving best model 2023-10-18 21:05:38,811 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:05:38,811 Loading model from best epoch ... 2023-10-18 21:05:38,893 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 21:05:40,453 Results: - F-score (micro) 0.5909 - F-score (macro) 0.417 - Accuracy 0.4597 By class: precision recall f1-score support LOC 0.7192 0.6962 0.7075 655 PER 0.3876 0.5874 0.4670 223 ORG 0.2000 0.0472 0.0764 127 micro avg 0.5918 0.5900 0.5909 1005 macro avg 0.4356 0.4436 0.4170 1005 weighted avg 0.5800 0.5900 0.5744 1005 2023-10-18 21:05:40,453 ----------------------------------------------------------------------------------------------------