2023-10-19 23:44:15,521 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,521 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-19 23:44:15,521 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,521 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-19 23:44:15,521 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,521 Train: 1166 sentences 2023-10-19 23:44:15,521 (train_with_dev=False, train_with_test=False) 2023-10-19 23:44:15,521 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,521 Training Params: 2023-10-19 23:44:15,521 - learning_rate: "3e-05" 2023-10-19 23:44:15,521 - mini_batch_size: "8" 2023-10-19 23:44:15,521 - max_epochs: "10" 2023-10-19 23:44:15,521 - shuffle: "True" 2023-10-19 23:44:15,521 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,521 Plugins: 2023-10-19 23:44:15,522 - TensorboardLogger 2023-10-19 23:44:15,522 - LinearScheduler | warmup_fraction: '0.1' 2023-10-19 23:44:15,522 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,522 Final evaluation on model from best epoch (best-model.pt) 2023-10-19 23:44:15,522 - metric: "('micro avg', 'f1-score')" 2023-10-19 23:44:15,522 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,522 Computation: 2023-10-19 23:44:15,522 - compute on device: cuda:0 2023-10-19 23:44:15,522 - embedding storage: none 2023-10-19 23:44:15,522 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,522 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-19 23:44:15,522 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,522 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:15,522 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-19 23:44:15,879 epoch 1 - iter 14/146 - loss 3.20793090 - time (sec): 0.36 - samples/sec: 13074.19 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:44:16,230 epoch 1 - iter 28/146 - loss 3.12574275 - time (sec): 0.71 - samples/sec: 12396.06 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:44:16,601 epoch 1 - iter 42/146 - loss 3.11771878 - time (sec): 1.08 - samples/sec: 12365.04 - lr: 0.000008 - momentum: 0.000000 2023-10-19 23:44:16,996 epoch 1 - iter 56/146 - loss 3.06935873 - time (sec): 1.47 - samples/sec: 12474.78 - lr: 0.000011 - momentum: 0.000000 2023-10-19 23:44:17,360 epoch 1 - iter 70/146 - loss 2.94558732 - time (sec): 1.84 - samples/sec: 12146.22 - lr: 0.000014 - momentum: 0.000000 2023-10-19 23:44:17,716 epoch 1 - iter 84/146 - loss 2.85761154 - time (sec): 2.19 - samples/sec: 11880.31 - lr: 0.000017 - momentum: 0.000000 2023-10-19 23:44:18,082 epoch 1 - iter 98/146 - loss 2.76266652 - time (sec): 2.56 - samples/sec: 11591.22 - lr: 0.000020 - momentum: 0.000000 2023-10-19 23:44:18,450 epoch 1 - iter 112/146 - loss 2.64147295 - time (sec): 2.93 - samples/sec: 11423.78 - lr: 0.000023 - momentum: 0.000000 2023-10-19 23:44:18,833 epoch 1 - iter 126/146 - loss 2.47431078 - time (sec): 3.31 - samples/sec: 11575.56 - lr: 0.000026 - momentum: 0.000000 2023-10-19 23:44:19,206 epoch 1 - iter 140/146 - loss 2.33923425 - time (sec): 3.68 - samples/sec: 11572.07 - lr: 0.000029 - momentum: 0.000000 2023-10-19 23:44:19,359 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:19,359 EPOCH 1 done: loss 2.2888 - lr: 0.000029 2023-10-19 23:44:19,619 DEV : loss 0.4948265552520752 - f1-score (micro avg) 0.0 2023-10-19 23:44:19,623 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:20,000 epoch 2 - iter 14/146 - loss 0.89608699 - time (sec): 0.38 - samples/sec: 13306.52 - lr: 0.000030 - momentum: 0.000000 2023-10-19 23:44:20,394 epoch 2 - iter 28/146 - loss 0.89649057 - time (sec): 0.77 - samples/sec: 12304.03 - lr: 0.000029 - momentum: 0.000000 2023-10-19 23:44:20,738 epoch 2 - iter 42/146 - loss 0.84222218 - time (sec): 1.11 - samples/sec: 12173.25 - lr: 0.000029 - momentum: 0.000000 2023-10-19 23:44:21,117 epoch 2 - iter 56/146 - loss 0.79613909 - time (sec): 1.49 - samples/sec: 12193.57 - lr: 0.000029 - momentum: 0.000000 2023-10-19 23:44:21,476 epoch 2 - iter 70/146 - loss 0.78551861 - time (sec): 1.85 - samples/sec: 11907.35 - lr: 0.000028 - momentum: 0.000000 2023-10-19 23:44:21,821 epoch 2 - iter 84/146 - loss 0.75375611 - time (sec): 2.20 - samples/sec: 11862.12 - lr: 0.000028 - momentum: 0.000000 2023-10-19 23:44:22,184 epoch 2 - iter 98/146 - loss 0.74151103 - time (sec): 2.56 - samples/sec: 11870.28 - lr: 0.000028 - momentum: 0.000000 2023-10-19 23:44:22,543 epoch 2 - iter 112/146 - loss 0.71853171 - time (sec): 2.92 - samples/sec: 11900.91 - lr: 0.000027 - momentum: 0.000000 2023-10-19 23:44:22,912 epoch 2 - iter 126/146 - loss 0.71739727 - time (sec): 3.29 - samples/sec: 11819.07 - lr: 0.000027 - momentum: 0.000000 2023-10-19 23:44:23,285 epoch 2 - iter 140/146 - loss 0.70527078 - time (sec): 3.66 - samples/sec: 11683.78 - lr: 0.000027 - momentum: 0.000000 2023-10-19 23:44:23,449 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:23,449 EPOCH 2 done: loss 0.6978 - lr: 0.000027 2023-10-19 23:44:24,226 DEV : loss 0.4478610157966614 - f1-score (micro avg) 0.0 2023-10-19 23:44:24,230 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:24,610 epoch 3 - iter 14/146 - loss 0.52729291 - time (sec): 0.38 - samples/sec: 10801.90 - lr: 0.000026 - momentum: 0.000000 2023-10-19 23:44:24,995 epoch 3 - iter 28/146 - loss 0.76913273 - time (sec): 0.77 - samples/sec: 11142.54 - lr: 0.000026 - momentum: 0.000000 2023-10-19 23:44:25,349 epoch 3 - iter 42/146 - loss 0.70963320 - time (sec): 1.12 - samples/sec: 11012.77 - lr: 0.000026 - momentum: 0.000000 2023-10-19 23:44:25,711 epoch 3 - iter 56/146 - loss 0.68190985 - time (sec): 1.48 - samples/sec: 11349.56 - lr: 0.000025 - momentum: 0.000000 2023-10-19 23:44:26,062 epoch 3 - iter 70/146 - loss 0.67155315 - time (sec): 1.83 - samples/sec: 11191.48 - lr: 0.000025 - momentum: 0.000000 2023-10-19 23:44:26,467 epoch 3 - iter 84/146 - loss 0.67153067 - time (sec): 2.24 - samples/sec: 11401.35 - lr: 0.000025 - momentum: 0.000000 2023-10-19 23:44:26,850 epoch 3 - iter 98/146 - loss 0.64327103 - time (sec): 2.62 - samples/sec: 11614.99 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:44:27,209 epoch 3 - iter 112/146 - loss 0.63785309 - time (sec): 2.98 - samples/sec: 11429.05 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:44:27,573 epoch 3 - iter 126/146 - loss 0.63678561 - time (sec): 3.34 - samples/sec: 11301.70 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:44:27,952 epoch 3 - iter 140/146 - loss 0.62392639 - time (sec): 3.72 - samples/sec: 11373.10 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:44:28,114 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:28,114 EPOCH 3 done: loss 0.6187 - lr: 0.000024 2023-10-19 23:44:28,746 DEV : loss 0.3981289267539978 - f1-score (micro avg) 0.0 2023-10-19 23:44:28,750 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:29,120 epoch 4 - iter 14/146 - loss 0.63503612 - time (sec): 0.37 - samples/sec: 11639.73 - lr: 0.000023 - momentum: 0.000000 2023-10-19 23:44:29,488 epoch 4 - iter 28/146 - loss 0.59337698 - time (sec): 0.74 - samples/sec: 11813.68 - lr: 0.000023 - momentum: 0.000000 2023-10-19 23:44:29,835 epoch 4 - iter 42/146 - loss 0.56984192 - time (sec): 1.08 - samples/sec: 11951.58 - lr: 0.000022 - momentum: 0.000000 2023-10-19 23:44:30,208 epoch 4 - iter 56/146 - loss 0.57261996 - time (sec): 1.46 - samples/sec: 11688.45 - lr: 0.000022 - momentum: 0.000000 2023-10-19 23:44:30,561 epoch 4 - iter 70/146 - loss 0.56437890 - time (sec): 1.81 - samples/sec: 11798.31 - lr: 0.000022 - momentum: 0.000000 2023-10-19 23:44:30,939 epoch 4 - iter 84/146 - loss 0.57166166 - time (sec): 2.19 - samples/sec: 12008.91 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:44:31,320 epoch 4 - iter 98/146 - loss 0.56599506 - time (sec): 2.57 - samples/sec: 11885.50 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:44:31,677 epoch 4 - iter 112/146 - loss 0.57465624 - time (sec): 2.93 - samples/sec: 11691.02 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:44:32,081 epoch 4 - iter 126/146 - loss 0.57509407 - time (sec): 3.33 - samples/sec: 11765.31 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:44:32,441 epoch 4 - iter 140/146 - loss 0.57114258 - time (sec): 3.69 - samples/sec: 11508.61 - lr: 0.000020 - momentum: 0.000000 2023-10-19 23:44:32,609 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:32,609 EPOCH 4 done: loss 0.5647 - lr: 0.000020 2023-10-19 23:44:33,243 DEV : loss 0.3668285310268402 - f1-score (micro avg) 0.0 2023-10-19 23:44:33,247 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:33,625 epoch 5 - iter 14/146 - loss 0.55650998 - time (sec): 0.38 - samples/sec: 11263.82 - lr: 0.000020 - momentum: 0.000000 2023-10-19 23:44:33,993 epoch 5 - iter 28/146 - loss 0.53362327 - time (sec): 0.75 - samples/sec: 12001.69 - lr: 0.000019 - momentum: 0.000000 2023-10-19 23:44:34,409 epoch 5 - iter 42/146 - loss 0.58285022 - time (sec): 1.16 - samples/sec: 12431.71 - lr: 0.000019 - momentum: 0.000000 2023-10-19 23:44:34,763 epoch 5 - iter 56/146 - loss 0.58106535 - time (sec): 1.52 - samples/sec: 12046.57 - lr: 0.000019 - momentum: 0.000000 2023-10-19 23:44:35,114 epoch 5 - iter 70/146 - loss 0.56368880 - time (sec): 1.87 - samples/sec: 11846.91 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:44:35,480 epoch 5 - iter 84/146 - loss 0.53574076 - time (sec): 2.23 - samples/sec: 11697.72 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:44:35,849 epoch 5 - iter 98/146 - loss 0.53489871 - time (sec): 2.60 - samples/sec: 11695.35 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:44:36,203 epoch 5 - iter 112/146 - loss 0.53117474 - time (sec): 2.96 - samples/sec: 11663.53 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:44:36,565 epoch 5 - iter 126/146 - loss 0.51809149 - time (sec): 3.32 - samples/sec: 11629.06 - lr: 0.000017 - momentum: 0.000000 2023-10-19 23:44:36,927 epoch 5 - iter 140/146 - loss 0.51509631 - time (sec): 3.68 - samples/sec: 11624.32 - lr: 0.000017 - momentum: 0.000000 2023-10-19 23:44:37,079 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:37,079 EPOCH 5 done: loss 0.5092 - lr: 0.000017 2023-10-19 23:44:37,710 DEV : loss 0.3487622141838074 - f1-score (micro avg) 0.0083 2023-10-19 23:44:37,713 saving best model 2023-10-19 23:44:37,745 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:38,149 epoch 6 - iter 14/146 - loss 0.51717814 - time (sec): 0.40 - samples/sec: 12096.05 - lr: 0.000016 - momentum: 0.000000 2023-10-19 23:44:38,491 epoch 6 - iter 28/146 - loss 0.49475125 - time (sec): 0.75 - samples/sec: 12053.33 - lr: 0.000016 - momentum: 0.000000 2023-10-19 23:44:38,877 epoch 6 - iter 42/146 - loss 0.54127243 - time (sec): 1.13 - samples/sec: 12256.10 - lr: 0.000016 - momentum: 0.000000 2023-10-19 23:44:39,236 epoch 6 - iter 56/146 - loss 0.53929809 - time (sec): 1.49 - samples/sec: 11921.21 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:44:39,597 epoch 6 - iter 70/146 - loss 0.51898022 - time (sec): 1.85 - samples/sec: 11908.69 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:44:39,964 epoch 6 - iter 84/146 - loss 0.50390782 - time (sec): 2.22 - samples/sec: 12133.87 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:44:40,336 epoch 6 - iter 98/146 - loss 0.48555142 - time (sec): 2.59 - samples/sec: 12014.04 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:44:40,691 epoch 6 - iter 112/146 - loss 0.48469459 - time (sec): 2.95 - samples/sec: 11803.73 - lr: 0.000014 - momentum: 0.000000 2023-10-19 23:44:41,041 epoch 6 - iter 126/146 - loss 0.48715911 - time (sec): 3.30 - samples/sec: 11820.32 - lr: 0.000014 - momentum: 0.000000 2023-10-19 23:44:41,386 epoch 6 - iter 140/146 - loss 0.48139076 - time (sec): 3.64 - samples/sec: 11761.39 - lr: 0.000014 - momentum: 0.000000 2023-10-19 23:44:41,527 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:41,527 EPOCH 6 done: loss 0.4820 - lr: 0.000014 2023-10-19 23:44:42,180 DEV : loss 0.3351541757583618 - f1-score (micro avg) 0.0679 2023-10-19 23:44:42,184 saving best model 2023-10-19 23:44:42,216 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:42,570 epoch 7 - iter 14/146 - loss 0.47764801 - time (sec): 0.35 - samples/sec: 14298.75 - lr: 0.000013 - momentum: 0.000000 2023-10-19 23:44:43,096 epoch 7 - iter 28/146 - loss 0.45794388 - time (sec): 0.88 - samples/sec: 10329.59 - lr: 0.000013 - momentum: 0.000000 2023-10-19 23:44:43,445 epoch 7 - iter 42/146 - loss 0.47027425 - time (sec): 1.23 - samples/sec: 10481.69 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:44:43,823 epoch 7 - iter 56/146 - loss 0.45100771 - time (sec): 1.61 - samples/sec: 10674.83 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:44:44,195 epoch 7 - iter 70/146 - loss 0.44284275 - time (sec): 1.98 - samples/sec: 10814.55 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:44:44,563 epoch 7 - iter 84/146 - loss 0.44481757 - time (sec): 2.35 - samples/sec: 10792.93 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:44:44,920 epoch 7 - iter 98/146 - loss 0.44510379 - time (sec): 2.70 - samples/sec: 10806.74 - lr: 0.000011 - momentum: 0.000000 2023-10-19 23:44:45,284 epoch 7 - iter 112/146 - loss 0.47756519 - time (sec): 3.07 - samples/sec: 10808.58 - lr: 0.000011 - momentum: 0.000000 2023-10-19 23:44:45,677 epoch 7 - iter 126/146 - loss 0.46379674 - time (sec): 3.46 - samples/sec: 11069.61 - lr: 0.000011 - momentum: 0.000000 2023-10-19 23:44:46,026 epoch 7 - iter 140/146 - loss 0.46383134 - time (sec): 3.81 - samples/sec: 11120.16 - lr: 0.000010 - momentum: 0.000000 2023-10-19 23:44:46,184 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:46,184 EPOCH 7 done: loss 0.4605 - lr: 0.000010 2023-10-19 23:44:46,821 DEV : loss 0.32718950510025024 - f1-score (micro avg) 0.083 2023-10-19 23:44:46,824 saving best model 2023-10-19 23:44:46,857 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:47,231 epoch 8 - iter 14/146 - loss 0.40082104 - time (sec): 0.37 - samples/sec: 11297.68 - lr: 0.000010 - momentum: 0.000000 2023-10-19 23:44:47,602 epoch 8 - iter 28/146 - loss 0.43856548 - time (sec): 0.74 - samples/sec: 11082.51 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:44:47,957 epoch 8 - iter 42/146 - loss 0.42568329 - time (sec): 1.10 - samples/sec: 11375.15 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:44:48,314 epoch 8 - iter 56/146 - loss 0.41938503 - time (sec): 1.46 - samples/sec: 11501.56 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:44:48,677 epoch 8 - iter 70/146 - loss 0.42774033 - time (sec): 1.82 - samples/sec: 11521.24 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:44:49,045 epoch 8 - iter 84/146 - loss 0.42162970 - time (sec): 2.19 - samples/sec: 11441.58 - lr: 0.000008 - momentum: 0.000000 2023-10-19 23:44:49,406 epoch 8 - iter 98/146 - loss 0.43307071 - time (sec): 2.55 - samples/sec: 11520.70 - lr: 0.000008 - momentum: 0.000000 2023-10-19 23:44:49,799 epoch 8 - iter 112/146 - loss 0.45252921 - time (sec): 2.94 - samples/sec: 11621.53 - lr: 0.000008 - momentum: 0.000000 2023-10-19 23:44:50,174 epoch 8 - iter 126/146 - loss 0.44665343 - time (sec): 3.32 - samples/sec: 11457.78 - lr: 0.000007 - momentum: 0.000000 2023-10-19 23:44:50,568 epoch 8 - iter 140/146 - loss 0.44301821 - time (sec): 3.71 - samples/sec: 11504.61 - lr: 0.000007 - momentum: 0.000000 2023-10-19 23:44:50,729 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:50,729 EPOCH 8 done: loss 0.4432 - lr: 0.000007 2023-10-19 23:44:51,370 DEV : loss 0.32052454352378845 - f1-score (micro avg) 0.1176 2023-10-19 23:44:51,374 saving best model 2023-10-19 23:44:51,409 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:51,812 epoch 9 - iter 14/146 - loss 0.37507220 - time (sec): 0.40 - samples/sec: 11558.92 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:44:52,193 epoch 9 - iter 28/146 - loss 0.41563106 - time (sec): 0.78 - samples/sec: 10999.01 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:44:52,577 epoch 9 - iter 42/146 - loss 0.42311606 - time (sec): 1.17 - samples/sec: 11319.13 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:44:52,947 epoch 9 - iter 56/146 - loss 0.42789024 - time (sec): 1.54 - samples/sec: 11104.38 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:44:53,322 epoch 9 - iter 70/146 - loss 0.41968601 - time (sec): 1.91 - samples/sec: 10955.66 - lr: 0.000005 - momentum: 0.000000 2023-10-19 23:44:53,693 epoch 9 - iter 84/146 - loss 0.42445420 - time (sec): 2.28 - samples/sec: 10979.08 - lr: 0.000005 - momentum: 0.000000 2023-10-19 23:44:54,121 epoch 9 - iter 98/146 - loss 0.42967559 - time (sec): 2.71 - samples/sec: 11358.13 - lr: 0.000005 - momentum: 0.000000 2023-10-19 23:44:54,472 epoch 9 - iter 112/146 - loss 0.42924366 - time (sec): 3.06 - samples/sec: 11441.38 - lr: 0.000004 - momentum: 0.000000 2023-10-19 23:44:54,839 epoch 9 - iter 126/146 - loss 0.42551829 - time (sec): 3.43 - samples/sec: 11403.75 - lr: 0.000004 - momentum: 0.000000 2023-10-19 23:44:55,192 epoch 9 - iter 140/146 - loss 0.44062214 - time (sec): 3.78 - samples/sec: 11395.17 - lr: 0.000004 - momentum: 0.000000 2023-10-19 23:44:55,337 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:55,337 EPOCH 9 done: loss 0.4391 - lr: 0.000004 2023-10-19 23:44:55,981 DEV : loss 0.3207331597805023 - f1-score (micro avg) 0.1165 2023-10-19 23:44:55,985 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:56,358 epoch 10 - iter 14/146 - loss 0.34285569 - time (sec): 0.37 - samples/sec: 11481.65 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:44:56,718 epoch 10 - iter 28/146 - loss 0.37629423 - time (sec): 0.73 - samples/sec: 11923.53 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:44:57,097 epoch 10 - iter 42/146 - loss 0.41250358 - time (sec): 1.11 - samples/sec: 11169.51 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:44:57,481 epoch 10 - iter 56/146 - loss 0.42812388 - time (sec): 1.50 - samples/sec: 10918.64 - lr: 0.000002 - momentum: 0.000000 2023-10-19 23:44:57,837 epoch 10 - iter 70/146 - loss 0.44238178 - time (sec): 1.85 - samples/sec: 10866.33 - lr: 0.000002 - momentum: 0.000000 2023-10-19 23:44:58,209 epoch 10 - iter 84/146 - loss 0.43871459 - time (sec): 2.22 - samples/sec: 11070.59 - lr: 0.000002 - momentum: 0.000000 2023-10-19 23:44:58,579 epoch 10 - iter 98/146 - loss 0.42609753 - time (sec): 2.59 - samples/sec: 11088.13 - lr: 0.000001 - momentum: 0.000000 2023-10-19 23:44:58,979 epoch 10 - iter 112/146 - loss 0.42733808 - time (sec): 2.99 - samples/sec: 11307.42 - lr: 0.000001 - momentum: 0.000000 2023-10-19 23:44:59,336 epoch 10 - iter 126/146 - loss 0.42287074 - time (sec): 3.35 - samples/sec: 11308.63 - lr: 0.000001 - momentum: 0.000000 2023-10-19 23:44:59,702 epoch 10 - iter 140/146 - loss 0.43137751 - time (sec): 3.72 - samples/sec: 11490.41 - lr: 0.000000 - momentum: 0.000000 2023-10-19 23:44:59,853 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:44:59,853 EPOCH 10 done: loss 0.4357 - lr: 0.000000 2023-10-19 23:45:00,486 DEV : loss 0.32032662630081177 - f1-score (micro avg) 0.1222 2023-10-19 23:45:00,490 saving best model 2023-10-19 23:45:00,552 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:45:00,553 Loading model from best epoch ... 2023-10-19 23:45:00,621 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-19 23:45:01,684 Results: - F-score (micro) 0.1326 - F-score (macro) 0.065 - Accuracy 0.0729 By class: precision recall f1-score support PER 0.1821 0.1580 0.1692 348 LOC 0.2174 0.0575 0.0909 261 ORG 0.0000 0.0000 0.0000 52 HumanProd 0.0000 0.0000 0.0000 22 micro avg 0.1877 0.1025 0.1326 683 macro avg 0.0999 0.0539 0.0650 683 weighted avg 0.1759 0.1025 0.1210 683 2023-10-19 23:45:01,684 ----------------------------------------------------------------------------------------------------