2023-10-18 14:33:33,455 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,456 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 14:33:33,456 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,456 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-18 14:33:33,456 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,456 Train: 1100 sentences 2023-10-18 14:33:33,456 (train_with_dev=False, train_with_test=False) 2023-10-18 14:33:33,456 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,456 Training Params: 2023-10-18 14:33:33,456 - learning_rate: "3e-05" 2023-10-18 14:33:33,456 - mini_batch_size: "4" 2023-10-18 14:33:33,456 - max_epochs: "10" 2023-10-18 14:33:33,457 - shuffle: "True" 2023-10-18 14:33:33,457 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,457 Plugins: 2023-10-18 14:33:33,457 - TensorboardLogger 2023-10-18 14:33:33,457 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 14:33:33,457 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,457 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 14:33:33,457 - metric: "('micro avg', 'f1-score')" 2023-10-18 14:33:33,457 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,457 Computation: 2023-10-18 14:33:33,457 - compute on device: cuda:0 2023-10-18 14:33:33,457 - embedding storage: none 2023-10-18 14:33:33,457 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,457 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 14:33:33,457 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,457 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:33,457 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 14:33:34,901 epoch 1 - iter 27/275 - loss 3.95807539 - time (sec): 1.44 - samples/sec: 1696.55 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:33:35,335 epoch 1 - iter 54/275 - loss 4.02684268 - time (sec): 1.88 - samples/sec: 2500.12 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:33:35,764 epoch 1 - iter 81/275 - loss 3.93498797 - time (sec): 2.31 - samples/sec: 2931.71 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:33:36,175 epoch 1 - iter 108/275 - loss 3.82505787 - time (sec): 2.72 - samples/sec: 3244.87 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:33:36,591 epoch 1 - iter 135/275 - loss 3.62952957 - time (sec): 3.13 - samples/sec: 3618.88 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:33:37,001 epoch 1 - iter 162/275 - loss 3.41964347 - time (sec): 3.54 - samples/sec: 3831.26 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:33:37,399 epoch 1 - iter 189/275 - loss 3.22624157 - time (sec): 3.94 - samples/sec: 4001.92 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:33:37,800 epoch 1 - iter 216/275 - loss 3.00227363 - time (sec): 4.34 - samples/sec: 4182.55 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:33:38,209 epoch 1 - iter 243/275 - loss 2.81798890 - time (sec): 4.75 - samples/sec: 4237.42 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:33:38,605 epoch 1 - iter 270/275 - loss 2.66322507 - time (sec): 5.15 - samples/sec: 4342.81 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:33:38,682 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:38,682 EPOCH 1 done: loss 2.6324 - lr: 0.000029 2023-10-18 14:33:38,943 DEV : loss 0.8727257251739502 - f1-score (micro avg) 0.0 2023-10-18 14:33:38,947 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:39,348 epoch 2 - iter 27/275 - loss 0.95414304 - time (sec): 0.40 - samples/sec: 6156.96 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:33:39,760 epoch 2 - iter 54/275 - loss 1.00539324 - time (sec): 0.81 - samples/sec: 5799.67 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:33:40,168 epoch 2 - iter 81/275 - loss 1.03670034 - time (sec): 1.22 - samples/sec: 5747.24 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:33:40,578 epoch 2 - iter 108/275 - loss 1.01175848 - time (sec): 1.63 - samples/sec: 5675.91 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:33:40,990 epoch 2 - iter 135/275 - loss 1.00671150 - time (sec): 2.04 - samples/sec: 5516.96 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:33:41,395 epoch 2 - iter 162/275 - loss 0.97791266 - time (sec): 2.45 - samples/sec: 5565.61 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:33:41,806 epoch 2 - iter 189/275 - loss 0.95352470 - time (sec): 2.86 - samples/sec: 5565.32 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:33:42,205 epoch 2 - iter 216/275 - loss 0.93400324 - time (sec): 3.26 - samples/sec: 5490.92 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:33:42,618 epoch 2 - iter 243/275 - loss 0.90908336 - time (sec): 3.67 - samples/sec: 5528.79 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:33:43,013 epoch 2 - iter 270/275 - loss 0.89179470 - time (sec): 4.07 - samples/sec: 5493.60 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:33:43,086 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:43,086 EPOCH 2 done: loss 0.8891 - lr: 0.000027 2023-10-18 14:33:43,458 DEV : loss 0.690082848072052 - f1-score (micro avg) 0.0 2023-10-18 14:33:43,465 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:43,868 epoch 3 - iter 27/275 - loss 0.79786830 - time (sec): 0.40 - samples/sec: 5628.57 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:33:44,271 epoch 3 - iter 54/275 - loss 0.73841948 - time (sec): 0.81 - samples/sec: 5612.54 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:33:44,684 epoch 3 - iter 81/275 - loss 0.76137491 - time (sec): 1.22 - samples/sec: 5654.41 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:33:45,222 epoch 3 - iter 108/275 - loss 0.72902040 - time (sec): 1.76 - samples/sec: 5272.71 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:33:45,637 epoch 3 - iter 135/275 - loss 0.73347067 - time (sec): 2.17 - samples/sec: 5275.39 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:33:46,032 epoch 3 - iter 162/275 - loss 0.71535696 - time (sec): 2.57 - samples/sec: 5337.01 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:33:46,451 epoch 3 - iter 189/275 - loss 0.70609375 - time (sec): 2.99 - samples/sec: 5330.98 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:33:46,863 epoch 3 - iter 216/275 - loss 0.70424713 - time (sec): 3.40 - samples/sec: 5323.43 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:33:47,271 epoch 3 - iter 243/275 - loss 0.70311682 - time (sec): 3.81 - samples/sec: 5323.86 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:33:47,675 epoch 3 - iter 270/275 - loss 0.70164780 - time (sec): 4.21 - samples/sec: 5333.31 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:33:47,745 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:47,746 EPOCH 3 done: loss 0.6995 - lr: 0.000023 2023-10-18 14:33:48,107 DEV : loss 0.5440794825553894 - f1-score (micro avg) 0.1311 2023-10-18 14:33:48,111 saving best model 2023-10-18 14:33:48,147 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:48,544 epoch 4 - iter 27/275 - loss 0.68168972 - time (sec): 0.40 - samples/sec: 5570.72 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:33:48,938 epoch 4 - iter 54/275 - loss 0.64810506 - time (sec): 0.79 - samples/sec: 5518.00 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:33:49,349 epoch 4 - iter 81/275 - loss 0.61425076 - time (sec): 1.20 - samples/sec: 5321.55 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:33:49,753 epoch 4 - iter 108/275 - loss 0.63426737 - time (sec): 1.61 - samples/sec: 5357.68 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:33:50,162 epoch 4 - iter 135/275 - loss 0.61651865 - time (sec): 2.01 - samples/sec: 5551.61 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:33:50,569 epoch 4 - iter 162/275 - loss 0.60018412 - time (sec): 2.42 - samples/sec: 5531.42 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:33:50,985 epoch 4 - iter 189/275 - loss 0.59771216 - time (sec): 2.84 - samples/sec: 5518.53 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:33:51,401 epoch 4 - iter 216/275 - loss 0.58918031 - time (sec): 3.25 - samples/sec: 5523.57 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:33:51,811 epoch 4 - iter 243/275 - loss 0.58391359 - time (sec): 3.66 - samples/sec: 5518.71 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:33:52,214 epoch 4 - iter 270/275 - loss 0.57886356 - time (sec): 4.07 - samples/sec: 5491.47 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:33:52,293 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:52,293 EPOCH 4 done: loss 0.5788 - lr: 0.000020 2023-10-18 14:33:52,651 DEV : loss 0.446980744600296 - f1-score (micro avg) 0.3195 2023-10-18 14:33:52,655 saving best model 2023-10-18 14:33:52,692 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:53,118 epoch 5 - iter 27/275 - loss 0.47038975 - time (sec): 0.42 - samples/sec: 6169.87 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:33:53,533 epoch 5 - iter 54/275 - loss 0.49156303 - time (sec): 0.84 - samples/sec: 5549.68 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:33:53,934 epoch 5 - iter 81/275 - loss 0.50829332 - time (sec): 1.24 - samples/sec: 5562.38 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:33:54,345 epoch 5 - iter 108/275 - loss 0.51783415 - time (sec): 1.65 - samples/sec: 5661.14 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:33:54,754 epoch 5 - iter 135/275 - loss 0.52054288 - time (sec): 2.06 - samples/sec: 5532.59 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:33:55,152 epoch 5 - iter 162/275 - loss 0.52045580 - time (sec): 2.46 - samples/sec: 5529.64 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:33:55,550 epoch 5 - iter 189/275 - loss 0.52141928 - time (sec): 2.86 - samples/sec: 5513.83 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:33:55,986 epoch 5 - iter 216/275 - loss 0.52400933 - time (sec): 3.29 - samples/sec: 5438.15 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:33:56,388 epoch 5 - iter 243/275 - loss 0.51808409 - time (sec): 3.69 - samples/sec: 5417.92 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:33:56,801 epoch 5 - iter 270/275 - loss 0.51933400 - time (sec): 4.11 - samples/sec: 5446.28 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:33:56,874 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:56,874 EPOCH 5 done: loss 0.5164 - lr: 0.000017 2023-10-18 14:33:57,240 DEV : loss 0.3829083740711212 - f1-score (micro avg) 0.4412 2023-10-18 14:33:57,244 saving best model 2023-10-18 14:33:57,280 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:33:57,671 epoch 6 - iter 27/275 - loss 0.60568622 - time (sec): 0.39 - samples/sec: 5055.69 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:33:58,062 epoch 6 - iter 54/275 - loss 0.51501565 - time (sec): 0.78 - samples/sec: 5417.32 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:33:58,466 epoch 6 - iter 81/275 - loss 0.52007717 - time (sec): 1.19 - samples/sec: 5542.45 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:33:58,875 epoch 6 - iter 108/275 - loss 0.50831316 - time (sec): 1.59 - samples/sec: 5541.06 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:33:59,277 epoch 6 - iter 135/275 - loss 0.48824914 - time (sec): 2.00 - samples/sec: 5482.41 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:33:59,688 epoch 6 - iter 162/275 - loss 0.48448898 - time (sec): 2.41 - samples/sec: 5579.49 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:34:00,104 epoch 6 - iter 189/275 - loss 0.47906986 - time (sec): 2.82 - samples/sec: 5542.73 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:34:00,518 epoch 6 - iter 216/275 - loss 0.47103482 - time (sec): 3.24 - samples/sec: 5512.00 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:34:00,925 epoch 6 - iter 243/275 - loss 0.47641128 - time (sec): 3.64 - samples/sec: 5565.64 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:34:01,335 epoch 6 - iter 270/275 - loss 0.47212454 - time (sec): 4.05 - samples/sec: 5528.51 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:34:01,411 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:01,411 EPOCH 6 done: loss 0.4711 - lr: 0.000013 2023-10-18 14:34:01,779 DEV : loss 0.35361379384994507 - f1-score (micro avg) 0.5 2023-10-18 14:34:01,783 saving best model 2023-10-18 14:34:01,818 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:02,230 epoch 7 - iter 27/275 - loss 0.51653193 - time (sec): 0.41 - samples/sec: 5057.55 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:34:02,649 epoch 7 - iter 54/275 - loss 0.47829517 - time (sec): 0.83 - samples/sec: 5139.10 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:34:03,056 epoch 7 - iter 81/275 - loss 0.47507434 - time (sec): 1.24 - samples/sec: 5124.10 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:34:03,467 epoch 7 - iter 108/275 - loss 0.46524549 - time (sec): 1.65 - samples/sec: 5065.57 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:34:03,840 epoch 7 - iter 135/275 - loss 0.45425778 - time (sec): 2.02 - samples/sec: 5284.03 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:34:04,214 epoch 7 - iter 162/275 - loss 0.44426560 - time (sec): 2.40 - samples/sec: 5469.89 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:34:04,589 epoch 7 - iter 189/275 - loss 0.45075310 - time (sec): 2.77 - samples/sec: 5577.78 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:34:04,963 epoch 7 - iter 216/275 - loss 0.44430239 - time (sec): 3.14 - samples/sec: 5653.18 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:34:05,336 epoch 7 - iter 243/275 - loss 0.43835253 - time (sec): 3.52 - samples/sec: 5736.85 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:34:05,707 epoch 7 - iter 270/275 - loss 0.44103139 - time (sec): 3.89 - samples/sec: 5772.09 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:34:05,774 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:05,774 EPOCH 7 done: loss 0.4392 - lr: 0.000010 2023-10-18 14:34:06,139 DEV : loss 0.3377951979637146 - f1-score (micro avg) 0.5332 2023-10-18 14:34:06,143 saving best model 2023-10-18 14:34:06,177 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:06,542 epoch 8 - iter 27/275 - loss 0.46120490 - time (sec): 0.36 - samples/sec: 6245.19 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:34:06,916 epoch 8 - iter 54/275 - loss 0.43711143 - time (sec): 0.74 - samples/sec: 5782.83 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:34:07,322 epoch 8 - iter 81/275 - loss 0.42993624 - time (sec): 1.14 - samples/sec: 5625.05 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:34:07,747 epoch 8 - iter 108/275 - loss 0.43798123 - time (sec): 1.57 - samples/sec: 5618.59 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:34:08,155 epoch 8 - iter 135/275 - loss 0.41689337 - time (sec): 1.98 - samples/sec: 5691.67 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:34:08,558 epoch 8 - iter 162/275 - loss 0.40611288 - time (sec): 2.38 - samples/sec: 5626.24 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:34:08,978 epoch 8 - iter 189/275 - loss 0.41126376 - time (sec): 2.80 - samples/sec: 5564.85 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:34:09,372 epoch 8 - iter 216/275 - loss 0.41874573 - time (sec): 3.19 - samples/sec: 5530.94 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:34:09,785 epoch 8 - iter 243/275 - loss 0.42031093 - time (sec): 3.61 - samples/sec: 5527.18 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:34:10,200 epoch 8 - iter 270/275 - loss 0.41959777 - time (sec): 4.02 - samples/sec: 5558.55 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:34:10,283 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:10,284 EPOCH 8 done: loss 0.4179 - lr: 0.000007 2023-10-18 14:34:10,653 DEV : loss 0.3256723880767822 - f1-score (micro avg) 0.5433 2023-10-18 14:34:10,657 saving best model 2023-10-18 14:34:10,693 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:11,121 epoch 9 - iter 27/275 - loss 0.40590119 - time (sec): 0.43 - samples/sec: 5380.00 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:34:11,522 epoch 9 - iter 54/275 - loss 0.41779296 - time (sec): 0.83 - samples/sec: 5553.98 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:34:11,937 epoch 9 - iter 81/275 - loss 0.41452508 - time (sec): 1.24 - samples/sec: 5468.20 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:34:12,330 epoch 9 - iter 108/275 - loss 0.43171475 - time (sec): 1.64 - samples/sec: 5435.80 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:34:12,730 epoch 9 - iter 135/275 - loss 0.44049718 - time (sec): 2.04 - samples/sec: 5420.83 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:34:13,137 epoch 9 - iter 162/275 - loss 0.43920082 - time (sec): 2.44 - samples/sec: 5429.86 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:34:13,545 epoch 9 - iter 189/275 - loss 0.42865859 - time (sec): 2.85 - samples/sec: 5430.42 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:34:13,961 epoch 9 - iter 216/275 - loss 0.42145289 - time (sec): 3.27 - samples/sec: 5461.56 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:34:14,364 epoch 9 - iter 243/275 - loss 0.42176964 - time (sec): 3.67 - samples/sec: 5563.73 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:34:14,763 epoch 9 - iter 270/275 - loss 0.41965621 - time (sec): 4.07 - samples/sec: 5493.05 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:34:14,840 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:14,841 EPOCH 9 done: loss 0.4216 - lr: 0.000003 2023-10-18 14:34:15,212 DEV : loss 0.320939838886261 - f1-score (micro avg) 0.5464 2023-10-18 14:34:15,216 saving best model 2023-10-18 14:34:15,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:15,638 epoch 10 - iter 27/275 - loss 0.35279704 - time (sec): 0.39 - samples/sec: 5775.64 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:34:16,040 epoch 10 - iter 54/275 - loss 0.39298011 - time (sec): 0.79 - samples/sec: 5759.75 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:34:16,440 epoch 10 - iter 81/275 - loss 0.40468269 - time (sec): 1.19 - samples/sec: 5509.45 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:34:16,854 epoch 10 - iter 108/275 - loss 0.39740579 - time (sec): 1.60 - samples/sec: 5443.48 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:34:17,257 epoch 10 - iter 135/275 - loss 0.40381656 - time (sec): 2.01 - samples/sec: 5498.53 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:34:17,655 epoch 10 - iter 162/275 - loss 0.41005123 - time (sec): 2.40 - samples/sec: 5561.44 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:34:18,067 epoch 10 - iter 189/275 - loss 0.42288410 - time (sec): 2.82 - samples/sec: 5606.36 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:34:18,477 epoch 10 - iter 216/275 - loss 0.41179863 - time (sec): 3.23 - samples/sec: 5613.41 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:34:18,883 epoch 10 - iter 243/275 - loss 0.40928966 - time (sec): 3.63 - samples/sec: 5552.26 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:34:19,287 epoch 10 - iter 270/275 - loss 0.40508631 - time (sec): 4.04 - samples/sec: 5543.71 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:34:19,371 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:19,371 EPOCH 10 done: loss 0.4038 - lr: 0.000000 2023-10-18 14:34:19,737 DEV : loss 0.3195188045501709 - f1-score (micro avg) 0.5483 2023-10-18 14:34:19,741 saving best model 2023-10-18 14:34:19,802 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:34:19,803 Loading model from best epoch ... 2023-10-18 14:34:19,882 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 14:34:20,162 Results: - F-score (micro) 0.587 - F-score (macro) 0.3506 - Accuracy 0.4269 By class: precision recall f1-score support scope 0.5707 0.5966 0.5833 176 pers 0.8452 0.5547 0.6698 128 work 0.4651 0.5405 0.5000 74 object 0.0000 0.0000 0.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.6102 0.5654 0.5870 382 macro avg 0.3762 0.3384 0.3506 382 weighted avg 0.6362 0.5654 0.5901 382 2023-10-18 14:34:20,162 ----------------------------------------------------------------------------------------------------