2023-10-18 16:00:35,621 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,621 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:00:35,621 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 Train: 1214 sentences 2023-10-18 16:00:35,622 (train_with_dev=False, train_with_test=False) 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 Training Params: 2023-10-18 16:00:35,622 - learning_rate: "3e-05" 2023-10-18 16:00:35,622 - mini_batch_size: "8" 2023-10-18 16:00:35,622 - max_epochs: "10" 2023-10-18 16:00:35,622 - shuffle: "True" 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 Plugins: 2023-10-18 16:00:35,622 - TensorboardLogger 2023-10-18 16:00:35,622 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:00:35,622 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 Computation: 2023-10-18 16:00:35,622 - compute on device: cuda:0 2023-10-18 16:00:35,622 - embedding storage: none 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:35,622 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:00:35,952 epoch 1 - iter 15/152 - loss 4.00789166 - time (sec): 0.33 - samples/sec: 9411.10 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:00:36,267 epoch 1 - iter 30/152 - loss 3.95672617 - time (sec): 0.64 - samples/sec: 9325.93 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:00:36,608 epoch 1 - iter 45/152 - loss 3.94624857 - time (sec): 0.99 - samples/sec: 9122.88 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:00:36,931 epoch 1 - iter 60/152 - loss 3.86046928 - time (sec): 1.31 - samples/sec: 9095.29 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:00:37,260 epoch 1 - iter 75/152 - loss 3.75215895 - time (sec): 1.64 - samples/sec: 9034.65 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:00:37,587 epoch 1 - iter 90/152 - loss 3.61600245 - time (sec): 1.96 - samples/sec: 8945.79 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:00:37,921 epoch 1 - iter 105/152 - loss 3.44287038 - time (sec): 2.30 - samples/sec: 9090.25 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:00:38,242 epoch 1 - iter 120/152 - loss 3.25685765 - time (sec): 2.62 - samples/sec: 9181.53 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:00:38,576 epoch 1 - iter 135/152 - loss 3.06295056 - time (sec): 2.95 - samples/sec: 9269.59 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:00:38,898 epoch 1 - iter 150/152 - loss 2.86590136 - time (sec): 3.27 - samples/sec: 9371.93 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:00:38,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:38,939 EPOCH 1 done: loss 2.8469 - lr: 0.000029 2023-10-18 16:00:39,425 DEV : loss 0.8295782804489136 - f1-score (micro avg) 0.0 2023-10-18 16:00:39,431 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:39,774 epoch 2 - iter 15/152 - loss 0.99260390 - time (sec): 0.34 - samples/sec: 9536.82 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:00:40,116 epoch 2 - iter 30/152 - loss 0.93274629 - time (sec): 0.68 - samples/sec: 9108.79 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:00:40,450 epoch 2 - iter 45/152 - loss 0.92656572 - time (sec): 1.02 - samples/sec: 9180.02 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:00:40,802 epoch 2 - iter 60/152 - loss 0.90640024 - time (sec): 1.37 - samples/sec: 9238.96 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:00:41,130 epoch 2 - iter 75/152 - loss 0.86234367 - time (sec): 1.70 - samples/sec: 9071.63 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:00:41,469 epoch 2 - iter 90/152 - loss 0.85598470 - time (sec): 2.04 - samples/sec: 8981.23 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:00:41,809 epoch 2 - iter 105/152 - loss 0.81272990 - time (sec): 2.38 - samples/sec: 9095.02 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:00:42,139 epoch 2 - iter 120/152 - loss 0.79453814 - time (sec): 2.71 - samples/sec: 9060.68 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:00:42,470 epoch 2 - iter 135/152 - loss 0.78840405 - time (sec): 3.04 - samples/sec: 9124.59 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:00:42,789 epoch 2 - iter 150/152 - loss 0.79761637 - time (sec): 3.36 - samples/sec: 9132.75 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:00:42,827 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:42,827 EPOCH 2 done: loss 0.7947 - lr: 0.000027 2023-10-18 16:00:43,325 DEV : loss 0.6788879632949829 - f1-score (micro avg) 0.0 2023-10-18 16:00:43,333 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:43,676 epoch 3 - iter 15/152 - loss 0.61489160 - time (sec): 0.34 - samples/sec: 8567.73 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:00:44,040 epoch 3 - iter 30/152 - loss 0.67262216 - time (sec): 0.71 - samples/sec: 8964.31 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:00:44,375 epoch 3 - iter 45/152 - loss 0.66018727 - time (sec): 1.04 - samples/sec: 9000.41 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:00:44,717 epoch 3 - iter 60/152 - loss 0.64421175 - time (sec): 1.38 - samples/sec: 9295.22 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:00:45,046 epoch 3 - iter 75/152 - loss 0.63940197 - time (sec): 1.71 - samples/sec: 9317.91 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:00:45,377 epoch 3 - iter 90/152 - loss 0.64900449 - time (sec): 2.04 - samples/sec: 9268.57 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:00:45,704 epoch 3 - iter 105/152 - loss 0.65768230 - time (sec): 2.37 - samples/sec: 9203.10 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:00:46,016 epoch 3 - iter 120/152 - loss 0.63905495 - time (sec): 2.68 - samples/sec: 9183.02 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:00:46,344 epoch 3 - iter 135/152 - loss 0.64040721 - time (sec): 3.01 - samples/sec: 9179.57 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:00:46,687 epoch 3 - iter 150/152 - loss 0.63819110 - time (sec): 3.35 - samples/sec: 9126.82 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:00:46,733 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:46,734 EPOCH 3 done: loss 0.6365 - lr: 0.000023 2023-10-18 16:00:47,248 DEV : loss 0.5025841593742371 - f1-score (micro avg) 0.0135 2023-10-18 16:00:47,253 saving best model 2023-10-18 16:00:47,286 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:47,641 epoch 4 - iter 15/152 - loss 0.57966195 - time (sec): 0.35 - samples/sec: 8685.86 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:00:47,969 epoch 4 - iter 30/152 - loss 0.62149453 - time (sec): 0.68 - samples/sec: 9045.65 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:00:48,294 epoch 4 - iter 45/152 - loss 0.59983084 - time (sec): 1.01 - samples/sec: 9258.20 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:00:48,612 epoch 4 - iter 60/152 - loss 0.59316151 - time (sec): 1.33 - samples/sec: 9296.56 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:00:48,945 epoch 4 - iter 75/152 - loss 0.57133209 - time (sec): 1.66 - samples/sec: 9213.56 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:00:49,274 epoch 4 - iter 90/152 - loss 0.56352174 - time (sec): 1.99 - samples/sec: 9152.12 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:00:49,603 epoch 4 - iter 105/152 - loss 0.55294877 - time (sec): 2.32 - samples/sec: 9222.83 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:00:49,941 epoch 4 - iter 120/152 - loss 0.53978253 - time (sec): 2.65 - samples/sec: 9254.58 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:00:50,280 epoch 4 - iter 135/152 - loss 0.53049593 - time (sec): 2.99 - samples/sec: 9193.27 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:00:50,633 epoch 4 - iter 150/152 - loss 0.52931331 - time (sec): 3.35 - samples/sec: 9153.02 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:00:50,677 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:50,678 EPOCH 4 done: loss 0.5276 - lr: 0.000020 2023-10-18 16:00:51,183 DEV : loss 0.4168972373008728 - f1-score (micro avg) 0.1581 2023-10-18 16:00:51,189 saving best model 2023-10-18 16:00:51,224 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:51,565 epoch 5 - iter 15/152 - loss 0.53617851 - time (sec): 0.34 - samples/sec: 9260.63 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:00:51,895 epoch 5 - iter 30/152 - loss 0.53803594 - time (sec): 0.67 - samples/sec: 9563.98 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:00:52,224 epoch 5 - iter 45/152 - loss 0.49619181 - time (sec): 1.00 - samples/sec: 9636.72 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:00:52,564 epoch 5 - iter 60/152 - loss 0.47745090 - time (sec): 1.34 - samples/sec: 9360.27 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:00:52,895 epoch 5 - iter 75/152 - loss 0.46797898 - time (sec): 1.67 - samples/sec: 9352.21 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:00:53,239 epoch 5 - iter 90/152 - loss 0.47419580 - time (sec): 2.01 - samples/sec: 9180.36 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:00:53,588 epoch 5 - iter 105/152 - loss 0.48269217 - time (sec): 2.36 - samples/sec: 9200.35 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:00:53,932 epoch 5 - iter 120/152 - loss 0.48859525 - time (sec): 2.71 - samples/sec: 9085.60 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:00:54,276 epoch 5 - iter 135/152 - loss 0.48301814 - time (sec): 3.05 - samples/sec: 9079.44 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:00:54,620 epoch 5 - iter 150/152 - loss 0.47519403 - time (sec): 3.40 - samples/sec: 9054.14 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:00:54,658 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:54,659 EPOCH 5 done: loss 0.4745 - lr: 0.000017 2023-10-18 16:00:55,162 DEV : loss 0.370134562253952 - f1-score (micro avg) 0.2869 2023-10-18 16:00:55,168 saving best model 2023-10-18 16:00:55,205 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:55,558 epoch 6 - iter 15/152 - loss 0.44026281 - time (sec): 0.35 - samples/sec: 8003.51 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:00:55,899 epoch 6 - iter 30/152 - loss 0.43017528 - time (sec): 0.69 - samples/sec: 8228.85 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:00:56,253 epoch 6 - iter 45/152 - loss 0.41910625 - time (sec): 1.05 - samples/sec: 8645.23 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:00:56,591 epoch 6 - iter 60/152 - loss 0.44785130 - time (sec): 1.39 - samples/sec: 8715.12 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:00:56,949 epoch 6 - iter 75/152 - loss 0.43373922 - time (sec): 1.74 - samples/sec: 8738.02 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:00:57,296 epoch 6 - iter 90/152 - loss 0.43274154 - time (sec): 2.09 - samples/sec: 8720.85 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:00:57,640 epoch 6 - iter 105/152 - loss 0.43008866 - time (sec): 2.43 - samples/sec: 8755.25 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:00:57,978 epoch 6 - iter 120/152 - loss 0.41982891 - time (sec): 2.77 - samples/sec: 8749.43 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:00:58,311 epoch 6 - iter 135/152 - loss 0.42569748 - time (sec): 3.11 - samples/sec: 8745.17 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:00:58,653 epoch 6 - iter 150/152 - loss 0.41690779 - time (sec): 3.45 - samples/sec: 8875.12 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:00:58,699 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:58,699 EPOCH 6 done: loss 0.4170 - lr: 0.000013 2023-10-18 16:00:59,207 DEV : loss 0.3444247543811798 - f1-score (micro avg) 0.3721 2023-10-18 16:00:59,212 saving best model 2023-10-18 16:00:59,252 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:00:59,588 epoch 7 - iter 15/152 - loss 0.41682993 - time (sec): 0.33 - samples/sec: 8798.49 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:00:59,920 epoch 7 - iter 30/152 - loss 0.43016752 - time (sec): 0.67 - samples/sec: 9035.92 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:01:00,245 epoch 7 - iter 45/152 - loss 0.41688699 - time (sec): 0.99 - samples/sec: 9082.43 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:01:00,578 epoch 7 - iter 60/152 - loss 0.41974238 - time (sec): 1.33 - samples/sec: 9121.79 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:01:00,924 epoch 7 - iter 75/152 - loss 0.41752012 - time (sec): 1.67 - samples/sec: 9197.13 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:01:01,256 epoch 7 - iter 90/152 - loss 0.41605449 - time (sec): 2.00 - samples/sec: 9143.60 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:01:01,585 epoch 7 - iter 105/152 - loss 0.40709899 - time (sec): 2.33 - samples/sec: 9120.16 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:01:01,941 epoch 7 - iter 120/152 - loss 0.40385903 - time (sec): 2.69 - samples/sec: 9044.78 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:01:02,291 epoch 7 - iter 135/152 - loss 0.39384345 - time (sec): 3.04 - samples/sec: 9042.18 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:01:02,649 epoch 7 - iter 150/152 - loss 0.40143075 - time (sec): 3.40 - samples/sec: 9028.61 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:01:02,693 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:02,693 EPOCH 7 done: loss 0.4003 - lr: 0.000010 2023-10-18 16:01:03,203 DEV : loss 0.33269548416137695 - f1-score (micro avg) 0.3965 2023-10-18 16:01:03,208 saving best model 2023-10-18 16:01:03,241 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:03,572 epoch 8 - iter 15/152 - loss 0.35671434 - time (sec): 0.33 - samples/sec: 8014.59 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:01:03,912 epoch 8 - iter 30/152 - loss 0.37272681 - time (sec): 0.67 - samples/sec: 8473.49 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:01:04,249 epoch 8 - iter 45/152 - loss 0.34943248 - time (sec): 1.01 - samples/sec: 8565.99 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:01:04,584 epoch 8 - iter 60/152 - loss 0.37313032 - time (sec): 1.34 - samples/sec: 8790.18 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:01:04,918 epoch 8 - iter 75/152 - loss 0.37034114 - time (sec): 1.68 - samples/sec: 9042.69 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:01:05,266 epoch 8 - iter 90/152 - loss 0.37029655 - time (sec): 2.02 - samples/sec: 8883.17 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:01:05,607 epoch 8 - iter 105/152 - loss 0.37200022 - time (sec): 2.37 - samples/sec: 8995.03 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:01:05,943 epoch 8 - iter 120/152 - loss 0.37318661 - time (sec): 2.70 - samples/sec: 9067.77 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:01:06,276 epoch 8 - iter 135/152 - loss 0.38006961 - time (sec): 3.03 - samples/sec: 9102.49 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:01:06,619 epoch 8 - iter 150/152 - loss 0.37960691 - time (sec): 3.38 - samples/sec: 9080.97 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:01:06,663 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:06,663 EPOCH 8 done: loss 0.3800 - lr: 0.000007 2023-10-18 16:01:07,167 DEV : loss 0.3233301043510437 - f1-score (micro avg) 0.4193 2023-10-18 16:01:07,173 saving best model 2023-10-18 16:01:07,206 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:07,545 epoch 9 - iter 15/152 - loss 0.33004570 - time (sec): 0.34 - samples/sec: 8821.56 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:01:07,900 epoch 9 - iter 30/152 - loss 0.36101004 - time (sec): 0.69 - samples/sec: 8833.59 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:01:08,234 epoch 9 - iter 45/152 - loss 0.38132630 - time (sec): 1.03 - samples/sec: 9014.43 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:01:08,558 epoch 9 - iter 60/152 - loss 0.36572414 - time (sec): 1.35 - samples/sec: 9212.04 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:01:08,894 epoch 9 - iter 75/152 - loss 0.36713005 - time (sec): 1.69 - samples/sec: 9222.39 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:01:09,205 epoch 9 - iter 90/152 - loss 0.36978585 - time (sec): 2.00 - samples/sec: 9176.61 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:01:09,547 epoch 9 - iter 105/152 - loss 0.37153446 - time (sec): 2.34 - samples/sec: 9255.70 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:01:09,860 epoch 9 - iter 120/152 - loss 0.37312119 - time (sec): 2.65 - samples/sec: 9236.49 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:01:10,185 epoch 9 - iter 135/152 - loss 0.37347333 - time (sec): 2.98 - samples/sec: 9182.59 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:01:10,534 epoch 9 - iter 150/152 - loss 0.37053182 - time (sec): 3.33 - samples/sec: 9202.30 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:01:10,576 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:10,576 EPOCH 9 done: loss 0.3708 - lr: 0.000004 2023-10-18 16:01:11,096 DEV : loss 0.3178371787071228 - f1-score (micro avg) 0.4365 2023-10-18 16:01:11,101 saving best model 2023-10-18 16:01:11,135 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:11,489 epoch 10 - iter 15/152 - loss 0.34922658 - time (sec): 0.35 - samples/sec: 8700.51 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:01:11,826 epoch 10 - iter 30/152 - loss 0.31666500 - time (sec): 0.69 - samples/sec: 9030.93 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:01:12,141 epoch 10 - iter 45/152 - loss 0.33260780 - time (sec): 1.01 - samples/sec: 9233.57 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:01:12,462 epoch 10 - iter 60/152 - loss 0.33449700 - time (sec): 1.33 - samples/sec: 9086.56 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:01:12,777 epoch 10 - iter 75/152 - loss 0.32800784 - time (sec): 1.64 - samples/sec: 9090.88 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:01:13,130 epoch 10 - iter 90/152 - loss 0.33601032 - time (sec): 2.00 - samples/sec: 9068.65 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:01:13,478 epoch 10 - iter 105/152 - loss 0.34407701 - time (sec): 2.34 - samples/sec: 9036.28 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:01:13,802 epoch 10 - iter 120/152 - loss 0.35499057 - time (sec): 2.67 - samples/sec: 9088.00 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:01:14,135 epoch 10 - iter 135/152 - loss 0.35856920 - time (sec): 3.00 - samples/sec: 9086.00 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:01:14,474 epoch 10 - iter 150/152 - loss 0.36415561 - time (sec): 3.34 - samples/sec: 9159.95 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:01:14,517 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:14,518 EPOCH 10 done: loss 0.3615 - lr: 0.000000 2023-10-18 16:01:15,030 DEV : loss 0.31526467204093933 - f1-score (micro avg) 0.4402 2023-10-18 16:01:15,036 saving best model 2023-10-18 16:01:15,096 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:01:15,097 Loading model from best epoch ... 2023-10-18 16:01:15,179 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-18 16:01:15,657 Results: - F-score (micro) 0.4351 - F-score (macro) 0.2662 - Accuracy 0.285 By class: precision recall f1-score support scope 0.4043 0.5033 0.4484 151 work 0.2893 0.3684 0.3241 95 pers 0.6024 0.5208 0.5587 96 loc 0.0000 0.0000 0.0000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.4107 0.4626 0.4351 348 macro avg 0.2592 0.2785 0.2662 348 weighted avg 0.4206 0.4626 0.4371 348 2023-10-18 16:01:15,657 ----------------------------------------------------------------------------------------------------