2023-10-18 16:15:05,938 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,938 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 16:15:05,938 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 Train: 1214 sentences 2023-10-18 16:15:05,939 (train_with_dev=False, train_with_test=False) 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 Training Params: 2023-10-18 16:15:05,939 - learning_rate: "5e-05" 2023-10-18 16:15:05,939 - mini_batch_size: "8" 2023-10-18 16:15:05,939 - max_epochs: "10" 2023-10-18 16:15:05,939 - shuffle: "True" 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 Plugins: 2023-10-18 16:15:05,939 - TensorboardLogger 2023-10-18 16:15:05,939 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 16:15:05,939 - metric: "('micro avg', 'f1-score')" 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 Computation: 2023-10-18 16:15:05,939 - compute on device: cuda:0 2023-10-18 16:15:05,939 - embedding storage: none 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:05,939 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 16:15:06,328 epoch 1 - iter 15/152 - loss 3.70125503 - time (sec): 0.39 - samples/sec: 7461.24 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:15:06,691 epoch 1 - iter 30/152 - loss 3.68114875 - time (sec): 0.75 - samples/sec: 7931.59 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:15:07,030 epoch 1 - iter 45/152 - loss 3.58287786 - time (sec): 1.09 - samples/sec: 8195.59 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:15:07,369 epoch 1 - iter 60/152 - loss 3.47650670 - time (sec): 1.43 - samples/sec: 8416.75 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:15:07,709 epoch 1 - iter 75/152 - loss 3.31494161 - time (sec): 1.77 - samples/sec: 8639.73 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:15:08,037 epoch 1 - iter 90/152 - loss 3.10983209 - time (sec): 2.10 - samples/sec: 8825.91 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:15:08,378 epoch 1 - iter 105/152 - loss 2.87583911 - time (sec): 2.44 - samples/sec: 8884.15 - lr: 0.000034 - momentum: 0.000000 2023-10-18 16:15:08,704 epoch 1 - iter 120/152 - loss 2.67088923 - time (sec): 2.76 - samples/sec: 8879.90 - lr: 0.000039 - momentum: 0.000000 2023-10-18 16:15:09,014 epoch 1 - iter 135/152 - loss 2.48951072 - time (sec): 3.07 - samples/sec: 8932.12 - lr: 0.000044 - momentum: 0.000000 2023-10-18 16:15:09,368 epoch 1 - iter 150/152 - loss 2.33212454 - time (sec): 3.43 - samples/sec: 8944.88 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:15:09,410 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:09,410 EPOCH 1 done: loss 2.3180 - lr: 0.000049 2023-10-18 16:15:09,749 DEV : loss 0.7997033596038818 - f1-score (micro avg) 0.0 2023-10-18 16:15:09,755 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:10,109 epoch 2 - iter 15/152 - loss 0.81002554 - time (sec): 0.35 - samples/sec: 8826.75 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:15:10,441 epoch 2 - iter 30/152 - loss 0.76938143 - time (sec): 0.69 - samples/sec: 8785.07 - lr: 0.000049 - momentum: 0.000000 2023-10-18 16:15:10,787 epoch 2 - iter 45/152 - loss 0.75447855 - time (sec): 1.03 - samples/sec: 8973.49 - lr: 0.000048 - momentum: 0.000000 2023-10-18 16:15:11,120 epoch 2 - iter 60/152 - loss 0.75947763 - time (sec): 1.36 - samples/sec: 9030.07 - lr: 0.000048 - momentum: 0.000000 2023-10-18 16:15:11,458 epoch 2 - iter 75/152 - loss 0.74812431 - time (sec): 1.70 - samples/sec: 9067.38 - lr: 0.000047 - momentum: 0.000000 2023-10-18 16:15:11,790 epoch 2 - iter 90/152 - loss 0.73785323 - time (sec): 2.03 - samples/sec: 9247.03 - lr: 0.000047 - momentum: 0.000000 2023-10-18 16:15:12,145 epoch 2 - iter 105/152 - loss 0.70566023 - time (sec): 2.39 - samples/sec: 9149.75 - lr: 0.000046 - momentum: 0.000000 2023-10-18 16:15:12,469 epoch 2 - iter 120/152 - loss 0.68726077 - time (sec): 2.71 - samples/sec: 9018.24 - lr: 0.000046 - momentum: 0.000000 2023-10-18 16:15:12,801 epoch 2 - iter 135/152 - loss 0.67861504 - time (sec): 3.05 - samples/sec: 9037.75 - lr: 0.000045 - momentum: 0.000000 2023-10-18 16:15:13,146 epoch 2 - iter 150/152 - loss 0.66359008 - time (sec): 3.39 - samples/sec: 9035.23 - lr: 0.000045 - momentum: 0.000000 2023-10-18 16:15:13,186 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:13,186 EPOCH 2 done: loss 0.6620 - lr: 0.000045 2023-10-18 16:15:13,707 DEV : loss 0.46732401847839355 - f1-score (micro avg) 0.0343 2023-10-18 16:15:13,714 saving best model 2023-10-18 16:15:13,745 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:14,039 epoch 3 - iter 15/152 - loss 0.49289613 - time (sec): 0.29 - samples/sec: 10230.71 - lr: 0.000044 - momentum: 0.000000 2023-10-18 16:15:14,373 epoch 3 - iter 30/152 - loss 0.53149270 - time (sec): 0.63 - samples/sec: 9840.78 - lr: 0.000043 - momentum: 0.000000 2023-10-18 16:15:14,704 epoch 3 - iter 45/152 - loss 0.50012512 - time (sec): 0.96 - samples/sec: 9500.08 - lr: 0.000043 - momentum: 0.000000 2023-10-18 16:15:15,040 epoch 3 - iter 60/152 - loss 0.51342831 - time (sec): 1.29 - samples/sec: 9563.89 - lr: 0.000042 - momentum: 0.000000 2023-10-18 16:15:15,373 epoch 3 - iter 75/152 - loss 0.50387466 - time (sec): 1.63 - samples/sec: 9666.29 - lr: 0.000042 - momentum: 0.000000 2023-10-18 16:15:15,712 epoch 3 - iter 90/152 - loss 0.49830093 - time (sec): 1.97 - samples/sec: 9528.37 - lr: 0.000041 - momentum: 0.000000 2023-10-18 16:15:16,062 epoch 3 - iter 105/152 - loss 0.48595812 - time (sec): 2.32 - samples/sec: 9401.56 - lr: 0.000041 - momentum: 0.000000 2023-10-18 16:15:16,416 epoch 3 - iter 120/152 - loss 0.47506590 - time (sec): 2.67 - samples/sec: 9270.09 - lr: 0.000040 - momentum: 0.000000 2023-10-18 16:15:16,911 epoch 3 - iter 135/152 - loss 0.46208534 - time (sec): 3.17 - samples/sec: 8819.55 - lr: 0.000040 - momentum: 0.000000 2023-10-18 16:15:17,263 epoch 3 - iter 150/152 - loss 0.45641935 - time (sec): 3.52 - samples/sec: 8706.75 - lr: 0.000039 - momentum: 0.000000 2023-10-18 16:15:17,306 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:17,306 EPOCH 3 done: loss 0.4552 - lr: 0.000039 2023-10-18 16:15:17,834 DEV : loss 0.3661465048789978 - f1-score (micro avg) 0.3212 2023-10-18 16:15:17,841 saving best model 2023-10-18 16:15:17,876 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:18,241 epoch 4 - iter 15/152 - loss 0.43363165 - time (sec): 0.36 - samples/sec: 9157.05 - lr: 0.000038 - momentum: 0.000000 2023-10-18 16:15:18,576 epoch 4 - iter 30/152 - loss 0.45798062 - time (sec): 0.70 - samples/sec: 8790.88 - lr: 0.000038 - momentum: 0.000000 2023-10-18 16:15:18,906 epoch 4 - iter 45/152 - loss 0.40931769 - time (sec): 1.03 - samples/sec: 9169.37 - lr: 0.000037 - momentum: 0.000000 2023-10-18 16:15:19,232 epoch 4 - iter 60/152 - loss 0.40835230 - time (sec): 1.36 - samples/sec: 9049.70 - lr: 0.000037 - momentum: 0.000000 2023-10-18 16:15:19,555 epoch 4 - iter 75/152 - loss 0.41597379 - time (sec): 1.68 - samples/sec: 9247.87 - lr: 0.000036 - momentum: 0.000000 2023-10-18 16:15:19,900 epoch 4 - iter 90/152 - loss 0.40044717 - time (sec): 2.02 - samples/sec: 9216.21 - lr: 0.000036 - momentum: 0.000000 2023-10-18 16:15:20,227 epoch 4 - iter 105/152 - loss 0.39441080 - time (sec): 2.35 - samples/sec: 9334.21 - lr: 0.000035 - momentum: 0.000000 2023-10-18 16:15:20,558 epoch 4 - iter 120/152 - loss 0.39303989 - time (sec): 2.68 - samples/sec: 9319.30 - lr: 0.000035 - momentum: 0.000000 2023-10-18 16:15:20,880 epoch 4 - iter 135/152 - loss 0.38970096 - time (sec): 3.00 - samples/sec: 9240.33 - lr: 0.000034 - momentum: 0.000000 2023-10-18 16:15:21,214 epoch 4 - iter 150/152 - loss 0.38359325 - time (sec): 3.34 - samples/sec: 9178.47 - lr: 0.000034 - momentum: 0.000000 2023-10-18 16:15:21,254 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:21,254 EPOCH 4 done: loss 0.3841 - lr: 0.000034 2023-10-18 16:15:21,761 DEV : loss 0.323783278465271 - f1-score (micro avg) 0.4063 2023-10-18 16:15:21,766 saving best model 2023-10-18 16:15:21,798 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:22,121 epoch 5 - iter 15/152 - loss 0.36984946 - time (sec): 0.32 - samples/sec: 9548.32 - lr: 0.000033 - momentum: 0.000000 2023-10-18 16:15:22,463 epoch 5 - iter 30/152 - loss 0.37561154 - time (sec): 0.66 - samples/sec: 9553.81 - lr: 0.000032 - momentum: 0.000000 2023-10-18 16:15:22,796 epoch 5 - iter 45/152 - loss 0.38421590 - time (sec): 1.00 - samples/sec: 9850.38 - lr: 0.000032 - momentum: 0.000000 2023-10-18 16:15:23,121 epoch 5 - iter 60/152 - loss 0.36743109 - time (sec): 1.32 - samples/sec: 9768.65 - lr: 0.000031 - momentum: 0.000000 2023-10-18 16:15:23,451 epoch 5 - iter 75/152 - loss 0.34611632 - time (sec): 1.65 - samples/sec: 9611.08 - lr: 0.000031 - momentum: 0.000000 2023-10-18 16:15:23,772 epoch 5 - iter 90/152 - loss 0.35660532 - time (sec): 1.97 - samples/sec: 9507.82 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:15:24,092 epoch 5 - iter 105/152 - loss 0.35440260 - time (sec): 2.29 - samples/sec: 9528.16 - lr: 0.000030 - momentum: 0.000000 2023-10-18 16:15:24,434 epoch 5 - iter 120/152 - loss 0.35111263 - time (sec): 2.64 - samples/sec: 9384.01 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:15:24,772 epoch 5 - iter 135/152 - loss 0.34517085 - time (sec): 2.97 - samples/sec: 9268.53 - lr: 0.000029 - momentum: 0.000000 2023-10-18 16:15:25,116 epoch 5 - iter 150/152 - loss 0.34456850 - time (sec): 3.32 - samples/sec: 9223.60 - lr: 0.000028 - momentum: 0.000000 2023-10-18 16:15:25,157 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:25,157 EPOCH 5 done: loss 0.3425 - lr: 0.000028 2023-10-18 16:15:25,672 DEV : loss 0.30024248361587524 - f1-score (micro avg) 0.4197 2023-10-18 16:15:25,678 saving best model 2023-10-18 16:15:25,712 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:26,030 epoch 6 - iter 15/152 - loss 0.30680307 - time (sec): 0.32 - samples/sec: 9209.99 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:15:26,330 epoch 6 - iter 30/152 - loss 0.31536530 - time (sec): 0.62 - samples/sec: 9494.58 - lr: 0.000027 - momentum: 0.000000 2023-10-18 16:15:26,630 epoch 6 - iter 45/152 - loss 0.32111647 - time (sec): 0.92 - samples/sec: 9731.65 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:15:26,976 epoch 6 - iter 60/152 - loss 0.31139374 - time (sec): 1.26 - samples/sec: 9458.63 - lr: 0.000026 - momentum: 0.000000 2023-10-18 16:15:27,333 epoch 6 - iter 75/152 - loss 0.32608309 - time (sec): 1.62 - samples/sec: 9432.18 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:15:27,664 epoch 6 - iter 90/152 - loss 0.33320171 - time (sec): 1.95 - samples/sec: 9455.81 - lr: 0.000025 - momentum: 0.000000 2023-10-18 16:15:28,001 epoch 6 - iter 105/152 - loss 0.32816785 - time (sec): 2.29 - samples/sec: 9325.92 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:15:28,339 epoch 6 - iter 120/152 - loss 0.32345935 - time (sec): 2.63 - samples/sec: 9206.42 - lr: 0.000024 - momentum: 0.000000 2023-10-18 16:15:28,667 epoch 6 - iter 135/152 - loss 0.32049895 - time (sec): 2.95 - samples/sec: 9293.55 - lr: 0.000023 - momentum: 0.000000 2023-10-18 16:15:28,997 epoch 6 - iter 150/152 - loss 0.32023113 - time (sec): 3.28 - samples/sec: 9342.58 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:15:29,037 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:29,038 EPOCH 6 done: loss 0.3219 - lr: 0.000022 2023-10-18 16:15:29,548 DEV : loss 0.28398463129997253 - f1-score (micro avg) 0.4567 2023-10-18 16:15:29,553 saving best model 2023-10-18 16:15:29,587 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:29,903 epoch 7 - iter 15/152 - loss 0.28252657 - time (sec): 0.32 - samples/sec: 9476.44 - lr: 0.000022 - momentum: 0.000000 2023-10-18 16:15:30,241 epoch 7 - iter 30/152 - loss 0.28573620 - time (sec): 0.65 - samples/sec: 9324.16 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:15:30,570 epoch 7 - iter 45/152 - loss 0.30759902 - time (sec): 0.98 - samples/sec: 9039.46 - lr: 0.000021 - momentum: 0.000000 2023-10-18 16:15:30,910 epoch 7 - iter 60/152 - loss 0.29238351 - time (sec): 1.32 - samples/sec: 8936.28 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:15:31,235 epoch 7 - iter 75/152 - loss 0.29122126 - time (sec): 1.65 - samples/sec: 9057.78 - lr: 0.000020 - momentum: 0.000000 2023-10-18 16:15:31,578 epoch 7 - iter 90/152 - loss 0.28753130 - time (sec): 1.99 - samples/sec: 9131.73 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:15:31,907 epoch 7 - iter 105/152 - loss 0.28053912 - time (sec): 2.32 - samples/sec: 9216.37 - lr: 0.000019 - momentum: 0.000000 2023-10-18 16:15:32,234 epoch 7 - iter 120/152 - loss 0.28667045 - time (sec): 2.65 - samples/sec: 9190.38 - lr: 0.000018 - momentum: 0.000000 2023-10-18 16:15:32,558 epoch 7 - iter 135/152 - loss 0.28971059 - time (sec): 2.97 - samples/sec: 9239.62 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:15:32,872 epoch 7 - iter 150/152 - loss 0.29771219 - time (sec): 3.29 - samples/sec: 9317.62 - lr: 0.000017 - momentum: 0.000000 2023-10-18 16:15:32,914 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:32,915 EPOCH 7 done: loss 0.2960 - lr: 0.000017 2023-10-18 16:15:33,438 DEV : loss 0.27674031257629395 - f1-score (micro avg) 0.4529 2023-10-18 16:15:33,444 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:33,773 epoch 8 - iter 15/152 - loss 0.36439532 - time (sec): 0.33 - samples/sec: 9732.25 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:15:34,110 epoch 8 - iter 30/152 - loss 0.32954097 - time (sec): 0.67 - samples/sec: 9636.51 - lr: 0.000016 - momentum: 0.000000 2023-10-18 16:15:34,423 epoch 8 - iter 45/152 - loss 0.28427625 - time (sec): 0.98 - samples/sec: 9698.94 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:15:34,741 epoch 8 - iter 60/152 - loss 0.27896014 - time (sec): 1.30 - samples/sec: 9434.15 - lr: 0.000015 - momentum: 0.000000 2023-10-18 16:15:35,080 epoch 8 - iter 75/152 - loss 0.27347096 - time (sec): 1.64 - samples/sec: 9364.17 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:15:35,408 epoch 8 - iter 90/152 - loss 0.27216251 - time (sec): 1.96 - samples/sec: 9336.29 - lr: 0.000014 - momentum: 0.000000 2023-10-18 16:15:35,756 epoch 8 - iter 105/152 - loss 0.27209154 - time (sec): 2.31 - samples/sec: 9301.03 - lr: 0.000013 - momentum: 0.000000 2023-10-18 16:15:36,078 epoch 8 - iter 120/152 - loss 0.27225921 - time (sec): 2.63 - samples/sec: 9323.79 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:15:36,402 epoch 8 - iter 135/152 - loss 0.28147704 - time (sec): 2.96 - samples/sec: 9312.46 - lr: 0.000012 - momentum: 0.000000 2023-10-18 16:15:36,728 epoch 8 - iter 150/152 - loss 0.28333049 - time (sec): 3.28 - samples/sec: 9321.71 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:15:36,773 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:36,773 EPOCH 8 done: loss 0.2844 - lr: 0.000011 2023-10-18 16:15:37,288 DEV : loss 0.27192819118499756 - f1-score (micro avg) 0.4646 2023-10-18 16:15:37,294 saving best model 2023-10-18 16:15:37,331 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:37,666 epoch 9 - iter 15/152 - loss 0.28403839 - time (sec): 0.33 - samples/sec: 9689.33 - lr: 0.000011 - momentum: 0.000000 2023-10-18 16:15:37,997 epoch 9 - iter 30/152 - loss 0.27874781 - time (sec): 0.67 - samples/sec: 9171.21 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:15:38,324 epoch 9 - iter 45/152 - loss 0.27955279 - time (sec): 0.99 - samples/sec: 9103.58 - lr: 0.000010 - momentum: 0.000000 2023-10-18 16:15:38,652 epoch 9 - iter 60/152 - loss 0.28386863 - time (sec): 1.32 - samples/sec: 9172.62 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:15:38,973 epoch 9 - iter 75/152 - loss 0.27880015 - time (sec): 1.64 - samples/sec: 9174.23 - lr: 0.000009 - momentum: 0.000000 2023-10-18 16:15:39,298 epoch 9 - iter 90/152 - loss 0.27348691 - time (sec): 1.97 - samples/sec: 9174.35 - lr: 0.000008 - momentum: 0.000000 2023-10-18 16:15:39,633 epoch 9 - iter 105/152 - loss 0.27175770 - time (sec): 2.30 - samples/sec: 9104.15 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:15:39,994 epoch 9 - iter 120/152 - loss 0.27214763 - time (sec): 2.66 - samples/sec: 9160.24 - lr: 0.000007 - momentum: 0.000000 2023-10-18 16:15:40,317 epoch 9 - iter 135/152 - loss 0.27296496 - time (sec): 2.99 - samples/sec: 9231.02 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:15:40,635 epoch 9 - iter 150/152 - loss 0.27582726 - time (sec): 3.30 - samples/sec: 9272.89 - lr: 0.000006 - momentum: 0.000000 2023-10-18 16:15:40,675 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:40,675 EPOCH 9 done: loss 0.2756 - lr: 0.000006 2023-10-18 16:15:41,199 DEV : loss 0.26522183418273926 - f1-score (micro avg) 0.4675 2023-10-18 16:15:41,204 saving best model 2023-10-18 16:15:41,239 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:41,573 epoch 10 - iter 15/152 - loss 0.25785621 - time (sec): 0.33 - samples/sec: 9020.05 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:15:41,907 epoch 10 - iter 30/152 - loss 0.27294342 - time (sec): 0.67 - samples/sec: 9092.45 - lr: 0.000005 - momentum: 0.000000 2023-10-18 16:15:42,240 epoch 10 - iter 45/152 - loss 0.27289338 - time (sec): 1.00 - samples/sec: 8965.33 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:15:42,555 epoch 10 - iter 60/152 - loss 0.27670625 - time (sec): 1.32 - samples/sec: 9078.86 - lr: 0.000004 - momentum: 0.000000 2023-10-18 16:15:42,881 epoch 10 - iter 75/152 - loss 0.27515774 - time (sec): 1.64 - samples/sec: 9242.88 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:15:43,214 epoch 10 - iter 90/152 - loss 0.27560286 - time (sec): 1.97 - samples/sec: 9277.73 - lr: 0.000003 - momentum: 0.000000 2023-10-18 16:15:43,535 epoch 10 - iter 105/152 - loss 0.27362789 - time (sec): 2.29 - samples/sec: 9284.26 - lr: 0.000002 - momentum: 0.000000 2023-10-18 16:15:43,862 epoch 10 - iter 120/152 - loss 0.27061588 - time (sec): 2.62 - samples/sec: 9283.22 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:15:44,190 epoch 10 - iter 135/152 - loss 0.27151096 - time (sec): 2.95 - samples/sec: 9266.27 - lr: 0.000001 - momentum: 0.000000 2023-10-18 16:15:44,525 epoch 10 - iter 150/152 - loss 0.27633272 - time (sec): 3.28 - samples/sec: 9280.40 - lr: 0.000000 - momentum: 0.000000 2023-10-18 16:15:44,570 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:44,570 EPOCH 10 done: loss 0.2763 - lr: 0.000000 2023-10-18 16:15:45,098 DEV : loss 0.2641502022743225 - f1-score (micro avg) 0.4707 2023-10-18 16:15:45,103 saving best model 2023-10-18 16:15:45,164 ---------------------------------------------------------------------------------------------------- 2023-10-18 16:15:45,165 Loading model from best epoch ... 2023-10-18 16:15:45,240 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-18 16:15:45,708 Results: - F-score (micro) 0.4815 - F-score (macro) 0.2998 - Accuracy 0.3305 By class: precision recall f1-score support scope 0.3930 0.5232 0.4489 151 work 0.3533 0.6211 0.4504 95 pers 0.6064 0.5938 0.6000 96 loc 0.0000 0.0000 0.0000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.4221 0.5603 0.4815 348 macro avg 0.2705 0.3476 0.2998 348 weighted avg 0.4343 0.5603 0.4832 348 2023-10-18 16:15:45,708 ----------------------------------------------------------------------------------------------------