2023-10-18 19:16:10,441 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,441 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 19:16:10,441 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 Train: 5901 sentences 2023-10-18 19:16:10,442 (train_with_dev=False, train_with_test=False) 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 Training Params: 2023-10-18 19:16:10,442 - learning_rate: "5e-05" 2023-10-18 19:16:10,442 - mini_batch_size: "8" 2023-10-18 19:16:10,442 - max_epochs: "10" 2023-10-18 19:16:10,442 - shuffle: "True" 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 Plugins: 2023-10-18 19:16:10,442 - TensorboardLogger 2023-10-18 19:16:10,442 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 19:16:10,442 - metric: "('micro avg', 'f1-score')" 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 Computation: 2023-10-18 19:16:10,442 - compute on device: cuda:0 2023-10-18 19:16:10,442 - embedding storage: none 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:10,442 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 19:16:12,152 epoch 1 - iter 73/738 - loss 3.07586690 - time (sec): 1.71 - samples/sec: 8803.91 - lr: 0.000005 - momentum: 0.000000 2023-10-18 19:16:14,385 epoch 1 - iter 146/738 - loss 2.77274838 - time (sec): 3.94 - samples/sec: 8359.91 - lr: 0.000010 - momentum: 0.000000 2023-10-18 19:16:16,258 epoch 1 - iter 219/738 - loss 2.37894957 - time (sec): 5.82 - samples/sec: 8573.50 - lr: 0.000015 - momentum: 0.000000 2023-10-18 19:16:18,002 epoch 1 - iter 292/738 - loss 2.02832146 - time (sec): 7.56 - samples/sec: 8683.64 - lr: 0.000020 - momentum: 0.000000 2023-10-18 19:16:19,668 epoch 1 - iter 365/738 - loss 1.77711491 - time (sec): 9.23 - samples/sec: 8794.98 - lr: 0.000025 - momentum: 0.000000 2023-10-18 19:16:21,343 epoch 1 - iter 438/738 - loss 1.59379009 - time (sec): 10.90 - samples/sec: 8915.26 - lr: 0.000030 - momentum: 0.000000 2023-10-18 19:16:23,048 epoch 1 - iter 511/738 - loss 1.46113750 - time (sec): 12.60 - samples/sec: 8930.48 - lr: 0.000035 - momentum: 0.000000 2023-10-18 19:16:24,891 epoch 1 - iter 584/738 - loss 1.33512940 - time (sec): 14.45 - samples/sec: 9118.78 - lr: 0.000039 - momentum: 0.000000 2023-10-18 19:16:26,602 epoch 1 - iter 657/738 - loss 1.24694827 - time (sec): 16.16 - samples/sec: 9194.50 - lr: 0.000044 - momentum: 0.000000 2023-10-18 19:16:28,278 epoch 1 - iter 730/738 - loss 1.18415498 - time (sec): 17.84 - samples/sec: 9146.63 - lr: 0.000049 - momentum: 0.000000 2023-10-18 19:16:28,542 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:28,542 EPOCH 1 done: loss 1.1698 - lr: 0.000049 2023-10-18 19:16:31,032 DEV : loss 0.3898785710334778 - f1-score (micro avg) 0.151 2023-10-18 19:16:31,060 saving best model 2023-10-18 19:16:31,087 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:32,754 epoch 2 - iter 73/738 - loss 0.44379610 - time (sec): 1.67 - samples/sec: 8957.60 - lr: 0.000049 - momentum: 0.000000 2023-10-18 19:16:34,463 epoch 2 - iter 146/738 - loss 0.45149541 - time (sec): 3.38 - samples/sec: 9164.70 - lr: 0.000049 - momentum: 0.000000 2023-10-18 19:16:36,377 epoch 2 - iter 219/738 - loss 0.44816036 - time (sec): 5.29 - samples/sec: 9320.47 - lr: 0.000048 - momentum: 0.000000 2023-10-18 19:16:38,107 epoch 2 - iter 292/738 - loss 0.44332964 - time (sec): 7.02 - samples/sec: 9149.40 - lr: 0.000048 - momentum: 0.000000 2023-10-18 19:16:39,802 epoch 2 - iter 365/738 - loss 0.44861383 - time (sec): 8.71 - samples/sec: 9231.97 - lr: 0.000047 - momentum: 0.000000 2023-10-18 19:16:41,523 epoch 2 - iter 438/738 - loss 0.44417937 - time (sec): 10.44 - samples/sec: 9186.53 - lr: 0.000047 - momentum: 0.000000 2023-10-18 19:16:43,310 epoch 2 - iter 511/738 - loss 0.43943622 - time (sec): 12.22 - samples/sec: 9272.15 - lr: 0.000046 - momentum: 0.000000 2023-10-18 19:16:45,089 epoch 2 - iter 584/738 - loss 0.43496346 - time (sec): 14.00 - samples/sec: 9296.38 - lr: 0.000046 - momentum: 0.000000 2023-10-18 19:16:47,412 epoch 2 - iter 657/738 - loss 0.42916078 - time (sec): 16.32 - samples/sec: 9104.09 - lr: 0.000045 - momentum: 0.000000 2023-10-18 19:16:49,139 epoch 2 - iter 730/738 - loss 0.42612713 - time (sec): 18.05 - samples/sec: 9107.29 - lr: 0.000045 - momentum: 0.000000 2023-10-18 19:16:49,331 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:49,331 EPOCH 2 done: loss 0.4258 - lr: 0.000045 2023-10-18 19:16:56,554 DEV : loss 0.30439963936805725 - f1-score (micro avg) 0.3835 2023-10-18 19:16:56,582 saving best model 2023-10-18 19:16:56,615 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:16:58,341 epoch 3 - iter 73/738 - loss 0.38212163 - time (sec): 1.73 - samples/sec: 9041.67 - lr: 0.000044 - momentum: 0.000000 2023-10-18 19:17:00,082 epoch 3 - iter 146/738 - loss 0.37123255 - time (sec): 3.47 - samples/sec: 8893.99 - lr: 0.000043 - momentum: 0.000000 2023-10-18 19:17:01,813 epoch 3 - iter 219/738 - loss 0.38122082 - time (sec): 5.20 - samples/sec: 8881.29 - lr: 0.000043 - momentum: 0.000000 2023-10-18 19:17:03,613 epoch 3 - iter 292/738 - loss 0.37054219 - time (sec): 7.00 - samples/sec: 9093.43 - lr: 0.000042 - momentum: 0.000000 2023-10-18 19:17:05,407 epoch 3 - iter 365/738 - loss 0.36975075 - time (sec): 8.79 - samples/sec: 9310.43 - lr: 0.000042 - momentum: 0.000000 2023-10-18 19:17:07,251 epoch 3 - iter 438/738 - loss 0.36348837 - time (sec): 10.64 - samples/sec: 9318.23 - lr: 0.000041 - momentum: 0.000000 2023-10-18 19:17:09,105 epoch 3 - iter 511/738 - loss 0.35957823 - time (sec): 12.49 - samples/sec: 9327.22 - lr: 0.000041 - momentum: 0.000000 2023-10-18 19:17:10,798 epoch 3 - iter 584/738 - loss 0.35718051 - time (sec): 14.18 - samples/sec: 9308.65 - lr: 0.000040 - momentum: 0.000000 2023-10-18 19:17:12,488 epoch 3 - iter 657/738 - loss 0.35703417 - time (sec): 15.87 - samples/sec: 9334.93 - lr: 0.000040 - momentum: 0.000000 2023-10-18 19:17:14,156 epoch 3 - iter 730/738 - loss 0.35392752 - time (sec): 17.54 - samples/sec: 9316.96 - lr: 0.000039 - momentum: 0.000000 2023-10-18 19:17:14,395 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:17:14,395 EPOCH 3 done: loss 0.3546 - lr: 0.000039 2023-10-18 19:17:21,627 DEV : loss 0.2697567641735077 - f1-score (micro avg) 0.4385 2023-10-18 19:17:21,654 saving best model 2023-10-18 19:17:21,689 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:17:23,414 epoch 4 - iter 73/738 - loss 0.31534660 - time (sec): 1.72 - samples/sec: 9312.37 - lr: 0.000038 - momentum: 0.000000 2023-10-18 19:17:25,190 epoch 4 - iter 146/738 - loss 0.31583539 - time (sec): 3.50 - samples/sec: 9459.36 - lr: 0.000038 - momentum: 0.000000 2023-10-18 19:17:26,931 epoch 4 - iter 219/738 - loss 0.31563059 - time (sec): 5.24 - samples/sec: 9568.13 - lr: 0.000037 - momentum: 0.000000 2023-10-18 19:17:28,602 epoch 4 - iter 292/738 - loss 0.32238830 - time (sec): 6.91 - samples/sec: 9481.86 - lr: 0.000037 - momentum: 0.000000 2023-10-18 19:17:30,332 epoch 4 - iter 365/738 - loss 0.32299176 - time (sec): 8.64 - samples/sec: 9408.69 - lr: 0.000036 - momentum: 0.000000 2023-10-18 19:17:32,244 epoch 4 - iter 438/738 - loss 0.32181537 - time (sec): 10.55 - samples/sec: 9483.55 - lr: 0.000036 - momentum: 0.000000 2023-10-18 19:17:33,999 epoch 4 - iter 511/738 - loss 0.31852674 - time (sec): 12.31 - samples/sec: 9396.46 - lr: 0.000035 - momentum: 0.000000 2023-10-18 19:17:35,735 epoch 4 - iter 584/738 - loss 0.31761959 - time (sec): 14.05 - samples/sec: 9372.05 - lr: 0.000035 - momentum: 0.000000 2023-10-18 19:17:37,460 epoch 4 - iter 657/738 - loss 0.31666181 - time (sec): 15.77 - samples/sec: 9381.59 - lr: 0.000034 - momentum: 0.000000 2023-10-18 19:17:39,337 epoch 4 - iter 730/738 - loss 0.31100660 - time (sec): 17.65 - samples/sec: 9344.17 - lr: 0.000033 - momentum: 0.000000 2023-10-18 19:17:39,517 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:17:39,518 EPOCH 4 done: loss 0.3105 - lr: 0.000033 2023-10-18 19:17:46,781 DEV : loss 0.24989160895347595 - f1-score (micro avg) 0.4738 2023-10-18 19:17:46,809 saving best model 2023-10-18 19:17:46,842 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:17:48,549 epoch 5 - iter 73/738 - loss 0.33889059 - time (sec): 1.71 - samples/sec: 9758.31 - lr: 0.000033 - momentum: 0.000000 2023-10-18 19:17:50,329 epoch 5 - iter 146/738 - loss 0.30240545 - time (sec): 3.49 - samples/sec: 9865.12 - lr: 0.000032 - momentum: 0.000000 2023-10-18 19:17:52,054 epoch 5 - iter 219/738 - loss 0.29939293 - time (sec): 5.21 - samples/sec: 9604.33 - lr: 0.000032 - momentum: 0.000000 2023-10-18 19:17:53,799 epoch 5 - iter 292/738 - loss 0.29440439 - time (sec): 6.96 - samples/sec: 9486.80 - lr: 0.000031 - momentum: 0.000000 2023-10-18 19:17:55,494 epoch 5 - iter 365/738 - loss 0.29326400 - time (sec): 8.65 - samples/sec: 9425.14 - lr: 0.000031 - momentum: 0.000000 2023-10-18 19:17:57,245 epoch 5 - iter 438/738 - loss 0.29384822 - time (sec): 10.40 - samples/sec: 9429.18 - lr: 0.000030 - momentum: 0.000000 2023-10-18 19:17:58,964 epoch 5 - iter 511/738 - loss 0.29217563 - time (sec): 12.12 - samples/sec: 9447.95 - lr: 0.000030 - momentum: 0.000000 2023-10-18 19:18:00,728 epoch 5 - iter 584/738 - loss 0.28778672 - time (sec): 13.89 - samples/sec: 9410.97 - lr: 0.000029 - momentum: 0.000000 2023-10-18 19:18:02,548 epoch 5 - iter 657/738 - loss 0.28493520 - time (sec): 15.71 - samples/sec: 9352.08 - lr: 0.000028 - momentum: 0.000000 2023-10-18 19:18:04,328 epoch 5 - iter 730/738 - loss 0.28263463 - time (sec): 17.49 - samples/sec: 9403.28 - lr: 0.000028 - momentum: 0.000000 2023-10-18 19:18:04,519 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:18:04,519 EPOCH 5 done: loss 0.2819 - lr: 0.000028 2023-10-18 19:18:11,788 DEV : loss 0.24010828137397766 - f1-score (micro avg) 0.5029 2023-10-18 19:18:11,815 saving best model 2023-10-18 19:18:11,849 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:18:13,695 epoch 6 - iter 73/738 - loss 0.29763220 - time (sec): 1.85 - samples/sec: 9990.87 - lr: 0.000027 - momentum: 0.000000 2023-10-18 19:18:15,418 epoch 6 - iter 146/738 - loss 0.28553332 - time (sec): 3.57 - samples/sec: 9318.60 - lr: 0.000027 - momentum: 0.000000 2023-10-18 19:18:17,163 epoch 6 - iter 219/738 - loss 0.27904513 - time (sec): 5.31 - samples/sec: 9345.50 - lr: 0.000026 - momentum: 0.000000 2023-10-18 19:18:18,941 epoch 6 - iter 292/738 - loss 0.26679853 - time (sec): 7.09 - samples/sec: 9280.39 - lr: 0.000026 - momentum: 0.000000 2023-10-18 19:18:21,255 epoch 6 - iter 365/738 - loss 0.26616402 - time (sec): 9.41 - samples/sec: 8894.81 - lr: 0.000025 - momentum: 0.000000 2023-10-18 19:18:23,013 epoch 6 - iter 438/738 - loss 0.26698244 - time (sec): 11.16 - samples/sec: 8904.33 - lr: 0.000025 - momentum: 0.000000 2023-10-18 19:18:24,785 epoch 6 - iter 511/738 - loss 0.26719563 - time (sec): 12.94 - samples/sec: 8828.95 - lr: 0.000024 - momentum: 0.000000 2023-10-18 19:18:26,509 epoch 6 - iter 584/738 - loss 0.26332397 - time (sec): 14.66 - samples/sec: 8892.33 - lr: 0.000023 - momentum: 0.000000 2023-10-18 19:18:28,257 epoch 6 - iter 657/738 - loss 0.26303038 - time (sec): 16.41 - samples/sec: 8972.92 - lr: 0.000023 - momentum: 0.000000 2023-10-18 19:18:29,999 epoch 6 - iter 730/738 - loss 0.26036240 - time (sec): 18.15 - samples/sec: 9064.22 - lr: 0.000022 - momentum: 0.000000 2023-10-18 19:18:30,189 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:18:30,189 EPOCH 6 done: loss 0.2592 - lr: 0.000022 2023-10-18 19:18:37,482 DEV : loss 0.23162847757339478 - f1-score (micro avg) 0.5326 2023-10-18 19:18:37,509 saving best model 2023-10-18 19:18:37,542 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:18:39,340 epoch 7 - iter 73/738 - loss 0.26477648 - time (sec): 1.80 - samples/sec: 8805.94 - lr: 0.000022 - momentum: 0.000000 2023-10-18 19:18:41,055 epoch 7 - iter 146/738 - loss 0.25666080 - time (sec): 3.51 - samples/sec: 9218.25 - lr: 0.000021 - momentum: 0.000000 2023-10-18 19:18:42,803 epoch 7 - iter 219/738 - loss 0.25357439 - time (sec): 5.26 - samples/sec: 9349.12 - lr: 0.000021 - momentum: 0.000000 2023-10-18 19:18:44,536 epoch 7 - iter 292/738 - loss 0.25298532 - time (sec): 6.99 - samples/sec: 9290.53 - lr: 0.000020 - momentum: 0.000000 2023-10-18 19:18:46,336 epoch 7 - iter 365/738 - loss 0.24890974 - time (sec): 8.79 - samples/sec: 9295.19 - lr: 0.000020 - momentum: 0.000000 2023-10-18 19:18:48,116 epoch 7 - iter 438/738 - loss 0.24820722 - time (sec): 10.57 - samples/sec: 9226.06 - lr: 0.000019 - momentum: 0.000000 2023-10-18 19:18:49,919 epoch 7 - iter 511/738 - loss 0.25007629 - time (sec): 12.38 - samples/sec: 9260.81 - lr: 0.000018 - momentum: 0.000000 2023-10-18 19:18:51,614 epoch 7 - iter 584/738 - loss 0.25047113 - time (sec): 14.07 - samples/sec: 9270.02 - lr: 0.000018 - momentum: 0.000000 2023-10-18 19:18:53,422 epoch 7 - iter 657/738 - loss 0.24687633 - time (sec): 15.88 - samples/sec: 9360.24 - lr: 0.000017 - momentum: 0.000000 2023-10-18 19:18:55,163 epoch 7 - iter 730/738 - loss 0.24542372 - time (sec): 17.62 - samples/sec: 9350.51 - lr: 0.000017 - momentum: 0.000000 2023-10-18 19:18:55,353 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:18:55,353 EPOCH 7 done: loss 0.2448 - lr: 0.000017 2023-10-18 19:19:02,608 DEV : loss 0.22602201998233795 - f1-score (micro avg) 0.5256 2023-10-18 19:19:02,637 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:19:04,311 epoch 8 - iter 73/738 - loss 0.24423595 - time (sec): 1.67 - samples/sec: 9745.79 - lr: 0.000016 - momentum: 0.000000 2023-10-18 19:19:06,038 epoch 8 - iter 146/738 - loss 0.23121395 - time (sec): 3.40 - samples/sec: 9164.06 - lr: 0.000016 - momentum: 0.000000 2023-10-18 19:19:07,781 epoch 8 - iter 219/738 - loss 0.22578664 - time (sec): 5.14 - samples/sec: 9300.66 - lr: 0.000015 - momentum: 0.000000 2023-10-18 19:19:09,520 epoch 8 - iter 292/738 - loss 0.22931143 - time (sec): 6.88 - samples/sec: 9270.00 - lr: 0.000015 - momentum: 0.000000 2023-10-18 19:19:11,191 epoch 8 - iter 365/738 - loss 0.23131383 - time (sec): 8.55 - samples/sec: 9206.24 - lr: 0.000014 - momentum: 0.000000 2023-10-18 19:19:13,033 epoch 8 - iter 438/738 - loss 0.23177533 - time (sec): 10.39 - samples/sec: 9213.72 - lr: 0.000013 - momentum: 0.000000 2023-10-18 19:19:14,746 epoch 8 - iter 511/738 - loss 0.23071403 - time (sec): 12.11 - samples/sec: 9212.74 - lr: 0.000013 - momentum: 0.000000 2023-10-18 19:19:16,470 epoch 8 - iter 584/738 - loss 0.22836249 - time (sec): 13.83 - samples/sec: 9299.07 - lr: 0.000012 - momentum: 0.000000 2023-10-18 19:19:18,250 epoch 8 - iter 657/738 - loss 0.22818409 - time (sec): 15.61 - samples/sec: 9358.07 - lr: 0.000012 - momentum: 0.000000 2023-10-18 19:19:20,147 epoch 8 - iter 730/738 - loss 0.23096993 - time (sec): 17.51 - samples/sec: 9427.60 - lr: 0.000011 - momentum: 0.000000 2023-10-18 19:19:20,335 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:19:20,335 EPOCH 8 done: loss 0.2311 - lr: 0.000011 2023-10-18 19:19:27,642 DEV : loss 0.22594498097896576 - f1-score (micro avg) 0.5422 2023-10-18 19:19:27,671 saving best model 2023-10-18 19:19:27,703 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:19:29,484 epoch 9 - iter 73/738 - loss 0.18455478 - time (sec): 1.78 - samples/sec: 8727.95 - lr: 0.000011 - momentum: 0.000000 2023-10-18 19:19:31,203 epoch 9 - iter 146/738 - loss 0.20931099 - time (sec): 3.50 - samples/sec: 9088.75 - lr: 0.000010 - momentum: 0.000000 2023-10-18 19:19:32,887 epoch 9 - iter 219/738 - loss 0.20975404 - time (sec): 5.18 - samples/sec: 9297.14 - lr: 0.000010 - momentum: 0.000000 2023-10-18 19:19:34,653 epoch 9 - iter 292/738 - loss 0.22411280 - time (sec): 6.95 - samples/sec: 9389.15 - lr: 0.000009 - momentum: 0.000000 2023-10-18 19:19:36,405 epoch 9 - iter 365/738 - loss 0.22562147 - time (sec): 8.70 - samples/sec: 9441.59 - lr: 0.000008 - momentum: 0.000000 2023-10-18 19:19:38,076 epoch 9 - iter 438/738 - loss 0.22642686 - time (sec): 10.37 - samples/sec: 9345.82 - lr: 0.000008 - momentum: 0.000000 2023-10-18 19:19:39,738 epoch 9 - iter 511/738 - loss 0.22750691 - time (sec): 12.03 - samples/sec: 9401.08 - lr: 0.000007 - momentum: 0.000000 2023-10-18 19:19:41,543 epoch 9 - iter 584/738 - loss 0.22803447 - time (sec): 13.84 - samples/sec: 9450.52 - lr: 0.000007 - momentum: 0.000000 2023-10-18 19:19:43,400 epoch 9 - iter 657/738 - loss 0.22389397 - time (sec): 15.70 - samples/sec: 9480.63 - lr: 0.000006 - momentum: 0.000000 2023-10-18 19:19:45,138 epoch 9 - iter 730/738 - loss 0.22337068 - time (sec): 17.43 - samples/sec: 9453.91 - lr: 0.000006 - momentum: 0.000000 2023-10-18 19:19:45,332 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:19:45,332 EPOCH 9 done: loss 0.2240 - lr: 0.000006 2023-10-18 19:19:52,605 DEV : loss 0.22624309360980988 - f1-score (micro avg) 0.5362 2023-10-18 19:19:52,632 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:19:54,345 epoch 10 - iter 73/738 - loss 0.24146970 - time (sec): 1.71 - samples/sec: 9516.11 - lr: 0.000005 - momentum: 0.000000 2023-10-18 19:19:56,147 epoch 10 - iter 146/738 - loss 0.23503637 - time (sec): 3.51 - samples/sec: 9734.76 - lr: 0.000004 - momentum: 0.000000 2023-10-18 19:19:57,897 epoch 10 - iter 219/738 - loss 0.23583201 - time (sec): 5.26 - samples/sec: 9531.90 - lr: 0.000004 - momentum: 0.000000 2023-10-18 19:20:00,098 epoch 10 - iter 292/738 - loss 0.22826275 - time (sec): 7.47 - samples/sec: 9052.37 - lr: 0.000003 - momentum: 0.000000 2023-10-18 19:20:01,827 epoch 10 - iter 365/738 - loss 0.22769440 - time (sec): 9.19 - samples/sec: 8993.55 - lr: 0.000003 - momentum: 0.000000 2023-10-18 19:20:03,517 epoch 10 - iter 438/738 - loss 0.22412354 - time (sec): 10.88 - samples/sec: 8956.87 - lr: 0.000002 - momentum: 0.000000 2023-10-18 19:20:05,191 epoch 10 - iter 511/738 - loss 0.22446585 - time (sec): 12.56 - samples/sec: 9001.05 - lr: 0.000002 - momentum: 0.000000 2023-10-18 19:20:06,948 epoch 10 - iter 584/738 - loss 0.22796532 - time (sec): 14.31 - samples/sec: 9050.23 - lr: 0.000001 - momentum: 0.000000 2023-10-18 19:20:08,702 epoch 10 - iter 657/738 - loss 0.22207363 - time (sec): 16.07 - samples/sec: 9167.06 - lr: 0.000001 - momentum: 0.000000 2023-10-18 19:20:10,373 epoch 10 - iter 730/738 - loss 0.22025921 - time (sec): 17.74 - samples/sec: 9298.67 - lr: 0.000000 - momentum: 0.000000 2023-10-18 19:20:10,543 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:20:10,543 EPOCH 10 done: loss 0.2207 - lr: 0.000000 2023-10-18 19:20:17,848 DEV : loss 0.2269122153520584 - f1-score (micro avg) 0.5457 2023-10-18 19:20:17,877 saving best model 2023-10-18 19:20:17,943 ---------------------------------------------------------------------------------------------------- 2023-10-18 19:20:17,943 Loading model from best epoch ... 2023-10-18 19:20:18,024 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-18 19:20:20,721 Results: - F-score (micro) 0.5456 - F-score (macro) 0.3348 - Accuracy 0.3993 By class: precision recall f1-score support loc 0.5688 0.7657 0.6528 858 pers 0.4237 0.5121 0.4637 537 org 0.2000 0.0530 0.0838 132 time 0.4500 0.5000 0.4737 54 prod 0.0000 0.0000 0.0000 61 micro avg 0.5087 0.5883 0.5456 1642 macro avg 0.3285 0.3662 0.3348 1642 weighted avg 0.4667 0.5883 0.5151 1642 2023-10-18 19:20:20,721 ----------------------------------------------------------------------------------------------------