2023-10-17 16:41:52,149 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,150 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 16:41:52,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,150 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-17 16:41:52,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,150 Train: 5777 sentences 2023-10-17 16:41:52,150 (train_with_dev=False, train_with_test=False) 2023-10-17 16:41:52,150 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,151 Training Params: 2023-10-17 16:41:52,151 - learning_rate: "5e-05" 2023-10-17 16:41:52,151 - mini_batch_size: "8" 2023-10-17 16:41:52,151 - max_epochs: "10" 2023-10-17 16:41:52,151 - shuffle: "True" 2023-10-17 16:41:52,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,151 Plugins: 2023-10-17 16:41:52,151 - TensorboardLogger 2023-10-17 16:41:52,151 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 16:41:52,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,151 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 16:41:52,151 - metric: "('micro avg', 'f1-score')" 2023-10-17 16:41:52,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,151 Computation: 2023-10-17 16:41:52,151 - compute on device: cuda:0 2023-10-17 16:41:52,151 - embedding storage: none 2023-10-17 16:41:52,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,151 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 16:41:52,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,151 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:41:52,151 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 16:41:57,409 epoch 1 - iter 72/723 - loss 2.67948568 - time (sec): 5.26 - samples/sec: 3471.16 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:42:02,493 epoch 1 - iter 144/723 - loss 1.51756626 - time (sec): 10.34 - samples/sec: 3447.49 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:42:07,700 epoch 1 - iter 216/723 - loss 1.11928765 - time (sec): 15.55 - samples/sec: 3369.59 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:42:12,879 epoch 1 - iter 288/723 - loss 0.88170380 - time (sec): 20.73 - samples/sec: 3381.28 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:42:18,297 epoch 1 - iter 360/723 - loss 0.72889095 - time (sec): 26.14 - samples/sec: 3383.00 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:42:23,461 epoch 1 - iter 432/723 - loss 0.63408298 - time (sec): 31.31 - samples/sec: 3377.27 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:42:28,807 epoch 1 - iter 504/723 - loss 0.56269758 - time (sec): 36.66 - samples/sec: 3371.19 - lr: 0.000035 - momentum: 0.000000 2023-10-17 16:42:34,061 epoch 1 - iter 576/723 - loss 0.50491815 - time (sec): 41.91 - samples/sec: 3371.37 - lr: 0.000040 - momentum: 0.000000 2023-10-17 16:42:39,152 epoch 1 - iter 648/723 - loss 0.46295023 - time (sec): 47.00 - samples/sec: 3365.62 - lr: 0.000045 - momentum: 0.000000 2023-10-17 16:42:44,485 epoch 1 - iter 720/723 - loss 0.43008419 - time (sec): 52.33 - samples/sec: 3358.72 - lr: 0.000050 - momentum: 0.000000 2023-10-17 16:42:44,669 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:42:44,669 EPOCH 1 done: loss 0.4293 - lr: 0.000050 2023-10-17 16:42:47,493 DEV : loss 0.08612097054719925 - f1-score (micro avg) 0.776 2023-10-17 16:42:47,510 saving best model 2023-10-17 16:42:47,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:42:52,928 epoch 2 - iter 72/723 - loss 0.12284217 - time (sec): 5.04 - samples/sec: 3280.92 - lr: 0.000049 - momentum: 0.000000 2023-10-17 16:42:57,952 epoch 2 - iter 144/723 - loss 0.10772593 - time (sec): 10.07 - samples/sec: 3328.33 - lr: 0.000049 - momentum: 0.000000 2023-10-17 16:43:03,507 epoch 2 - iter 216/723 - loss 0.09967815 - time (sec): 15.62 - samples/sec: 3264.64 - lr: 0.000048 - momentum: 0.000000 2023-10-17 16:43:08,830 epoch 2 - iter 288/723 - loss 0.09421738 - time (sec): 20.95 - samples/sec: 3291.54 - lr: 0.000048 - momentum: 0.000000 2023-10-17 16:43:13,997 epoch 2 - iter 360/723 - loss 0.09235624 - time (sec): 26.11 - samples/sec: 3291.59 - lr: 0.000047 - momentum: 0.000000 2023-10-17 16:43:19,442 epoch 2 - iter 432/723 - loss 0.08878323 - time (sec): 31.56 - samples/sec: 3332.33 - lr: 0.000047 - momentum: 0.000000 2023-10-17 16:43:24,817 epoch 2 - iter 504/723 - loss 0.08735342 - time (sec): 36.93 - samples/sec: 3331.92 - lr: 0.000046 - momentum: 0.000000 2023-10-17 16:43:29,935 epoch 2 - iter 576/723 - loss 0.08589435 - time (sec): 42.05 - samples/sec: 3329.59 - lr: 0.000046 - momentum: 0.000000 2023-10-17 16:43:35,289 epoch 2 - iter 648/723 - loss 0.08630544 - time (sec): 47.41 - samples/sec: 3336.84 - lr: 0.000045 - momentum: 0.000000 2023-10-17 16:43:40,514 epoch 2 - iter 720/723 - loss 0.08713654 - time (sec): 52.63 - samples/sec: 3339.30 - lr: 0.000044 - momentum: 0.000000 2023-10-17 16:43:40,687 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:43:40,687 EPOCH 2 done: loss 0.0871 - lr: 0.000044 2023-10-17 16:43:44,409 DEV : loss 0.08103517442941666 - f1-score (micro avg) 0.8054 2023-10-17 16:43:44,429 saving best model 2023-10-17 16:43:44,868 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:43:50,145 epoch 3 - iter 72/723 - loss 0.07033316 - time (sec): 5.27 - samples/sec: 3451.58 - lr: 0.000044 - momentum: 0.000000 2023-10-17 16:43:55,281 epoch 3 - iter 144/723 - loss 0.06506614 - time (sec): 10.41 - samples/sec: 3398.46 - lr: 0.000043 - momentum: 0.000000 2023-10-17 16:44:00,931 epoch 3 - iter 216/723 - loss 0.06638678 - time (sec): 16.06 - samples/sec: 3389.15 - lr: 0.000043 - momentum: 0.000000 2023-10-17 16:44:05,935 epoch 3 - iter 288/723 - loss 0.07045554 - time (sec): 21.06 - samples/sec: 3384.34 - lr: 0.000042 - momentum: 0.000000 2023-10-17 16:44:10,944 epoch 3 - iter 360/723 - loss 0.06881606 - time (sec): 26.07 - samples/sec: 3402.69 - lr: 0.000042 - momentum: 0.000000 2023-10-17 16:44:16,239 epoch 3 - iter 432/723 - loss 0.06465933 - time (sec): 31.37 - samples/sec: 3410.62 - lr: 0.000041 - momentum: 0.000000 2023-10-17 16:44:21,124 epoch 3 - iter 504/723 - loss 0.06471377 - time (sec): 36.25 - samples/sec: 3404.19 - lr: 0.000041 - momentum: 0.000000 2023-10-17 16:44:25,967 epoch 3 - iter 576/723 - loss 0.06437747 - time (sec): 41.10 - samples/sec: 3402.79 - lr: 0.000040 - momentum: 0.000000 2023-10-17 16:44:31,384 epoch 3 - iter 648/723 - loss 0.06355648 - time (sec): 46.51 - samples/sec: 3391.68 - lr: 0.000039 - momentum: 0.000000 2023-10-17 16:44:37,004 epoch 3 - iter 720/723 - loss 0.06347931 - time (sec): 52.13 - samples/sec: 3367.06 - lr: 0.000039 - momentum: 0.000000 2023-10-17 16:44:37,239 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:44:37,240 EPOCH 3 done: loss 0.0636 - lr: 0.000039 2023-10-17 16:44:40,614 DEV : loss 0.07145461440086365 - f1-score (micro avg) 0.8577 2023-10-17 16:44:40,640 saving best model 2023-10-17 16:44:41,173 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:44:46,600 epoch 4 - iter 72/723 - loss 0.03837137 - time (sec): 5.42 - samples/sec: 3284.40 - lr: 0.000038 - momentum: 0.000000 2023-10-17 16:44:51,561 epoch 4 - iter 144/723 - loss 0.04438266 - time (sec): 10.38 - samples/sec: 3356.50 - lr: 0.000038 - momentum: 0.000000 2023-10-17 16:44:56,563 epoch 4 - iter 216/723 - loss 0.03974982 - time (sec): 15.39 - samples/sec: 3354.19 - lr: 0.000037 - momentum: 0.000000 2023-10-17 16:45:01,861 epoch 4 - iter 288/723 - loss 0.04252196 - time (sec): 20.68 - samples/sec: 3321.95 - lr: 0.000037 - momentum: 0.000000 2023-10-17 16:45:07,950 epoch 4 - iter 360/723 - loss 0.04602468 - time (sec): 26.77 - samples/sec: 3239.10 - lr: 0.000036 - momentum: 0.000000 2023-10-17 16:45:13,411 epoch 4 - iter 432/723 - loss 0.04676170 - time (sec): 32.24 - samples/sec: 3256.34 - lr: 0.000036 - momentum: 0.000000 2023-10-17 16:45:18,763 epoch 4 - iter 504/723 - loss 0.04634358 - time (sec): 37.59 - samples/sec: 3265.53 - lr: 0.000035 - momentum: 0.000000 2023-10-17 16:45:24,271 epoch 4 - iter 576/723 - loss 0.04532936 - time (sec): 43.10 - samples/sec: 3258.66 - lr: 0.000034 - momentum: 0.000000 2023-10-17 16:45:29,844 epoch 4 - iter 648/723 - loss 0.04480859 - time (sec): 48.67 - samples/sec: 3247.61 - lr: 0.000034 - momentum: 0.000000 2023-10-17 16:45:35,543 epoch 4 - iter 720/723 - loss 0.04434037 - time (sec): 54.37 - samples/sec: 3232.47 - lr: 0.000033 - momentum: 0.000000 2023-10-17 16:45:35,705 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:45:35,706 EPOCH 4 done: loss 0.0443 - lr: 0.000033 2023-10-17 16:45:39,119 DEV : loss 0.07815779000520706 - f1-score (micro avg) 0.8501 2023-10-17 16:45:39,139 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:45:44,291 epoch 5 - iter 72/723 - loss 0.04048703 - time (sec): 5.15 - samples/sec: 3193.52 - lr: 0.000033 - momentum: 0.000000 2023-10-17 16:45:49,767 epoch 5 - iter 144/723 - loss 0.03381670 - time (sec): 10.63 - samples/sec: 3201.78 - lr: 0.000032 - momentum: 0.000000 2023-10-17 16:45:55,062 epoch 5 - iter 216/723 - loss 0.03374095 - time (sec): 15.92 - samples/sec: 3244.20 - lr: 0.000032 - momentum: 0.000000 2023-10-17 16:45:59,942 epoch 5 - iter 288/723 - loss 0.03223050 - time (sec): 20.80 - samples/sec: 3260.69 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:46:05,326 epoch 5 - iter 360/723 - loss 0.03202958 - time (sec): 26.19 - samples/sec: 3286.68 - lr: 0.000031 - momentum: 0.000000 2023-10-17 16:46:10,429 epoch 5 - iter 432/723 - loss 0.03258311 - time (sec): 31.29 - samples/sec: 3309.95 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:46:16,275 epoch 5 - iter 504/723 - loss 0.03307126 - time (sec): 37.13 - samples/sec: 3297.61 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:46:22,011 epoch 5 - iter 576/723 - loss 0.03267137 - time (sec): 42.87 - samples/sec: 3298.91 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:46:27,202 epoch 5 - iter 648/723 - loss 0.03133181 - time (sec): 48.06 - samples/sec: 3310.60 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:46:32,072 epoch 5 - iter 720/723 - loss 0.03197385 - time (sec): 52.93 - samples/sec: 3316.95 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:46:32,273 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:46:32,274 EPOCH 5 done: loss 0.0319 - lr: 0.000028 2023-10-17 16:46:35,912 DEV : loss 0.11172021180391312 - f1-score (micro avg) 0.8452 2023-10-17 16:46:35,934 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:46:41,523 epoch 6 - iter 72/723 - loss 0.03952585 - time (sec): 5.59 - samples/sec: 3291.70 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:46:46,742 epoch 6 - iter 144/723 - loss 0.02804880 - time (sec): 10.81 - samples/sec: 3289.62 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:46:51,600 epoch 6 - iter 216/723 - loss 0.02574703 - time (sec): 15.66 - samples/sec: 3376.62 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:46:56,779 epoch 6 - iter 288/723 - loss 0.02448115 - time (sec): 20.84 - samples/sec: 3372.80 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:47:02,195 epoch 6 - iter 360/723 - loss 0.02459804 - time (sec): 26.26 - samples/sec: 3368.51 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:47:07,405 epoch 6 - iter 432/723 - loss 0.02437003 - time (sec): 31.47 - samples/sec: 3392.17 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:47:12,596 epoch 6 - iter 504/723 - loss 0.02308042 - time (sec): 36.66 - samples/sec: 3401.30 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:47:17,452 epoch 6 - iter 576/723 - loss 0.02316493 - time (sec): 41.52 - samples/sec: 3399.71 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:47:22,336 epoch 6 - iter 648/723 - loss 0.02348809 - time (sec): 46.40 - samples/sec: 3405.70 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:47:27,338 epoch 6 - iter 720/723 - loss 0.02377909 - time (sec): 51.40 - samples/sec: 3414.88 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:47:27,650 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:27,650 EPOCH 6 done: loss 0.0237 - lr: 0.000022 2023-10-17 16:47:30,926 DEV : loss 0.12220776081085205 - f1-score (micro avg) 0.833 2023-10-17 16:47:30,943 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:36,396 epoch 7 - iter 72/723 - loss 0.01650496 - time (sec): 5.45 - samples/sec: 3376.92 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:47:41,778 epoch 7 - iter 144/723 - loss 0.01360068 - time (sec): 10.83 - samples/sec: 3342.96 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:47:46,812 epoch 7 - iter 216/723 - loss 0.01452251 - time (sec): 15.87 - samples/sec: 3352.48 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:47:52,178 epoch 7 - iter 288/723 - loss 0.01645838 - time (sec): 21.23 - samples/sec: 3344.16 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:47:57,861 epoch 7 - iter 360/723 - loss 0.01755396 - time (sec): 26.92 - samples/sec: 3306.69 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:48:02,996 epoch 7 - iter 432/723 - loss 0.01683966 - time (sec): 32.05 - samples/sec: 3308.57 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:48:08,362 epoch 7 - iter 504/723 - loss 0.01804620 - time (sec): 37.42 - samples/sec: 3283.71 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:48:13,422 epoch 7 - iter 576/723 - loss 0.01858647 - time (sec): 42.48 - samples/sec: 3299.04 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:48:18,612 epoch 7 - iter 648/723 - loss 0.01818199 - time (sec): 47.67 - samples/sec: 3299.05 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:48:24,115 epoch 7 - iter 720/723 - loss 0.01814035 - time (sec): 53.17 - samples/sec: 3303.18 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:48:24,307 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:48:24,307 EPOCH 7 done: loss 0.0181 - lr: 0.000017 2023-10-17 16:48:27,588 DEV : loss 0.151025652885437 - f1-score (micro avg) 0.85 2023-10-17 16:48:27,609 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:48:32,780 epoch 8 - iter 72/723 - loss 0.01746000 - time (sec): 5.17 - samples/sec: 3309.45 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:48:38,022 epoch 8 - iter 144/723 - loss 0.01354879 - time (sec): 10.41 - samples/sec: 3286.96 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:48:43,458 epoch 8 - iter 216/723 - loss 0.01322560 - time (sec): 15.85 - samples/sec: 3284.27 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:48:48,750 epoch 8 - iter 288/723 - loss 0.01336715 - time (sec): 21.14 - samples/sec: 3279.30 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:48:53,653 epoch 8 - iter 360/723 - loss 0.01217760 - time (sec): 26.04 - samples/sec: 3282.41 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:48:59,111 epoch 8 - iter 432/723 - loss 0.01159660 - time (sec): 31.50 - samples/sec: 3299.90 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:49:04,309 epoch 8 - iter 504/723 - loss 0.01175992 - time (sec): 36.70 - samples/sec: 3331.35 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:49:09,349 epoch 8 - iter 576/723 - loss 0.01287688 - time (sec): 41.74 - samples/sec: 3342.24 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:49:14,737 epoch 8 - iter 648/723 - loss 0.01260778 - time (sec): 47.13 - samples/sec: 3348.97 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:49:19,940 epoch 8 - iter 720/723 - loss 0.01205049 - time (sec): 52.33 - samples/sec: 3356.47 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:49:20,099 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:49:20,099 EPOCH 8 done: loss 0.0122 - lr: 0.000011 2023-10-17 16:49:23,733 DEV : loss 0.1497245579957962 - f1-score (micro avg) 0.8549 2023-10-17 16:49:23,750 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:49:29,107 epoch 9 - iter 72/723 - loss 0.00992020 - time (sec): 5.36 - samples/sec: 3378.53 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:49:34,981 epoch 9 - iter 144/723 - loss 0.01528820 - time (sec): 11.23 - samples/sec: 3281.60 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:49:39,992 epoch 9 - iter 216/723 - loss 0.01243284 - time (sec): 16.24 - samples/sec: 3325.54 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:49:45,206 epoch 9 - iter 288/723 - loss 0.01139342 - time (sec): 21.45 - samples/sec: 3322.35 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:49:50,460 epoch 9 - iter 360/723 - loss 0.01161949 - time (sec): 26.71 - samples/sec: 3349.14 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:49:55,643 epoch 9 - iter 432/723 - loss 0.01045204 - time (sec): 31.89 - samples/sec: 3359.81 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:50:00,599 epoch 9 - iter 504/723 - loss 0.01053498 - time (sec): 36.85 - samples/sec: 3364.78 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:50:05,600 epoch 9 - iter 576/723 - loss 0.00976123 - time (sec): 41.85 - samples/sec: 3373.15 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:50:10,861 epoch 9 - iter 648/723 - loss 0.00958225 - time (sec): 47.11 - samples/sec: 3361.92 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:50:16,284 epoch 9 - iter 720/723 - loss 0.00921964 - time (sec): 52.53 - samples/sec: 3341.67 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:50:16,459 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:50:16,459 EPOCH 9 done: loss 0.0092 - lr: 0.000006 2023-10-17 16:50:19,623 DEV : loss 0.14509357511997223 - f1-score (micro avg) 0.86 2023-10-17 16:50:19,640 saving best model 2023-10-17 16:50:20,094 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:50:25,413 epoch 10 - iter 72/723 - loss 0.00606283 - time (sec): 5.31 - samples/sec: 3415.52 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:50:30,233 epoch 10 - iter 144/723 - loss 0.00442758 - time (sec): 10.13 - samples/sec: 3442.56 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:50:34,936 epoch 10 - iter 216/723 - loss 0.00436506 - time (sec): 14.84 - samples/sec: 3380.80 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:50:40,577 epoch 10 - iter 288/723 - loss 0.00606399 - time (sec): 20.48 - samples/sec: 3309.01 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:50:45,896 epoch 10 - iter 360/723 - loss 0.00559111 - time (sec): 25.80 - samples/sec: 3325.33 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:50:51,535 epoch 10 - iter 432/723 - loss 0.00537624 - time (sec): 31.44 - samples/sec: 3271.64 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:50:56,965 epoch 10 - iter 504/723 - loss 0.00564013 - time (sec): 36.87 - samples/sec: 3274.74 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:51:02,222 epoch 10 - iter 576/723 - loss 0.00556250 - time (sec): 42.12 - samples/sec: 3287.06 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:51:07,571 epoch 10 - iter 648/723 - loss 0.00584230 - time (sec): 47.47 - samples/sec: 3304.77 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:51:13,133 epoch 10 - iter 720/723 - loss 0.00597101 - time (sec): 53.03 - samples/sec: 3312.61 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:51:13,317 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:13,317 EPOCH 10 done: loss 0.0060 - lr: 0.000000 2023-10-17 16:51:16,735 DEV : loss 0.1616183966398239 - f1-score (micro avg) 0.8483 2023-10-17 16:51:17,091 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:17,092 Loading model from best epoch ... 2023-10-17 16:51:18,421 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 16:51:21,216 Results: - F-score (micro) 0.8458 - F-score (macro) 0.7427 - Accuracy 0.746 By class: precision recall f1-score support PER 0.8531 0.8195 0.8360 482 LOC 0.9417 0.8821 0.9109 458 ORG 0.5000 0.4638 0.4812 69 micro avg 0.8692 0.8236 0.8458 1009 macro avg 0.7650 0.7218 0.7427 1009 weighted avg 0.8692 0.8236 0.8457 1009 2023-10-17 16:51:21,216 ----------------------------------------------------------------------------------------------------