2023-10-17 12:57:22,959 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,960 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 12:57:22,960 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,961 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-17 12:57:22,961 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,961 Train: 14465 sentences 2023-10-17 12:57:22,961 (train_with_dev=False, train_with_test=False) 2023-10-17 12:57:22,961 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,961 Training Params: 2023-10-17 12:57:22,961 - learning_rate: "5e-05" 2023-10-17 12:57:22,961 - mini_batch_size: "8" 2023-10-17 12:57:22,961 - max_epochs: "10" 2023-10-17 12:57:22,961 - shuffle: "True" 2023-10-17 12:57:22,961 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,961 Plugins: 2023-10-17 12:57:22,961 - TensorboardLogger 2023-10-17 12:57:22,961 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 12:57:22,961 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,962 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 12:57:22,962 - metric: "('micro avg', 'f1-score')" 2023-10-17 12:57:22,962 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,962 Computation: 2023-10-17 12:57:22,962 - compute on device: cuda:0 2023-10-17 12:57:22,962 - embedding storage: none 2023-10-17 12:57:22,962 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,962 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 12:57:22,962 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,962 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:57:22,962 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 12:57:37,220 epoch 1 - iter 180/1809 - loss 1.84340706 - time (sec): 14.26 - samples/sec: 2648.39 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:57:50,903 epoch 1 - iter 360/1809 - loss 1.05150880 - time (sec): 27.94 - samples/sec: 2653.99 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:58:04,641 epoch 1 - iter 540/1809 - loss 0.74172731 - time (sec): 41.68 - samples/sec: 2727.46 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:58:17,857 epoch 1 - iter 720/1809 - loss 0.58928040 - time (sec): 54.89 - samples/sec: 2771.98 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:58:32,073 epoch 1 - iter 900/1809 - loss 0.49921402 - time (sec): 69.11 - samples/sec: 2758.24 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:58:45,617 epoch 1 - iter 1080/1809 - loss 0.43657646 - time (sec): 82.65 - samples/sec: 2755.99 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:58:59,176 epoch 1 - iter 1260/1809 - loss 0.38919295 - time (sec): 96.21 - samples/sec: 2764.35 - lr: 0.000035 - momentum: 0.000000 2023-10-17 12:59:12,317 epoch 1 - iter 1440/1809 - loss 0.35301689 - time (sec): 109.35 - samples/sec: 2791.91 - lr: 0.000040 - momentum: 0.000000 2023-10-17 12:59:25,395 epoch 1 - iter 1620/1809 - loss 0.32539419 - time (sec): 122.43 - samples/sec: 2797.97 - lr: 0.000045 - momentum: 0.000000 2023-10-17 12:59:39,241 epoch 1 - iter 1800/1809 - loss 0.30517417 - time (sec): 136.28 - samples/sec: 2775.15 - lr: 0.000050 - momentum: 0.000000 2023-10-17 12:59:39,970 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:59:39,971 EPOCH 1 done: loss 0.3042 - lr: 0.000050 2023-10-17 12:59:45,481 DEV : loss 0.10241620242595673 - f1-score (micro avg) 0.5874 2023-10-17 12:59:45,529 saving best model 2023-10-17 12:59:46,074 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:00:00,358 epoch 2 - iter 180/1809 - loss 0.08999612 - time (sec): 14.28 - samples/sec: 2712.94 - lr: 0.000049 - momentum: 0.000000 2023-10-17 13:00:14,189 epoch 2 - iter 360/1809 - loss 0.08860684 - time (sec): 28.11 - samples/sec: 2740.96 - lr: 0.000049 - momentum: 0.000000 2023-10-17 13:00:27,501 epoch 2 - iter 540/1809 - loss 0.09064747 - time (sec): 41.43 - samples/sec: 2748.69 - lr: 0.000048 - momentum: 0.000000 2023-10-17 13:00:40,822 epoch 2 - iter 720/1809 - loss 0.08986173 - time (sec): 54.75 - samples/sec: 2763.03 - lr: 0.000048 - momentum: 0.000000 2023-10-17 13:00:55,054 epoch 2 - iter 900/1809 - loss 0.08846040 - time (sec): 68.98 - samples/sec: 2726.01 - lr: 0.000047 - momentum: 0.000000 2023-10-17 13:01:08,830 epoch 2 - iter 1080/1809 - loss 0.09011922 - time (sec): 82.75 - samples/sec: 2736.28 - lr: 0.000047 - momentum: 0.000000 2023-10-17 13:01:23,268 epoch 2 - iter 1260/1809 - loss 0.08854167 - time (sec): 97.19 - samples/sec: 2728.15 - lr: 0.000046 - momentum: 0.000000 2023-10-17 13:01:36,733 epoch 2 - iter 1440/1809 - loss 0.08789492 - time (sec): 110.66 - samples/sec: 2735.03 - lr: 0.000046 - momentum: 0.000000 2023-10-17 13:01:50,720 epoch 2 - iter 1620/1809 - loss 0.08789901 - time (sec): 124.64 - samples/sec: 2739.18 - lr: 0.000045 - momentum: 0.000000 2023-10-17 13:02:04,249 epoch 2 - iter 1800/1809 - loss 0.08798538 - time (sec): 138.17 - samples/sec: 2737.08 - lr: 0.000044 - momentum: 0.000000 2023-10-17 13:02:04,931 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:02:04,931 EPOCH 2 done: loss 0.0879 - lr: 0.000044 2023-10-17 13:02:11,312 DEV : loss 0.12951047718524933 - f1-score (micro avg) 0.6379 2023-10-17 13:02:11,354 saving best model 2023-10-17 13:02:11,954 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:02:24,999 epoch 3 - iter 180/1809 - loss 0.06500996 - time (sec): 13.04 - samples/sec: 2812.60 - lr: 0.000044 - momentum: 0.000000 2023-10-17 13:02:38,194 epoch 3 - iter 360/1809 - loss 0.06343791 - time (sec): 26.24 - samples/sec: 2858.92 - lr: 0.000043 - momentum: 0.000000 2023-10-17 13:02:51,465 epoch 3 - iter 540/1809 - loss 0.06422061 - time (sec): 39.51 - samples/sec: 2860.97 - lr: 0.000043 - momentum: 0.000000 2023-10-17 13:03:04,781 epoch 3 - iter 720/1809 - loss 0.06581941 - time (sec): 52.83 - samples/sec: 2856.71 - lr: 0.000042 - momentum: 0.000000 2023-10-17 13:03:17,851 epoch 3 - iter 900/1809 - loss 0.06424567 - time (sec): 65.90 - samples/sec: 2859.91 - lr: 0.000042 - momentum: 0.000000 2023-10-17 13:03:30,944 epoch 3 - iter 1080/1809 - loss 0.06446192 - time (sec): 78.99 - samples/sec: 2881.71 - lr: 0.000041 - momentum: 0.000000 2023-10-17 13:03:45,032 epoch 3 - iter 1260/1809 - loss 0.06498763 - time (sec): 93.08 - samples/sec: 2836.00 - lr: 0.000041 - momentum: 0.000000 2023-10-17 13:03:59,459 epoch 3 - iter 1440/1809 - loss 0.06557827 - time (sec): 107.50 - samples/sec: 2808.10 - lr: 0.000040 - momentum: 0.000000 2023-10-17 13:04:13,201 epoch 3 - iter 1620/1809 - loss 0.06535863 - time (sec): 121.25 - samples/sec: 2813.72 - lr: 0.000039 - momentum: 0.000000 2023-10-17 13:04:27,131 epoch 3 - iter 1800/1809 - loss 0.06527467 - time (sec): 135.18 - samples/sec: 2798.04 - lr: 0.000039 - momentum: 0.000000 2023-10-17 13:04:27,823 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:04:27,824 EPOCH 3 done: loss 0.0652 - lr: 0.000039 2023-10-17 13:04:35,091 DEV : loss 0.1395214945077896 - f1-score (micro avg) 0.6286 2023-10-17 13:04:35,137 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:04:49,427 epoch 4 - iter 180/1809 - loss 0.04607623 - time (sec): 14.29 - samples/sec: 2694.03 - lr: 0.000038 - momentum: 0.000000 2023-10-17 13:05:02,656 epoch 4 - iter 360/1809 - loss 0.04930480 - time (sec): 27.52 - samples/sec: 2765.32 - lr: 0.000038 - momentum: 0.000000 2023-10-17 13:05:16,122 epoch 4 - iter 540/1809 - loss 0.04969251 - time (sec): 40.98 - samples/sec: 2809.11 - lr: 0.000037 - momentum: 0.000000 2023-10-17 13:05:29,916 epoch 4 - iter 720/1809 - loss 0.04875287 - time (sec): 54.78 - samples/sec: 2789.29 - lr: 0.000037 - momentum: 0.000000 2023-10-17 13:05:43,863 epoch 4 - iter 900/1809 - loss 0.05042701 - time (sec): 68.72 - samples/sec: 2772.42 - lr: 0.000036 - momentum: 0.000000 2023-10-17 13:05:57,328 epoch 4 - iter 1080/1809 - loss 0.05094849 - time (sec): 82.19 - samples/sec: 2774.64 - lr: 0.000036 - momentum: 0.000000 2023-10-17 13:06:10,958 epoch 4 - iter 1260/1809 - loss 0.05022295 - time (sec): 95.82 - samples/sec: 2775.40 - lr: 0.000035 - momentum: 0.000000 2023-10-17 13:06:25,227 epoch 4 - iter 1440/1809 - loss 0.04918643 - time (sec): 110.09 - samples/sec: 2767.92 - lr: 0.000034 - momentum: 0.000000 2023-10-17 13:06:39,187 epoch 4 - iter 1620/1809 - loss 0.04981270 - time (sec): 124.05 - samples/sec: 2756.38 - lr: 0.000034 - momentum: 0.000000 2023-10-17 13:06:52,930 epoch 4 - iter 1800/1809 - loss 0.04985632 - time (sec): 137.79 - samples/sec: 2744.58 - lr: 0.000033 - momentum: 0.000000 2023-10-17 13:06:53,562 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:06:53,562 EPOCH 4 done: loss 0.0498 - lr: 0.000033 2023-10-17 13:06:59,958 DEV : loss 0.18873652815818787 - f1-score (micro avg) 0.626 2023-10-17 13:07:00,003 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:07:13,048 epoch 5 - iter 180/1809 - loss 0.02718902 - time (sec): 13.04 - samples/sec: 2906.08 - lr: 0.000033 - momentum: 0.000000 2023-10-17 13:07:26,269 epoch 5 - iter 360/1809 - loss 0.03344726 - time (sec): 26.26 - samples/sec: 2908.88 - lr: 0.000032 - momentum: 0.000000 2023-10-17 13:07:39,729 epoch 5 - iter 540/1809 - loss 0.03538346 - time (sec): 39.72 - samples/sec: 2867.24 - lr: 0.000032 - momentum: 0.000000 2023-10-17 13:07:54,013 epoch 5 - iter 720/1809 - loss 0.03597454 - time (sec): 54.01 - samples/sec: 2815.92 - lr: 0.000031 - momentum: 0.000000 2023-10-17 13:08:08,507 epoch 5 - iter 900/1809 - loss 0.03415119 - time (sec): 68.50 - samples/sec: 2788.45 - lr: 0.000031 - momentum: 0.000000 2023-10-17 13:08:23,258 epoch 5 - iter 1080/1809 - loss 0.03530280 - time (sec): 83.25 - samples/sec: 2762.73 - lr: 0.000030 - momentum: 0.000000 2023-10-17 13:08:37,149 epoch 5 - iter 1260/1809 - loss 0.03588608 - time (sec): 97.14 - samples/sec: 2742.58 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:08:50,110 epoch 5 - iter 1440/1809 - loss 0.03547127 - time (sec): 110.10 - samples/sec: 2743.26 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:09:03,929 epoch 5 - iter 1620/1809 - loss 0.03474216 - time (sec): 123.92 - samples/sec: 2746.59 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:09:17,457 epoch 5 - iter 1800/1809 - loss 0.03509614 - time (sec): 137.45 - samples/sec: 2747.44 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:09:18,174 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:09:18,175 EPOCH 5 done: loss 0.0351 - lr: 0.000028 2023-10-17 13:09:24,764 DEV : loss 0.27836063504219055 - f1-score (micro avg) 0.6421 2023-10-17 13:09:24,810 saving best model 2023-10-17 13:09:25,455 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:09:38,637 epoch 6 - iter 180/1809 - loss 0.01695260 - time (sec): 13.18 - samples/sec: 2838.54 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:09:51,844 epoch 6 - iter 360/1809 - loss 0.02022091 - time (sec): 26.39 - samples/sec: 2824.39 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:10:05,970 epoch 6 - iter 540/1809 - loss 0.02183610 - time (sec): 40.51 - samples/sec: 2778.30 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:10:19,993 epoch 6 - iter 720/1809 - loss 0.02350031 - time (sec): 54.54 - samples/sec: 2766.70 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:10:34,760 epoch 6 - iter 900/1809 - loss 0.02331272 - time (sec): 69.30 - samples/sec: 2735.54 - lr: 0.000025 - momentum: 0.000000 2023-10-17 13:10:49,638 epoch 6 - iter 1080/1809 - loss 0.02398290 - time (sec): 84.18 - samples/sec: 2701.31 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:11:03,946 epoch 6 - iter 1260/1809 - loss 0.02353897 - time (sec): 98.49 - samples/sec: 2695.72 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:11:18,525 epoch 6 - iter 1440/1809 - loss 0.02441370 - time (sec): 113.07 - samples/sec: 2683.06 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:11:31,815 epoch 6 - iter 1620/1809 - loss 0.02434786 - time (sec): 126.36 - samples/sec: 2690.79 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:11:45,166 epoch 6 - iter 1800/1809 - loss 0.02504313 - time (sec): 139.71 - samples/sec: 2709.69 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:11:45,767 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:45,768 EPOCH 6 done: loss 0.0251 - lr: 0.000022 2023-10-17 13:11:52,868 DEV : loss 0.277566134929657 - f1-score (micro avg) 0.6404 2023-10-17 13:11:52,914 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:12:05,939 epoch 7 - iter 180/1809 - loss 0.01703639 - time (sec): 13.02 - samples/sec: 2899.01 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:12:19,089 epoch 7 - iter 360/1809 - loss 0.01419929 - time (sec): 26.17 - samples/sec: 2873.27 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:12:32,766 epoch 7 - iter 540/1809 - loss 0.01520086 - time (sec): 39.85 - samples/sec: 2833.52 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:12:46,111 epoch 7 - iter 720/1809 - loss 0.01636940 - time (sec): 53.19 - samples/sec: 2830.28 - lr: 0.000020 - momentum: 0.000000 2023-10-17 13:12:59,075 epoch 7 - iter 900/1809 - loss 0.01631353 - time (sec): 66.16 - samples/sec: 2854.63 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:13:11,932 epoch 7 - iter 1080/1809 - loss 0.01710760 - time (sec): 79.02 - samples/sec: 2863.60 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:13:25,576 epoch 7 - iter 1260/1809 - loss 0.01653770 - time (sec): 92.66 - samples/sec: 2857.00 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:13:39,118 epoch 7 - iter 1440/1809 - loss 0.01635189 - time (sec): 106.20 - samples/sec: 2850.78 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:13:52,416 epoch 7 - iter 1620/1809 - loss 0.01664859 - time (sec): 119.50 - samples/sec: 2856.40 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:14:06,491 epoch 7 - iter 1800/1809 - loss 0.01683251 - time (sec): 133.58 - samples/sec: 2830.35 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:14:07,203 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:14:07,204 EPOCH 7 done: loss 0.0169 - lr: 0.000017 2023-10-17 13:14:13,444 DEV : loss 0.3381885588169098 - f1-score (micro avg) 0.6469 2023-10-17 13:14:13,488 saving best model 2023-10-17 13:14:14,083 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:14:27,601 epoch 8 - iter 180/1809 - loss 0.00784746 - time (sec): 13.52 - samples/sec: 2779.74 - lr: 0.000016 - momentum: 0.000000 2023-10-17 13:14:41,110 epoch 8 - iter 360/1809 - loss 0.00992790 - time (sec): 27.02 - samples/sec: 2761.80 - lr: 0.000016 - momentum: 0.000000 2023-10-17 13:14:55,394 epoch 8 - iter 540/1809 - loss 0.01010784 - time (sec): 41.31 - samples/sec: 2720.81 - lr: 0.000015 - momentum: 0.000000 2023-10-17 13:15:09,732 epoch 8 - iter 720/1809 - loss 0.01077941 - time (sec): 55.65 - samples/sec: 2718.87 - lr: 0.000014 - momentum: 0.000000 2023-10-17 13:15:24,528 epoch 8 - iter 900/1809 - loss 0.01023506 - time (sec): 70.44 - samples/sec: 2672.41 - lr: 0.000014 - momentum: 0.000000 2023-10-17 13:15:38,345 epoch 8 - iter 1080/1809 - loss 0.01066822 - time (sec): 84.26 - samples/sec: 2675.84 - lr: 0.000013 - momentum: 0.000000 2023-10-17 13:15:52,277 epoch 8 - iter 1260/1809 - loss 0.01068226 - time (sec): 98.19 - samples/sec: 2673.63 - lr: 0.000013 - momentum: 0.000000 2023-10-17 13:16:05,434 epoch 8 - iter 1440/1809 - loss 0.01011548 - time (sec): 111.35 - samples/sec: 2708.59 - lr: 0.000012 - momentum: 0.000000 2023-10-17 13:16:19,175 epoch 8 - iter 1620/1809 - loss 0.01069293 - time (sec): 125.09 - samples/sec: 2718.77 - lr: 0.000012 - momentum: 0.000000 2023-10-17 13:16:33,216 epoch 8 - iter 1800/1809 - loss 0.01129507 - time (sec): 139.13 - samples/sec: 2717.14 - lr: 0.000011 - momentum: 0.000000 2023-10-17 13:16:33,918 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:16:33,918 EPOCH 8 done: loss 0.0112 - lr: 0.000011 2023-10-17 13:16:40,422 DEV : loss 0.3760252892971039 - f1-score (micro avg) 0.6555 2023-10-17 13:16:40,466 saving best model 2023-10-17 13:16:41,076 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:16:55,165 epoch 9 - iter 180/1809 - loss 0.00699473 - time (sec): 14.09 - samples/sec: 2692.27 - lr: 0.000011 - momentum: 0.000000 2023-10-17 13:17:08,854 epoch 9 - iter 360/1809 - loss 0.00800011 - time (sec): 27.78 - samples/sec: 2673.76 - lr: 0.000010 - momentum: 0.000000 2023-10-17 13:17:21,927 epoch 9 - iter 540/1809 - loss 0.00784076 - time (sec): 40.85 - samples/sec: 2716.11 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:17:35,991 epoch 9 - iter 720/1809 - loss 0.00811916 - time (sec): 54.91 - samples/sec: 2707.86 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:17:49,737 epoch 9 - iter 900/1809 - loss 0.00811360 - time (sec): 68.66 - samples/sec: 2730.47 - lr: 0.000008 - momentum: 0.000000 2023-10-17 13:18:03,967 epoch 9 - iter 1080/1809 - loss 0.00795283 - time (sec): 82.89 - samples/sec: 2716.52 - lr: 0.000008 - momentum: 0.000000 2023-10-17 13:18:17,621 epoch 9 - iter 1260/1809 - loss 0.00770312 - time (sec): 96.54 - samples/sec: 2730.38 - lr: 0.000007 - momentum: 0.000000 2023-10-17 13:18:31,453 epoch 9 - iter 1440/1809 - loss 0.00743409 - time (sec): 110.38 - samples/sec: 2732.91 - lr: 0.000007 - momentum: 0.000000 2023-10-17 13:18:45,298 epoch 9 - iter 1620/1809 - loss 0.00731569 - time (sec): 124.22 - samples/sec: 2730.15 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:18:59,691 epoch 9 - iter 1800/1809 - loss 0.00705202 - time (sec): 138.61 - samples/sec: 2727.78 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:19:00,353 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:19:00,353 EPOCH 9 done: loss 0.0070 - lr: 0.000006 2023-10-17 13:19:07,562 DEV : loss 0.3952539563179016 - f1-score (micro avg) 0.6521 2023-10-17 13:19:07,604 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:19:21,606 epoch 10 - iter 180/1809 - loss 0.00511638 - time (sec): 14.00 - samples/sec: 2613.91 - lr: 0.000005 - momentum: 0.000000 2023-10-17 13:19:35,721 epoch 10 - iter 360/1809 - loss 0.00478079 - time (sec): 28.11 - samples/sec: 2677.95 - lr: 0.000004 - momentum: 0.000000 2023-10-17 13:19:50,649 epoch 10 - iter 540/1809 - loss 0.00426938 - time (sec): 43.04 - samples/sec: 2589.12 - lr: 0.000004 - momentum: 0.000000 2023-10-17 13:20:05,350 epoch 10 - iter 720/1809 - loss 0.00500865 - time (sec): 57.74 - samples/sec: 2598.57 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:20:19,311 epoch 10 - iter 900/1809 - loss 0.00515276 - time (sec): 71.70 - samples/sec: 2610.81 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:20:32,388 epoch 10 - iter 1080/1809 - loss 0.00512569 - time (sec): 84.78 - samples/sec: 2661.62 - lr: 0.000002 - momentum: 0.000000 2023-10-17 13:20:46,110 epoch 10 - iter 1260/1809 - loss 0.00480815 - time (sec): 98.50 - samples/sec: 2662.33 - lr: 0.000002 - momentum: 0.000000 2023-10-17 13:21:00,221 epoch 10 - iter 1440/1809 - loss 0.00509895 - time (sec): 112.61 - samples/sec: 2676.30 - lr: 0.000001 - momentum: 0.000000 2023-10-17 13:21:13,711 epoch 10 - iter 1620/1809 - loss 0.00479092 - time (sec): 126.10 - samples/sec: 2701.50 - lr: 0.000001 - momentum: 0.000000 2023-10-17 13:21:27,421 epoch 10 - iter 1800/1809 - loss 0.00470773 - time (sec): 139.82 - samples/sec: 2705.31 - lr: 0.000000 - momentum: 0.000000 2023-10-17 13:21:28,066 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:21:28,066 EPOCH 10 done: loss 0.0047 - lr: 0.000000 2023-10-17 13:21:34,505 DEV : loss 0.4053354859352112 - f1-score (micro avg) 0.6545 2023-10-17 13:21:35,063 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:21:35,065 Loading model from best epoch ... 2023-10-17 13:21:37,585 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-17 13:21:45,770 Results: - F-score (micro) 0.6646 - F-score (macro) 0.5118 - Accuracy 0.5075 By class: precision recall f1-score support loc 0.6667 0.7682 0.7138 591 pers 0.5948 0.7731 0.6724 357 org 0.1818 0.1266 0.1493 79 micro avg 0.6167 0.7205 0.6646 1027 macro avg 0.4811 0.5560 0.5118 1027 weighted avg 0.6044 0.7205 0.6560 1027 2023-10-17 13:21:45,770 ----------------------------------------------------------------------------------------------------