stefan-it's picture
Upload folder using huggingface_hub
265141b
2023-10-17 12:57:22,959 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,960 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 12:57:22,960 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,961 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-17 12:57:22,961 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,961 Train: 14465 sentences
2023-10-17 12:57:22,961 (train_with_dev=False, train_with_test=False)
2023-10-17 12:57:22,961 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,961 Training Params:
2023-10-17 12:57:22,961 - learning_rate: "5e-05"
2023-10-17 12:57:22,961 - mini_batch_size: "8"
2023-10-17 12:57:22,961 - max_epochs: "10"
2023-10-17 12:57:22,961 - shuffle: "True"
2023-10-17 12:57:22,961 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,961 Plugins:
2023-10-17 12:57:22,961 - TensorboardLogger
2023-10-17 12:57:22,961 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 12:57:22,961 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,962 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 12:57:22,962 - metric: "('micro avg', 'f1-score')"
2023-10-17 12:57:22,962 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,962 Computation:
2023-10-17 12:57:22,962 - compute on device: cuda:0
2023-10-17 12:57:22,962 - embedding storage: none
2023-10-17 12:57:22,962 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,962 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 12:57:22,962 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,962 ----------------------------------------------------------------------------------------------------
2023-10-17 12:57:22,962 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 12:57:37,220 epoch 1 - iter 180/1809 - loss 1.84340706 - time (sec): 14.26 - samples/sec: 2648.39 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:57:50,903 epoch 1 - iter 360/1809 - loss 1.05150880 - time (sec): 27.94 - samples/sec: 2653.99 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:58:04,641 epoch 1 - iter 540/1809 - loss 0.74172731 - time (sec): 41.68 - samples/sec: 2727.46 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:58:17,857 epoch 1 - iter 720/1809 - loss 0.58928040 - time (sec): 54.89 - samples/sec: 2771.98 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:58:32,073 epoch 1 - iter 900/1809 - loss 0.49921402 - time (sec): 69.11 - samples/sec: 2758.24 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:58:45,617 epoch 1 - iter 1080/1809 - loss 0.43657646 - time (sec): 82.65 - samples/sec: 2755.99 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:58:59,176 epoch 1 - iter 1260/1809 - loss 0.38919295 - time (sec): 96.21 - samples/sec: 2764.35 - lr: 0.000035 - momentum: 0.000000
2023-10-17 12:59:12,317 epoch 1 - iter 1440/1809 - loss 0.35301689 - time (sec): 109.35 - samples/sec: 2791.91 - lr: 0.000040 - momentum: 0.000000
2023-10-17 12:59:25,395 epoch 1 - iter 1620/1809 - loss 0.32539419 - time (sec): 122.43 - samples/sec: 2797.97 - lr: 0.000045 - momentum: 0.000000
2023-10-17 12:59:39,241 epoch 1 - iter 1800/1809 - loss 0.30517417 - time (sec): 136.28 - samples/sec: 2775.15 - lr: 0.000050 - momentum: 0.000000
2023-10-17 12:59:39,970 ----------------------------------------------------------------------------------------------------
2023-10-17 12:59:39,971 EPOCH 1 done: loss 0.3042 - lr: 0.000050
2023-10-17 12:59:45,481 DEV : loss 0.10241620242595673 - f1-score (micro avg) 0.5874
2023-10-17 12:59:45,529 saving best model
2023-10-17 12:59:46,074 ----------------------------------------------------------------------------------------------------
2023-10-17 13:00:00,358 epoch 2 - iter 180/1809 - loss 0.08999612 - time (sec): 14.28 - samples/sec: 2712.94 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:00:14,189 epoch 2 - iter 360/1809 - loss 0.08860684 - time (sec): 28.11 - samples/sec: 2740.96 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:00:27,501 epoch 2 - iter 540/1809 - loss 0.09064747 - time (sec): 41.43 - samples/sec: 2748.69 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:00:40,822 epoch 2 - iter 720/1809 - loss 0.08986173 - time (sec): 54.75 - samples/sec: 2763.03 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:00:55,054 epoch 2 - iter 900/1809 - loss 0.08846040 - time (sec): 68.98 - samples/sec: 2726.01 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:01:08,830 epoch 2 - iter 1080/1809 - loss 0.09011922 - time (sec): 82.75 - samples/sec: 2736.28 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:01:23,268 epoch 2 - iter 1260/1809 - loss 0.08854167 - time (sec): 97.19 - samples/sec: 2728.15 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:01:36,733 epoch 2 - iter 1440/1809 - loss 0.08789492 - time (sec): 110.66 - samples/sec: 2735.03 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:01:50,720 epoch 2 - iter 1620/1809 - loss 0.08789901 - time (sec): 124.64 - samples/sec: 2739.18 - lr: 0.000045 - momentum: 0.000000
2023-10-17 13:02:04,249 epoch 2 - iter 1800/1809 - loss 0.08798538 - time (sec): 138.17 - samples/sec: 2737.08 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:02:04,931 ----------------------------------------------------------------------------------------------------
2023-10-17 13:02:04,931 EPOCH 2 done: loss 0.0879 - lr: 0.000044
2023-10-17 13:02:11,312 DEV : loss 0.12951047718524933 - f1-score (micro avg) 0.6379
2023-10-17 13:02:11,354 saving best model
2023-10-17 13:02:11,954 ----------------------------------------------------------------------------------------------------
2023-10-17 13:02:24,999 epoch 3 - iter 180/1809 - loss 0.06500996 - time (sec): 13.04 - samples/sec: 2812.60 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:02:38,194 epoch 3 - iter 360/1809 - loss 0.06343791 - time (sec): 26.24 - samples/sec: 2858.92 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:02:51,465 epoch 3 - iter 540/1809 - loss 0.06422061 - time (sec): 39.51 - samples/sec: 2860.97 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:03:04,781 epoch 3 - iter 720/1809 - loss 0.06581941 - time (sec): 52.83 - samples/sec: 2856.71 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:03:17,851 epoch 3 - iter 900/1809 - loss 0.06424567 - time (sec): 65.90 - samples/sec: 2859.91 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:03:30,944 epoch 3 - iter 1080/1809 - loss 0.06446192 - time (sec): 78.99 - samples/sec: 2881.71 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:03:45,032 epoch 3 - iter 1260/1809 - loss 0.06498763 - time (sec): 93.08 - samples/sec: 2836.00 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:03:59,459 epoch 3 - iter 1440/1809 - loss 0.06557827 - time (sec): 107.50 - samples/sec: 2808.10 - lr: 0.000040 - momentum: 0.000000
2023-10-17 13:04:13,201 epoch 3 - iter 1620/1809 - loss 0.06535863 - time (sec): 121.25 - samples/sec: 2813.72 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:04:27,131 epoch 3 - iter 1800/1809 - loss 0.06527467 - time (sec): 135.18 - samples/sec: 2798.04 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:04:27,823 ----------------------------------------------------------------------------------------------------
2023-10-17 13:04:27,824 EPOCH 3 done: loss 0.0652 - lr: 0.000039
2023-10-17 13:04:35,091 DEV : loss 0.1395214945077896 - f1-score (micro avg) 0.6286
2023-10-17 13:04:35,137 ----------------------------------------------------------------------------------------------------
2023-10-17 13:04:49,427 epoch 4 - iter 180/1809 - loss 0.04607623 - time (sec): 14.29 - samples/sec: 2694.03 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:05:02,656 epoch 4 - iter 360/1809 - loss 0.04930480 - time (sec): 27.52 - samples/sec: 2765.32 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:05:16,122 epoch 4 - iter 540/1809 - loss 0.04969251 - time (sec): 40.98 - samples/sec: 2809.11 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:05:29,916 epoch 4 - iter 720/1809 - loss 0.04875287 - time (sec): 54.78 - samples/sec: 2789.29 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:05:43,863 epoch 4 - iter 900/1809 - loss 0.05042701 - time (sec): 68.72 - samples/sec: 2772.42 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:05:57,328 epoch 4 - iter 1080/1809 - loss 0.05094849 - time (sec): 82.19 - samples/sec: 2774.64 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:06:10,958 epoch 4 - iter 1260/1809 - loss 0.05022295 - time (sec): 95.82 - samples/sec: 2775.40 - lr: 0.000035 - momentum: 0.000000
2023-10-17 13:06:25,227 epoch 4 - iter 1440/1809 - loss 0.04918643 - time (sec): 110.09 - samples/sec: 2767.92 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:06:39,187 epoch 4 - iter 1620/1809 - loss 0.04981270 - time (sec): 124.05 - samples/sec: 2756.38 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:06:52,930 epoch 4 - iter 1800/1809 - loss 0.04985632 - time (sec): 137.79 - samples/sec: 2744.58 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:06:53,562 ----------------------------------------------------------------------------------------------------
2023-10-17 13:06:53,562 EPOCH 4 done: loss 0.0498 - lr: 0.000033
2023-10-17 13:06:59,958 DEV : loss 0.18873652815818787 - f1-score (micro avg) 0.626
2023-10-17 13:07:00,003 ----------------------------------------------------------------------------------------------------
2023-10-17 13:07:13,048 epoch 5 - iter 180/1809 - loss 0.02718902 - time (sec): 13.04 - samples/sec: 2906.08 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:07:26,269 epoch 5 - iter 360/1809 - loss 0.03344726 - time (sec): 26.26 - samples/sec: 2908.88 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:07:39,729 epoch 5 - iter 540/1809 - loss 0.03538346 - time (sec): 39.72 - samples/sec: 2867.24 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:07:54,013 epoch 5 - iter 720/1809 - loss 0.03597454 - time (sec): 54.01 - samples/sec: 2815.92 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:08:08,507 epoch 5 - iter 900/1809 - loss 0.03415119 - time (sec): 68.50 - samples/sec: 2788.45 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:08:23,258 epoch 5 - iter 1080/1809 - loss 0.03530280 - time (sec): 83.25 - samples/sec: 2762.73 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:08:37,149 epoch 5 - iter 1260/1809 - loss 0.03588608 - time (sec): 97.14 - samples/sec: 2742.58 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:08:50,110 epoch 5 - iter 1440/1809 - loss 0.03547127 - time (sec): 110.10 - samples/sec: 2743.26 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:09:03,929 epoch 5 - iter 1620/1809 - loss 0.03474216 - time (sec): 123.92 - samples/sec: 2746.59 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:09:17,457 epoch 5 - iter 1800/1809 - loss 0.03509614 - time (sec): 137.45 - samples/sec: 2747.44 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:09:18,174 ----------------------------------------------------------------------------------------------------
2023-10-17 13:09:18,175 EPOCH 5 done: loss 0.0351 - lr: 0.000028
2023-10-17 13:09:24,764 DEV : loss 0.27836063504219055 - f1-score (micro avg) 0.6421
2023-10-17 13:09:24,810 saving best model
2023-10-17 13:09:25,455 ----------------------------------------------------------------------------------------------------
2023-10-17 13:09:38,637 epoch 6 - iter 180/1809 - loss 0.01695260 - time (sec): 13.18 - samples/sec: 2838.54 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:09:51,844 epoch 6 - iter 360/1809 - loss 0.02022091 - time (sec): 26.39 - samples/sec: 2824.39 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:10:05,970 epoch 6 - iter 540/1809 - loss 0.02183610 - time (sec): 40.51 - samples/sec: 2778.30 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:10:19,993 epoch 6 - iter 720/1809 - loss 0.02350031 - time (sec): 54.54 - samples/sec: 2766.70 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:10:34,760 epoch 6 - iter 900/1809 - loss 0.02331272 - time (sec): 69.30 - samples/sec: 2735.54 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:10:49,638 epoch 6 - iter 1080/1809 - loss 0.02398290 - time (sec): 84.18 - samples/sec: 2701.31 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:11:03,946 epoch 6 - iter 1260/1809 - loss 0.02353897 - time (sec): 98.49 - samples/sec: 2695.72 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:11:18,525 epoch 6 - iter 1440/1809 - loss 0.02441370 - time (sec): 113.07 - samples/sec: 2683.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:11:31,815 epoch 6 - iter 1620/1809 - loss 0.02434786 - time (sec): 126.36 - samples/sec: 2690.79 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:11:45,166 epoch 6 - iter 1800/1809 - loss 0.02504313 - time (sec): 139.71 - samples/sec: 2709.69 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:11:45,767 ----------------------------------------------------------------------------------------------------
2023-10-17 13:11:45,768 EPOCH 6 done: loss 0.0251 - lr: 0.000022
2023-10-17 13:11:52,868 DEV : loss 0.277566134929657 - f1-score (micro avg) 0.6404
2023-10-17 13:11:52,914 ----------------------------------------------------------------------------------------------------
2023-10-17 13:12:05,939 epoch 7 - iter 180/1809 - loss 0.01703639 - time (sec): 13.02 - samples/sec: 2899.01 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:12:19,089 epoch 7 - iter 360/1809 - loss 0.01419929 - time (sec): 26.17 - samples/sec: 2873.27 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:12:32,766 epoch 7 - iter 540/1809 - loss 0.01520086 - time (sec): 39.85 - samples/sec: 2833.52 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:12:46,111 epoch 7 - iter 720/1809 - loss 0.01636940 - time (sec): 53.19 - samples/sec: 2830.28 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:12:59,075 epoch 7 - iter 900/1809 - loss 0.01631353 - time (sec): 66.16 - samples/sec: 2854.63 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:13:11,932 epoch 7 - iter 1080/1809 - loss 0.01710760 - time (sec): 79.02 - samples/sec: 2863.60 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:13:25,576 epoch 7 - iter 1260/1809 - loss 0.01653770 - time (sec): 92.66 - samples/sec: 2857.00 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:13:39,118 epoch 7 - iter 1440/1809 - loss 0.01635189 - time (sec): 106.20 - samples/sec: 2850.78 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:13:52,416 epoch 7 - iter 1620/1809 - loss 0.01664859 - time (sec): 119.50 - samples/sec: 2856.40 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:14:06,491 epoch 7 - iter 1800/1809 - loss 0.01683251 - time (sec): 133.58 - samples/sec: 2830.35 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:14:07,203 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:07,204 EPOCH 7 done: loss 0.0169 - lr: 0.000017
2023-10-17 13:14:13,444 DEV : loss 0.3381885588169098 - f1-score (micro avg) 0.6469
2023-10-17 13:14:13,488 saving best model
2023-10-17 13:14:14,083 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:27,601 epoch 8 - iter 180/1809 - loss 0.00784746 - time (sec): 13.52 - samples/sec: 2779.74 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:14:41,110 epoch 8 - iter 360/1809 - loss 0.00992790 - time (sec): 27.02 - samples/sec: 2761.80 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:14:55,394 epoch 8 - iter 540/1809 - loss 0.01010784 - time (sec): 41.31 - samples/sec: 2720.81 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:15:09,732 epoch 8 - iter 720/1809 - loss 0.01077941 - time (sec): 55.65 - samples/sec: 2718.87 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:15:24,528 epoch 8 - iter 900/1809 - loss 0.01023506 - time (sec): 70.44 - samples/sec: 2672.41 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:15:38,345 epoch 8 - iter 1080/1809 - loss 0.01066822 - time (sec): 84.26 - samples/sec: 2675.84 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:15:52,277 epoch 8 - iter 1260/1809 - loss 0.01068226 - time (sec): 98.19 - samples/sec: 2673.63 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:16:05,434 epoch 8 - iter 1440/1809 - loss 0.01011548 - time (sec): 111.35 - samples/sec: 2708.59 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:16:19,175 epoch 8 - iter 1620/1809 - loss 0.01069293 - time (sec): 125.09 - samples/sec: 2718.77 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:16:33,216 epoch 8 - iter 1800/1809 - loss 0.01129507 - time (sec): 139.13 - samples/sec: 2717.14 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:16:33,918 ----------------------------------------------------------------------------------------------------
2023-10-17 13:16:33,918 EPOCH 8 done: loss 0.0112 - lr: 0.000011
2023-10-17 13:16:40,422 DEV : loss 0.3760252892971039 - f1-score (micro avg) 0.6555
2023-10-17 13:16:40,466 saving best model
2023-10-17 13:16:41,076 ----------------------------------------------------------------------------------------------------
2023-10-17 13:16:55,165 epoch 9 - iter 180/1809 - loss 0.00699473 - time (sec): 14.09 - samples/sec: 2692.27 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:17:08,854 epoch 9 - iter 360/1809 - loss 0.00800011 - time (sec): 27.78 - samples/sec: 2673.76 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:17:21,927 epoch 9 - iter 540/1809 - loss 0.00784076 - time (sec): 40.85 - samples/sec: 2716.11 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:17:35,991 epoch 9 - iter 720/1809 - loss 0.00811916 - time (sec): 54.91 - samples/sec: 2707.86 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:17:49,737 epoch 9 - iter 900/1809 - loss 0.00811360 - time (sec): 68.66 - samples/sec: 2730.47 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:18:03,967 epoch 9 - iter 1080/1809 - loss 0.00795283 - time (sec): 82.89 - samples/sec: 2716.52 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:18:17,621 epoch 9 - iter 1260/1809 - loss 0.00770312 - time (sec): 96.54 - samples/sec: 2730.38 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:18:31,453 epoch 9 - iter 1440/1809 - loss 0.00743409 - time (sec): 110.38 - samples/sec: 2732.91 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:18:45,298 epoch 9 - iter 1620/1809 - loss 0.00731569 - time (sec): 124.22 - samples/sec: 2730.15 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:18:59,691 epoch 9 - iter 1800/1809 - loss 0.00705202 - time (sec): 138.61 - samples/sec: 2727.78 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:19:00,353 ----------------------------------------------------------------------------------------------------
2023-10-17 13:19:00,353 EPOCH 9 done: loss 0.0070 - lr: 0.000006
2023-10-17 13:19:07,562 DEV : loss 0.3952539563179016 - f1-score (micro avg) 0.6521
2023-10-17 13:19:07,604 ----------------------------------------------------------------------------------------------------
2023-10-17 13:19:21,606 epoch 10 - iter 180/1809 - loss 0.00511638 - time (sec): 14.00 - samples/sec: 2613.91 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:19:35,721 epoch 10 - iter 360/1809 - loss 0.00478079 - time (sec): 28.11 - samples/sec: 2677.95 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:19:50,649 epoch 10 - iter 540/1809 - loss 0.00426938 - time (sec): 43.04 - samples/sec: 2589.12 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:20:05,350 epoch 10 - iter 720/1809 - loss 0.00500865 - time (sec): 57.74 - samples/sec: 2598.57 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:20:19,311 epoch 10 - iter 900/1809 - loss 0.00515276 - time (sec): 71.70 - samples/sec: 2610.81 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:20:32,388 epoch 10 - iter 1080/1809 - loss 0.00512569 - time (sec): 84.78 - samples/sec: 2661.62 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:20:46,110 epoch 10 - iter 1260/1809 - loss 0.00480815 - time (sec): 98.50 - samples/sec: 2662.33 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:21:00,221 epoch 10 - iter 1440/1809 - loss 0.00509895 - time (sec): 112.61 - samples/sec: 2676.30 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:21:13,711 epoch 10 - iter 1620/1809 - loss 0.00479092 - time (sec): 126.10 - samples/sec: 2701.50 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:21:27,421 epoch 10 - iter 1800/1809 - loss 0.00470773 - time (sec): 139.82 - samples/sec: 2705.31 - lr: 0.000000 - momentum: 0.000000
2023-10-17 13:21:28,066 ----------------------------------------------------------------------------------------------------
2023-10-17 13:21:28,066 EPOCH 10 done: loss 0.0047 - lr: 0.000000
2023-10-17 13:21:34,505 DEV : loss 0.4053354859352112 - f1-score (micro avg) 0.6545
2023-10-17 13:21:35,063 ----------------------------------------------------------------------------------------------------
2023-10-17 13:21:35,065 Loading model from best epoch ...
2023-10-17 13:21:37,585 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-17 13:21:45,770
Results:
- F-score (micro) 0.6646
- F-score (macro) 0.5118
- Accuracy 0.5075
By class:
precision recall f1-score support
loc 0.6667 0.7682 0.7138 591
pers 0.5948 0.7731 0.6724 357
org 0.1818 0.1266 0.1493 79
micro avg 0.6167 0.7205 0.6646 1027
macro avg 0.4811 0.5560 0.5118 1027
weighted avg 0.6044 0.7205 0.6560 1027
2023-10-17 13:21:45,770 ----------------------------------------------------------------------------------------------------