stefan-it's picture
Upload folder using huggingface_hub
cea2f71
2023-10-17 18:52:00,708 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,710 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 18:52:00,710 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,710 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-17 18:52:00,710 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,710 Train: 14465 sentences
2023-10-17 18:52:00,710 (train_with_dev=False, train_with_test=False)
2023-10-17 18:52:00,710 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,710 Training Params:
2023-10-17 18:52:00,710 - learning_rate: "3e-05"
2023-10-17 18:52:00,710 - mini_batch_size: "8"
2023-10-17 18:52:00,710 - max_epochs: "10"
2023-10-17 18:52:00,710 - shuffle: "True"
2023-10-17 18:52:00,710 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,710 Plugins:
2023-10-17 18:52:00,710 - TensorboardLogger
2023-10-17 18:52:00,711 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:52:00,711 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,711 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:52:00,711 - metric: "('micro avg', 'f1-score')"
2023-10-17 18:52:00,711 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,711 Computation:
2023-10-17 18:52:00,711 - compute on device: cuda:0
2023-10-17 18:52:00,711 - embedding storage: none
2023-10-17 18:52:00,711 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,711 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 18:52:00,711 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,711 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:00,711 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:52:13,715 epoch 1 - iter 180/1809 - loss 2.28258934 - time (sec): 13.00 - samples/sec: 2789.82 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:52:26,579 epoch 1 - iter 360/1809 - loss 1.24974822 - time (sec): 25.87 - samples/sec: 2901.45 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:52:39,333 epoch 1 - iter 540/1809 - loss 0.89210105 - time (sec): 38.62 - samples/sec: 2915.35 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:52:51,661 epoch 1 - iter 720/1809 - loss 0.71541122 - time (sec): 50.95 - samples/sec: 2906.71 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:53:04,504 epoch 1 - iter 900/1809 - loss 0.59767207 - time (sec): 63.79 - samples/sec: 2913.38 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:53:17,623 epoch 1 - iter 1080/1809 - loss 0.51371059 - time (sec): 76.91 - samples/sec: 2926.72 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:53:31,474 epoch 1 - iter 1260/1809 - loss 0.45457977 - time (sec): 90.76 - samples/sec: 2909.10 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:53:44,938 epoch 1 - iter 1440/1809 - loss 0.41231541 - time (sec): 104.23 - samples/sec: 2889.17 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:53:58,080 epoch 1 - iter 1620/1809 - loss 0.37675173 - time (sec): 117.37 - samples/sec: 2890.05 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:54:11,681 epoch 1 - iter 1800/1809 - loss 0.34914029 - time (sec): 130.97 - samples/sec: 2887.66 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:54:12,286 ----------------------------------------------------------------------------------------------------
2023-10-17 18:54:12,287 EPOCH 1 done: loss 0.3480 - lr: 0.000030
2023-10-17 18:54:17,839 DEV : loss 0.10930902510881424 - f1-score (micro avg) 0.5978
2023-10-17 18:54:17,881 saving best model
2023-10-17 18:54:18,405 ----------------------------------------------------------------------------------------------------
2023-10-17 18:54:30,964 epoch 2 - iter 180/1809 - loss 0.10650053 - time (sec): 12.56 - samples/sec: 2959.31 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:54:44,030 epoch 2 - iter 360/1809 - loss 0.09585977 - time (sec): 25.62 - samples/sec: 2997.96 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:54:57,016 epoch 2 - iter 540/1809 - loss 0.09384622 - time (sec): 38.61 - samples/sec: 2975.16 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:55:09,680 epoch 2 - iter 720/1809 - loss 0.09091547 - time (sec): 51.27 - samples/sec: 2959.73 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:55:22,557 epoch 2 - iter 900/1809 - loss 0.09225937 - time (sec): 64.15 - samples/sec: 2959.26 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:55:35,377 epoch 2 - iter 1080/1809 - loss 0.09010366 - time (sec): 76.97 - samples/sec: 2966.15 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:55:48,422 epoch 2 - iter 1260/1809 - loss 0.08952091 - time (sec): 90.02 - samples/sec: 2971.74 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:56:01,089 epoch 2 - iter 1440/1809 - loss 0.08943201 - time (sec): 102.68 - samples/sec: 2954.82 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:56:14,452 epoch 2 - iter 1620/1809 - loss 0.08793406 - time (sec): 116.05 - samples/sec: 2928.46 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:56:27,318 epoch 2 - iter 1800/1809 - loss 0.08729226 - time (sec): 128.91 - samples/sec: 2933.17 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:56:27,923 ----------------------------------------------------------------------------------------------------
2023-10-17 18:56:27,924 EPOCH 2 done: loss 0.0871 - lr: 0.000027
2023-10-17 18:56:34,962 DEV : loss 0.12650814652442932 - f1-score (micro avg) 0.6702
2023-10-17 18:56:35,002 saving best model
2023-10-17 18:56:35,614 ----------------------------------------------------------------------------------------------------
2023-10-17 18:56:48,417 epoch 3 - iter 180/1809 - loss 0.06478131 - time (sec): 12.80 - samples/sec: 2971.79 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:57:01,042 epoch 3 - iter 360/1809 - loss 0.06198826 - time (sec): 25.43 - samples/sec: 2962.13 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:57:13,921 epoch 3 - iter 540/1809 - loss 0.06502424 - time (sec): 38.30 - samples/sec: 2957.71 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:57:26,694 epoch 3 - iter 720/1809 - loss 0.06650593 - time (sec): 51.08 - samples/sec: 2945.63 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:57:39,977 epoch 3 - iter 900/1809 - loss 0.06659994 - time (sec): 64.36 - samples/sec: 2926.54 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:57:52,785 epoch 3 - iter 1080/1809 - loss 0.06578702 - time (sec): 77.17 - samples/sec: 2926.83 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:58:06,364 epoch 3 - iter 1260/1809 - loss 0.06364589 - time (sec): 90.75 - samples/sec: 2931.13 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:58:20,302 epoch 3 - iter 1440/1809 - loss 0.06291425 - time (sec): 104.69 - samples/sec: 2908.03 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:58:33,778 epoch 3 - iter 1620/1809 - loss 0.06208188 - time (sec): 118.16 - samples/sec: 2889.68 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:58:47,159 epoch 3 - iter 1800/1809 - loss 0.06263994 - time (sec): 131.54 - samples/sec: 2877.10 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:58:47,795 ----------------------------------------------------------------------------------------------------
2023-10-17 18:58:47,795 EPOCH 3 done: loss 0.0626 - lr: 0.000023
2023-10-17 18:58:54,045 DEV : loss 0.12355589121580124 - f1-score (micro avg) 0.5872
2023-10-17 18:58:54,085 ----------------------------------------------------------------------------------------------------
2023-10-17 18:59:07,267 epoch 4 - iter 180/1809 - loss 0.03666583 - time (sec): 13.18 - samples/sec: 2968.80 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:59:20,573 epoch 4 - iter 360/1809 - loss 0.04027279 - time (sec): 26.49 - samples/sec: 2843.54 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:59:34,773 epoch 4 - iter 540/1809 - loss 0.04343845 - time (sec): 40.69 - samples/sec: 2790.97 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:59:48,888 epoch 4 - iter 720/1809 - loss 0.04472054 - time (sec): 54.80 - samples/sec: 2765.20 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:00:02,898 epoch 4 - iter 900/1809 - loss 0.04626259 - time (sec): 68.81 - samples/sec: 2763.89 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:00:16,800 epoch 4 - iter 1080/1809 - loss 0.04635190 - time (sec): 82.71 - samples/sec: 2761.61 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:00:29,571 epoch 4 - iter 1260/1809 - loss 0.04645411 - time (sec): 95.48 - samples/sec: 2776.58 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:00:42,905 epoch 4 - iter 1440/1809 - loss 0.04596775 - time (sec): 108.82 - samples/sec: 2796.26 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:00:56,051 epoch 4 - iter 1620/1809 - loss 0.04613201 - time (sec): 121.96 - samples/sec: 2806.25 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:01:08,814 epoch 4 - iter 1800/1809 - loss 0.04664147 - time (sec): 134.73 - samples/sec: 2806.50 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:01:09,439 ----------------------------------------------------------------------------------------------------
2023-10-17 19:01:09,439 EPOCH 4 done: loss 0.0466 - lr: 0.000020
2023-10-17 19:01:16,574 DEV : loss 0.20206163823604584 - f1-score (micro avg) 0.6504
2023-10-17 19:01:16,614 ----------------------------------------------------------------------------------------------------
2023-10-17 19:01:29,149 epoch 5 - iter 180/1809 - loss 0.02937075 - time (sec): 12.53 - samples/sec: 2950.05 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:01:42,017 epoch 5 - iter 360/1809 - loss 0.03242653 - time (sec): 25.40 - samples/sec: 2946.71 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:01:54,818 epoch 5 - iter 540/1809 - loss 0.03272337 - time (sec): 38.20 - samples/sec: 2947.20 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:02:07,931 epoch 5 - iter 720/1809 - loss 0.03365088 - time (sec): 51.32 - samples/sec: 2943.60 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:02:20,754 epoch 5 - iter 900/1809 - loss 0.03447276 - time (sec): 64.14 - samples/sec: 2938.08 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:02:33,631 epoch 5 - iter 1080/1809 - loss 0.03437117 - time (sec): 77.02 - samples/sec: 2950.11 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:02:46,395 epoch 5 - iter 1260/1809 - loss 0.03449813 - time (sec): 89.78 - samples/sec: 2945.56 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:02:59,134 epoch 5 - iter 1440/1809 - loss 0.03583856 - time (sec): 102.52 - samples/sec: 2950.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:03:11,969 epoch 5 - iter 1620/1809 - loss 0.03533839 - time (sec): 115.35 - samples/sec: 2943.55 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:03:24,864 epoch 5 - iter 1800/1809 - loss 0.03477947 - time (sec): 128.25 - samples/sec: 2946.66 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:03:25,466 ----------------------------------------------------------------------------------------------------
2023-10-17 19:03:25,466 EPOCH 5 done: loss 0.0347 - lr: 0.000017
2023-10-17 19:03:32,608 DEV : loss 0.26585522294044495 - f1-score (micro avg) 0.6403
2023-10-17 19:03:32,648 ----------------------------------------------------------------------------------------------------
2023-10-17 19:03:45,303 epoch 6 - iter 180/1809 - loss 0.02901171 - time (sec): 12.65 - samples/sec: 2971.87 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:03:58,369 epoch 6 - iter 360/1809 - loss 0.02498842 - time (sec): 25.72 - samples/sec: 2950.77 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:04:11,132 epoch 6 - iter 540/1809 - loss 0.02262579 - time (sec): 38.48 - samples/sec: 2953.97 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:04:24,024 epoch 6 - iter 720/1809 - loss 0.02208957 - time (sec): 51.37 - samples/sec: 2960.86 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:04:37,341 epoch 6 - iter 900/1809 - loss 0.02333084 - time (sec): 64.69 - samples/sec: 2953.20 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:04:50,111 epoch 6 - iter 1080/1809 - loss 0.02330581 - time (sec): 77.46 - samples/sec: 2952.63 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:05:02,800 epoch 6 - iter 1260/1809 - loss 0.02337206 - time (sec): 90.15 - samples/sec: 2963.00 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:05:15,888 epoch 6 - iter 1440/1809 - loss 0.02308956 - time (sec): 103.24 - samples/sec: 2952.06 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:05:28,581 epoch 6 - iter 1620/1809 - loss 0.02310559 - time (sec): 115.93 - samples/sec: 2947.92 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:05:41,069 epoch 6 - iter 1800/1809 - loss 0.02341078 - time (sec): 128.42 - samples/sec: 2943.51 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:05:41,734 ----------------------------------------------------------------------------------------------------
2023-10-17 19:05:41,734 EPOCH 6 done: loss 0.0233 - lr: 0.000013
2023-10-17 19:05:48,183 DEV : loss 0.3086238503456116 - f1-score (micro avg) 0.6503
2023-10-17 19:05:48,225 ----------------------------------------------------------------------------------------------------
2023-10-17 19:06:01,564 epoch 7 - iter 180/1809 - loss 0.01807542 - time (sec): 13.34 - samples/sec: 2810.00 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:06:15,381 epoch 7 - iter 360/1809 - loss 0.01576264 - time (sec): 27.15 - samples/sec: 2773.21 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:06:27,999 epoch 7 - iter 540/1809 - loss 0.01466161 - time (sec): 39.77 - samples/sec: 2799.51 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:06:40,910 epoch 7 - iter 720/1809 - loss 0.01674336 - time (sec): 52.68 - samples/sec: 2854.34 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:06:53,287 epoch 7 - iter 900/1809 - loss 0.01690733 - time (sec): 65.06 - samples/sec: 2876.06 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:07:06,360 epoch 7 - iter 1080/1809 - loss 0.01654140 - time (sec): 78.13 - samples/sec: 2883.90 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:07:19,309 epoch 7 - iter 1260/1809 - loss 0.01661279 - time (sec): 91.08 - samples/sec: 2898.11 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:07:32,299 epoch 7 - iter 1440/1809 - loss 0.01634721 - time (sec): 104.07 - samples/sec: 2894.84 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:07:44,924 epoch 7 - iter 1620/1809 - loss 0.01679032 - time (sec): 116.70 - samples/sec: 2910.94 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:07:57,626 epoch 7 - iter 1800/1809 - loss 0.01648901 - time (sec): 129.40 - samples/sec: 2921.97 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:07:58,230 ----------------------------------------------------------------------------------------------------
2023-10-17 19:07:58,230 EPOCH 7 done: loss 0.0165 - lr: 0.000010
2023-10-17 19:08:05,481 DEV : loss 0.34659674763679504 - f1-score (micro avg) 0.632
2023-10-17 19:08:05,522 ----------------------------------------------------------------------------------------------------
2023-10-17 19:08:18,671 epoch 8 - iter 180/1809 - loss 0.00915215 - time (sec): 13.15 - samples/sec: 2787.60 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:08:31,720 epoch 8 - iter 360/1809 - loss 0.00930872 - time (sec): 26.20 - samples/sec: 2811.43 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:08:45,704 epoch 8 - iter 540/1809 - loss 0.01003110 - time (sec): 40.18 - samples/sec: 2780.09 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:08:59,427 epoch 8 - iter 720/1809 - loss 0.01169323 - time (sec): 53.90 - samples/sec: 2763.02 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:09:13,561 epoch 8 - iter 900/1809 - loss 0.01138645 - time (sec): 68.04 - samples/sec: 2763.43 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:09:28,039 epoch 8 - iter 1080/1809 - loss 0.01169517 - time (sec): 82.51 - samples/sec: 2733.67 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:09:41,516 epoch 8 - iter 1260/1809 - loss 0.01153351 - time (sec): 95.99 - samples/sec: 2735.49 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:09:55,382 epoch 8 - iter 1440/1809 - loss 0.01223461 - time (sec): 109.86 - samples/sec: 2738.43 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:10:09,151 epoch 8 - iter 1620/1809 - loss 0.01213000 - time (sec): 123.63 - samples/sec: 2743.07 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:10:23,760 epoch 8 - iter 1800/1809 - loss 0.01167306 - time (sec): 138.24 - samples/sec: 2735.18 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:10:24,463 ----------------------------------------------------------------------------------------------------
2023-10-17 19:10:24,463 EPOCH 8 done: loss 0.0118 - lr: 0.000007
2023-10-17 19:10:30,861 DEV : loss 0.3742707073688507 - f1-score (micro avg) 0.6569
2023-10-17 19:10:30,904 ----------------------------------------------------------------------------------------------------
2023-10-17 19:10:45,043 epoch 9 - iter 180/1809 - loss 0.01079518 - time (sec): 14.14 - samples/sec: 2667.85 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:10:59,146 epoch 9 - iter 360/1809 - loss 0.00876799 - time (sec): 28.24 - samples/sec: 2658.25 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:11:13,369 epoch 9 - iter 540/1809 - loss 0.00753337 - time (sec): 42.46 - samples/sec: 2635.77 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:11:27,223 epoch 9 - iter 720/1809 - loss 0.00737339 - time (sec): 56.32 - samples/sec: 2633.17 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:11:40,338 epoch 9 - iter 900/1809 - loss 0.00736290 - time (sec): 69.43 - samples/sec: 2684.24 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:11:52,078 epoch 9 - iter 1080/1809 - loss 0.00776742 - time (sec): 81.17 - samples/sec: 2771.41 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:12:04,124 epoch 9 - iter 1260/1809 - loss 0.00759926 - time (sec): 93.22 - samples/sec: 2820.58 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:12:15,927 epoch 9 - iter 1440/1809 - loss 0.00718286 - time (sec): 105.02 - samples/sec: 2865.46 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:12:27,689 epoch 9 - iter 1620/1809 - loss 0.00764832 - time (sec): 116.78 - samples/sec: 2915.65 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:12:39,260 epoch 9 - iter 1800/1809 - loss 0.00771228 - time (sec): 128.35 - samples/sec: 2948.74 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:12:39,777 ----------------------------------------------------------------------------------------------------
2023-10-17 19:12:39,777 EPOCH 9 done: loss 0.0077 - lr: 0.000003
2023-10-17 19:12:46,785 DEV : loss 0.3763969838619232 - f1-score (micro avg) 0.657
2023-10-17 19:12:46,835 ----------------------------------------------------------------------------------------------------
2023-10-17 19:13:00,924 epoch 10 - iter 180/1809 - loss 0.00366108 - time (sec): 14.09 - samples/sec: 2664.13 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:13:14,564 epoch 10 - iter 360/1809 - loss 0.00360582 - time (sec): 27.73 - samples/sec: 2693.32 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:13:27,619 epoch 10 - iter 540/1809 - loss 0.00387035 - time (sec): 40.78 - samples/sec: 2769.23 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:13:41,002 epoch 10 - iter 720/1809 - loss 0.00394948 - time (sec): 54.17 - samples/sec: 2800.28 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:13:54,695 epoch 10 - iter 900/1809 - loss 0.00373659 - time (sec): 67.86 - samples/sec: 2815.70 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:14:08,138 epoch 10 - iter 1080/1809 - loss 0.00427956 - time (sec): 81.30 - samples/sec: 2815.21 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:14:22,036 epoch 10 - iter 1260/1809 - loss 0.00426728 - time (sec): 95.20 - samples/sec: 2798.15 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:14:35,919 epoch 10 - iter 1440/1809 - loss 0.00454799 - time (sec): 109.08 - samples/sec: 2790.47 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:14:49,267 epoch 10 - iter 1620/1809 - loss 0.00463397 - time (sec): 122.43 - samples/sec: 2784.15 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:15:03,058 epoch 10 - iter 1800/1809 - loss 0.00481728 - time (sec): 136.22 - samples/sec: 2776.43 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:15:03,681 ----------------------------------------------------------------------------------------------------
2023-10-17 19:15:03,682 EPOCH 10 done: loss 0.0048 - lr: 0.000000
2023-10-17 19:15:10,958 DEV : loss 0.3807702958583832 - f1-score (micro avg) 0.6587
2023-10-17 19:15:11,520 ----------------------------------------------------------------------------------------------------
2023-10-17 19:15:11,522 Loading model from best epoch ...
2023-10-17 19:15:13,239 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-17 19:15:21,331
Results:
- F-score (micro) 0.6784
- F-score (macro) 0.4668
- Accuracy 0.5229
By class:
precision recall f1-score support
loc 0.6593 0.8122 0.7278 591
pers 0.6018 0.7619 0.6724 357
org 0.0000 0.0000 0.0000 79
micro avg 0.6319 0.7322 0.6784 1027
macro avg 0.4204 0.5247 0.4668 1027
weighted avg 0.5886 0.7322 0.6526 1027
2023-10-17 19:15:21,332 ----------------------------------------------------------------------------------------------------