stefan-it's picture
Upload folder using huggingface_hub
7ae32e2
2023-10-17 12:38:39,938 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,939 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 12:38:39,939 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 Train: 7142 sentences
2023-10-17 12:38:39,940 (train_with_dev=False, train_with_test=False)
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 Training Params:
2023-10-17 12:38:39,940 - learning_rate: "3e-05"
2023-10-17 12:38:39,940 - mini_batch_size: "4"
2023-10-17 12:38:39,940 - max_epochs: "10"
2023-10-17 12:38:39,940 - shuffle: "True"
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 Plugins:
2023-10-17 12:38:39,940 - TensorboardLogger
2023-10-17 12:38:39,940 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 12:38:39,940 - metric: "('micro avg', 'f1-score')"
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 Computation:
2023-10-17 12:38:39,940 - compute on device: cuda:0
2023-10-17 12:38:39,940 - embedding storage: none
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 ----------------------------------------------------------------------------------------------------
2023-10-17 12:38:39,940 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 12:38:48,805 epoch 1 - iter 178/1786 - loss 2.58419137 - time (sec): 8.86 - samples/sec: 2764.45 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:38:57,446 epoch 1 - iter 356/1786 - loss 1.60835755 - time (sec): 17.50 - samples/sec: 2837.81 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:39:06,220 epoch 1 - iter 534/1786 - loss 1.19630048 - time (sec): 26.28 - samples/sec: 2857.21 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:39:14,892 epoch 1 - iter 712/1786 - loss 0.98846743 - time (sec): 34.95 - samples/sec: 2812.48 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:39:23,628 epoch 1 - iter 890/1786 - loss 0.84098822 - time (sec): 43.69 - samples/sec: 2818.02 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:39:32,454 epoch 1 - iter 1068/1786 - loss 0.73229552 - time (sec): 52.51 - samples/sec: 2824.31 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:39:41,095 epoch 1 - iter 1246/1786 - loss 0.65469828 - time (sec): 61.15 - samples/sec: 2822.73 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:39:49,717 epoch 1 - iter 1424/1786 - loss 0.58726625 - time (sec): 69.78 - samples/sec: 2840.73 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:39:58,326 epoch 1 - iter 1602/1786 - loss 0.54044553 - time (sec): 78.38 - samples/sec: 2838.36 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:40:07,261 epoch 1 - iter 1780/1786 - loss 0.50050960 - time (sec): 87.32 - samples/sec: 2839.87 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:40:07,548 ----------------------------------------------------------------------------------------------------
2023-10-17 12:40:07,548 EPOCH 1 done: loss 0.4994 - lr: 0.000030
2023-10-17 12:40:10,607 DEV : loss 0.12284992635250092 - f1-score (micro avg) 0.7364
2023-10-17 12:40:10,624 saving best model
2023-10-17 12:40:10,972 ----------------------------------------------------------------------------------------------------
2023-10-17 12:40:19,706 epoch 2 - iter 178/1786 - loss 0.14323297 - time (sec): 8.73 - samples/sec: 2696.00 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:40:28,597 epoch 2 - iter 356/1786 - loss 0.13101019 - time (sec): 17.62 - samples/sec: 2759.03 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:40:37,665 epoch 2 - iter 534/1786 - loss 0.12875450 - time (sec): 26.69 - samples/sec: 2766.30 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:40:46,600 epoch 2 - iter 712/1786 - loss 0.12509880 - time (sec): 35.63 - samples/sec: 2767.43 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:40:55,119 epoch 2 - iter 890/1786 - loss 0.12118468 - time (sec): 44.15 - samples/sec: 2790.56 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:41:03,960 epoch 2 - iter 1068/1786 - loss 0.12100919 - time (sec): 52.99 - samples/sec: 2800.47 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:41:12,795 epoch 2 - iter 1246/1786 - loss 0.12037138 - time (sec): 61.82 - samples/sec: 2787.10 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:41:21,556 epoch 2 - iter 1424/1786 - loss 0.11693455 - time (sec): 70.58 - samples/sec: 2796.51 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:41:30,702 epoch 2 - iter 1602/1786 - loss 0.11676725 - time (sec): 79.73 - samples/sec: 2820.70 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:41:39,400 epoch 2 - iter 1780/1786 - loss 0.11614911 - time (sec): 88.43 - samples/sec: 2804.09 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:41:39,695 ----------------------------------------------------------------------------------------------------
2023-10-17 12:41:39,695 EPOCH 2 done: loss 0.1162 - lr: 0.000027
2023-10-17 12:41:44,377 DEV : loss 0.12338940799236298 - f1-score (micro avg) 0.7839
2023-10-17 12:41:44,393 saving best model
2023-10-17 12:41:44,872 ----------------------------------------------------------------------------------------------------
2023-10-17 12:41:53,888 epoch 3 - iter 178/1786 - loss 0.07449215 - time (sec): 9.01 - samples/sec: 2885.63 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:42:02,741 epoch 3 - iter 356/1786 - loss 0.07691580 - time (sec): 17.87 - samples/sec: 2828.11 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:42:11,936 epoch 3 - iter 534/1786 - loss 0.07682986 - time (sec): 27.06 - samples/sec: 2816.30 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:42:20,884 epoch 3 - iter 712/1786 - loss 0.07692302 - time (sec): 36.01 - samples/sec: 2858.19 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:42:29,900 epoch 3 - iter 890/1786 - loss 0.07803257 - time (sec): 45.03 - samples/sec: 2847.16 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:42:38,522 epoch 3 - iter 1068/1786 - loss 0.08020088 - time (sec): 53.65 - samples/sec: 2835.54 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:42:47,357 epoch 3 - iter 1246/1786 - loss 0.08256246 - time (sec): 62.48 - samples/sec: 2828.02 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:42:55,914 epoch 3 - iter 1424/1786 - loss 0.08154790 - time (sec): 71.04 - samples/sec: 2824.51 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:43:04,852 epoch 3 - iter 1602/1786 - loss 0.08168387 - time (sec): 79.98 - samples/sec: 2813.12 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:43:13,481 epoch 3 - iter 1780/1786 - loss 0.08241782 - time (sec): 88.61 - samples/sec: 2798.91 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:43:13,791 ----------------------------------------------------------------------------------------------------
2023-10-17 12:43:13,791 EPOCH 3 done: loss 0.0825 - lr: 0.000023
2023-10-17 12:43:17,926 DEV : loss 0.14865444600582123 - f1-score (micro avg) 0.81
2023-10-17 12:43:17,943 saving best model
2023-10-17 12:43:18,414 ----------------------------------------------------------------------------------------------------
2023-10-17 12:43:27,142 epoch 4 - iter 178/1786 - loss 0.05085513 - time (sec): 8.73 - samples/sec: 2870.18 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:43:35,981 epoch 4 - iter 356/1786 - loss 0.05118122 - time (sec): 17.56 - samples/sec: 2838.14 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:43:45,038 epoch 4 - iter 534/1786 - loss 0.05465020 - time (sec): 26.62 - samples/sec: 2834.92 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:43:53,763 epoch 4 - iter 712/1786 - loss 0.05592902 - time (sec): 35.35 - samples/sec: 2823.21 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:44:02,634 epoch 4 - iter 890/1786 - loss 0.05561688 - time (sec): 44.22 - samples/sec: 2823.23 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:44:11,294 epoch 4 - iter 1068/1786 - loss 0.05734254 - time (sec): 52.88 - samples/sec: 2847.91 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:44:20,737 epoch 4 - iter 1246/1786 - loss 0.05604538 - time (sec): 62.32 - samples/sec: 2818.67 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:44:29,480 epoch 4 - iter 1424/1786 - loss 0.05505510 - time (sec): 71.06 - samples/sec: 2795.24 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:44:38,262 epoch 4 - iter 1602/1786 - loss 0.05522979 - time (sec): 79.85 - samples/sec: 2789.62 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:44:47,144 epoch 4 - iter 1780/1786 - loss 0.05563942 - time (sec): 88.73 - samples/sec: 2796.45 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:44:47,417 ----------------------------------------------------------------------------------------------------
2023-10-17 12:44:47,418 EPOCH 4 done: loss 0.0557 - lr: 0.000020
2023-10-17 12:44:51,659 DEV : loss 0.1732509285211563 - f1-score (micro avg) 0.7968
2023-10-17 12:44:51,675 ----------------------------------------------------------------------------------------------------
2023-10-17 12:45:00,231 epoch 5 - iter 178/1786 - loss 0.04800649 - time (sec): 8.56 - samples/sec: 2861.03 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:45:08,672 epoch 5 - iter 356/1786 - loss 0.04479618 - time (sec): 17.00 - samples/sec: 2806.48 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:45:17,652 epoch 5 - iter 534/1786 - loss 0.04797973 - time (sec): 25.98 - samples/sec: 2798.29 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:45:26,563 epoch 5 - iter 712/1786 - loss 0.04688139 - time (sec): 34.89 - samples/sec: 2802.90 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:45:35,499 epoch 5 - iter 890/1786 - loss 0.04767877 - time (sec): 43.82 - samples/sec: 2840.66 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:45:44,398 epoch 5 - iter 1068/1786 - loss 0.04614346 - time (sec): 52.72 - samples/sec: 2830.04 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:45:53,388 epoch 5 - iter 1246/1786 - loss 0.04538086 - time (sec): 61.71 - samples/sec: 2834.72 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:46:02,100 epoch 5 - iter 1424/1786 - loss 0.04455120 - time (sec): 70.42 - samples/sec: 2835.78 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:46:11,083 epoch 5 - iter 1602/1786 - loss 0.04412066 - time (sec): 79.41 - samples/sec: 2831.98 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:46:19,606 epoch 5 - iter 1780/1786 - loss 0.04400326 - time (sec): 87.93 - samples/sec: 2821.87 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:46:19,874 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:19,874 EPOCH 5 done: loss 0.0440 - lr: 0.000017
2023-10-17 12:46:24,615 DEV : loss 0.17575478553771973 - f1-score (micro avg) 0.8176
2023-10-17 12:46:24,631 saving best model
2023-10-17 12:46:25,082 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:34,054 epoch 6 - iter 178/1786 - loss 0.03021896 - time (sec): 8.97 - samples/sec: 2804.62 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:46:42,641 epoch 6 - iter 356/1786 - loss 0.02818833 - time (sec): 17.55 - samples/sec: 2907.95 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:46:51,341 epoch 6 - iter 534/1786 - loss 0.03091021 - time (sec): 26.25 - samples/sec: 2877.64 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:47:00,933 epoch 6 - iter 712/1786 - loss 0.03157363 - time (sec): 35.85 - samples/sec: 2792.54 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:47:10,770 epoch 6 - iter 890/1786 - loss 0.03226007 - time (sec): 45.68 - samples/sec: 2747.95 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:47:19,864 epoch 6 - iter 1068/1786 - loss 0.03366693 - time (sec): 54.78 - samples/sec: 2773.82 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:47:28,608 epoch 6 - iter 1246/1786 - loss 0.03408422 - time (sec): 63.52 - samples/sec: 2776.37 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:47:37,057 epoch 6 - iter 1424/1786 - loss 0.03358057 - time (sec): 71.97 - samples/sec: 2780.65 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:47:45,656 epoch 6 - iter 1602/1786 - loss 0.03267451 - time (sec): 80.57 - samples/sec: 2774.99 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:47:54,484 epoch 6 - iter 1780/1786 - loss 0.03278461 - time (sec): 89.40 - samples/sec: 2774.25 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:47:54,777 ----------------------------------------------------------------------------------------------------
2023-10-17 12:47:54,777 EPOCH 6 done: loss 0.0328 - lr: 0.000013
2023-10-17 12:47:59,058 DEV : loss 0.18483255803585052 - f1-score (micro avg) 0.8194
2023-10-17 12:47:59,076 saving best model
2023-10-17 12:47:59,571 ----------------------------------------------------------------------------------------------------
2023-10-17 12:48:08,260 epoch 7 - iter 178/1786 - loss 0.01835655 - time (sec): 8.69 - samples/sec: 2684.07 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:48:17,298 epoch 7 - iter 356/1786 - loss 0.02049370 - time (sec): 17.73 - samples/sec: 2810.28 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:48:26,095 epoch 7 - iter 534/1786 - loss 0.02307035 - time (sec): 26.52 - samples/sec: 2782.84 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:48:35,199 epoch 7 - iter 712/1786 - loss 0.02161123 - time (sec): 35.63 - samples/sec: 2825.58 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:48:43,775 epoch 7 - iter 890/1786 - loss 0.02268236 - time (sec): 44.20 - samples/sec: 2853.93 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:48:52,265 epoch 7 - iter 1068/1786 - loss 0.02277251 - time (sec): 52.69 - samples/sec: 2819.47 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:49:00,956 epoch 7 - iter 1246/1786 - loss 0.02387286 - time (sec): 61.38 - samples/sec: 2802.17 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:49:09,676 epoch 7 - iter 1424/1786 - loss 0.02364284 - time (sec): 70.10 - samples/sec: 2798.53 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:49:18,563 epoch 7 - iter 1602/1786 - loss 0.02334329 - time (sec): 78.99 - samples/sec: 2812.50 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:49:28,118 epoch 7 - iter 1780/1786 - loss 0.02349736 - time (sec): 88.55 - samples/sec: 2802.48 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:49:28,393 ----------------------------------------------------------------------------------------------------
2023-10-17 12:49:28,394 EPOCH 7 done: loss 0.0234 - lr: 0.000010
2023-10-17 12:49:32,541 DEV : loss 0.19320163130760193 - f1-score (micro avg) 0.8236
2023-10-17 12:49:32,559 saving best model
2023-10-17 12:49:33,026 ----------------------------------------------------------------------------------------------------
2023-10-17 12:49:41,645 epoch 8 - iter 178/1786 - loss 0.01560280 - time (sec): 8.62 - samples/sec: 3039.75 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:49:50,439 epoch 8 - iter 356/1786 - loss 0.01462824 - time (sec): 17.41 - samples/sec: 2947.26 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:49:59,265 epoch 8 - iter 534/1786 - loss 0.01544668 - time (sec): 26.24 - samples/sec: 2902.15 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:50:07,965 epoch 8 - iter 712/1786 - loss 0.01648773 - time (sec): 34.94 - samples/sec: 2856.18 - lr: 0.000009 - momentum: 0.000000
2023-10-17 12:50:16,852 epoch 8 - iter 890/1786 - loss 0.01642399 - time (sec): 43.82 - samples/sec: 2847.43 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:50:25,719 epoch 8 - iter 1068/1786 - loss 0.01759485 - time (sec): 52.69 - samples/sec: 2849.73 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:50:34,770 epoch 8 - iter 1246/1786 - loss 0.01639187 - time (sec): 61.74 - samples/sec: 2845.88 - lr: 0.000008 - momentum: 0.000000
2023-10-17 12:50:43,717 epoch 8 - iter 1424/1786 - loss 0.01647365 - time (sec): 70.69 - samples/sec: 2863.89 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:50:52,381 epoch 8 - iter 1602/1786 - loss 0.01678055 - time (sec): 79.35 - samples/sec: 2847.54 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:51:00,775 epoch 8 - iter 1780/1786 - loss 0.01689960 - time (sec): 87.75 - samples/sec: 2827.07 - lr: 0.000007 - momentum: 0.000000
2023-10-17 12:51:01,071 ----------------------------------------------------------------------------------------------------
2023-10-17 12:51:01,071 EPOCH 8 done: loss 0.0169 - lr: 0.000007
2023-10-17 12:51:05,268 DEV : loss 0.19824054837226868 - f1-score (micro avg) 0.8207
2023-10-17 12:51:05,284 ----------------------------------------------------------------------------------------------------
2023-10-17 12:51:14,686 epoch 9 - iter 178/1786 - loss 0.01052900 - time (sec): 9.40 - samples/sec: 2720.90 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:51:23,649 epoch 9 - iter 356/1786 - loss 0.01469236 - time (sec): 18.36 - samples/sec: 2771.46 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:51:32,564 epoch 9 - iter 534/1786 - loss 0.01355989 - time (sec): 27.28 - samples/sec: 2805.94 - lr: 0.000006 - momentum: 0.000000
2023-10-17 12:51:40,994 epoch 9 - iter 712/1786 - loss 0.01249401 - time (sec): 35.71 - samples/sec: 2814.57 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:51:49,608 epoch 9 - iter 890/1786 - loss 0.01257532 - time (sec): 44.32 - samples/sec: 2829.89 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:51:58,268 epoch 9 - iter 1068/1786 - loss 0.01305676 - time (sec): 52.98 - samples/sec: 2831.92 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:52:07,045 epoch 9 - iter 1246/1786 - loss 0.01263776 - time (sec): 61.76 - samples/sec: 2824.51 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:52:15,718 epoch 9 - iter 1424/1786 - loss 0.01215768 - time (sec): 70.43 - samples/sec: 2808.67 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:52:24,522 epoch 9 - iter 1602/1786 - loss 0.01172614 - time (sec): 79.24 - samples/sec: 2815.65 - lr: 0.000004 - momentum: 0.000000
2023-10-17 12:52:33,450 epoch 9 - iter 1780/1786 - loss 0.01183726 - time (sec): 88.16 - samples/sec: 2814.13 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:52:33,742 ----------------------------------------------------------------------------------------------------
2023-10-17 12:52:33,742 EPOCH 9 done: loss 0.0118 - lr: 0.000003
2023-10-17 12:52:37,926 DEV : loss 0.20676946640014648 - f1-score (micro avg) 0.8172
2023-10-17 12:52:37,944 ----------------------------------------------------------------------------------------------------
2023-10-17 12:52:46,932 epoch 10 - iter 178/1786 - loss 0.01278157 - time (sec): 8.99 - samples/sec: 2814.97 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:52:55,908 epoch 10 - iter 356/1786 - loss 0.01285362 - time (sec): 17.96 - samples/sec: 2786.82 - lr: 0.000003 - momentum: 0.000000
2023-10-17 12:53:04,725 epoch 10 - iter 534/1786 - loss 0.01038423 - time (sec): 26.78 - samples/sec: 2775.89 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:53:13,650 epoch 10 - iter 712/1786 - loss 0.01035496 - time (sec): 35.70 - samples/sec: 2804.32 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:53:22,361 epoch 10 - iter 890/1786 - loss 0.00954459 - time (sec): 44.42 - samples/sec: 2809.79 - lr: 0.000002 - momentum: 0.000000
2023-10-17 12:53:30,976 epoch 10 - iter 1068/1786 - loss 0.00937681 - time (sec): 53.03 - samples/sec: 2799.34 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:53:39,358 epoch 10 - iter 1246/1786 - loss 0.00888181 - time (sec): 61.41 - samples/sec: 2807.33 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:53:47,865 epoch 10 - iter 1424/1786 - loss 0.00852283 - time (sec): 69.92 - samples/sec: 2809.14 - lr: 0.000001 - momentum: 0.000000
2023-10-17 12:53:56,420 epoch 10 - iter 1602/1786 - loss 0.00837085 - time (sec): 78.47 - samples/sec: 2821.52 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:54:05,123 epoch 10 - iter 1780/1786 - loss 0.00851493 - time (sec): 87.18 - samples/sec: 2846.18 - lr: 0.000000 - momentum: 0.000000
2023-10-17 12:54:05,392 ----------------------------------------------------------------------------------------------------
2023-10-17 12:54:05,392 EPOCH 10 done: loss 0.0085 - lr: 0.000000
2023-10-17 12:54:10,017 DEV : loss 0.21040573716163635 - f1-score (micro avg) 0.83
2023-10-17 12:54:10,033 saving best model
2023-10-17 12:54:10,788 ----------------------------------------------------------------------------------------------------
2023-10-17 12:54:10,789 Loading model from best epoch ...
2023-10-17 12:54:12,118 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 12:54:21,698
Results:
- F-score (micro) 0.7085
- F-score (macro) 0.642
- Accuracy 0.5666
By class:
precision recall f1-score support
LOC 0.7307 0.7014 0.7158 1095
PER 0.7868 0.7915 0.7892 1012
ORG 0.4543 0.5574 0.5006 357
HumanProd 0.4286 0.8182 0.5625 33
micro avg 0.6984 0.7189 0.7085 2497
macro avg 0.6001 0.7171 0.6420 2497
weighted avg 0.7100 0.7189 0.7127 2497
2023-10-17 12:54:21,698 ----------------------------------------------------------------------------------------------------