stefan-it's picture
Upload folder using huggingface_hub
52d7712
raw
history blame
24.2 kB
2023-10-17 19:43:42,471 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,472 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 19:43:42,472 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,472 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-17 19:43:42,472 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,472 Train: 5901 sentences
2023-10-17 19:43:42,472 (train_with_dev=False, train_with_test=False)
2023-10-17 19:43:42,472 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,472 Training Params:
2023-10-17 19:43:42,473 - learning_rate: "3e-05"
2023-10-17 19:43:42,473 - mini_batch_size: "8"
2023-10-17 19:43:42,473 - max_epochs: "10"
2023-10-17 19:43:42,473 - shuffle: "True"
2023-10-17 19:43:42,473 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,473 Plugins:
2023-10-17 19:43:42,473 - TensorboardLogger
2023-10-17 19:43:42,473 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 19:43:42,473 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,473 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 19:43:42,473 - metric: "('micro avg', 'f1-score')"
2023-10-17 19:43:42,473 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,473 Computation:
2023-10-17 19:43:42,473 - compute on device: cuda:0
2023-10-17 19:43:42,473 - embedding storage: none
2023-10-17 19:43:42,473 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,473 Model training base path: "hmbench-hipe2020/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 19:43:42,473 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,473 ----------------------------------------------------------------------------------------------------
2023-10-17 19:43:42,473 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 19:43:48,751 epoch 1 - iter 73/738 - loss 3.18129576 - time (sec): 6.28 - samples/sec: 2801.48 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:43:53,251 epoch 1 - iter 146/738 - loss 2.19020696 - time (sec): 10.78 - samples/sec: 3050.59 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:43:59,111 epoch 1 - iter 219/738 - loss 1.59513825 - time (sec): 16.64 - samples/sec: 3079.50 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:44:05,045 epoch 1 - iter 292/738 - loss 1.27764894 - time (sec): 22.57 - samples/sec: 3059.98 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:44:10,208 epoch 1 - iter 365/738 - loss 1.09648871 - time (sec): 27.73 - samples/sec: 3067.33 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:44:14,907 epoch 1 - iter 438/738 - loss 0.97918922 - time (sec): 32.43 - samples/sec: 3080.33 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:44:19,805 epoch 1 - iter 511/738 - loss 0.88580507 - time (sec): 37.33 - samples/sec: 3085.82 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:44:24,890 epoch 1 - iter 584/738 - loss 0.80554273 - time (sec): 42.42 - samples/sec: 3097.71 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:44:30,040 epoch 1 - iter 657/738 - loss 0.73810377 - time (sec): 47.57 - samples/sec: 3101.81 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:44:35,093 epoch 1 - iter 730/738 - loss 0.67997669 - time (sec): 52.62 - samples/sec: 3129.86 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:44:35,557 ----------------------------------------------------------------------------------------------------
2023-10-17 19:44:35,557 EPOCH 1 done: loss 0.6742 - lr: 0.000030
2023-10-17 19:44:41,721 DEV : loss 0.12336786091327667 - f1-score (micro avg) 0.7553
2023-10-17 19:44:41,759 saving best model
2023-10-17 19:44:42,193 ----------------------------------------------------------------------------------------------------
2023-10-17 19:44:48,004 epoch 2 - iter 73/738 - loss 0.15982108 - time (sec): 5.81 - samples/sec: 2868.17 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:44:53,384 epoch 2 - iter 146/738 - loss 0.15219823 - time (sec): 11.19 - samples/sec: 3108.03 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:44:58,736 epoch 2 - iter 219/738 - loss 0.14354631 - time (sec): 16.54 - samples/sec: 3140.24 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:45:03,750 epoch 2 - iter 292/738 - loss 0.13970635 - time (sec): 21.55 - samples/sec: 3123.54 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:45:08,549 epoch 2 - iter 365/738 - loss 0.13505059 - time (sec): 26.35 - samples/sec: 3108.53 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:45:13,319 epoch 2 - iter 438/738 - loss 0.13319422 - time (sec): 31.12 - samples/sec: 3120.46 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:45:18,429 epoch 2 - iter 511/738 - loss 0.12880087 - time (sec): 36.23 - samples/sec: 3131.97 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:45:23,596 epoch 2 - iter 584/738 - loss 0.12583910 - time (sec): 41.40 - samples/sec: 3127.20 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:45:29,201 epoch 2 - iter 657/738 - loss 0.12490350 - time (sec): 47.01 - samples/sec: 3139.13 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:45:34,718 epoch 2 - iter 730/738 - loss 0.12241339 - time (sec): 52.52 - samples/sec: 3133.31 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:45:35,417 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:35,417 EPOCH 2 done: loss 0.1220 - lr: 0.000027
2023-10-17 19:45:46,937 DEV : loss 0.09578868746757507 - f1-score (micro avg) 0.8146
2023-10-17 19:45:46,965 saving best model
2023-10-17 19:45:47,493 ----------------------------------------------------------------------------------------------------
2023-10-17 19:45:53,270 epoch 3 - iter 73/738 - loss 0.07059181 - time (sec): 5.77 - samples/sec: 3049.57 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:45:58,525 epoch 3 - iter 146/738 - loss 0.07590889 - time (sec): 11.03 - samples/sec: 3161.90 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:46:03,634 epoch 3 - iter 219/738 - loss 0.07091362 - time (sec): 16.14 - samples/sec: 3199.78 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:46:08,697 epoch 3 - iter 292/738 - loss 0.06883331 - time (sec): 21.20 - samples/sec: 3180.84 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:46:13,735 epoch 3 - iter 365/738 - loss 0.07059965 - time (sec): 26.24 - samples/sec: 3176.95 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:46:18,955 epoch 3 - iter 438/738 - loss 0.07205663 - time (sec): 31.46 - samples/sec: 3143.20 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:46:24,569 epoch 3 - iter 511/738 - loss 0.07270546 - time (sec): 37.07 - samples/sec: 3157.14 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:46:29,790 epoch 3 - iter 584/738 - loss 0.07234630 - time (sec): 42.29 - samples/sec: 3145.32 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:46:34,895 epoch 3 - iter 657/738 - loss 0.07186579 - time (sec): 47.40 - samples/sec: 3143.08 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:46:39,719 epoch 3 - iter 730/738 - loss 0.07250374 - time (sec): 52.22 - samples/sec: 3159.36 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:46:40,195 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:40,195 EPOCH 3 done: loss 0.0722 - lr: 0.000023
2023-10-17 19:46:51,335 DEV : loss 0.10583800822496414 - f1-score (micro avg) 0.833
2023-10-17 19:46:51,364 saving best model
2023-10-17 19:46:51,881 ----------------------------------------------------------------------------------------------------
2023-10-17 19:46:57,112 epoch 4 - iter 73/738 - loss 0.05206689 - time (sec): 5.23 - samples/sec: 3037.15 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:47:02,454 epoch 4 - iter 146/738 - loss 0.04622094 - time (sec): 10.57 - samples/sec: 3169.54 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:47:07,164 epoch 4 - iter 219/738 - loss 0.05037936 - time (sec): 15.28 - samples/sec: 3196.93 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:47:12,269 epoch 4 - iter 292/738 - loss 0.05234898 - time (sec): 20.39 - samples/sec: 3188.25 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:47:17,049 epoch 4 - iter 365/738 - loss 0.05275237 - time (sec): 25.17 - samples/sec: 3168.35 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:47:22,024 epoch 4 - iter 438/738 - loss 0.05040956 - time (sec): 30.14 - samples/sec: 3198.64 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:47:26,845 epoch 4 - iter 511/738 - loss 0.04932344 - time (sec): 34.96 - samples/sec: 3209.89 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:47:32,525 epoch 4 - iter 584/738 - loss 0.04781680 - time (sec): 40.64 - samples/sec: 3196.99 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:47:37,836 epoch 4 - iter 657/738 - loss 0.04819721 - time (sec): 45.95 - samples/sec: 3181.08 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:47:43,898 epoch 4 - iter 730/738 - loss 0.04886480 - time (sec): 52.02 - samples/sec: 3167.06 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:47:44,407 ----------------------------------------------------------------------------------------------------
2023-10-17 19:47:44,407 EPOCH 4 done: loss 0.0485 - lr: 0.000020
2023-10-17 19:47:55,491 DEV : loss 0.122743621468544 - f1-score (micro avg) 0.8485
2023-10-17 19:47:55,519 saving best model
2023-10-17 19:47:56,054 ----------------------------------------------------------------------------------------------------
2023-10-17 19:48:01,493 epoch 5 - iter 73/738 - loss 0.02654138 - time (sec): 5.44 - samples/sec: 3265.24 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:48:06,555 epoch 5 - iter 146/738 - loss 0.03073291 - time (sec): 10.50 - samples/sec: 3185.99 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:48:11,730 epoch 5 - iter 219/738 - loss 0.03108548 - time (sec): 15.67 - samples/sec: 3161.54 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:48:16,908 epoch 5 - iter 292/738 - loss 0.03530107 - time (sec): 20.85 - samples/sec: 3182.99 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:48:21,927 epoch 5 - iter 365/738 - loss 0.03345830 - time (sec): 25.87 - samples/sec: 3204.30 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:48:27,157 epoch 5 - iter 438/738 - loss 0.03329768 - time (sec): 31.10 - samples/sec: 3199.94 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:48:32,613 epoch 5 - iter 511/738 - loss 0.03272234 - time (sec): 36.56 - samples/sec: 3155.39 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:48:37,419 epoch 5 - iter 584/738 - loss 0.03290570 - time (sec): 41.36 - samples/sec: 3159.03 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:48:42,520 epoch 5 - iter 657/738 - loss 0.03309942 - time (sec): 46.46 - samples/sec: 3171.10 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:48:47,829 epoch 5 - iter 730/738 - loss 0.03334347 - time (sec): 51.77 - samples/sec: 3170.29 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:48:48,658 ----------------------------------------------------------------------------------------------------
2023-10-17 19:48:48,658 EPOCH 5 done: loss 0.0335 - lr: 0.000017
2023-10-17 19:48:59,704 DEV : loss 0.15168806910514832 - f1-score (micro avg) 0.8543
2023-10-17 19:48:59,733 saving best model
2023-10-17 19:49:00,281 ----------------------------------------------------------------------------------------------------
2023-10-17 19:49:05,290 epoch 6 - iter 73/738 - loss 0.03010752 - time (sec): 5.01 - samples/sec: 3140.40 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:49:10,877 epoch 6 - iter 146/738 - loss 0.02499735 - time (sec): 10.59 - samples/sec: 3101.45 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:49:16,327 epoch 6 - iter 219/738 - loss 0.02175671 - time (sec): 16.05 - samples/sec: 3082.57 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:49:21,571 epoch 6 - iter 292/738 - loss 0.02484654 - time (sec): 21.29 - samples/sec: 3068.11 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:49:26,919 epoch 6 - iter 365/738 - loss 0.02430030 - time (sec): 26.64 - samples/sec: 3058.19 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:49:32,199 epoch 6 - iter 438/738 - loss 0.02418786 - time (sec): 31.92 - samples/sec: 3041.81 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:49:37,745 epoch 6 - iter 511/738 - loss 0.02390168 - time (sec): 37.46 - samples/sec: 3036.04 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:49:42,959 epoch 6 - iter 584/738 - loss 0.02289941 - time (sec): 42.68 - samples/sec: 3066.21 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:49:48,171 epoch 6 - iter 657/738 - loss 0.02342068 - time (sec): 47.89 - samples/sec: 3066.73 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:49:53,684 epoch 6 - iter 730/738 - loss 0.02460435 - time (sec): 53.40 - samples/sec: 3080.94 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:49:54,408 ----------------------------------------------------------------------------------------------------
2023-10-17 19:49:54,408 EPOCH 6 done: loss 0.0246 - lr: 0.000013
2023-10-17 19:50:05,542 DEV : loss 0.15729978680610657 - f1-score (micro avg) 0.8547
2023-10-17 19:50:05,573 saving best model
2023-10-17 19:50:06,189 ----------------------------------------------------------------------------------------------------
2023-10-17 19:50:11,582 epoch 7 - iter 73/738 - loss 0.01280464 - time (sec): 5.39 - samples/sec: 3121.20 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:50:16,753 epoch 7 - iter 146/738 - loss 0.01448297 - time (sec): 10.56 - samples/sec: 3123.91 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:50:22,379 epoch 7 - iter 219/738 - loss 0.01626473 - time (sec): 16.18 - samples/sec: 3133.78 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:50:27,912 epoch 7 - iter 292/738 - loss 0.01707587 - time (sec): 21.72 - samples/sec: 3140.88 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:50:33,061 epoch 7 - iter 365/738 - loss 0.01952204 - time (sec): 26.87 - samples/sec: 3121.60 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:50:38,280 epoch 7 - iter 438/738 - loss 0.01973386 - time (sec): 32.09 - samples/sec: 3134.20 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:50:43,038 epoch 7 - iter 511/738 - loss 0.01921562 - time (sec): 36.84 - samples/sec: 3158.94 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:50:48,334 epoch 7 - iter 584/738 - loss 0.01911006 - time (sec): 42.14 - samples/sec: 3166.35 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:50:53,630 epoch 7 - iter 657/738 - loss 0.01943647 - time (sec): 47.44 - samples/sec: 3163.17 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:50:58,375 epoch 7 - iter 730/738 - loss 0.01835721 - time (sec): 52.18 - samples/sec: 3160.70 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:50:58,850 ----------------------------------------------------------------------------------------------------
2023-10-17 19:50:58,850 EPOCH 7 done: loss 0.0183 - lr: 0.000010
2023-10-17 19:51:09,961 DEV : loss 0.18084457516670227 - f1-score (micro avg) 0.8646
2023-10-17 19:51:09,994 saving best model
2023-10-17 19:51:10,719 ----------------------------------------------------------------------------------------------------
2023-10-17 19:51:15,879 epoch 8 - iter 73/738 - loss 0.00764996 - time (sec): 5.16 - samples/sec: 3152.05 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:51:22,133 epoch 8 - iter 146/738 - loss 0.01104963 - time (sec): 11.41 - samples/sec: 2973.21 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:51:27,099 epoch 8 - iter 219/738 - loss 0.01198235 - time (sec): 16.38 - samples/sec: 3023.19 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:51:32,039 epoch 8 - iter 292/738 - loss 0.01113988 - time (sec): 21.32 - samples/sec: 3063.31 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:51:37,404 epoch 8 - iter 365/738 - loss 0.01216710 - time (sec): 26.68 - samples/sec: 3071.41 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:51:42,200 epoch 8 - iter 438/738 - loss 0.01124920 - time (sec): 31.48 - samples/sec: 3103.15 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:51:48,209 epoch 8 - iter 511/738 - loss 0.01270444 - time (sec): 37.49 - samples/sec: 3113.35 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:51:53,103 epoch 8 - iter 584/738 - loss 0.01213295 - time (sec): 42.38 - samples/sec: 3133.78 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:51:58,181 epoch 8 - iter 657/738 - loss 0.01173209 - time (sec): 47.46 - samples/sec: 3143.35 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:52:03,086 epoch 8 - iter 730/738 - loss 0.01264595 - time (sec): 52.37 - samples/sec: 3142.90 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:52:03,702 ----------------------------------------------------------------------------------------------------
2023-10-17 19:52:03,702 EPOCH 8 done: loss 0.0126 - lr: 0.000007
2023-10-17 19:52:15,277 DEV : loss 0.19484567642211914 - f1-score (micro avg) 0.8602
2023-10-17 19:52:15,312 ----------------------------------------------------------------------------------------------------
2023-10-17 19:52:20,982 epoch 9 - iter 73/738 - loss 0.00488093 - time (sec): 5.67 - samples/sec: 3162.54 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:52:26,399 epoch 9 - iter 146/738 - loss 0.00805781 - time (sec): 11.09 - samples/sec: 3264.48 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:52:32,156 epoch 9 - iter 219/738 - loss 0.00865579 - time (sec): 16.84 - samples/sec: 3235.87 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:52:37,449 epoch 9 - iter 292/738 - loss 0.00824878 - time (sec): 22.14 - samples/sec: 3143.10 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:52:42,684 epoch 9 - iter 365/738 - loss 0.00755746 - time (sec): 27.37 - samples/sec: 3121.26 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:52:47,474 epoch 9 - iter 438/738 - loss 0.00745069 - time (sec): 32.16 - samples/sec: 3162.12 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:52:52,032 epoch 9 - iter 511/738 - loss 0.00907204 - time (sec): 36.72 - samples/sec: 3177.35 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:52:56,789 epoch 9 - iter 584/738 - loss 0.00911722 - time (sec): 41.48 - samples/sec: 3186.87 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:53:02,404 epoch 9 - iter 657/738 - loss 0.00897147 - time (sec): 47.09 - samples/sec: 3187.92 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:53:06,847 epoch 9 - iter 730/738 - loss 0.00954450 - time (sec): 51.53 - samples/sec: 3191.40 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:53:07,402 ----------------------------------------------------------------------------------------------------
2023-10-17 19:53:07,402 EPOCH 9 done: loss 0.0095 - lr: 0.000003
2023-10-17 19:53:18,725 DEV : loss 0.1866709142923355 - f1-score (micro avg) 0.8599
2023-10-17 19:53:18,756 ----------------------------------------------------------------------------------------------------
2023-10-17 19:53:24,448 epoch 10 - iter 73/738 - loss 0.00755432 - time (sec): 5.69 - samples/sec: 3440.94 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:53:29,523 epoch 10 - iter 146/738 - loss 0.00658582 - time (sec): 10.77 - samples/sec: 3419.43 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:53:34,251 epoch 10 - iter 219/738 - loss 0.00573246 - time (sec): 15.49 - samples/sec: 3424.02 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:53:38,964 epoch 10 - iter 292/738 - loss 0.00503723 - time (sec): 20.21 - samples/sec: 3347.86 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:53:43,952 epoch 10 - iter 365/738 - loss 0.00546895 - time (sec): 25.19 - samples/sec: 3299.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:53:48,719 epoch 10 - iter 438/738 - loss 0.00520885 - time (sec): 29.96 - samples/sec: 3314.36 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:53:54,231 epoch 10 - iter 511/738 - loss 0.00544848 - time (sec): 35.47 - samples/sec: 3288.43 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:53:58,768 epoch 10 - iter 584/738 - loss 0.00613301 - time (sec): 40.01 - samples/sec: 3293.93 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:54:03,447 epoch 10 - iter 657/738 - loss 0.00650287 - time (sec): 44.69 - samples/sec: 3300.55 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:54:08,736 epoch 10 - iter 730/738 - loss 0.00664178 - time (sec): 49.98 - samples/sec: 3299.34 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:54:09,185 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:09,185 EPOCH 10 done: loss 0.0067 - lr: 0.000000
2023-10-17 19:54:20,252 DEV : loss 0.1855737715959549 - f1-score (micro avg) 0.8652
2023-10-17 19:54:20,280 saving best model
2023-10-17 19:54:21,202 ----------------------------------------------------------------------------------------------------
2023-10-17 19:54:21,204 Loading model from best epoch ...
2023-10-17 19:54:23,111 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-17 19:54:29,618
Results:
- F-score (micro) 0.8059
- F-score (macro) 0.7164
- Accuracy 0.6929
By class:
precision recall f1-score support
loc 0.8570 0.8800 0.8683 858
pers 0.7792 0.8082 0.7934 537
org 0.5467 0.6212 0.5816 132
prod 0.7679 0.7049 0.7350 61
time 0.5645 0.6481 0.6034 54
micro avg 0.7907 0.8216 0.8059 1642
macro avg 0.7030 0.7325 0.7164 1642
weighted avg 0.7937 0.8216 0.8071 1642
2023-10-17 19:54:29,618 ----------------------------------------------------------------------------------------------------