stefan-it's picture
Upload folder using huggingface_hub
84c3e98
raw
history blame
24.1 kB
2023-10-15 00:34:00,672 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,673 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 00:34:00,673 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,673 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,674 Train: 14465 sentences
2023-10-15 00:34:00,674 (train_with_dev=False, train_with_test=False)
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,674 Training Params:
2023-10-15 00:34:00,674 - learning_rate: "3e-05"
2023-10-15 00:34:00,674 - mini_batch_size: "8"
2023-10-15 00:34:00,674 - max_epochs: "10"
2023-10-15 00:34:00,674 - shuffle: "True"
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,674 Plugins:
2023-10-15 00:34:00,674 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,674 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 00:34:00,674 - metric: "('micro avg', 'f1-score')"
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,674 Computation:
2023-10-15 00:34:00,674 - compute on device: cuda:0
2023-10-15 00:34:00,674 - embedding storage: none
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,674 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
2023-10-15 00:34:12,020 epoch 1 - iter 180/1809 - loss 1.69578880 - time (sec): 11.35 - samples/sec: 3329.84 - lr: 0.000003 - momentum: 0.000000
2023-10-15 00:34:23,157 epoch 1 - iter 360/1809 - loss 0.96774144 - time (sec): 22.48 - samples/sec: 3335.91 - lr: 0.000006 - momentum: 0.000000
2023-10-15 00:34:34,271 epoch 1 - iter 540/1809 - loss 0.70357347 - time (sec): 33.60 - samples/sec: 3365.18 - lr: 0.000009 - momentum: 0.000000
2023-10-15 00:34:45,056 epoch 1 - iter 720/1809 - loss 0.56603696 - time (sec): 44.38 - samples/sec: 3386.28 - lr: 0.000012 - momentum: 0.000000
2023-10-15 00:34:56,166 epoch 1 - iter 900/1809 - loss 0.47915304 - time (sec): 55.49 - samples/sec: 3400.37 - lr: 0.000015 - momentum: 0.000000
2023-10-15 00:35:07,591 epoch 1 - iter 1080/1809 - loss 0.41979242 - time (sec): 66.92 - samples/sec: 3381.28 - lr: 0.000018 - momentum: 0.000000
2023-10-15 00:35:18,537 epoch 1 - iter 1260/1809 - loss 0.37659632 - time (sec): 77.86 - samples/sec: 3380.44 - lr: 0.000021 - momentum: 0.000000
2023-10-15 00:35:29,412 epoch 1 - iter 1440/1809 - loss 0.34144756 - time (sec): 88.74 - samples/sec: 3392.55 - lr: 0.000024 - momentum: 0.000000
2023-10-15 00:35:40,488 epoch 1 - iter 1620/1809 - loss 0.31444292 - time (sec): 99.81 - samples/sec: 3405.87 - lr: 0.000027 - momentum: 0.000000
2023-10-15 00:35:51,789 epoch 1 - iter 1800/1809 - loss 0.29260619 - time (sec): 111.11 - samples/sec: 3405.01 - lr: 0.000030 - momentum: 0.000000
2023-10-15 00:35:52,293 ----------------------------------------------------------------------------------------------------
2023-10-15 00:35:52,293 EPOCH 1 done: loss 0.2920 - lr: 0.000030
2023-10-15 00:35:56,993 DEV : loss 0.12198404222726822 - f1-score (micro avg) 0.5722
2023-10-15 00:35:57,022 saving best model
2023-10-15 00:35:57,380 ----------------------------------------------------------------------------------------------------
2023-10-15 00:36:08,464 epoch 2 - iter 180/1809 - loss 0.09620289 - time (sec): 11.08 - samples/sec: 3499.09 - lr: 0.000030 - momentum: 0.000000
2023-10-15 00:36:19,654 epoch 2 - iter 360/1809 - loss 0.08888247 - time (sec): 22.27 - samples/sec: 3425.10 - lr: 0.000029 - momentum: 0.000000
2023-10-15 00:36:30,545 epoch 2 - iter 540/1809 - loss 0.08750531 - time (sec): 33.16 - samples/sec: 3430.82 - lr: 0.000029 - momentum: 0.000000
2023-10-15 00:36:41,703 epoch 2 - iter 720/1809 - loss 0.08597305 - time (sec): 44.32 - samples/sec: 3446.10 - lr: 0.000029 - momentum: 0.000000
2023-10-15 00:36:52,772 epoch 2 - iter 900/1809 - loss 0.08644518 - time (sec): 55.39 - samples/sec: 3439.68 - lr: 0.000028 - momentum: 0.000000
2023-10-15 00:37:04,106 epoch 2 - iter 1080/1809 - loss 0.08478312 - time (sec): 66.73 - samples/sec: 3445.62 - lr: 0.000028 - momentum: 0.000000
2023-10-15 00:37:15,095 epoch 2 - iter 1260/1809 - loss 0.08491575 - time (sec): 77.71 - samples/sec: 3434.67 - lr: 0.000028 - momentum: 0.000000
2023-10-15 00:37:26,060 epoch 2 - iter 1440/1809 - loss 0.08292218 - time (sec): 88.68 - samples/sec: 3420.93 - lr: 0.000027 - momentum: 0.000000
2023-10-15 00:37:37,259 epoch 2 - iter 1620/1809 - loss 0.08203160 - time (sec): 99.88 - samples/sec: 3418.03 - lr: 0.000027 - momentum: 0.000000
2023-10-15 00:37:48,250 epoch 2 - iter 1800/1809 - loss 0.08159459 - time (sec): 110.87 - samples/sec: 3412.35 - lr: 0.000027 - momentum: 0.000000
2023-10-15 00:37:48,794 ----------------------------------------------------------------------------------------------------
2023-10-15 00:37:48,794 EPOCH 2 done: loss 0.0814 - lr: 0.000027
2023-10-15 00:37:55,062 DEV : loss 0.11705530434846878 - f1-score (micro avg) 0.6312
2023-10-15 00:37:55,091 saving best model
2023-10-15 00:37:55,626 ----------------------------------------------------------------------------------------------------
2023-10-15 00:38:06,588 epoch 3 - iter 180/1809 - loss 0.05232030 - time (sec): 10.96 - samples/sec: 3449.48 - lr: 0.000026 - momentum: 0.000000
2023-10-15 00:38:18,020 epoch 3 - iter 360/1809 - loss 0.05666857 - time (sec): 22.39 - samples/sec: 3430.40 - lr: 0.000026 - momentum: 0.000000
2023-10-15 00:38:28,583 epoch 3 - iter 540/1809 - loss 0.05594377 - time (sec): 32.95 - samples/sec: 3441.67 - lr: 0.000026 - momentum: 0.000000
2023-10-15 00:38:39,674 epoch 3 - iter 720/1809 - loss 0.05871684 - time (sec): 44.05 - samples/sec: 3425.15 - lr: 0.000025 - momentum: 0.000000
2023-10-15 00:38:51,076 epoch 3 - iter 900/1809 - loss 0.05950701 - time (sec): 55.45 - samples/sec: 3374.92 - lr: 0.000025 - momentum: 0.000000
2023-10-15 00:39:02,741 epoch 3 - iter 1080/1809 - loss 0.05920344 - time (sec): 67.11 - samples/sec: 3358.55 - lr: 0.000025 - momentum: 0.000000
2023-10-15 00:39:14,716 epoch 3 - iter 1260/1809 - loss 0.05784277 - time (sec): 79.09 - samples/sec: 3328.74 - lr: 0.000024 - momentum: 0.000000
2023-10-15 00:39:26,435 epoch 3 - iter 1440/1809 - loss 0.05840880 - time (sec): 90.81 - samples/sec: 3318.84 - lr: 0.000024 - momentum: 0.000000
2023-10-15 00:39:37,770 epoch 3 - iter 1620/1809 - loss 0.05800859 - time (sec): 102.14 - samples/sec: 3315.52 - lr: 0.000024 - momentum: 0.000000
2023-10-15 00:39:49,682 epoch 3 - iter 1800/1809 - loss 0.05721469 - time (sec): 114.05 - samples/sec: 3318.60 - lr: 0.000023 - momentum: 0.000000
2023-10-15 00:39:50,192 ----------------------------------------------------------------------------------------------------
2023-10-15 00:39:50,192 EPOCH 3 done: loss 0.0574 - lr: 0.000023
2023-10-15 00:39:56,519 DEV : loss 0.15803837776184082 - f1-score (micro avg) 0.6323
2023-10-15 00:39:56,553 saving best model
2023-10-15 00:39:57,023 ----------------------------------------------------------------------------------------------------
2023-10-15 00:40:07,803 epoch 4 - iter 180/1809 - loss 0.03675662 - time (sec): 10.78 - samples/sec: 3513.50 - lr: 0.000023 - momentum: 0.000000
2023-10-15 00:40:18,830 epoch 4 - iter 360/1809 - loss 0.03782429 - time (sec): 21.80 - samples/sec: 3425.01 - lr: 0.000023 - momentum: 0.000000
2023-10-15 00:40:30,431 epoch 4 - iter 540/1809 - loss 0.04025441 - time (sec): 33.40 - samples/sec: 3433.46 - lr: 0.000022 - momentum: 0.000000
2023-10-15 00:40:41,758 epoch 4 - iter 720/1809 - loss 0.03972038 - time (sec): 44.73 - samples/sec: 3398.75 - lr: 0.000022 - momentum: 0.000000
2023-10-15 00:40:52,956 epoch 4 - iter 900/1809 - loss 0.04064138 - time (sec): 55.93 - samples/sec: 3390.54 - lr: 0.000022 - momentum: 0.000000
2023-10-15 00:41:04,080 epoch 4 - iter 1080/1809 - loss 0.04035289 - time (sec): 67.05 - samples/sec: 3388.44 - lr: 0.000021 - momentum: 0.000000
2023-10-15 00:41:15,319 epoch 4 - iter 1260/1809 - loss 0.04089726 - time (sec): 78.29 - samples/sec: 3388.14 - lr: 0.000021 - momentum: 0.000000
2023-10-15 00:41:26,299 epoch 4 - iter 1440/1809 - loss 0.04115836 - time (sec): 89.27 - samples/sec: 3390.11 - lr: 0.000021 - momentum: 0.000000
2023-10-15 00:41:37,035 epoch 4 - iter 1620/1809 - loss 0.04108914 - time (sec): 100.01 - samples/sec: 3402.00 - lr: 0.000020 - momentum: 0.000000
2023-10-15 00:41:47,994 epoch 4 - iter 1800/1809 - loss 0.04098637 - time (sec): 110.97 - samples/sec: 3408.61 - lr: 0.000020 - momentum: 0.000000
2023-10-15 00:41:48,505 ----------------------------------------------------------------------------------------------------
2023-10-15 00:41:48,506 EPOCH 4 done: loss 0.0412 - lr: 0.000020
2023-10-15 00:41:55,067 DEV : loss 0.26651689410209656 - f1-score (micro avg) 0.6358
2023-10-15 00:41:55,097 saving best model
2023-10-15 00:41:55,617 ----------------------------------------------------------------------------------------------------
2023-10-15 00:42:06,584 epoch 5 - iter 180/1809 - loss 0.03222780 - time (sec): 10.96 - samples/sec: 3384.38 - lr: 0.000020 - momentum: 0.000000
2023-10-15 00:42:17,751 epoch 5 - iter 360/1809 - loss 0.02816632 - time (sec): 22.13 - samples/sec: 3432.68 - lr: 0.000019 - momentum: 0.000000
2023-10-15 00:42:28,823 epoch 5 - iter 540/1809 - loss 0.02611782 - time (sec): 33.20 - samples/sec: 3445.57 - lr: 0.000019 - momentum: 0.000000
2023-10-15 00:42:39,839 epoch 5 - iter 720/1809 - loss 0.02595299 - time (sec): 44.22 - samples/sec: 3431.75 - lr: 0.000019 - momentum: 0.000000
2023-10-15 00:42:51,400 epoch 5 - iter 900/1809 - loss 0.02604561 - time (sec): 55.78 - samples/sec: 3405.01 - lr: 0.000018 - momentum: 0.000000
2023-10-15 00:43:02,924 epoch 5 - iter 1080/1809 - loss 0.02815177 - time (sec): 67.30 - samples/sec: 3393.94 - lr: 0.000018 - momentum: 0.000000
2023-10-15 00:43:14,004 epoch 5 - iter 1260/1809 - loss 0.02856667 - time (sec): 78.38 - samples/sec: 3416.12 - lr: 0.000018 - momentum: 0.000000
2023-10-15 00:43:25,118 epoch 5 - iter 1440/1809 - loss 0.02810936 - time (sec): 89.50 - samples/sec: 3410.79 - lr: 0.000017 - momentum: 0.000000
2023-10-15 00:43:36,034 epoch 5 - iter 1620/1809 - loss 0.02857657 - time (sec): 100.41 - samples/sec: 3407.81 - lr: 0.000017 - momentum: 0.000000
2023-10-15 00:43:47,204 epoch 5 - iter 1800/1809 - loss 0.02823526 - time (sec): 111.58 - samples/sec: 3387.43 - lr: 0.000017 - momentum: 0.000000
2023-10-15 00:43:47,831 ----------------------------------------------------------------------------------------------------
2023-10-15 00:43:47,831 EPOCH 5 done: loss 0.0281 - lr: 0.000017
2023-10-15 00:43:54,965 DEV : loss 0.29827266931533813 - f1-score (micro avg) 0.6512
2023-10-15 00:43:55,016 saving best model
2023-10-15 00:43:55,433 ----------------------------------------------------------------------------------------------------
2023-10-15 00:44:06,476 epoch 6 - iter 180/1809 - loss 0.02021871 - time (sec): 11.04 - samples/sec: 3386.99 - lr: 0.000016 - momentum: 0.000000
2023-10-15 00:44:17,804 epoch 6 - iter 360/1809 - loss 0.02162851 - time (sec): 22.37 - samples/sec: 3390.81 - lr: 0.000016 - momentum: 0.000000
2023-10-15 00:44:29,501 epoch 6 - iter 540/1809 - loss 0.02039145 - time (sec): 34.07 - samples/sec: 3338.77 - lr: 0.000016 - momentum: 0.000000
2023-10-15 00:44:40,865 epoch 6 - iter 720/1809 - loss 0.02010225 - time (sec): 45.43 - samples/sec: 3354.54 - lr: 0.000015 - momentum: 0.000000
2023-10-15 00:44:51,991 epoch 6 - iter 900/1809 - loss 0.01961907 - time (sec): 56.56 - samples/sec: 3351.96 - lr: 0.000015 - momentum: 0.000000
2023-10-15 00:45:03,033 epoch 6 - iter 1080/1809 - loss 0.02020129 - time (sec): 67.60 - samples/sec: 3361.18 - lr: 0.000015 - momentum: 0.000000
2023-10-15 00:45:13,853 epoch 6 - iter 1260/1809 - loss 0.02013371 - time (sec): 78.42 - samples/sec: 3371.39 - lr: 0.000014 - momentum: 0.000000
2023-10-15 00:45:24,859 epoch 6 - iter 1440/1809 - loss 0.01963702 - time (sec): 89.42 - samples/sec: 3378.59 - lr: 0.000014 - momentum: 0.000000
2023-10-15 00:45:35,850 epoch 6 - iter 1620/1809 - loss 0.01979049 - time (sec): 100.41 - samples/sec: 3391.33 - lr: 0.000014 - momentum: 0.000000
2023-10-15 00:45:46,868 epoch 6 - iter 1800/1809 - loss 0.02010880 - time (sec): 111.43 - samples/sec: 3393.44 - lr: 0.000013 - momentum: 0.000000
2023-10-15 00:45:47,411 ----------------------------------------------------------------------------------------------------
2023-10-15 00:45:47,412 EPOCH 6 done: loss 0.0201 - lr: 0.000013
2023-10-15 00:45:54,097 DEV : loss 0.326910138130188 - f1-score (micro avg) 0.6521
2023-10-15 00:45:54,136 saving best model
2023-10-15 00:45:54,554 ----------------------------------------------------------------------------------------------------
2023-10-15 00:46:05,343 epoch 7 - iter 180/1809 - loss 0.01170096 - time (sec): 10.79 - samples/sec: 3427.57 - lr: 0.000013 - momentum: 0.000000
2023-10-15 00:46:16,264 epoch 7 - iter 360/1809 - loss 0.01374452 - time (sec): 21.71 - samples/sec: 3385.53 - lr: 0.000013 - momentum: 0.000000
2023-10-15 00:46:27,180 epoch 7 - iter 540/1809 - loss 0.01405741 - time (sec): 32.62 - samples/sec: 3422.57 - lr: 0.000012 - momentum: 0.000000
2023-10-15 00:46:38,027 epoch 7 - iter 720/1809 - loss 0.01354319 - time (sec): 43.47 - samples/sec: 3423.22 - lr: 0.000012 - momentum: 0.000000
2023-10-15 00:46:49,010 epoch 7 - iter 900/1809 - loss 0.01402736 - time (sec): 54.45 - samples/sec: 3432.65 - lr: 0.000012 - momentum: 0.000000
2023-10-15 00:46:59,959 epoch 7 - iter 1080/1809 - loss 0.01405428 - time (sec): 65.40 - samples/sec: 3441.56 - lr: 0.000011 - momentum: 0.000000
2023-10-15 00:47:11,928 epoch 7 - iter 1260/1809 - loss 0.01369459 - time (sec): 77.37 - samples/sec: 3413.33 - lr: 0.000011 - momentum: 0.000000
2023-10-15 00:47:23,332 epoch 7 - iter 1440/1809 - loss 0.01300601 - time (sec): 88.78 - samples/sec: 3393.45 - lr: 0.000011 - momentum: 0.000000
2023-10-15 00:47:34,578 epoch 7 - iter 1620/1809 - loss 0.01324269 - time (sec): 100.02 - samples/sec: 3412.43 - lr: 0.000010 - momentum: 0.000000
2023-10-15 00:47:45,549 epoch 7 - iter 1800/1809 - loss 0.01359864 - time (sec): 110.99 - samples/sec: 3407.83 - lr: 0.000010 - momentum: 0.000000
2023-10-15 00:47:46,058 ----------------------------------------------------------------------------------------------------
2023-10-15 00:47:46,058 EPOCH 7 done: loss 0.0136 - lr: 0.000010
2023-10-15 00:47:51,722 DEV : loss 0.35920804738998413 - f1-score (micro avg) 0.6464
2023-10-15 00:47:51,760 ----------------------------------------------------------------------------------------------------
2023-10-15 00:48:04,052 epoch 8 - iter 180/1809 - loss 0.00809107 - time (sec): 12.29 - samples/sec: 3016.59 - lr: 0.000010 - momentum: 0.000000
2023-10-15 00:48:15,154 epoch 8 - iter 360/1809 - loss 0.00997925 - time (sec): 23.39 - samples/sec: 3223.35 - lr: 0.000009 - momentum: 0.000000
2023-10-15 00:48:26,207 epoch 8 - iter 540/1809 - loss 0.00950308 - time (sec): 34.44 - samples/sec: 3264.79 - lr: 0.000009 - momentum: 0.000000
2023-10-15 00:48:37,453 epoch 8 - iter 720/1809 - loss 0.01017881 - time (sec): 45.69 - samples/sec: 3318.90 - lr: 0.000009 - momentum: 0.000000
2023-10-15 00:48:48,337 epoch 8 - iter 900/1809 - loss 0.01019771 - time (sec): 56.57 - samples/sec: 3331.55 - lr: 0.000008 - momentum: 0.000000
2023-10-15 00:48:59,241 epoch 8 - iter 1080/1809 - loss 0.00985664 - time (sec): 67.48 - samples/sec: 3352.87 - lr: 0.000008 - momentum: 0.000000
2023-10-15 00:49:10,421 epoch 8 - iter 1260/1809 - loss 0.01007201 - time (sec): 78.66 - samples/sec: 3348.79 - lr: 0.000008 - momentum: 0.000000
2023-10-15 00:49:21,548 epoch 8 - iter 1440/1809 - loss 0.01003360 - time (sec): 89.79 - samples/sec: 3367.68 - lr: 0.000007 - momentum: 0.000000
2023-10-15 00:49:32,423 epoch 8 - iter 1620/1809 - loss 0.00994408 - time (sec): 100.66 - samples/sec: 3376.73 - lr: 0.000007 - momentum: 0.000000
2023-10-15 00:49:43,635 epoch 8 - iter 1800/1809 - loss 0.00961262 - time (sec): 111.87 - samples/sec: 3377.27 - lr: 0.000007 - momentum: 0.000000
2023-10-15 00:49:44,234 ----------------------------------------------------------------------------------------------------
2023-10-15 00:49:44,235 EPOCH 8 done: loss 0.0096 - lr: 0.000007
2023-10-15 00:49:49,924 DEV : loss 0.3747365176677704 - f1-score (micro avg) 0.6497
2023-10-15 00:49:49,967 ----------------------------------------------------------------------------------------------------
2023-10-15 00:50:01,635 epoch 9 - iter 180/1809 - loss 0.00978635 - time (sec): 11.67 - samples/sec: 3241.63 - lr: 0.000006 - momentum: 0.000000
2023-10-15 00:50:13,437 epoch 9 - iter 360/1809 - loss 0.00794210 - time (sec): 23.47 - samples/sec: 3243.38 - lr: 0.000006 - momentum: 0.000000
2023-10-15 00:50:25,022 epoch 9 - iter 540/1809 - loss 0.00746988 - time (sec): 35.05 - samples/sec: 3250.84 - lr: 0.000006 - momentum: 0.000000
2023-10-15 00:50:35,745 epoch 9 - iter 720/1809 - loss 0.00720058 - time (sec): 45.78 - samples/sec: 3297.68 - lr: 0.000005 - momentum: 0.000000
2023-10-15 00:50:47,088 epoch 9 - iter 900/1809 - loss 0.00675225 - time (sec): 57.12 - samples/sec: 3316.81 - lr: 0.000005 - momentum: 0.000000
2023-10-15 00:50:59,223 epoch 9 - iter 1080/1809 - loss 0.00625992 - time (sec): 69.25 - samples/sec: 3281.87 - lr: 0.000005 - momentum: 0.000000
2023-10-15 00:51:10,232 epoch 9 - iter 1260/1809 - loss 0.00599461 - time (sec): 80.26 - samples/sec: 3295.86 - lr: 0.000004 - momentum: 0.000000
2023-10-15 00:51:21,369 epoch 9 - iter 1440/1809 - loss 0.00630740 - time (sec): 91.40 - samples/sec: 3316.86 - lr: 0.000004 - momentum: 0.000000
2023-10-15 00:51:32,748 epoch 9 - iter 1620/1809 - loss 0.00639991 - time (sec): 102.78 - samples/sec: 3316.07 - lr: 0.000004 - momentum: 0.000000
2023-10-15 00:51:43,587 epoch 9 - iter 1800/1809 - loss 0.00679754 - time (sec): 113.62 - samples/sec: 3327.65 - lr: 0.000003 - momentum: 0.000000
2023-10-15 00:51:44,107 ----------------------------------------------------------------------------------------------------
2023-10-15 00:51:44,107 EPOCH 9 done: loss 0.0068 - lr: 0.000003
2023-10-15 00:51:49,804 DEV : loss 0.36713123321533203 - f1-score (micro avg) 0.6476
2023-10-15 00:51:49,848 ----------------------------------------------------------------------------------------------------
2023-10-15 00:52:01,561 epoch 10 - iter 180/1809 - loss 0.00501353 - time (sec): 11.71 - samples/sec: 3294.40 - lr: 0.000003 - momentum: 0.000000
2023-10-15 00:52:12,931 epoch 10 - iter 360/1809 - loss 0.00564503 - time (sec): 23.08 - samples/sec: 3312.41 - lr: 0.000003 - momentum: 0.000000
2023-10-15 00:52:24,067 epoch 10 - iter 540/1809 - loss 0.00553930 - time (sec): 34.22 - samples/sec: 3314.09 - lr: 0.000002 - momentum: 0.000000
2023-10-15 00:52:35,168 epoch 10 - iter 720/1809 - loss 0.00474944 - time (sec): 45.32 - samples/sec: 3360.42 - lr: 0.000002 - momentum: 0.000000
2023-10-15 00:52:46,049 epoch 10 - iter 900/1809 - loss 0.00441452 - time (sec): 56.20 - samples/sec: 3374.49 - lr: 0.000002 - momentum: 0.000000
2023-10-15 00:52:56,767 epoch 10 - iter 1080/1809 - loss 0.00439160 - time (sec): 66.92 - samples/sec: 3382.57 - lr: 0.000001 - momentum: 0.000000
2023-10-15 00:53:08,047 epoch 10 - iter 1260/1809 - loss 0.00399218 - time (sec): 78.20 - samples/sec: 3396.59 - lr: 0.000001 - momentum: 0.000000
2023-10-15 00:53:18,818 epoch 10 - iter 1440/1809 - loss 0.00419350 - time (sec): 88.97 - samples/sec: 3407.77 - lr: 0.000001 - momentum: 0.000000
2023-10-15 00:53:29,758 epoch 10 - iter 1620/1809 - loss 0.00416085 - time (sec): 99.91 - samples/sec: 3411.06 - lr: 0.000000 - momentum: 0.000000
2023-10-15 00:53:41,863 epoch 10 - iter 1800/1809 - loss 0.00440667 - time (sec): 112.01 - samples/sec: 3371.59 - lr: 0.000000 - momentum: 0.000000
2023-10-15 00:53:42,494 ----------------------------------------------------------------------------------------------------
2023-10-15 00:53:42,494 EPOCH 10 done: loss 0.0044 - lr: 0.000000
2023-10-15 00:53:48,144 DEV : loss 0.3939391076564789 - f1-score (micro avg) 0.6589
2023-10-15 00:53:48,186 saving best model
2023-10-15 00:53:49,083 ----------------------------------------------------------------------------------------------------
2023-10-15 00:53:49,084 Loading model from best epoch ...
2023-10-15 00:53:50,690 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-15 00:53:58,307
Results:
- F-score (micro) 0.6652
- F-score (macro) 0.5525
- Accuracy 0.5115
By class:
precision recall f1-score support
loc 0.6331 0.8088 0.7103 591
pers 0.5735 0.7871 0.6635 357
org 0.2895 0.2785 0.2839 79
micro avg 0.5912 0.7605 0.6652 1027
macro avg 0.4987 0.6248 0.5525 1027
weighted avg 0.5859 0.7605 0.6612 1027
2023-10-15 00:53:58,308 ----------------------------------------------------------------------------------------------------