|
2023-10-15 00:34:00,672 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,673 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-15 00:34:00,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,673 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences |
|
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,674 Train: 14465 sentences |
|
2023-10-15 00:34:00,674 (train_with_dev=False, train_with_test=False) |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,674 Training Params: |
|
2023-10-15 00:34:00,674 - learning_rate: "3e-05" |
|
2023-10-15 00:34:00,674 - mini_batch_size: "8" |
|
2023-10-15 00:34:00,674 - max_epochs: "10" |
|
2023-10-15 00:34:00,674 - shuffle: "True" |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,674 Plugins: |
|
2023-10-15 00:34:00,674 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,674 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-15 00:34:00,674 - metric: "('micro avg', 'f1-score')" |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,674 Computation: |
|
2023-10-15 00:34:00,674 - compute on device: cuda:0 |
|
2023-10-15 00:34:00,674 - embedding storage: none |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,674 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:00,674 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:34:12,020 epoch 1 - iter 180/1809 - loss 1.69578880 - time (sec): 11.35 - samples/sec: 3329.84 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 00:34:23,157 epoch 1 - iter 360/1809 - loss 0.96774144 - time (sec): 22.48 - samples/sec: 3335.91 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 00:34:34,271 epoch 1 - iter 540/1809 - loss 0.70357347 - time (sec): 33.60 - samples/sec: 3365.18 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 00:34:45,056 epoch 1 - iter 720/1809 - loss 0.56603696 - time (sec): 44.38 - samples/sec: 3386.28 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 00:34:56,166 epoch 1 - iter 900/1809 - loss 0.47915304 - time (sec): 55.49 - samples/sec: 3400.37 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 00:35:07,591 epoch 1 - iter 1080/1809 - loss 0.41979242 - time (sec): 66.92 - samples/sec: 3381.28 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 00:35:18,537 epoch 1 - iter 1260/1809 - loss 0.37659632 - time (sec): 77.86 - samples/sec: 3380.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 00:35:29,412 epoch 1 - iter 1440/1809 - loss 0.34144756 - time (sec): 88.74 - samples/sec: 3392.55 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 00:35:40,488 epoch 1 - iter 1620/1809 - loss 0.31444292 - time (sec): 99.81 - samples/sec: 3405.87 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 00:35:51,789 epoch 1 - iter 1800/1809 - loss 0.29260619 - time (sec): 111.11 - samples/sec: 3405.01 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 00:35:52,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:35:52,293 EPOCH 1 done: loss 0.2920 - lr: 0.000030 |
|
2023-10-15 00:35:56,993 DEV : loss 0.12198404222726822 - f1-score (micro avg) 0.5722 |
|
2023-10-15 00:35:57,022 saving best model |
|
2023-10-15 00:35:57,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:36:08,464 epoch 2 - iter 180/1809 - loss 0.09620289 - time (sec): 11.08 - samples/sec: 3499.09 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-15 00:36:19,654 epoch 2 - iter 360/1809 - loss 0.08888247 - time (sec): 22.27 - samples/sec: 3425.10 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 00:36:30,545 epoch 2 - iter 540/1809 - loss 0.08750531 - time (sec): 33.16 - samples/sec: 3430.82 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 00:36:41,703 epoch 2 - iter 720/1809 - loss 0.08597305 - time (sec): 44.32 - samples/sec: 3446.10 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-15 00:36:52,772 epoch 2 - iter 900/1809 - loss 0.08644518 - time (sec): 55.39 - samples/sec: 3439.68 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 00:37:04,106 epoch 2 - iter 1080/1809 - loss 0.08478312 - time (sec): 66.73 - samples/sec: 3445.62 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 00:37:15,095 epoch 2 - iter 1260/1809 - loss 0.08491575 - time (sec): 77.71 - samples/sec: 3434.67 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-15 00:37:26,060 epoch 2 - iter 1440/1809 - loss 0.08292218 - time (sec): 88.68 - samples/sec: 3420.93 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 00:37:37,259 epoch 2 - iter 1620/1809 - loss 0.08203160 - time (sec): 99.88 - samples/sec: 3418.03 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 00:37:48,250 epoch 2 - iter 1800/1809 - loss 0.08159459 - time (sec): 110.87 - samples/sec: 3412.35 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-15 00:37:48,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:37:48,794 EPOCH 2 done: loss 0.0814 - lr: 0.000027 |
|
2023-10-15 00:37:55,062 DEV : loss 0.11705530434846878 - f1-score (micro avg) 0.6312 |
|
2023-10-15 00:37:55,091 saving best model |
|
2023-10-15 00:37:55,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:38:06,588 epoch 3 - iter 180/1809 - loss 0.05232030 - time (sec): 10.96 - samples/sec: 3449.48 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 00:38:18,020 epoch 3 - iter 360/1809 - loss 0.05666857 - time (sec): 22.39 - samples/sec: 3430.40 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 00:38:28,583 epoch 3 - iter 540/1809 - loss 0.05594377 - time (sec): 32.95 - samples/sec: 3441.67 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-15 00:38:39,674 epoch 3 - iter 720/1809 - loss 0.05871684 - time (sec): 44.05 - samples/sec: 3425.15 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 00:38:51,076 epoch 3 - iter 900/1809 - loss 0.05950701 - time (sec): 55.45 - samples/sec: 3374.92 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 00:39:02,741 epoch 3 - iter 1080/1809 - loss 0.05920344 - time (sec): 67.11 - samples/sec: 3358.55 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-15 00:39:14,716 epoch 3 - iter 1260/1809 - loss 0.05784277 - time (sec): 79.09 - samples/sec: 3328.74 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 00:39:26,435 epoch 3 - iter 1440/1809 - loss 0.05840880 - time (sec): 90.81 - samples/sec: 3318.84 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 00:39:37,770 epoch 3 - iter 1620/1809 - loss 0.05800859 - time (sec): 102.14 - samples/sec: 3315.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-15 00:39:49,682 epoch 3 - iter 1800/1809 - loss 0.05721469 - time (sec): 114.05 - samples/sec: 3318.60 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 00:39:50,192 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:39:50,192 EPOCH 3 done: loss 0.0574 - lr: 0.000023 |
|
2023-10-15 00:39:56,519 DEV : loss 0.15803837776184082 - f1-score (micro avg) 0.6323 |
|
2023-10-15 00:39:56,553 saving best model |
|
2023-10-15 00:39:57,023 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:40:07,803 epoch 4 - iter 180/1809 - loss 0.03675662 - time (sec): 10.78 - samples/sec: 3513.50 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 00:40:18,830 epoch 4 - iter 360/1809 - loss 0.03782429 - time (sec): 21.80 - samples/sec: 3425.01 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-15 00:40:30,431 epoch 4 - iter 540/1809 - loss 0.04025441 - time (sec): 33.40 - samples/sec: 3433.46 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 00:40:41,758 epoch 4 - iter 720/1809 - loss 0.03972038 - time (sec): 44.73 - samples/sec: 3398.75 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 00:40:52,956 epoch 4 - iter 900/1809 - loss 0.04064138 - time (sec): 55.93 - samples/sec: 3390.54 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-15 00:41:04,080 epoch 4 - iter 1080/1809 - loss 0.04035289 - time (sec): 67.05 - samples/sec: 3388.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 00:41:15,319 epoch 4 - iter 1260/1809 - loss 0.04089726 - time (sec): 78.29 - samples/sec: 3388.14 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 00:41:26,299 epoch 4 - iter 1440/1809 - loss 0.04115836 - time (sec): 89.27 - samples/sec: 3390.11 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-15 00:41:37,035 epoch 4 - iter 1620/1809 - loss 0.04108914 - time (sec): 100.01 - samples/sec: 3402.00 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 00:41:47,994 epoch 4 - iter 1800/1809 - loss 0.04098637 - time (sec): 110.97 - samples/sec: 3408.61 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 00:41:48,505 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:41:48,506 EPOCH 4 done: loss 0.0412 - lr: 0.000020 |
|
2023-10-15 00:41:55,067 DEV : loss 0.26651689410209656 - f1-score (micro avg) 0.6358 |
|
2023-10-15 00:41:55,097 saving best model |
|
2023-10-15 00:41:55,617 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:42:06,584 epoch 5 - iter 180/1809 - loss 0.03222780 - time (sec): 10.96 - samples/sec: 3384.38 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-15 00:42:17,751 epoch 5 - iter 360/1809 - loss 0.02816632 - time (sec): 22.13 - samples/sec: 3432.68 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 00:42:28,823 epoch 5 - iter 540/1809 - loss 0.02611782 - time (sec): 33.20 - samples/sec: 3445.57 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 00:42:39,839 epoch 5 - iter 720/1809 - loss 0.02595299 - time (sec): 44.22 - samples/sec: 3431.75 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-15 00:42:51,400 epoch 5 - iter 900/1809 - loss 0.02604561 - time (sec): 55.78 - samples/sec: 3405.01 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 00:43:02,924 epoch 5 - iter 1080/1809 - loss 0.02815177 - time (sec): 67.30 - samples/sec: 3393.94 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 00:43:14,004 epoch 5 - iter 1260/1809 - loss 0.02856667 - time (sec): 78.38 - samples/sec: 3416.12 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-15 00:43:25,118 epoch 5 - iter 1440/1809 - loss 0.02810936 - time (sec): 89.50 - samples/sec: 3410.79 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 00:43:36,034 epoch 5 - iter 1620/1809 - loss 0.02857657 - time (sec): 100.41 - samples/sec: 3407.81 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 00:43:47,204 epoch 5 - iter 1800/1809 - loss 0.02823526 - time (sec): 111.58 - samples/sec: 3387.43 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-15 00:43:47,831 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:43:47,831 EPOCH 5 done: loss 0.0281 - lr: 0.000017 |
|
2023-10-15 00:43:54,965 DEV : loss 0.29827266931533813 - f1-score (micro avg) 0.6512 |
|
2023-10-15 00:43:55,016 saving best model |
|
2023-10-15 00:43:55,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:44:06,476 epoch 6 - iter 180/1809 - loss 0.02021871 - time (sec): 11.04 - samples/sec: 3386.99 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 00:44:17,804 epoch 6 - iter 360/1809 - loss 0.02162851 - time (sec): 22.37 - samples/sec: 3390.81 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 00:44:29,501 epoch 6 - iter 540/1809 - loss 0.02039145 - time (sec): 34.07 - samples/sec: 3338.77 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-15 00:44:40,865 epoch 6 - iter 720/1809 - loss 0.02010225 - time (sec): 45.43 - samples/sec: 3354.54 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 00:44:51,991 epoch 6 - iter 900/1809 - loss 0.01961907 - time (sec): 56.56 - samples/sec: 3351.96 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 00:45:03,033 epoch 6 - iter 1080/1809 - loss 0.02020129 - time (sec): 67.60 - samples/sec: 3361.18 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-15 00:45:13,853 epoch 6 - iter 1260/1809 - loss 0.02013371 - time (sec): 78.42 - samples/sec: 3371.39 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 00:45:24,859 epoch 6 - iter 1440/1809 - loss 0.01963702 - time (sec): 89.42 - samples/sec: 3378.59 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 00:45:35,850 epoch 6 - iter 1620/1809 - loss 0.01979049 - time (sec): 100.41 - samples/sec: 3391.33 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-15 00:45:46,868 epoch 6 - iter 1800/1809 - loss 0.02010880 - time (sec): 111.43 - samples/sec: 3393.44 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 00:45:47,411 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:45:47,412 EPOCH 6 done: loss 0.0201 - lr: 0.000013 |
|
2023-10-15 00:45:54,097 DEV : loss 0.326910138130188 - f1-score (micro avg) 0.6521 |
|
2023-10-15 00:45:54,136 saving best model |
|
2023-10-15 00:45:54,554 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:46:05,343 epoch 7 - iter 180/1809 - loss 0.01170096 - time (sec): 10.79 - samples/sec: 3427.57 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 00:46:16,264 epoch 7 - iter 360/1809 - loss 0.01374452 - time (sec): 21.71 - samples/sec: 3385.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-15 00:46:27,180 epoch 7 - iter 540/1809 - loss 0.01405741 - time (sec): 32.62 - samples/sec: 3422.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 00:46:38,027 epoch 7 - iter 720/1809 - loss 0.01354319 - time (sec): 43.47 - samples/sec: 3423.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 00:46:49,010 epoch 7 - iter 900/1809 - loss 0.01402736 - time (sec): 54.45 - samples/sec: 3432.65 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-15 00:46:59,959 epoch 7 - iter 1080/1809 - loss 0.01405428 - time (sec): 65.40 - samples/sec: 3441.56 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 00:47:11,928 epoch 7 - iter 1260/1809 - loss 0.01369459 - time (sec): 77.37 - samples/sec: 3413.33 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 00:47:23,332 epoch 7 - iter 1440/1809 - loss 0.01300601 - time (sec): 88.78 - samples/sec: 3393.45 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-15 00:47:34,578 epoch 7 - iter 1620/1809 - loss 0.01324269 - time (sec): 100.02 - samples/sec: 3412.43 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 00:47:45,549 epoch 7 - iter 1800/1809 - loss 0.01359864 - time (sec): 110.99 - samples/sec: 3407.83 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 00:47:46,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:47:46,058 EPOCH 7 done: loss 0.0136 - lr: 0.000010 |
|
2023-10-15 00:47:51,722 DEV : loss 0.35920804738998413 - f1-score (micro avg) 0.6464 |
|
2023-10-15 00:47:51,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:48:04,052 epoch 8 - iter 180/1809 - loss 0.00809107 - time (sec): 12.29 - samples/sec: 3016.59 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-15 00:48:15,154 epoch 8 - iter 360/1809 - loss 0.00997925 - time (sec): 23.39 - samples/sec: 3223.35 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 00:48:26,207 epoch 8 - iter 540/1809 - loss 0.00950308 - time (sec): 34.44 - samples/sec: 3264.79 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 00:48:37,453 epoch 8 - iter 720/1809 - loss 0.01017881 - time (sec): 45.69 - samples/sec: 3318.90 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-15 00:48:48,337 epoch 8 - iter 900/1809 - loss 0.01019771 - time (sec): 56.57 - samples/sec: 3331.55 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 00:48:59,241 epoch 8 - iter 1080/1809 - loss 0.00985664 - time (sec): 67.48 - samples/sec: 3352.87 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 00:49:10,421 epoch 8 - iter 1260/1809 - loss 0.01007201 - time (sec): 78.66 - samples/sec: 3348.79 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-15 00:49:21,548 epoch 8 - iter 1440/1809 - loss 0.01003360 - time (sec): 89.79 - samples/sec: 3367.68 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 00:49:32,423 epoch 8 - iter 1620/1809 - loss 0.00994408 - time (sec): 100.66 - samples/sec: 3376.73 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 00:49:43,635 epoch 8 - iter 1800/1809 - loss 0.00961262 - time (sec): 111.87 - samples/sec: 3377.27 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-15 00:49:44,234 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:49:44,235 EPOCH 8 done: loss 0.0096 - lr: 0.000007 |
|
2023-10-15 00:49:49,924 DEV : loss 0.3747365176677704 - f1-score (micro avg) 0.6497 |
|
2023-10-15 00:49:49,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:50:01,635 epoch 9 - iter 180/1809 - loss 0.00978635 - time (sec): 11.67 - samples/sec: 3241.63 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 00:50:13,437 epoch 9 - iter 360/1809 - loss 0.00794210 - time (sec): 23.47 - samples/sec: 3243.38 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 00:50:25,022 epoch 9 - iter 540/1809 - loss 0.00746988 - time (sec): 35.05 - samples/sec: 3250.84 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-15 00:50:35,745 epoch 9 - iter 720/1809 - loss 0.00720058 - time (sec): 45.78 - samples/sec: 3297.68 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 00:50:47,088 epoch 9 - iter 900/1809 - loss 0.00675225 - time (sec): 57.12 - samples/sec: 3316.81 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 00:50:59,223 epoch 9 - iter 1080/1809 - loss 0.00625992 - time (sec): 69.25 - samples/sec: 3281.87 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-15 00:51:10,232 epoch 9 - iter 1260/1809 - loss 0.00599461 - time (sec): 80.26 - samples/sec: 3295.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 00:51:21,369 epoch 9 - iter 1440/1809 - loss 0.00630740 - time (sec): 91.40 - samples/sec: 3316.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 00:51:32,748 epoch 9 - iter 1620/1809 - loss 0.00639991 - time (sec): 102.78 - samples/sec: 3316.07 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-15 00:51:43,587 epoch 9 - iter 1800/1809 - loss 0.00679754 - time (sec): 113.62 - samples/sec: 3327.65 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 00:51:44,107 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:51:44,107 EPOCH 9 done: loss 0.0068 - lr: 0.000003 |
|
2023-10-15 00:51:49,804 DEV : loss 0.36713123321533203 - f1-score (micro avg) 0.6476 |
|
2023-10-15 00:51:49,848 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:52:01,561 epoch 10 - iter 180/1809 - loss 0.00501353 - time (sec): 11.71 - samples/sec: 3294.40 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 00:52:12,931 epoch 10 - iter 360/1809 - loss 0.00564503 - time (sec): 23.08 - samples/sec: 3312.41 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-15 00:52:24,067 epoch 10 - iter 540/1809 - loss 0.00553930 - time (sec): 34.22 - samples/sec: 3314.09 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 00:52:35,168 epoch 10 - iter 720/1809 - loss 0.00474944 - time (sec): 45.32 - samples/sec: 3360.42 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 00:52:46,049 epoch 10 - iter 900/1809 - loss 0.00441452 - time (sec): 56.20 - samples/sec: 3374.49 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-15 00:52:56,767 epoch 10 - iter 1080/1809 - loss 0.00439160 - time (sec): 66.92 - samples/sec: 3382.57 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 00:53:08,047 epoch 10 - iter 1260/1809 - loss 0.00399218 - time (sec): 78.20 - samples/sec: 3396.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 00:53:18,818 epoch 10 - iter 1440/1809 - loss 0.00419350 - time (sec): 88.97 - samples/sec: 3407.77 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-15 00:53:29,758 epoch 10 - iter 1620/1809 - loss 0.00416085 - time (sec): 99.91 - samples/sec: 3411.06 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 00:53:41,863 epoch 10 - iter 1800/1809 - loss 0.00440667 - time (sec): 112.01 - samples/sec: 3371.59 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-15 00:53:42,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:53:42,494 EPOCH 10 done: loss 0.0044 - lr: 0.000000 |
|
2023-10-15 00:53:48,144 DEV : loss 0.3939391076564789 - f1-score (micro avg) 0.6589 |
|
2023-10-15 00:53:48,186 saving best model |
|
2023-10-15 00:53:49,083 ---------------------------------------------------------------------------------------------------- |
|
2023-10-15 00:53:49,084 Loading model from best epoch ... |
|
2023-10-15 00:53:50,690 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org |
|
2023-10-15 00:53:58,307 |
|
Results: |
|
- F-score (micro) 0.6652 |
|
- F-score (macro) 0.5525 |
|
- Accuracy 0.5115 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.6331 0.8088 0.7103 591 |
|
pers 0.5735 0.7871 0.6635 357 |
|
org 0.2895 0.2785 0.2839 79 |
|
|
|
micro avg 0.5912 0.7605 0.6652 1027 |
|
macro avg 0.4987 0.6248 0.5525 1027 |
|
weighted avg 0.5859 0.7605 0.6612 1027 |
|
|
|
2023-10-15 00:53:58,308 ---------------------------------------------------------------------------------------------------- |
|
|