|
2023-10-25 15:46:38,696 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,697 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,697 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,697 Train: 7142 sentences |
|
2023-10-25 15:46:38,697 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,697 Training Params: |
|
2023-10-25 15:46:38,697 - learning_rate: "5e-05" |
|
2023-10-25 15:46:38,697 - mini_batch_size: "4" |
|
2023-10-25 15:46:38,697 - max_epochs: "10" |
|
2023-10-25 15:46:38,697 - shuffle: "True" |
|
2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,697 Plugins: |
|
2023-10-25 15:46:38,697 - TensorboardLogger |
|
2023-10-25 15:46:38,697 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,697 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 15:46:38,697 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,698 Computation: |
|
2023-10-25 15:46:38,698 - compute on device: cuda:0 |
|
2023-10-25 15:46:38,698 - embedding storage: none |
|
2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,698 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:46:38,698 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 15:46:48,318 epoch 1 - iter 178/1786 - loss 1.71122294 - time (sec): 9.62 - samples/sec: 2650.79 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 15:46:57,968 epoch 1 - iter 356/1786 - loss 1.10034904 - time (sec): 19.27 - samples/sec: 2534.06 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 15:47:07,424 epoch 1 - iter 534/1786 - loss 0.83110797 - time (sec): 28.73 - samples/sec: 2515.44 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:47:16,715 epoch 1 - iter 712/1786 - loss 0.67436102 - time (sec): 38.02 - samples/sec: 2546.45 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:47:26,198 epoch 1 - iter 890/1786 - loss 0.57044616 - time (sec): 47.50 - samples/sec: 2567.57 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:47:35,361 epoch 1 - iter 1068/1786 - loss 0.49625403 - time (sec): 56.66 - samples/sec: 2617.57 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:47:44,672 epoch 1 - iter 1246/1786 - loss 0.44959498 - time (sec): 65.97 - samples/sec: 2624.58 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 15:47:54,339 epoch 1 - iter 1424/1786 - loss 0.41355462 - time (sec): 75.64 - samples/sec: 2617.35 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 15:48:03,738 epoch 1 - iter 1602/1786 - loss 0.38486574 - time (sec): 85.04 - samples/sec: 2618.41 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 15:48:12,801 epoch 1 - iter 1780/1786 - loss 0.36139254 - time (sec): 94.10 - samples/sec: 2636.46 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-25 15:48:13,083 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:48:13,083 EPOCH 1 done: loss 0.3608 - lr: 0.000050 |
|
2023-10-25 15:48:17,092 DEV : loss 0.13503123819828033 - f1-score (micro avg) 0.7202 |
|
2023-10-25 15:48:17,116 saving best model |
|
2023-10-25 15:48:17,612 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:48:27,486 epoch 2 - iter 178/1786 - loss 0.12407203 - time (sec): 9.87 - samples/sec: 2599.84 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 15:48:37,042 epoch 2 - iter 356/1786 - loss 0.12736955 - time (sec): 19.43 - samples/sec: 2426.93 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 15:48:46,727 epoch 2 - iter 534/1786 - loss 0.12458317 - time (sec): 29.11 - samples/sec: 2535.20 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 15:48:56,148 epoch 2 - iter 712/1786 - loss 0.12184214 - time (sec): 38.53 - samples/sec: 2555.59 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 15:49:05,444 epoch 2 - iter 890/1786 - loss 0.12055895 - time (sec): 47.83 - samples/sec: 2593.23 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 15:49:14,963 epoch 2 - iter 1068/1786 - loss 0.12103444 - time (sec): 57.35 - samples/sec: 2594.01 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 15:49:24,474 epoch 2 - iter 1246/1786 - loss 0.11982038 - time (sec): 66.86 - samples/sec: 2615.69 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 15:49:33,793 epoch 2 - iter 1424/1786 - loss 0.11974673 - time (sec): 76.18 - samples/sec: 2589.45 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 15:49:43,364 epoch 2 - iter 1602/1786 - loss 0.11956401 - time (sec): 85.75 - samples/sec: 2596.32 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 15:49:53,170 epoch 2 - iter 1780/1786 - loss 0.11947260 - time (sec): 95.56 - samples/sec: 2592.81 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 15:49:53,498 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:49:53,498 EPOCH 2 done: loss 0.1196 - lr: 0.000044 |
|
2023-10-25 15:49:57,539 DEV : loss 0.11700031161308289 - f1-score (micro avg) 0.7623 |
|
2023-10-25 15:49:57,560 saving best model |
|
2023-10-25 15:49:58,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:50:07,682 epoch 3 - iter 178/1786 - loss 0.07945384 - time (sec): 9.48 - samples/sec: 2683.68 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 15:50:17,051 epoch 3 - iter 356/1786 - loss 0.07503847 - time (sec): 18.85 - samples/sec: 2596.72 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 15:50:26,519 epoch 3 - iter 534/1786 - loss 0.07795492 - time (sec): 28.32 - samples/sec: 2648.33 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 15:50:35,768 epoch 3 - iter 712/1786 - loss 0.08107118 - time (sec): 37.57 - samples/sec: 2654.03 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 15:50:44,402 epoch 3 - iter 890/1786 - loss 0.08084430 - time (sec): 46.20 - samples/sec: 2662.85 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 15:50:53,144 epoch 3 - iter 1068/1786 - loss 0.08209011 - time (sec): 54.94 - samples/sec: 2688.32 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 15:51:02,410 epoch 3 - iter 1246/1786 - loss 0.08279690 - time (sec): 64.21 - samples/sec: 2707.24 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 15:51:11,472 epoch 3 - iter 1424/1786 - loss 0.08293172 - time (sec): 73.27 - samples/sec: 2725.81 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 15:51:20,625 epoch 3 - iter 1602/1786 - loss 0.08232577 - time (sec): 82.42 - samples/sec: 2731.48 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 15:51:29,623 epoch 3 - iter 1780/1786 - loss 0.08244948 - time (sec): 91.42 - samples/sec: 2714.74 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 15:51:29,917 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:51:29,917 EPOCH 3 done: loss 0.0825 - lr: 0.000039 |
|
2023-10-25 15:51:34,711 DEV : loss 0.1378549337387085 - f1-score (micro avg) 0.755 |
|
2023-10-25 15:51:34,733 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:51:43,650 epoch 4 - iter 178/1786 - loss 0.05427530 - time (sec): 8.91 - samples/sec: 2798.78 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 15:51:52,829 epoch 4 - iter 356/1786 - loss 0.06260664 - time (sec): 18.09 - samples/sec: 2759.29 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 15:52:01,468 epoch 4 - iter 534/1786 - loss 0.06187774 - time (sec): 26.73 - samples/sec: 2746.84 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 15:52:10,600 epoch 4 - iter 712/1786 - loss 0.06195480 - time (sec): 35.87 - samples/sec: 2765.81 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 15:52:19,483 epoch 4 - iter 890/1786 - loss 0.06149034 - time (sec): 44.75 - samples/sec: 2776.77 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 15:52:28,516 epoch 4 - iter 1068/1786 - loss 0.06169674 - time (sec): 53.78 - samples/sec: 2806.92 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 15:52:37,349 epoch 4 - iter 1246/1786 - loss 0.06445329 - time (sec): 62.61 - samples/sec: 2795.13 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 15:52:46,539 epoch 4 - iter 1424/1786 - loss 0.06499429 - time (sec): 71.80 - samples/sec: 2758.74 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 15:52:55,747 epoch 4 - iter 1602/1786 - loss 0.06306222 - time (sec): 81.01 - samples/sec: 2771.19 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 15:53:04,431 epoch 4 - iter 1780/1786 - loss 0.06282893 - time (sec): 89.70 - samples/sec: 2767.02 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 15:53:04,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:53:04,707 EPOCH 4 done: loss 0.0630 - lr: 0.000033 |
|
2023-10-25 15:53:09,579 DEV : loss 0.18551814556121826 - f1-score (micro avg) 0.7519 |
|
2023-10-25 15:53:09,603 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:53:18,813 epoch 5 - iter 178/1786 - loss 0.04430229 - time (sec): 9.21 - samples/sec: 2578.40 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 15:53:28,613 epoch 5 - iter 356/1786 - loss 0.04196870 - time (sec): 19.01 - samples/sec: 2602.50 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 15:53:38,464 epoch 5 - iter 534/1786 - loss 0.04425551 - time (sec): 28.86 - samples/sec: 2605.34 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 15:53:47,762 epoch 5 - iter 712/1786 - loss 0.04299756 - time (sec): 38.16 - samples/sec: 2606.77 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 15:53:56,812 epoch 5 - iter 890/1786 - loss 0.04398595 - time (sec): 47.21 - samples/sec: 2619.00 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 15:54:05,975 epoch 5 - iter 1068/1786 - loss 0.04465443 - time (sec): 56.37 - samples/sec: 2659.82 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 15:54:14,748 epoch 5 - iter 1246/1786 - loss 0.04413175 - time (sec): 65.14 - samples/sec: 2672.35 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:54:23,729 epoch 5 - iter 1424/1786 - loss 0.04310012 - time (sec): 74.12 - samples/sec: 2673.94 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 15:54:32,836 epoch 5 - iter 1602/1786 - loss 0.04362101 - time (sec): 83.23 - samples/sec: 2677.74 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:54:42,313 epoch 5 - iter 1780/1786 - loss 0.04499036 - time (sec): 92.71 - samples/sec: 2675.55 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 15:54:42,605 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:54:42,605 EPOCH 5 done: loss 0.0449 - lr: 0.000028 |
|
2023-10-25 15:54:46,479 DEV : loss 0.18028688430786133 - f1-score (micro avg) 0.8033 |
|
2023-10-25 15:54:46,503 saving best model |
|
2023-10-25 15:54:47,160 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:54:56,791 epoch 6 - iter 178/1786 - loss 0.02633064 - time (sec): 9.63 - samples/sec: 2614.15 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:55:06,138 epoch 6 - iter 356/1786 - loss 0.02540360 - time (sec): 18.97 - samples/sec: 2572.31 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 15:55:15,646 epoch 6 - iter 534/1786 - loss 0.02731547 - time (sec): 28.48 - samples/sec: 2601.99 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 15:55:25,066 epoch 6 - iter 712/1786 - loss 0.03161683 - time (sec): 37.90 - samples/sec: 2616.15 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 15:55:34,736 epoch 6 - iter 890/1786 - loss 0.03258884 - time (sec): 47.57 - samples/sec: 2632.34 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 15:55:44,052 epoch 6 - iter 1068/1786 - loss 0.03444115 - time (sec): 56.89 - samples/sec: 2608.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:55:53,428 epoch 6 - iter 1246/1786 - loss 0.03455081 - time (sec): 66.27 - samples/sec: 2624.54 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 15:56:02,885 epoch 6 - iter 1424/1786 - loss 0.03507010 - time (sec): 75.72 - samples/sec: 2631.78 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 15:56:11,781 epoch 6 - iter 1602/1786 - loss 0.03617707 - time (sec): 84.62 - samples/sec: 2646.56 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 15:56:20,674 epoch 6 - iter 1780/1786 - loss 0.03554486 - time (sec): 93.51 - samples/sec: 2654.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 15:56:20,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:56:20,971 EPOCH 6 done: loss 0.0357 - lr: 0.000022 |
|
2023-10-25 15:56:25,780 DEV : loss 0.1820164918899536 - f1-score (micro avg) 0.7943 |
|
2023-10-25 15:56:25,801 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:56:35,262 epoch 7 - iter 178/1786 - loss 0.02389501 - time (sec): 9.46 - samples/sec: 2817.92 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 15:56:44,521 epoch 7 - iter 356/1786 - loss 0.03026582 - time (sec): 18.72 - samples/sec: 2707.42 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:56:53,894 epoch 7 - iter 534/1786 - loss 0.03128907 - time (sec): 28.09 - samples/sec: 2670.28 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 15:57:03,184 epoch 7 - iter 712/1786 - loss 0.03038936 - time (sec): 37.38 - samples/sec: 2681.89 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 15:57:12,520 epoch 7 - iter 890/1786 - loss 0.02884850 - time (sec): 46.72 - samples/sec: 2668.10 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 15:57:21,892 epoch 7 - iter 1068/1786 - loss 0.02871245 - time (sec): 56.09 - samples/sec: 2650.46 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 15:57:31,104 epoch 7 - iter 1246/1786 - loss 0.02822760 - time (sec): 65.30 - samples/sec: 2629.02 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:57:40,635 epoch 7 - iter 1424/1786 - loss 0.02843002 - time (sec): 74.83 - samples/sec: 2645.70 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 15:57:49,434 epoch 7 - iter 1602/1786 - loss 0.02811501 - time (sec): 83.63 - samples/sec: 2659.82 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 15:57:58,178 epoch 7 - iter 1780/1786 - loss 0.02826981 - time (sec): 92.38 - samples/sec: 2685.53 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 15:57:58,469 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:57:58,470 EPOCH 7 done: loss 0.0282 - lr: 0.000017 |
|
2023-10-25 15:58:03,620 DEV : loss 0.2033814787864685 - f1-score (micro avg) 0.7832 |
|
2023-10-25 15:58:03,643 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:58:13,174 epoch 8 - iter 178/1786 - loss 0.02698386 - time (sec): 9.53 - samples/sec: 2513.72 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 15:58:23,042 epoch 8 - iter 356/1786 - loss 0.02189818 - time (sec): 19.40 - samples/sec: 2504.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 15:58:32,872 epoch 8 - iter 534/1786 - loss 0.02241369 - time (sec): 29.23 - samples/sec: 2531.08 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 15:58:42,433 epoch 8 - iter 712/1786 - loss 0.02213136 - time (sec): 38.79 - samples/sec: 2517.54 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 15:58:52,151 epoch 8 - iter 890/1786 - loss 0.02156072 - time (sec): 48.51 - samples/sec: 2522.71 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 15:59:01,830 epoch 8 - iter 1068/1786 - loss 0.02028695 - time (sec): 58.18 - samples/sec: 2539.64 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 15:59:11,492 epoch 8 - iter 1246/1786 - loss 0.01971787 - time (sec): 67.85 - samples/sec: 2567.47 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 15:59:21,218 epoch 8 - iter 1424/1786 - loss 0.01980862 - time (sec): 77.57 - samples/sec: 2547.94 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:59:30,611 epoch 8 - iter 1602/1786 - loss 0.02006421 - time (sec): 86.97 - samples/sec: 2557.81 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 15:59:39,890 epoch 8 - iter 1780/1786 - loss 0.01964747 - time (sec): 96.25 - samples/sec: 2577.09 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 15:59:40,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:59:40,217 EPOCH 8 done: loss 0.0197 - lr: 0.000011 |
|
2023-10-25 15:59:44,159 DEV : loss 0.21475903689861298 - f1-score (micro avg) 0.79 |
|
2023-10-25 15:59:44,183 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 15:59:53,943 epoch 9 - iter 178/1786 - loss 0.00854585 - time (sec): 9.76 - samples/sec: 2529.14 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 16:00:03,415 epoch 9 - iter 356/1786 - loss 0.00730957 - time (sec): 19.23 - samples/sec: 2546.52 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 16:00:12,902 epoch 9 - iter 534/1786 - loss 0.00824375 - time (sec): 28.72 - samples/sec: 2565.61 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:00:22,790 epoch 9 - iter 712/1786 - loss 0.01124988 - time (sec): 38.61 - samples/sec: 2607.10 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 16:00:32,365 epoch 9 - iter 890/1786 - loss 0.01242606 - time (sec): 48.18 - samples/sec: 2611.93 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 16:00:41,872 epoch 9 - iter 1068/1786 - loss 0.01230465 - time (sec): 57.69 - samples/sec: 2624.15 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 16:00:51,095 epoch 9 - iter 1246/1786 - loss 0.01297617 - time (sec): 66.91 - samples/sec: 2616.87 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 16:01:00,313 epoch 9 - iter 1424/1786 - loss 0.01268858 - time (sec): 76.13 - samples/sec: 2634.73 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 16:01:09,907 epoch 9 - iter 1602/1786 - loss 0.01258302 - time (sec): 85.72 - samples/sec: 2619.87 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:01:19,662 epoch 9 - iter 1780/1786 - loss 0.01218671 - time (sec): 95.48 - samples/sec: 2598.21 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 16:01:19,987 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:01:19,987 EPOCH 9 done: loss 0.0122 - lr: 0.000006 |
|
2023-10-25 16:01:25,365 DEV : loss 0.21478621661663055 - f1-score (micro avg) 0.7938 |
|
2023-10-25 16:01:25,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:01:34,658 epoch 10 - iter 178/1786 - loss 0.01114122 - time (sec): 9.27 - samples/sec: 2577.39 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 16:01:44,026 epoch 10 - iter 356/1786 - loss 0.00936457 - time (sec): 18.64 - samples/sec: 2663.26 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 16:01:53,778 epoch 10 - iter 534/1786 - loss 0.00753507 - time (sec): 28.39 - samples/sec: 2632.67 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 16:02:02,735 epoch 10 - iter 712/1786 - loss 0.00724194 - time (sec): 37.35 - samples/sec: 2675.25 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:02:12,223 epoch 10 - iter 890/1786 - loss 0.00681051 - time (sec): 46.84 - samples/sec: 2606.92 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 16:02:21,833 epoch 10 - iter 1068/1786 - loss 0.00691939 - time (sec): 56.45 - samples/sec: 2622.33 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 16:02:31,013 epoch 10 - iter 1246/1786 - loss 0.00730625 - time (sec): 65.63 - samples/sec: 2633.25 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 16:02:40,785 epoch 10 - iter 1424/1786 - loss 0.00715163 - time (sec): 75.40 - samples/sec: 2622.41 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 16:02:50,482 epoch 10 - iter 1602/1786 - loss 0.00713533 - time (sec): 85.09 - samples/sec: 2603.48 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 16:03:00,096 epoch 10 - iter 1780/1786 - loss 0.00759595 - time (sec): 94.71 - samples/sec: 2619.46 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 16:03:00,416 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:00,416 EPOCH 10 done: loss 0.0076 - lr: 0.000000 |
|
2023-10-25 16:03:04,801 DEV : loss 0.2166852504014969 - f1-score (micro avg) 0.7957 |
|
2023-10-25 16:03:06,067 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 16:03:06,068 Loading model from best epoch ... |
|
2023-10-25 16:03:07,869 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-25 16:03:20,794 |
|
Results: |
|
- F-score (micro) 0.6746 |
|
- F-score (macro) 0.5843 |
|
- Accuracy 0.521 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6880 0.6484 0.6676 1095 |
|
PER 0.7798 0.7559 0.7677 1012 |
|
ORG 0.4706 0.4706 0.4706 357 |
|
HumanProd 0.3188 0.6667 0.4314 33 |
|
|
|
micro avg 0.6827 0.6668 0.6746 2497 |
|
macro avg 0.5643 0.6354 0.5843 2497 |
|
weighted avg 0.6892 0.6668 0.6769 2497 |
|
|
|
2023-10-25 16:03:20,794 ---------------------------------------------------------------------------------------------------- |
|
|