|
2023-10-16 14:26:00,861 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,862 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 14:26:00,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,862 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-16 14:26:00,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,862 Train: 7142 sentences |
|
2023-10-16 14:26:00,862 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 14:26:00,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,863 Training Params: |
|
2023-10-16 14:26:00,863 - learning_rate: "5e-05" |
|
2023-10-16 14:26:00,863 - mini_batch_size: "4" |
|
2023-10-16 14:26:00,863 - max_epochs: "10" |
|
2023-10-16 14:26:00,863 - shuffle: "True" |
|
2023-10-16 14:26:00,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,863 Plugins: |
|
2023-10-16 14:26:00,863 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 14:26:00,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,863 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 14:26:00,863 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 14:26:00,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,863 Computation: |
|
2023-10-16 14:26:00,863 - compute on device: cuda:0 |
|
2023-10-16 14:26:00,863 - embedding storage: none |
|
2023-10-16 14:26:00,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,863 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-16 14:26:00,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:00,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:26:09,399 epoch 1 - iter 178/1786 - loss 1.72350929 - time (sec): 8.54 - samples/sec: 2893.61 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 14:26:18,210 epoch 1 - iter 356/1786 - loss 1.08436518 - time (sec): 17.35 - samples/sec: 2864.09 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 14:26:26,893 epoch 1 - iter 534/1786 - loss 0.82712712 - time (sec): 26.03 - samples/sec: 2889.48 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 14:26:35,720 epoch 1 - iter 712/1786 - loss 0.67539723 - time (sec): 34.86 - samples/sec: 2901.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 14:26:44,292 epoch 1 - iter 890/1786 - loss 0.58731779 - time (sec): 43.43 - samples/sec: 2887.71 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 14:26:52,634 epoch 1 - iter 1068/1786 - loss 0.52122298 - time (sec): 51.77 - samples/sec: 2875.76 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 14:27:01,396 epoch 1 - iter 1246/1786 - loss 0.46746722 - time (sec): 60.53 - samples/sec: 2891.65 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 14:27:09,876 epoch 1 - iter 1424/1786 - loss 0.43349886 - time (sec): 69.01 - samples/sec: 2898.54 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 14:27:18,516 epoch 1 - iter 1602/1786 - loss 0.40309372 - time (sec): 77.65 - samples/sec: 2881.00 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 14:27:27,304 epoch 1 - iter 1780/1786 - loss 0.37831641 - time (sec): 86.44 - samples/sec: 2868.61 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-16 14:27:27,579 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:27:27,579 EPOCH 1 done: loss 0.3776 - lr: 0.000050 |
|
2023-10-16 14:27:30,550 DEV : loss 0.1278153508901596 - f1-score (micro avg) 0.6762 |
|
2023-10-16 14:27:30,565 saving best model |
|
2023-10-16 14:27:30,922 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:27:39,497 epoch 2 - iter 178/1786 - loss 0.14556974 - time (sec): 8.57 - samples/sec: 2912.61 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 14:27:48,242 epoch 2 - iter 356/1786 - loss 0.13073050 - time (sec): 17.32 - samples/sec: 2972.66 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 14:27:56,928 epoch 2 - iter 534/1786 - loss 0.13025242 - time (sec): 26.00 - samples/sec: 2967.14 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 14:28:05,608 epoch 2 - iter 712/1786 - loss 0.13292229 - time (sec): 34.68 - samples/sec: 2921.00 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 14:28:14,449 epoch 2 - iter 890/1786 - loss 0.13208372 - time (sec): 43.53 - samples/sec: 2900.44 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 14:28:23,136 epoch 2 - iter 1068/1786 - loss 0.13177483 - time (sec): 52.21 - samples/sec: 2897.01 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 14:28:31,937 epoch 2 - iter 1246/1786 - loss 0.12936425 - time (sec): 61.01 - samples/sec: 2875.81 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 14:28:40,602 epoch 2 - iter 1424/1786 - loss 0.12693416 - time (sec): 69.68 - samples/sec: 2874.59 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 14:28:49,147 epoch 2 - iter 1602/1786 - loss 0.12465130 - time (sec): 78.22 - samples/sec: 2883.32 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 14:28:57,347 epoch 2 - iter 1780/1786 - loss 0.12599998 - time (sec): 86.42 - samples/sec: 2865.96 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 14:28:57,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:28:57,611 EPOCH 2 done: loss 0.1259 - lr: 0.000044 |
|
2023-10-16 14:29:02,237 DEV : loss 0.1384890377521515 - f1-score (micro avg) 0.7327 |
|
2023-10-16 14:29:02,255 saving best model |
|
2023-10-16 14:29:02,722 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:29:11,369 epoch 3 - iter 178/1786 - loss 0.08576002 - time (sec): 8.64 - samples/sec: 2753.36 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 14:29:20,279 epoch 3 - iter 356/1786 - loss 0.09184061 - time (sec): 17.55 - samples/sec: 2740.32 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 14:29:29,385 epoch 3 - iter 534/1786 - loss 0.09010177 - time (sec): 26.66 - samples/sec: 2727.10 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 14:29:38,115 epoch 3 - iter 712/1786 - loss 0.08566478 - time (sec): 35.39 - samples/sec: 2755.05 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 14:29:46,981 epoch 3 - iter 890/1786 - loss 0.08411972 - time (sec): 44.26 - samples/sec: 2771.07 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 14:29:55,567 epoch 3 - iter 1068/1786 - loss 0.08500553 - time (sec): 52.84 - samples/sec: 2780.96 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 14:30:04,319 epoch 3 - iter 1246/1786 - loss 0.08416921 - time (sec): 61.59 - samples/sec: 2801.64 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 14:30:13,120 epoch 3 - iter 1424/1786 - loss 0.08421054 - time (sec): 70.40 - samples/sec: 2818.47 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 14:30:22,250 epoch 3 - iter 1602/1786 - loss 0.08757976 - time (sec): 79.52 - samples/sec: 2800.20 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 14:30:31,006 epoch 3 - iter 1780/1786 - loss 0.08731846 - time (sec): 88.28 - samples/sec: 2807.01 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 14:30:31,300 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:30:31,300 EPOCH 3 done: loss 0.0873 - lr: 0.000039 |
|
2023-10-16 14:30:36,115 DEV : loss 0.1713087409734726 - f1-score (micro avg) 0.7429 |
|
2023-10-16 14:30:36,132 saving best model |
|
2023-10-16 14:30:36,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:30:45,423 epoch 4 - iter 178/1786 - loss 0.05739948 - time (sec): 8.84 - samples/sec: 2942.92 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 14:30:54,134 epoch 4 - iter 356/1786 - loss 0.05690690 - time (sec): 17.55 - samples/sec: 2888.92 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 14:31:02,732 epoch 4 - iter 534/1786 - loss 0.05637698 - time (sec): 26.14 - samples/sec: 2869.13 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 14:31:11,563 epoch 4 - iter 712/1786 - loss 0.05914208 - time (sec): 34.98 - samples/sec: 2868.87 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 14:31:20,009 epoch 4 - iter 890/1786 - loss 0.05997229 - time (sec): 43.42 - samples/sec: 2864.77 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 14:31:28,634 epoch 4 - iter 1068/1786 - loss 0.06052032 - time (sec): 52.05 - samples/sec: 2871.82 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 14:31:37,097 epoch 4 - iter 1246/1786 - loss 0.06191085 - time (sec): 60.51 - samples/sec: 2851.50 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 14:31:45,688 epoch 4 - iter 1424/1786 - loss 0.06222248 - time (sec): 69.10 - samples/sec: 2853.96 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 14:31:54,395 epoch 4 - iter 1602/1786 - loss 0.06351783 - time (sec): 77.81 - samples/sec: 2852.68 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 14:32:03,271 epoch 4 - iter 1780/1786 - loss 0.06440999 - time (sec): 86.68 - samples/sec: 2860.50 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 14:32:03,570 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:32:03,570 EPOCH 4 done: loss 0.0644 - lr: 0.000033 |
|
2023-10-16 14:32:07,654 DEV : loss 0.1713092029094696 - f1-score (micro avg) 0.7712 |
|
2023-10-16 14:32:07,670 saving best model |
|
2023-10-16 14:32:08,139 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:32:16,968 epoch 5 - iter 178/1786 - loss 0.03588763 - time (sec): 8.83 - samples/sec: 2627.03 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 14:32:25,763 epoch 5 - iter 356/1786 - loss 0.04421636 - time (sec): 17.62 - samples/sec: 2862.46 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 14:32:34,456 epoch 5 - iter 534/1786 - loss 0.04709247 - time (sec): 26.31 - samples/sec: 2875.81 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 14:32:43,150 epoch 5 - iter 712/1786 - loss 0.04802956 - time (sec): 35.01 - samples/sec: 2881.20 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 14:32:51,542 epoch 5 - iter 890/1786 - loss 0.04796144 - time (sec): 43.40 - samples/sec: 2841.11 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 14:33:00,455 epoch 5 - iter 1068/1786 - loss 0.04753006 - time (sec): 52.31 - samples/sec: 2868.11 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 14:33:09,033 epoch 5 - iter 1246/1786 - loss 0.04905045 - time (sec): 60.89 - samples/sec: 2863.09 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 14:33:17,625 epoch 5 - iter 1424/1786 - loss 0.04974146 - time (sec): 69.48 - samples/sec: 2854.21 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 14:33:26,438 epoch 5 - iter 1602/1786 - loss 0.04973847 - time (sec): 78.30 - samples/sec: 2845.43 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 14:33:35,208 epoch 5 - iter 1780/1786 - loss 0.04912550 - time (sec): 87.07 - samples/sec: 2851.07 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 14:33:35,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:33:35,495 EPOCH 5 done: loss 0.0492 - lr: 0.000028 |
|
2023-10-16 14:33:40,088 DEV : loss 0.1638714075088501 - f1-score (micro avg) 0.786 |
|
2023-10-16 14:33:40,104 saving best model |
|
2023-10-16 14:33:40,563 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:33:49,049 epoch 6 - iter 178/1786 - loss 0.04047784 - time (sec): 8.48 - samples/sec: 2804.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 14:33:57,790 epoch 6 - iter 356/1786 - loss 0.03466111 - time (sec): 17.22 - samples/sec: 2894.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 14:34:06,507 epoch 6 - iter 534/1786 - loss 0.03645853 - time (sec): 25.94 - samples/sec: 2876.13 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 14:34:15,323 epoch 6 - iter 712/1786 - loss 0.03843536 - time (sec): 34.76 - samples/sec: 2876.10 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 14:34:23,863 epoch 6 - iter 890/1786 - loss 0.04112112 - time (sec): 43.30 - samples/sec: 2866.24 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 14:34:32,461 epoch 6 - iter 1068/1786 - loss 0.04047041 - time (sec): 51.89 - samples/sec: 2912.93 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 14:34:40,749 epoch 6 - iter 1246/1786 - loss 0.04134837 - time (sec): 60.18 - samples/sec: 2923.91 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 14:34:49,172 epoch 6 - iter 1424/1786 - loss 0.04218921 - time (sec): 68.61 - samples/sec: 2929.62 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 14:34:57,673 epoch 6 - iter 1602/1786 - loss 0.04256479 - time (sec): 77.11 - samples/sec: 2907.31 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 14:35:06,362 epoch 6 - iter 1780/1786 - loss 0.04276236 - time (sec): 85.80 - samples/sec: 2893.03 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 14:35:06,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:35:06,640 EPOCH 6 done: loss 0.0429 - lr: 0.000022 |
|
2023-10-16 14:35:11,228 DEV : loss 0.16154661774635315 - f1-score (micro avg) 0.7973 |
|
2023-10-16 14:35:11,244 saving best model |
|
2023-10-16 14:35:11,739 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:35:20,356 epoch 7 - iter 178/1786 - loss 0.03094916 - time (sec): 8.61 - samples/sec: 2725.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 14:35:28,854 epoch 7 - iter 356/1786 - loss 0.03031893 - time (sec): 17.11 - samples/sec: 2747.67 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 14:35:37,566 epoch 7 - iter 534/1786 - loss 0.03010387 - time (sec): 25.82 - samples/sec: 2799.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 14:35:46,181 epoch 7 - iter 712/1786 - loss 0.03065030 - time (sec): 34.44 - samples/sec: 2835.81 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 14:35:54,743 epoch 7 - iter 890/1786 - loss 0.02853794 - time (sec): 43.00 - samples/sec: 2820.94 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 14:36:03,394 epoch 7 - iter 1068/1786 - loss 0.02784728 - time (sec): 51.65 - samples/sec: 2829.24 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 14:36:12,171 epoch 7 - iter 1246/1786 - loss 0.02834536 - time (sec): 60.43 - samples/sec: 2854.33 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 14:36:21,073 epoch 7 - iter 1424/1786 - loss 0.02869632 - time (sec): 69.33 - samples/sec: 2875.27 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 14:36:29,562 epoch 7 - iter 1602/1786 - loss 0.02920281 - time (sec): 77.82 - samples/sec: 2867.77 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 14:36:38,162 epoch 7 - iter 1780/1786 - loss 0.03013120 - time (sec): 86.42 - samples/sec: 2870.16 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 14:36:38,448 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:36:38,449 EPOCH 7 done: loss 0.0301 - lr: 0.000017 |
|
2023-10-16 14:36:42,497 DEV : loss 0.18691599369049072 - f1-score (micro avg) 0.8076 |
|
2023-10-16 14:36:42,513 saving best model |
|
2023-10-16 14:36:42,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:36:51,660 epoch 8 - iter 178/1786 - loss 0.02062719 - time (sec): 8.68 - samples/sec: 2748.88 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 14:37:00,179 epoch 8 - iter 356/1786 - loss 0.02050486 - time (sec): 17.20 - samples/sec: 2833.37 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 14:37:08,847 epoch 8 - iter 534/1786 - loss 0.02086256 - time (sec): 25.87 - samples/sec: 2853.42 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 14:37:17,490 epoch 8 - iter 712/1786 - loss 0.02131782 - time (sec): 34.51 - samples/sec: 2873.72 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 14:37:25,774 epoch 8 - iter 890/1786 - loss 0.02156856 - time (sec): 42.80 - samples/sec: 2852.53 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 14:37:34,448 epoch 8 - iter 1068/1786 - loss 0.02181934 - time (sec): 51.47 - samples/sec: 2841.35 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 14:37:43,284 epoch 8 - iter 1246/1786 - loss 0.02187373 - time (sec): 60.31 - samples/sec: 2873.93 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 14:37:52,258 epoch 8 - iter 1424/1786 - loss 0.02158534 - time (sec): 69.28 - samples/sec: 2876.68 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 14:38:00,408 epoch 8 - iter 1602/1786 - loss 0.02156756 - time (sec): 77.43 - samples/sec: 2868.83 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 14:38:08,929 epoch 8 - iter 1780/1786 - loss 0.02125784 - time (sec): 85.95 - samples/sec: 2880.58 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 14:38:09,269 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:38:09,269 EPOCH 8 done: loss 0.0212 - lr: 0.000011 |
|
2023-10-16 14:38:13,927 DEV : loss 0.19830918312072754 - f1-score (micro avg) 0.8 |
|
2023-10-16 14:38:13,943 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:38:22,614 epoch 9 - iter 178/1786 - loss 0.01444712 - time (sec): 8.67 - samples/sec: 2747.32 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 14:38:31,308 epoch 9 - iter 356/1786 - loss 0.01386542 - time (sec): 17.36 - samples/sec: 2790.61 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 14:38:39,923 epoch 9 - iter 534/1786 - loss 0.01456268 - time (sec): 25.98 - samples/sec: 2781.99 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 14:38:48,588 epoch 9 - iter 712/1786 - loss 0.01617640 - time (sec): 34.64 - samples/sec: 2799.31 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 14:38:57,041 epoch 9 - iter 890/1786 - loss 0.01623150 - time (sec): 43.10 - samples/sec: 2795.59 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 14:39:05,722 epoch 9 - iter 1068/1786 - loss 0.01612223 - time (sec): 51.78 - samples/sec: 2819.35 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 14:39:14,680 epoch 9 - iter 1246/1786 - loss 0.01568525 - time (sec): 60.74 - samples/sec: 2811.42 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 14:39:23,598 epoch 9 - iter 1424/1786 - loss 0.01603606 - time (sec): 69.65 - samples/sec: 2832.39 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 14:39:32,277 epoch 9 - iter 1602/1786 - loss 0.01544125 - time (sec): 78.33 - samples/sec: 2846.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 14:39:40,960 epoch 9 - iter 1780/1786 - loss 0.01507366 - time (sec): 87.02 - samples/sec: 2849.40 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 14:39:41,259 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:39:41,260 EPOCH 9 done: loss 0.0151 - lr: 0.000006 |
|
2023-10-16 14:39:45,962 DEV : loss 0.20294949412345886 - f1-score (micro avg) 0.8003 |
|
2023-10-16 14:39:45,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:39:54,760 epoch 10 - iter 178/1786 - loss 0.01198823 - time (sec): 8.78 - samples/sec: 2808.38 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 14:40:03,269 epoch 10 - iter 356/1786 - loss 0.01082116 - time (sec): 17.29 - samples/sec: 2772.31 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 14:40:11,969 epoch 10 - iter 534/1786 - loss 0.00989378 - time (sec): 25.99 - samples/sec: 2800.29 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 14:40:20,848 epoch 10 - iter 712/1786 - loss 0.01074502 - time (sec): 34.87 - samples/sec: 2810.52 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 14:40:29,477 epoch 10 - iter 890/1786 - loss 0.00999905 - time (sec): 43.50 - samples/sec: 2816.99 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 14:40:38,192 epoch 10 - iter 1068/1786 - loss 0.00997578 - time (sec): 52.21 - samples/sec: 2819.62 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 14:40:46,875 epoch 10 - iter 1246/1786 - loss 0.00978065 - time (sec): 60.90 - samples/sec: 2821.42 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 14:40:55,564 epoch 10 - iter 1424/1786 - loss 0.00992214 - time (sec): 69.58 - samples/sec: 2829.76 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 14:41:04,111 epoch 10 - iter 1602/1786 - loss 0.01021399 - time (sec): 78.13 - samples/sec: 2835.76 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 14:41:12,876 epoch 10 - iter 1780/1786 - loss 0.01056216 - time (sec): 86.90 - samples/sec: 2851.53 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 14:41:13,167 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:41:13,167 EPOCH 10 done: loss 0.0105 - lr: 0.000000 |
|
2023-10-16 14:41:17,252 DEV : loss 0.20407141745090485 - f1-score (micro avg) 0.8005 |
|
2023-10-16 14:41:17,631 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 14:41:17,632 Loading model from best epoch ... |
|
2023-10-16 14:41:19,064 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 14:41:28,416 |
|
Results: |
|
- F-score (micro) 0.6717 |
|
- F-score (macro) 0.5958 |
|
- Accuracy 0.5213 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6643 0.6721 0.6682 1095 |
|
PER 0.7561 0.7292 0.7425 1012 |
|
ORG 0.5041 0.5182 0.5110 357 |
|
HumanProd 0.4000 0.5455 0.4615 33 |
|
|
|
micro avg 0.6719 0.6716 0.6717 2497 |
|
macro avg 0.5811 0.6163 0.5958 2497 |
|
weighted avg 0.6751 0.6716 0.6731 2497 |
|
|
|
2023-10-16 14:41:28,416 ---------------------------------------------------------------------------------------------------- |
|
|