|
2023-10-16 18:23:52,773 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,774 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:23:52,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,774 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:23:52,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,774 Train: 1166 sentences |
|
2023-10-16 18:23:52,774 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:23:52,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,774 Training Params: |
|
2023-10-16 18:23:52,774 - learning_rate: "5e-05" |
|
2023-10-16 18:23:52,774 - mini_batch_size: "4" |
|
2023-10-16 18:23:52,774 - max_epochs: "10" |
|
2023-10-16 18:23:52,774 - shuffle: "True" |
|
2023-10-16 18:23:52,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,774 Plugins: |
|
2023-10-16 18:23:52,775 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:23:52,775 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,775 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:23:52,775 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:23:52,775 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,775 Computation: |
|
2023-10-16 18:23:52,775 - compute on device: cuda:0 |
|
2023-10-16 18:23:52,775 - embedding storage: none |
|
2023-10-16 18:23:52,775 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,775 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-16 18:23:52,775 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:52,775 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:54,355 epoch 1 - iter 29/292 - loss 2.82861770 - time (sec): 1.58 - samples/sec: 2569.51 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:23:55,895 epoch 1 - iter 58/292 - loss 2.27049059 - time (sec): 3.12 - samples/sec: 2418.01 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:23:57,801 epoch 1 - iter 87/292 - loss 1.50862869 - time (sec): 5.02 - samples/sec: 2581.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:23:59,391 epoch 1 - iter 116/292 - loss 1.29469874 - time (sec): 6.62 - samples/sec: 2575.35 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:24:00,959 epoch 1 - iter 145/292 - loss 1.16113362 - time (sec): 8.18 - samples/sec: 2577.50 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:24:02,460 epoch 1 - iter 174/292 - loss 1.04202470 - time (sec): 9.68 - samples/sec: 2554.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:24:04,304 epoch 1 - iter 203/292 - loss 0.92592921 - time (sec): 11.53 - samples/sec: 2608.57 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:24:06,039 epoch 1 - iter 232/292 - loss 0.82865037 - time (sec): 13.26 - samples/sec: 2633.60 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:24:07,694 epoch 1 - iter 261/292 - loss 0.76127572 - time (sec): 14.92 - samples/sec: 2634.58 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:24:09,386 epoch 1 - iter 290/292 - loss 0.70436125 - time (sec): 16.61 - samples/sec: 2661.85 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:24:09,480 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:24:09,480 EPOCH 1 done: loss 0.7015 - lr: 0.000049 |
|
2023-10-16 18:24:10,491 DEV : loss 0.198698028922081 - f1-score (micro avg) 0.4183 |
|
2023-10-16 18:24:10,496 saving best model |
|
2023-10-16 18:24:10,894 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:24:12,579 epoch 2 - iter 29/292 - loss 0.21870353 - time (sec): 1.68 - samples/sec: 2890.96 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:24:14,360 epoch 2 - iter 58/292 - loss 0.19710288 - time (sec): 3.46 - samples/sec: 2792.34 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:24:15,863 epoch 2 - iter 87/292 - loss 0.20636583 - time (sec): 4.97 - samples/sec: 2748.51 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:24:17,463 epoch 2 - iter 116/292 - loss 0.19745140 - time (sec): 6.57 - samples/sec: 2680.86 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:24:19,192 epoch 2 - iter 145/292 - loss 0.19457600 - time (sec): 8.30 - samples/sec: 2670.91 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:24:20,873 epoch 2 - iter 174/292 - loss 0.19655143 - time (sec): 9.98 - samples/sec: 2701.15 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:24:22,548 epoch 2 - iter 203/292 - loss 0.19279805 - time (sec): 11.65 - samples/sec: 2695.02 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:24:24,046 epoch 2 - iter 232/292 - loss 0.18956283 - time (sec): 13.15 - samples/sec: 2679.43 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:24:25,691 epoch 2 - iter 261/292 - loss 0.18794363 - time (sec): 14.80 - samples/sec: 2702.80 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:24:27,392 epoch 2 - iter 290/292 - loss 0.18131152 - time (sec): 16.50 - samples/sec: 2688.04 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:24:27,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:24:27,495 EPOCH 2 done: loss 0.1809 - lr: 0.000045 |
|
2023-10-16 18:24:28,829 DEV : loss 0.12991848587989807 - f1-score (micro avg) 0.6681 |
|
2023-10-16 18:24:28,836 saving best model |
|
2023-10-16 18:24:29,353 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:24:31,427 epoch 3 - iter 29/292 - loss 0.14634692 - time (sec): 2.07 - samples/sec: 2612.34 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 18:24:33,032 epoch 3 - iter 58/292 - loss 0.15575082 - time (sec): 3.68 - samples/sec: 2612.36 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:24:34,896 epoch 3 - iter 87/292 - loss 0.14045427 - time (sec): 5.54 - samples/sec: 2653.09 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:24:36,530 epoch 3 - iter 116/292 - loss 0.12675597 - time (sec): 7.17 - samples/sec: 2688.94 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:24:38,274 epoch 3 - iter 145/292 - loss 0.12307056 - time (sec): 8.92 - samples/sec: 2729.57 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:24:39,809 epoch 3 - iter 174/292 - loss 0.12034832 - time (sec): 10.45 - samples/sec: 2692.55 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:24:41,378 epoch 3 - iter 203/292 - loss 0.11494198 - time (sec): 12.02 - samples/sec: 2652.33 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:24:42,914 epoch 3 - iter 232/292 - loss 0.11259923 - time (sec): 13.56 - samples/sec: 2679.55 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:24:44,531 epoch 3 - iter 261/292 - loss 0.11038783 - time (sec): 15.18 - samples/sec: 2660.06 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:24:46,091 epoch 3 - iter 290/292 - loss 0.10544311 - time (sec): 16.74 - samples/sec: 2644.00 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 18:24:46,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:24:46,180 EPOCH 3 done: loss 0.1058 - lr: 0.000039 |
|
2023-10-16 18:24:47,660 DEV : loss 0.14769691228866577 - f1-score (micro avg) 0.6436 |
|
2023-10-16 18:24:47,665 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:24:49,228 epoch 4 - iter 29/292 - loss 0.07509580 - time (sec): 1.56 - samples/sec: 2524.12 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:24:50,823 epoch 4 - iter 58/292 - loss 0.07886125 - time (sec): 3.16 - samples/sec: 2606.97 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:24:52,434 epoch 4 - iter 87/292 - loss 0.08386891 - time (sec): 4.77 - samples/sec: 2589.31 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:24:54,155 epoch 4 - iter 116/292 - loss 0.07402682 - time (sec): 6.49 - samples/sec: 2590.61 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:24:55,743 epoch 4 - iter 145/292 - loss 0.07089413 - time (sec): 8.08 - samples/sec: 2597.15 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:24:57,409 epoch 4 - iter 174/292 - loss 0.07525889 - time (sec): 9.74 - samples/sec: 2623.03 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:24:59,095 epoch 4 - iter 203/292 - loss 0.07936455 - time (sec): 11.43 - samples/sec: 2579.90 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:25:00,743 epoch 4 - iter 232/292 - loss 0.08057674 - time (sec): 13.08 - samples/sec: 2607.69 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:25:02,362 epoch 4 - iter 261/292 - loss 0.07932140 - time (sec): 14.70 - samples/sec: 2619.95 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 18:25:04,245 epoch 4 - iter 290/292 - loss 0.07381505 - time (sec): 16.58 - samples/sec: 2668.67 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:25:04,334 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:25:04,334 EPOCH 4 done: loss 0.0737 - lr: 0.000033 |
|
2023-10-16 18:25:05,677 DEV : loss 0.12834103405475616 - f1-score (micro avg) 0.7359 |
|
2023-10-16 18:25:05,683 saving best model |
|
2023-10-16 18:25:06,300 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:25:08,081 epoch 5 - iter 29/292 - loss 0.07177539 - time (sec): 1.78 - samples/sec: 2425.89 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:25:09,758 epoch 5 - iter 58/292 - loss 0.06584511 - time (sec): 3.46 - samples/sec: 2491.49 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:25:11,555 epoch 5 - iter 87/292 - loss 0.05710451 - time (sec): 5.25 - samples/sec: 2522.65 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:25:13,217 epoch 5 - iter 116/292 - loss 0.05716308 - time (sec): 6.92 - samples/sec: 2475.43 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:25:14,897 epoch 5 - iter 145/292 - loss 0.05603785 - time (sec): 8.59 - samples/sec: 2511.36 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:25:16,552 epoch 5 - iter 174/292 - loss 0.05766367 - time (sec): 10.25 - samples/sec: 2515.05 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:25:18,212 epoch 5 - iter 203/292 - loss 0.05955690 - time (sec): 11.91 - samples/sec: 2543.13 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:25:19,868 epoch 5 - iter 232/292 - loss 0.05935010 - time (sec): 13.57 - samples/sec: 2596.39 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:25:21,531 epoch 5 - iter 261/292 - loss 0.05678007 - time (sec): 15.23 - samples/sec: 2597.33 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:25:23,191 epoch 5 - iter 290/292 - loss 0.05537101 - time (sec): 16.89 - samples/sec: 2620.66 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:25:23,284 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:25:23,284 EPOCH 5 done: loss 0.0552 - lr: 0.000028 |
|
2023-10-16 18:25:24,543 DEV : loss 0.14645273983478546 - f1-score (micro avg) 0.7352 |
|
2023-10-16 18:25:24,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:25:26,308 epoch 6 - iter 29/292 - loss 0.04452680 - time (sec): 1.76 - samples/sec: 2936.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:25:27,791 epoch 6 - iter 58/292 - loss 0.03953523 - time (sec): 3.24 - samples/sec: 2783.92 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:25:29,353 epoch 6 - iter 87/292 - loss 0.03594242 - time (sec): 4.80 - samples/sec: 2699.21 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:25:31,087 epoch 6 - iter 116/292 - loss 0.03355498 - time (sec): 6.54 - samples/sec: 2682.21 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:25:32,641 epoch 6 - iter 145/292 - loss 0.03011679 - time (sec): 8.09 - samples/sec: 2752.84 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:25:34,188 epoch 6 - iter 174/292 - loss 0.03277362 - time (sec): 9.64 - samples/sec: 2715.24 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:25:35,821 epoch 6 - iter 203/292 - loss 0.03272890 - time (sec): 11.27 - samples/sec: 2703.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:25:37,583 epoch 6 - iter 232/292 - loss 0.03614478 - time (sec): 13.03 - samples/sec: 2692.34 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:25:39,388 epoch 6 - iter 261/292 - loss 0.04456812 - time (sec): 14.84 - samples/sec: 2713.55 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:25:41,070 epoch 6 - iter 290/292 - loss 0.04463049 - time (sec): 16.52 - samples/sec: 2677.04 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:25:41,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:25:41,157 EPOCH 6 done: loss 0.0444 - lr: 0.000022 |
|
2023-10-16 18:25:42,447 DEV : loss 0.14799444377422333 - f1-score (micro avg) 0.7484 |
|
2023-10-16 18:25:42,452 saving best model |
|
2023-10-16 18:25:42,958 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:25:44,786 epoch 7 - iter 29/292 - loss 0.03561790 - time (sec): 1.83 - samples/sec: 3036.38 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:25:46,453 epoch 7 - iter 58/292 - loss 0.02551739 - time (sec): 3.49 - samples/sec: 2816.66 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:25:48,083 epoch 7 - iter 87/292 - loss 0.02590902 - time (sec): 5.12 - samples/sec: 2751.37 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:25:49,738 epoch 7 - iter 116/292 - loss 0.02747668 - time (sec): 6.78 - samples/sec: 2682.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:25:51,399 epoch 7 - iter 145/292 - loss 0.03466284 - time (sec): 8.44 - samples/sec: 2669.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:25:53,181 epoch 7 - iter 174/292 - loss 0.03184043 - time (sec): 10.22 - samples/sec: 2664.73 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:25:54,741 epoch 7 - iter 203/292 - loss 0.02941846 - time (sec): 11.78 - samples/sec: 2660.51 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:25:56,439 epoch 7 - iter 232/292 - loss 0.02738168 - time (sec): 13.48 - samples/sec: 2675.12 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:25:58,056 epoch 7 - iter 261/292 - loss 0.03070679 - time (sec): 15.10 - samples/sec: 2674.75 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:25:59,711 epoch 7 - iter 290/292 - loss 0.03072038 - time (sec): 16.75 - samples/sec: 2647.79 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:25:59,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:25:59,802 EPOCH 7 done: loss 0.0310 - lr: 0.000017 |
|
2023-10-16 18:26:01,110 DEV : loss 0.19859679043293 - f1-score (micro avg) 0.7 |
|
2023-10-16 18:26:01,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:26:02,791 epoch 8 - iter 29/292 - loss 0.02307857 - time (sec): 1.67 - samples/sec: 2596.11 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:26:04,483 epoch 8 - iter 58/292 - loss 0.01807390 - time (sec): 3.36 - samples/sec: 2719.87 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:26:06,451 epoch 8 - iter 87/292 - loss 0.02050120 - time (sec): 5.33 - samples/sec: 2515.58 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:26:08,050 epoch 8 - iter 116/292 - loss 0.01843879 - time (sec): 6.93 - samples/sec: 2494.41 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:26:09,701 epoch 8 - iter 145/292 - loss 0.02220405 - time (sec): 8.58 - samples/sec: 2566.57 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:26:11,313 epoch 8 - iter 174/292 - loss 0.01973760 - time (sec): 10.19 - samples/sec: 2591.01 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:26:13,004 epoch 8 - iter 203/292 - loss 0.02021888 - time (sec): 11.88 - samples/sec: 2585.02 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:26:14,724 epoch 8 - iter 232/292 - loss 0.02196907 - time (sec): 13.60 - samples/sec: 2614.75 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:26:16,202 epoch 8 - iter 261/292 - loss 0.02142726 - time (sec): 15.08 - samples/sec: 2600.76 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:26:17,917 epoch 8 - iter 290/292 - loss 0.02133524 - time (sec): 16.79 - samples/sec: 2629.43 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:26:18,016 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:26:18,017 EPOCH 8 done: loss 0.0217 - lr: 0.000011 |
|
2023-10-16 18:26:19,265 DEV : loss 0.18336039781570435 - f1-score (micro avg) 0.7265 |
|
2023-10-16 18:26:19,269 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:26:20,781 epoch 9 - iter 29/292 - loss 0.00973806 - time (sec): 1.51 - samples/sec: 2912.72 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:26:22,671 epoch 9 - iter 58/292 - loss 0.01556620 - time (sec): 3.40 - samples/sec: 2708.61 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:26:24,279 epoch 9 - iter 87/292 - loss 0.01591292 - time (sec): 5.01 - samples/sec: 2680.62 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:26:25,971 epoch 9 - iter 116/292 - loss 0.01447470 - time (sec): 6.70 - samples/sec: 2738.05 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:26:27,665 epoch 9 - iter 145/292 - loss 0.01441480 - time (sec): 8.39 - samples/sec: 2754.42 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:26:29,466 epoch 9 - iter 174/292 - loss 0.01538477 - time (sec): 10.20 - samples/sec: 2754.27 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:26:31,091 epoch 9 - iter 203/292 - loss 0.01607027 - time (sec): 11.82 - samples/sec: 2727.62 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:26:32,659 epoch 9 - iter 232/292 - loss 0.01513598 - time (sec): 13.39 - samples/sec: 2699.68 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:26:34,245 epoch 9 - iter 261/292 - loss 0.01547309 - time (sec): 14.97 - samples/sec: 2677.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:26:35,816 epoch 9 - iter 290/292 - loss 0.01443794 - time (sec): 16.55 - samples/sec: 2666.50 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:26:35,924 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:26:35,924 EPOCH 9 done: loss 0.0148 - lr: 0.000006 |
|
2023-10-16 18:26:37,193 DEV : loss 0.18265648186206818 - f1-score (micro avg) 0.7039 |
|
2023-10-16 18:26:37,198 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:26:38,851 epoch 10 - iter 29/292 - loss 0.00403559 - time (sec): 1.65 - samples/sec: 2836.55 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:26:40,560 epoch 10 - iter 58/292 - loss 0.00536696 - time (sec): 3.36 - samples/sec: 2893.01 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:26:42,186 epoch 10 - iter 87/292 - loss 0.01069592 - time (sec): 4.99 - samples/sec: 2799.15 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:26:43,781 epoch 10 - iter 116/292 - loss 0.00956979 - time (sec): 6.58 - samples/sec: 2751.92 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:26:45,406 epoch 10 - iter 145/292 - loss 0.01094773 - time (sec): 8.21 - samples/sec: 2733.42 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:26:47,151 epoch 10 - iter 174/292 - loss 0.01223287 - time (sec): 9.95 - samples/sec: 2766.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:26:48,801 epoch 10 - iter 203/292 - loss 0.01118601 - time (sec): 11.60 - samples/sec: 2747.13 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:26:50,388 epoch 10 - iter 232/292 - loss 0.01079535 - time (sec): 13.19 - samples/sec: 2709.71 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:26:52,076 epoch 10 - iter 261/292 - loss 0.01043899 - time (sec): 14.88 - samples/sec: 2713.94 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:26:53,572 epoch 10 - iter 290/292 - loss 0.01121750 - time (sec): 16.37 - samples/sec: 2700.50 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:26:53,665 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:26:53,665 EPOCH 10 done: loss 0.0112 - lr: 0.000000 |
|
2023-10-16 18:26:54,953 DEV : loss 0.1769292950630188 - f1-score (micro avg) 0.7277 |
|
2023-10-16 18:26:55,295 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:26:55,296 Loading model from best epoch ... |
|
2023-10-16 18:26:56,968 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:26:59,639 |
|
Results: |
|
- F-score (micro) 0.7629 |
|
- F-score (macro) 0.6925 |
|
- Accuracy 0.639 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7989 0.8448 0.8212 348 |
|
LOC 0.6804 0.8238 0.7452 261 |
|
ORG 0.4651 0.3846 0.4211 52 |
|
HumanProd 0.7500 0.8182 0.7826 22 |
|
|
|
micro avg 0.7284 0.8009 0.7629 683 |
|
macro avg 0.6736 0.7178 0.6925 683 |
|
weighted avg 0.7266 0.8009 0.7605 683 |
|
|
|
2023-10-16 18:26:59,639 ---------------------------------------------------------------------------------------------------- |
|
|