|
2023-10-16 18:20:17,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,456 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:20:17,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,456 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:20:17,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,456 Train: 1166 sentences |
|
2023-10-16 18:20:17,456 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:20:17,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,456 Training Params: |
|
2023-10-16 18:20:17,456 - learning_rate: "3e-05" |
|
2023-10-16 18:20:17,456 - mini_batch_size: "4" |
|
2023-10-16 18:20:17,456 - max_epochs: "10" |
|
2023-10-16 18:20:17,456 - shuffle: "True" |
|
2023-10-16 18:20:17,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,456 Plugins: |
|
2023-10-16 18:20:17,456 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:20:17,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,456 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:20:17,456 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:20:17,456 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,456 Computation: |
|
2023-10-16 18:20:17,456 - compute on device: cuda:0 |
|
2023-10-16 18:20:17,456 - embedding storage: none |
|
2023-10-16 18:20:17,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,457 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-16 18:20:17,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:17,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:19,012 epoch 1 - iter 29/292 - loss 2.89452854 - time (sec): 1.55 - samples/sec: 2610.85 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:20:20,543 epoch 1 - iter 58/292 - loss 2.56030484 - time (sec): 3.09 - samples/sec: 2444.98 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:20:22,417 epoch 1 - iter 87/292 - loss 1.72394676 - time (sec): 4.96 - samples/sec: 2616.09 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:20:23,974 epoch 1 - iter 116/292 - loss 1.47947989 - time (sec): 6.52 - samples/sec: 2614.50 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:20:25,566 epoch 1 - iter 145/292 - loss 1.33553521 - time (sec): 8.11 - samples/sec: 2601.11 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:20:27,124 epoch 1 - iter 174/292 - loss 1.19681158 - time (sec): 9.67 - samples/sec: 2559.05 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:20:28,991 epoch 1 - iter 203/292 - loss 1.05794959 - time (sec): 11.53 - samples/sec: 2607.47 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:20:30,683 epoch 1 - iter 232/292 - loss 0.94552175 - time (sec): 13.23 - samples/sec: 2641.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:20:32,284 epoch 1 - iter 261/292 - loss 0.86717661 - time (sec): 14.83 - samples/sec: 2650.81 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:20:33,975 epoch 1 - iter 290/292 - loss 0.79892846 - time (sec): 16.52 - samples/sec: 2676.74 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:20:34,068 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:34,068 EPOCH 1 done: loss 0.7958 - lr: 0.000030 |
|
2023-10-16 18:20:35,329 DEV : loss 0.18650421500205994 - f1-score (micro avg) 0.5083 |
|
2023-10-16 18:20:35,336 saving best model |
|
2023-10-16 18:20:35,710 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:37,446 epoch 2 - iter 29/292 - loss 0.22066234 - time (sec): 1.73 - samples/sec: 2805.89 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:20:39,219 epoch 2 - iter 58/292 - loss 0.20180903 - time (sec): 3.51 - samples/sec: 2757.79 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:20:40,760 epoch 2 - iter 87/292 - loss 0.21524233 - time (sec): 5.05 - samples/sec: 2703.85 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:20:42,386 epoch 2 - iter 116/292 - loss 0.20437921 - time (sec): 6.67 - samples/sec: 2637.60 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:20:44,125 epoch 2 - iter 145/292 - loss 0.20407849 - time (sec): 8.41 - samples/sec: 2633.71 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:20:45,791 epoch 2 - iter 174/292 - loss 0.21017489 - time (sec): 10.08 - samples/sec: 2673.66 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:20:47,443 epoch 2 - iter 203/292 - loss 0.20169462 - time (sec): 11.73 - samples/sec: 2676.79 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:20:48,926 epoch 2 - iter 232/292 - loss 0.19676618 - time (sec): 13.21 - samples/sec: 2666.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:20:50,536 epoch 2 - iter 261/292 - loss 0.19762036 - time (sec): 14.82 - samples/sec: 2697.55 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:20:52,157 epoch 2 - iter 290/292 - loss 0.19024544 - time (sec): 16.45 - samples/sec: 2696.27 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:20:52,241 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:52,241 EPOCH 2 done: loss 0.1899 - lr: 0.000027 |
|
2023-10-16 18:20:53,557 DEV : loss 0.12093638628721237 - f1-score (micro avg) 0.6793 |
|
2023-10-16 18:20:53,566 saving best model |
|
2023-10-16 18:20:54,094 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:20:56,278 epoch 3 - iter 29/292 - loss 0.17753294 - time (sec): 2.18 - samples/sec: 2480.59 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:20:58,072 epoch 3 - iter 58/292 - loss 0.18357718 - time (sec): 3.98 - samples/sec: 2415.49 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:20:59,986 epoch 3 - iter 87/292 - loss 0.16005714 - time (sec): 5.89 - samples/sec: 2495.71 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:21:01,620 epoch 3 - iter 116/292 - loss 0.14422489 - time (sec): 7.52 - samples/sec: 2563.83 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:21:03,360 epoch 3 - iter 145/292 - loss 0.13857187 - time (sec): 9.26 - samples/sec: 2627.64 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:21:05,026 epoch 3 - iter 174/292 - loss 0.13043156 - time (sec): 10.93 - samples/sec: 2575.18 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:21:06,711 epoch 3 - iter 203/292 - loss 0.12565357 - time (sec): 12.62 - samples/sec: 2527.63 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:21:08,371 epoch 3 - iter 232/292 - loss 0.12072889 - time (sec): 14.27 - samples/sec: 2545.07 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:21:10,093 epoch 3 - iter 261/292 - loss 0.12031084 - time (sec): 16.00 - samples/sec: 2523.45 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:21:11,747 epoch 3 - iter 290/292 - loss 0.11654858 - time (sec): 17.65 - samples/sec: 2506.89 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:21:11,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:21:11,847 EPOCH 3 done: loss 0.1165 - lr: 0.000023 |
|
2023-10-16 18:21:13,137 DEV : loss 0.12300916761159897 - f1-score (micro avg) 0.6891 |
|
2023-10-16 18:21:13,142 saving best model |
|
2023-10-16 18:21:13,598 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:21:15,278 epoch 4 - iter 29/292 - loss 0.07691043 - time (sec): 1.68 - samples/sec: 2351.61 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:21:17,065 epoch 4 - iter 58/292 - loss 0.07953942 - time (sec): 3.46 - samples/sec: 2375.70 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:21:18,849 epoch 4 - iter 87/292 - loss 0.08591602 - time (sec): 5.25 - samples/sec: 2352.10 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:21:20,722 epoch 4 - iter 116/292 - loss 0.08257757 - time (sec): 7.12 - samples/sec: 2360.53 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:21:22,305 epoch 4 - iter 145/292 - loss 0.07813624 - time (sec): 8.70 - samples/sec: 2409.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:21:23,960 epoch 4 - iter 174/292 - loss 0.07972020 - time (sec): 10.36 - samples/sec: 2466.85 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:21:25,683 epoch 4 - iter 203/292 - loss 0.08347839 - time (sec): 12.08 - samples/sec: 2440.27 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:21:27,464 epoch 4 - iter 232/292 - loss 0.08291309 - time (sec): 13.86 - samples/sec: 2459.88 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:21:29,179 epoch 4 - iter 261/292 - loss 0.08335854 - time (sec): 15.58 - samples/sec: 2471.62 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:21:31,135 epoch 4 - iter 290/292 - loss 0.07773530 - time (sec): 17.53 - samples/sec: 2523.16 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:21:31,236 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:21:31,236 EPOCH 4 done: loss 0.0777 - lr: 0.000020 |
|
2023-10-16 18:21:32,765 DEV : loss 0.12332110106945038 - f1-score (micro avg) 0.74 |
|
2023-10-16 18:21:32,770 saving best model |
|
2023-10-16 18:21:33,320 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:21:35,152 epoch 5 - iter 29/292 - loss 0.07471586 - time (sec): 1.83 - samples/sec: 2357.99 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:21:36,892 epoch 5 - iter 58/292 - loss 0.06384738 - time (sec): 3.57 - samples/sec: 2411.66 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:21:38,709 epoch 5 - iter 87/292 - loss 0.05322672 - time (sec): 5.39 - samples/sec: 2459.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:21:40,429 epoch 5 - iter 116/292 - loss 0.05394182 - time (sec): 7.11 - samples/sec: 2408.65 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:21:42,049 epoch 5 - iter 145/292 - loss 0.05371064 - time (sec): 8.73 - samples/sec: 2473.41 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:21:43,756 epoch 5 - iter 174/292 - loss 0.05377309 - time (sec): 10.43 - samples/sec: 2470.82 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:21:45,483 epoch 5 - iter 203/292 - loss 0.05506603 - time (sec): 12.16 - samples/sec: 2490.49 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:21:47,281 epoch 5 - iter 232/292 - loss 0.05569026 - time (sec): 13.96 - samples/sec: 2523.27 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:21:49,057 epoch 5 - iter 261/292 - loss 0.05310602 - time (sec): 15.73 - samples/sec: 2513.82 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:21:50,796 epoch 5 - iter 290/292 - loss 0.05168977 - time (sec): 17.47 - samples/sec: 2532.96 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:21:50,895 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:21:50,895 EPOCH 5 done: loss 0.0515 - lr: 0.000017 |
|
2023-10-16 18:21:52,154 DEV : loss 0.12363986670970917 - f1-score (micro avg) 0.7598 |
|
2023-10-16 18:21:52,158 saving best model |
|
2023-10-16 18:21:52,789 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:21:54,644 epoch 6 - iter 29/292 - loss 0.03999197 - time (sec): 1.85 - samples/sec: 2786.58 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:21:56,201 epoch 6 - iter 58/292 - loss 0.03737188 - time (sec): 3.41 - samples/sec: 2646.70 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:21:57,780 epoch 6 - iter 87/292 - loss 0.03607318 - time (sec): 4.99 - samples/sec: 2599.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:21:59,610 epoch 6 - iter 116/292 - loss 0.03250214 - time (sec): 6.82 - samples/sec: 2571.56 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:22:01,167 epoch 6 - iter 145/292 - loss 0.03164073 - time (sec): 8.38 - samples/sec: 2659.55 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:22:02,699 epoch 6 - iter 174/292 - loss 0.03264454 - time (sec): 9.91 - samples/sec: 2641.66 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:22:04,383 epoch 6 - iter 203/292 - loss 0.03064583 - time (sec): 11.59 - samples/sec: 2628.28 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:22:06,149 epoch 6 - iter 232/292 - loss 0.03250064 - time (sec): 13.36 - samples/sec: 2627.00 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:22:07,985 epoch 6 - iter 261/292 - loss 0.03958890 - time (sec): 15.19 - samples/sec: 2650.19 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:22:09,627 epoch 6 - iter 290/292 - loss 0.04044838 - time (sec): 16.84 - samples/sec: 2627.04 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:22:09,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:22:09,716 EPOCH 6 done: loss 0.0403 - lr: 0.000013 |
|
2023-10-16 18:22:10,948 DEV : loss 0.13180699944496155 - f1-score (micro avg) 0.75 |
|
2023-10-16 18:22:10,953 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:22:12,761 epoch 7 - iter 29/292 - loss 0.02941895 - time (sec): 1.81 - samples/sec: 3067.01 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:22:14,410 epoch 7 - iter 58/292 - loss 0.02180690 - time (sec): 3.46 - samples/sec: 2846.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:22:16,073 epoch 7 - iter 87/292 - loss 0.02244354 - time (sec): 5.12 - samples/sec: 2753.55 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:22:17,742 epoch 7 - iter 116/292 - loss 0.02270389 - time (sec): 6.79 - samples/sec: 2678.76 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:22:19,376 epoch 7 - iter 145/292 - loss 0.02838678 - time (sec): 8.42 - samples/sec: 2675.03 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:22:21,149 epoch 7 - iter 174/292 - loss 0.03006600 - time (sec): 10.19 - samples/sec: 2671.55 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:22:22,667 epoch 7 - iter 203/292 - loss 0.02892158 - time (sec): 11.71 - samples/sec: 2675.80 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:22:24,337 epoch 7 - iter 232/292 - loss 0.02871850 - time (sec): 13.38 - samples/sec: 2694.09 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:22:25,903 epoch 7 - iter 261/292 - loss 0.03081396 - time (sec): 14.95 - samples/sec: 2700.95 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:22:27,578 epoch 7 - iter 290/292 - loss 0.03280251 - time (sec): 16.62 - samples/sec: 2668.03 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:22:27,670 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:22:27,670 EPOCH 7 done: loss 0.0327 - lr: 0.000010 |
|
2023-10-16 18:22:28,952 DEV : loss 0.15362893044948578 - f1-score (micro avg) 0.7722 |
|
2023-10-16 18:22:28,957 saving best model |
|
2023-10-16 18:22:29,505 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:22:31,139 epoch 8 - iter 29/292 - loss 0.02176979 - time (sec): 1.63 - samples/sec: 2657.39 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:22:32,842 epoch 8 - iter 58/292 - loss 0.01588976 - time (sec): 3.33 - samples/sec: 2742.97 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:22:34,581 epoch 8 - iter 87/292 - loss 0.01818793 - time (sec): 5.07 - samples/sec: 2642.71 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:22:36,177 epoch 8 - iter 116/292 - loss 0.01807534 - time (sec): 6.67 - samples/sec: 2591.67 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:22:37,886 epoch 8 - iter 145/292 - loss 0.01972452 - time (sec): 8.38 - samples/sec: 2628.33 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:22:39,553 epoch 8 - iter 174/292 - loss 0.02020388 - time (sec): 10.04 - samples/sec: 2629.04 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:22:41,327 epoch 8 - iter 203/292 - loss 0.02214309 - time (sec): 11.82 - samples/sec: 2598.89 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:22:43,085 epoch 8 - iter 232/292 - loss 0.02534168 - time (sec): 13.57 - samples/sec: 2619.77 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:22:44,552 epoch 8 - iter 261/292 - loss 0.02444304 - time (sec): 15.04 - samples/sec: 2607.05 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:22:46,287 epoch 8 - iter 290/292 - loss 0.02378208 - time (sec): 16.78 - samples/sec: 2631.90 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:22:46,394 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:22:46,395 EPOCH 8 done: loss 0.0239 - lr: 0.000007 |
|
2023-10-16 18:22:47,882 DEV : loss 0.15757989883422852 - f1-score (micro avg) 0.7409 |
|
2023-10-16 18:22:47,887 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:22:49,421 epoch 9 - iter 29/292 - loss 0.01149395 - time (sec): 1.53 - samples/sec: 2871.12 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:22:51,306 epoch 9 - iter 58/292 - loss 0.01863564 - time (sec): 3.42 - samples/sec: 2695.26 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:22:52,911 epoch 9 - iter 87/292 - loss 0.02194483 - time (sec): 5.02 - samples/sec: 2673.40 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:22:54,625 epoch 9 - iter 116/292 - loss 0.02014668 - time (sec): 6.74 - samples/sec: 2723.47 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:22:56,304 epoch 9 - iter 145/292 - loss 0.01858049 - time (sec): 8.42 - samples/sec: 2747.63 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:22:58,135 epoch 9 - iter 174/292 - loss 0.01892734 - time (sec): 10.25 - samples/sec: 2740.43 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:22:59,789 epoch 9 - iter 203/292 - loss 0.02059484 - time (sec): 11.90 - samples/sec: 2709.42 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:23:01,371 epoch 9 - iter 232/292 - loss 0.01991698 - time (sec): 13.48 - samples/sec: 2680.80 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:23:03,051 epoch 9 - iter 261/292 - loss 0.01931874 - time (sec): 15.16 - samples/sec: 2644.81 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:23:04,650 epoch 9 - iter 290/292 - loss 0.01784804 - time (sec): 16.76 - samples/sec: 2632.11 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:23:04,759 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:04,759 EPOCH 9 done: loss 0.0180 - lr: 0.000003 |
|
2023-10-16 18:23:06,017 DEV : loss 0.16514389216899872 - f1-score (micro avg) 0.742 |
|
2023-10-16 18:23:06,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:07,633 epoch 10 - iter 29/292 - loss 0.00947002 - time (sec): 1.61 - samples/sec: 2910.42 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:23:09,292 epoch 10 - iter 58/292 - loss 0.01129762 - time (sec): 3.27 - samples/sec: 2974.22 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:23:10,905 epoch 10 - iter 87/292 - loss 0.01868934 - time (sec): 4.88 - samples/sec: 2859.56 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:23:12,479 epoch 10 - iter 116/292 - loss 0.01805683 - time (sec): 6.46 - samples/sec: 2805.78 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:23:14,118 epoch 10 - iter 145/292 - loss 0.01706716 - time (sec): 8.10 - samples/sec: 2771.34 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:23:15,875 epoch 10 - iter 174/292 - loss 0.01588592 - time (sec): 9.85 - samples/sec: 2794.65 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:23:17,533 epoch 10 - iter 203/292 - loss 0.01507664 - time (sec): 11.51 - samples/sec: 2768.98 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:23:19,143 epoch 10 - iter 232/292 - loss 0.01490814 - time (sec): 13.12 - samples/sec: 2723.95 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:23:20,862 epoch 10 - iter 261/292 - loss 0.01407332 - time (sec): 14.84 - samples/sec: 2720.91 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:23:22,436 epoch 10 - iter 290/292 - loss 0.01532912 - time (sec): 16.41 - samples/sec: 2693.94 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:23:22,535 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:22,536 EPOCH 10 done: loss 0.0153 - lr: 0.000000 |
|
2023-10-16 18:23:23,815 DEV : loss 0.16304908692836761 - f1-score (micro avg) 0.7319 |
|
2023-10-16 18:23:24,214 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:23:24,216 Loading model from best epoch ... |
|
2023-10-16 18:23:25,920 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:23:28,554 |
|
Results: |
|
- F-score (micro) 0.7389 |
|
- F-score (macro) 0.6609 |
|
- Accuracy 0.609 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7801 0.8563 0.8164 348 |
|
LOC 0.6350 0.7931 0.7053 261 |
|
ORG 0.3750 0.3462 0.3600 52 |
|
HumanProd 0.8000 0.7273 0.7619 22 |
|
|
|
micro avg 0.6946 0.7892 0.7389 683 |
|
macro avg 0.6475 0.6807 0.6609 683 |
|
weighted avg 0.6944 0.7892 0.7375 683 |
|
|
|
2023-10-16 18:23:28,554 ---------------------------------------------------------------------------------------------------- |
|
|