|
2023-10-16 18:27:24,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,157 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:27:24,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,157 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:27:24,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,157 Train: 1166 sentences |
|
2023-10-16 18:27:24,157 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:27:24,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,158 Training Params: |
|
2023-10-16 18:27:24,158 - learning_rate: "3e-05" |
|
2023-10-16 18:27:24,158 - mini_batch_size: "8" |
|
2023-10-16 18:27:24,158 - max_epochs: "10" |
|
2023-10-16 18:27:24,158 - shuffle: "True" |
|
2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,158 Plugins: |
|
2023-10-16 18:27:24,158 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,158 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:27:24,158 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,158 Computation: |
|
2023-10-16 18:27:24,158 - compute on device: cuda:0 |
|
2023-10-16 18:27:24,158 - embedding storage: none |
|
2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,158 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:25,317 epoch 1 - iter 14/146 - loss 2.93513404 - time (sec): 1.16 - samples/sec: 3436.39 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:27:26,505 epoch 1 - iter 28/146 - loss 2.77389228 - time (sec): 2.35 - samples/sec: 3099.80 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:27:28,131 epoch 1 - iter 42/146 - loss 2.21274536 - time (sec): 3.97 - samples/sec: 3082.74 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:27:29,576 epoch 1 - iter 56/146 - loss 1.86758775 - time (sec): 5.42 - samples/sec: 3038.81 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:27:30,809 epoch 1 - iter 70/146 - loss 1.63388705 - time (sec): 6.65 - samples/sec: 2992.94 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:27:32,063 epoch 1 - iter 84/146 - loss 1.51508448 - time (sec): 7.90 - samples/sec: 2991.87 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:27:33,978 epoch 1 - iter 98/146 - loss 1.33298731 - time (sec): 9.82 - samples/sec: 2942.53 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:27:35,323 epoch 1 - iter 112/146 - loss 1.20821560 - time (sec): 11.16 - samples/sec: 2986.86 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:27:36,996 epoch 1 - iter 126/146 - loss 1.09396532 - time (sec): 12.84 - samples/sec: 2965.34 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:27:38,420 epoch 1 - iter 140/146 - loss 1.00755548 - time (sec): 14.26 - samples/sec: 2973.03 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:27:39,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:39,094 EPOCH 1 done: loss 0.9748 - lr: 0.000029 |
|
2023-10-16 18:27:39,905 DEV : loss 0.21248747408390045 - f1-score (micro avg) 0.4671 |
|
2023-10-16 18:27:39,911 saving best model |
|
2023-10-16 18:27:40,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:41,907 epoch 2 - iter 14/146 - loss 0.25741814 - time (sec): 1.53 - samples/sec: 3144.33 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:27:43,594 epoch 2 - iter 28/146 - loss 0.25094580 - time (sec): 3.21 - samples/sec: 2960.80 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:27:44,866 epoch 2 - iter 42/146 - loss 0.25091985 - time (sec): 4.49 - samples/sec: 2947.16 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:27:46,323 epoch 2 - iter 56/146 - loss 0.23948423 - time (sec): 5.94 - samples/sec: 2914.89 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:27:47,677 epoch 2 - iter 70/146 - loss 0.23078029 - time (sec): 7.30 - samples/sec: 2876.03 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:27:49,415 epoch 2 - iter 84/146 - loss 0.25123902 - time (sec): 9.03 - samples/sec: 2871.89 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:27:50,999 epoch 2 - iter 98/146 - loss 0.24030979 - time (sec): 10.62 - samples/sec: 2881.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:27:52,231 epoch 2 - iter 112/146 - loss 0.23215042 - time (sec): 11.85 - samples/sec: 2885.68 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:27:53,529 epoch 2 - iter 126/146 - loss 0.22725978 - time (sec): 13.15 - samples/sec: 2926.76 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:27:55,155 epoch 2 - iter 140/146 - loss 0.21838231 - time (sec): 14.77 - samples/sec: 2921.39 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:27:55,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:55,621 EPOCH 2 done: loss 0.2171 - lr: 0.000027 |
|
2023-10-16 18:27:56,858 DEV : loss 0.1434197872877121 - f1-score (micro avg) 0.6021 |
|
2023-10-16 18:27:56,863 saving best model |
|
2023-10-16 18:27:57,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:27:59,758 epoch 3 - iter 14/146 - loss 0.18575960 - time (sec): 2.08 - samples/sec: 2489.05 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:28:01,045 epoch 3 - iter 28/146 - loss 0.18887110 - time (sec): 3.37 - samples/sec: 2772.46 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:28:02,583 epoch 3 - iter 42/146 - loss 0.17188404 - time (sec): 4.91 - samples/sec: 2876.25 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:28:04,041 epoch 3 - iter 56/146 - loss 0.15696548 - time (sec): 6.37 - samples/sec: 2941.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:28:05,630 epoch 3 - iter 70/146 - loss 0.14502167 - time (sec): 7.96 - samples/sec: 2928.47 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:28:06,896 epoch 3 - iter 84/146 - loss 0.14099742 - time (sec): 9.22 - samples/sec: 2934.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:28:08,331 epoch 3 - iter 98/146 - loss 0.13734271 - time (sec): 10.66 - samples/sec: 2924.49 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:28:09,528 epoch 3 - iter 112/146 - loss 0.13345700 - time (sec): 11.85 - samples/sec: 2950.12 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:28:11,017 epoch 3 - iter 126/146 - loss 0.13132961 - time (sec): 13.34 - samples/sec: 2945.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:28:12,211 epoch 3 - iter 140/146 - loss 0.12890036 - time (sec): 14.54 - samples/sec: 2958.40 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:28:12,652 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:28:12,652 EPOCH 3 done: loss 0.1276 - lr: 0.000024 |
|
2023-10-16 18:28:13,931 DEV : loss 0.11977185308933258 - f1-score (micro avg) 0.6652 |
|
2023-10-16 18:28:13,937 saving best model |
|
2023-10-16 18:28:14,450 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:28:15,788 epoch 4 - iter 14/146 - loss 0.07996901 - time (sec): 1.34 - samples/sec: 2911.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:28:17,109 epoch 4 - iter 28/146 - loss 0.08203620 - time (sec): 2.66 - samples/sec: 2996.09 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:28:18,473 epoch 4 - iter 42/146 - loss 0.09414321 - time (sec): 4.02 - samples/sec: 2900.82 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:28:20,069 epoch 4 - iter 56/146 - loss 0.08273818 - time (sec): 5.62 - samples/sec: 2934.25 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:28:21,354 epoch 4 - iter 70/146 - loss 0.08429318 - time (sec): 6.90 - samples/sec: 2935.80 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:28:22,724 epoch 4 - iter 84/146 - loss 0.08662969 - time (sec): 8.27 - samples/sec: 2941.49 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:28:24,096 epoch 4 - iter 98/146 - loss 0.08750805 - time (sec): 9.64 - samples/sec: 2935.41 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:28:25,651 epoch 4 - iter 112/146 - loss 0.08853932 - time (sec): 11.20 - samples/sec: 2939.66 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:28:26,988 epoch 4 - iter 126/146 - loss 0.08671444 - time (sec): 12.54 - samples/sec: 2961.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:28:28,792 epoch 4 - iter 140/146 - loss 0.08282451 - time (sec): 14.34 - samples/sec: 2977.22 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:28:29,335 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:28:29,336 EPOCH 4 done: loss 0.0822 - lr: 0.000020 |
|
2023-10-16 18:28:30,578 DEV : loss 0.10638927668333054 - f1-score (micro avg) 0.7168 |
|
2023-10-16 18:28:30,583 saving best model |
|
2023-10-16 18:28:31,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:28:32,716 epoch 5 - iter 14/146 - loss 0.10632934 - time (sec): 1.54 - samples/sec: 2747.36 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:28:34,221 epoch 5 - iter 28/146 - loss 0.07986467 - time (sec): 3.04 - samples/sec: 2757.96 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:28:35,865 epoch 5 - iter 42/146 - loss 0.06670757 - time (sec): 4.68 - samples/sec: 2733.59 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:28:37,137 epoch 5 - iter 56/146 - loss 0.06636050 - time (sec): 5.96 - samples/sec: 2773.06 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:28:38,628 epoch 5 - iter 70/146 - loss 0.06634048 - time (sec): 7.45 - samples/sec: 2793.51 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:28:40,017 epoch 5 - iter 84/146 - loss 0.06502579 - time (sec): 8.84 - samples/sec: 2819.90 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:28:41,496 epoch 5 - iter 98/146 - loss 0.06323217 - time (sec): 10.32 - samples/sec: 2827.88 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:28:42,923 epoch 5 - iter 112/146 - loss 0.06339836 - time (sec): 11.74 - samples/sec: 2888.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:28:44,351 epoch 5 - iter 126/146 - loss 0.06263725 - time (sec): 13.17 - samples/sec: 2901.58 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:28:45,678 epoch 5 - iter 140/146 - loss 0.06146587 - time (sec): 14.50 - samples/sec: 2909.62 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:28:46,393 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:28:46,393 EPOCH 5 done: loss 0.0599 - lr: 0.000017 |
|
2023-10-16 18:28:47,691 DEV : loss 0.11089115589857101 - f1-score (micro avg) 0.7484 |
|
2023-10-16 18:28:47,696 saving best model |
|
2023-10-16 18:28:48,226 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:28:50,125 epoch 6 - iter 14/146 - loss 0.03612688 - time (sec): 1.90 - samples/sec: 2635.80 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:28:51,353 epoch 6 - iter 28/146 - loss 0.03686816 - time (sec): 3.13 - samples/sec: 2817.15 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:28:52,703 epoch 6 - iter 42/146 - loss 0.03616211 - time (sec): 4.48 - samples/sec: 2829.65 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:28:54,290 epoch 6 - iter 56/146 - loss 0.03447287 - time (sec): 6.06 - samples/sec: 2763.87 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:28:55,642 epoch 6 - iter 70/146 - loss 0.03490954 - time (sec): 7.41 - samples/sec: 2901.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:28:56,811 epoch 6 - iter 84/146 - loss 0.03505656 - time (sec): 8.58 - samples/sec: 2936.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:28:58,271 epoch 6 - iter 98/146 - loss 0.03312980 - time (sec): 10.04 - samples/sec: 2955.51 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:28:59,521 epoch 6 - iter 112/146 - loss 0.03670852 - time (sec): 11.29 - samples/sec: 2951.86 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:29:01,313 epoch 6 - iter 126/146 - loss 0.03829386 - time (sec): 13.09 - samples/sec: 2986.27 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:29:02,439 epoch 6 - iter 140/146 - loss 0.04191597 - time (sec): 14.21 - samples/sec: 2979.92 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:29:03,316 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:03,316 EPOCH 6 done: loss 0.0433 - lr: 0.000014 |
|
2023-10-16 18:29:04,619 DEV : loss 0.12725397944450378 - f1-score (micro avg) 0.7152 |
|
2023-10-16 18:29:04,626 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:06,333 epoch 7 - iter 14/146 - loss 0.03280998 - time (sec): 1.71 - samples/sec: 3099.99 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:29:07,641 epoch 7 - iter 28/146 - loss 0.02824876 - time (sec): 3.01 - samples/sec: 3047.06 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:29:09,289 epoch 7 - iter 42/146 - loss 0.02707074 - time (sec): 4.66 - samples/sec: 2956.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:29:10,811 epoch 7 - iter 56/146 - loss 0.02889321 - time (sec): 6.18 - samples/sec: 2871.09 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:29:12,358 epoch 7 - iter 70/146 - loss 0.03251612 - time (sec): 7.73 - samples/sec: 2848.77 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:29:13,920 epoch 7 - iter 84/146 - loss 0.02978633 - time (sec): 9.29 - samples/sec: 2857.28 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:29:15,091 epoch 7 - iter 98/146 - loss 0.02989419 - time (sec): 10.46 - samples/sec: 2884.06 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:29:16,500 epoch 7 - iter 112/146 - loss 0.02868727 - time (sec): 11.87 - samples/sec: 2879.60 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:29:17,885 epoch 7 - iter 126/146 - loss 0.03081760 - time (sec): 13.26 - samples/sec: 2928.91 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:29:19,190 epoch 7 - iter 140/146 - loss 0.03274991 - time (sec): 14.56 - samples/sec: 2931.80 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:29:19,925 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:19,926 EPOCH 7 done: loss 0.0321 - lr: 0.000010 |
|
2023-10-16 18:29:21,217 DEV : loss 0.12044133991003036 - f1-score (micro avg) 0.766 |
|
2023-10-16 18:29:21,222 saving best model |
|
2023-10-16 18:29:21,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:23,128 epoch 8 - iter 14/146 - loss 0.02741513 - time (sec): 1.32 - samples/sec: 3165.44 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:29:24,529 epoch 8 - iter 28/146 - loss 0.01990358 - time (sec): 2.72 - samples/sec: 3158.95 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:29:26,092 epoch 8 - iter 42/146 - loss 0.02296771 - time (sec): 4.28 - samples/sec: 2976.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:29:27,603 epoch 8 - iter 56/146 - loss 0.02253618 - time (sec): 5.80 - samples/sec: 2889.22 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:29:29,125 epoch 8 - iter 70/146 - loss 0.02333529 - time (sec): 7.32 - samples/sec: 2902.57 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:29:30,335 epoch 8 - iter 84/146 - loss 0.02425044 - time (sec): 8.53 - samples/sec: 2940.76 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:29:32,104 epoch 8 - iter 98/146 - loss 0.02550230 - time (sec): 10.30 - samples/sec: 2902.86 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:29:33,562 epoch 8 - iter 112/146 - loss 0.02765254 - time (sec): 11.75 - samples/sec: 2930.06 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:29:34,824 epoch 8 - iter 126/146 - loss 0.02724778 - time (sec): 13.02 - samples/sec: 2926.68 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:29:36,217 epoch 8 - iter 140/146 - loss 0.02645279 - time (sec): 14.41 - samples/sec: 2956.07 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:29:36,861 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:36,861 EPOCH 8 done: loss 0.0261 - lr: 0.000007 |
|
2023-10-16 18:29:38,320 DEV : loss 0.12693804502487183 - f1-score (micro avg) 0.7526 |
|
2023-10-16 18:29:38,324 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:39,576 epoch 9 - iter 14/146 - loss 0.01650818 - time (sec): 1.25 - samples/sec: 3366.30 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:29:41,284 epoch 9 - iter 28/146 - loss 0.02351489 - time (sec): 2.96 - samples/sec: 2916.17 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:29:42,771 epoch 9 - iter 42/146 - loss 0.02544422 - time (sec): 4.45 - samples/sec: 2897.68 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:29:44,255 epoch 9 - iter 56/146 - loss 0.02923101 - time (sec): 5.93 - samples/sec: 2977.74 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:29:45,955 epoch 9 - iter 70/146 - loss 0.02603799 - time (sec): 7.63 - samples/sec: 2934.50 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:29:47,435 epoch 9 - iter 84/146 - loss 0.02466823 - time (sec): 9.11 - samples/sec: 2929.35 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:29:48,776 epoch 9 - iter 98/146 - loss 0.02570875 - time (sec): 10.45 - samples/sec: 2950.34 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:29:50,231 epoch 9 - iter 112/146 - loss 0.02418524 - time (sec): 11.91 - samples/sec: 2947.42 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:29:51,536 epoch 9 - iter 126/146 - loss 0.02378156 - time (sec): 13.21 - samples/sec: 2941.77 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:29:53,032 epoch 9 - iter 140/146 - loss 0.02245718 - time (sec): 14.71 - samples/sec: 2913.45 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:29:53,512 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:53,512 EPOCH 9 done: loss 0.0224 - lr: 0.000004 |
|
2023-10-16 18:29:54,818 DEV : loss 0.13801662623882294 - f1-score (micro avg) 0.7417 |
|
2023-10-16 18:29:54,822 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:29:56,258 epoch 10 - iter 14/146 - loss 0.00940816 - time (sec): 1.43 - samples/sec: 3070.97 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:29:57,875 epoch 10 - iter 28/146 - loss 0.01387647 - time (sec): 3.05 - samples/sec: 3120.84 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:29:59,255 epoch 10 - iter 42/146 - loss 0.02669714 - time (sec): 4.43 - samples/sec: 3025.78 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:30:00,595 epoch 10 - iter 56/146 - loss 0.02370339 - time (sec): 5.77 - samples/sec: 3054.06 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:30:02,063 epoch 10 - iter 70/146 - loss 0.02208689 - time (sec): 7.24 - samples/sec: 2976.98 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:30:03,748 epoch 10 - iter 84/146 - loss 0.02139677 - time (sec): 8.92 - samples/sec: 3007.22 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:30:05,069 epoch 10 - iter 98/146 - loss 0.01968660 - time (sec): 10.25 - samples/sec: 3022.30 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:30:06,371 epoch 10 - iter 112/146 - loss 0.01902306 - time (sec): 11.55 - samples/sec: 2995.40 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:30:07,887 epoch 10 - iter 126/146 - loss 0.01748688 - time (sec): 13.06 - samples/sec: 2969.77 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:30:09,235 epoch 10 - iter 140/146 - loss 0.01846671 - time (sec): 14.41 - samples/sec: 2990.60 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:30:09,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:30:09,714 EPOCH 10 done: loss 0.0184 - lr: 0.000000 |
|
2023-10-16 18:30:11,005 DEV : loss 0.144253209233284 - f1-score (micro avg) 0.7318 |
|
2023-10-16 18:30:11,430 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:30:11,431 Loading model from best epoch ... |
|
2023-10-16 18:30:13,049 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:30:15,723 |
|
Results: |
|
- F-score (micro) 0.7512 |
|
- F-score (macro) 0.6702 |
|
- Accuracy 0.6244 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7962 0.8420 0.8184 348 |
|
LOC 0.6503 0.8123 0.7223 261 |
|
ORG 0.5000 0.4231 0.4583 52 |
|
HumanProd 0.6818 0.6818 0.6818 22 |
|
|
|
micro avg 0.7132 0.7936 0.7512 683 |
|
macro avg 0.6571 0.6898 0.6702 683 |
|
weighted avg 0.7142 0.7936 0.7499 683 |
|
|
|
2023-10-16 18:30:15,723 ---------------------------------------------------------------------------------------------------- |
|
|