|
2023-10-13 08:18:44,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 MultiCorpus: 1100 train + 206 dev + 240 test sentences |
|
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 Train: 1100 sentences |
|
2023-10-13 08:18:44,053 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 Training Params: |
|
2023-10-13 08:18:44,053 - learning_rate: "5e-05" |
|
2023-10-13 08:18:44,053 - mini_batch_size: "8" |
|
2023-10-13 08:18:44,053 - max_epochs: "10" |
|
2023-10-13 08:18:44,053 - shuffle: "True" |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 Plugins: |
|
2023-10-13 08:18:44,053 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 08:18:44,053 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 Computation: |
|
2023-10-13 08:18:44,053 - compute on device: cuda:0 |
|
2023-10-13 08:18:44,053 - embedding storage: none |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:44,769 epoch 1 - iter 13/138 - loss 3.36216384 - time (sec): 0.71 - samples/sec: 2638.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:18:45,511 epoch 1 - iter 26/138 - loss 3.05778578 - time (sec): 1.46 - samples/sec: 2880.89 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:18:46,213 epoch 1 - iter 39/138 - loss 2.50663084 - time (sec): 2.16 - samples/sec: 2928.60 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:18:46,974 epoch 1 - iter 52/138 - loss 2.12168365 - time (sec): 2.92 - samples/sec: 2885.36 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:18:47,664 epoch 1 - iter 65/138 - loss 1.85127161 - time (sec): 3.61 - samples/sec: 2900.21 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:18:48,458 epoch 1 - iter 78/138 - loss 1.62474225 - time (sec): 4.40 - samples/sec: 2900.66 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 08:18:49,213 epoch 1 - iter 91/138 - loss 1.46221903 - time (sec): 5.16 - samples/sec: 2915.04 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 08:18:49,925 epoch 1 - iter 104/138 - loss 1.35389479 - time (sec): 5.87 - samples/sec: 2891.43 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 08:18:50,720 epoch 1 - iter 117/138 - loss 1.24029074 - time (sec): 6.67 - samples/sec: 2888.87 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 08:18:51,440 epoch 1 - iter 130/138 - loss 1.16080197 - time (sec): 7.39 - samples/sec: 2894.16 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 08:18:51,871 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:51,872 EPOCH 1 done: loss 1.1099 - lr: 0.000047 |
|
2023-10-13 08:18:52,835 DEV : loss 0.2852579355239868 - f1-score (micro avg) 0.6815 |
|
2023-10-13 08:18:52,842 saving best model |
|
2023-10-13 08:18:53,246 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:18:53,964 epoch 2 - iter 13/138 - loss 0.29008713 - time (sec): 0.72 - samples/sec: 2803.74 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 08:18:54,676 epoch 2 - iter 26/138 - loss 0.27894573 - time (sec): 1.43 - samples/sec: 2883.49 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 08:18:55,401 epoch 2 - iter 39/138 - loss 0.26387076 - time (sec): 2.15 - samples/sec: 2796.71 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 08:18:56,141 epoch 2 - iter 52/138 - loss 0.24752298 - time (sec): 2.89 - samples/sec: 2876.04 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 08:18:56,868 epoch 2 - iter 65/138 - loss 0.22805221 - time (sec): 3.62 - samples/sec: 2912.31 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 08:18:57,629 epoch 2 - iter 78/138 - loss 0.22365996 - time (sec): 4.38 - samples/sec: 2892.94 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 08:18:58,350 epoch 2 - iter 91/138 - loss 0.21579324 - time (sec): 5.10 - samples/sec: 2910.70 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 08:18:59,086 epoch 2 - iter 104/138 - loss 0.20462341 - time (sec): 5.84 - samples/sec: 2910.63 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 08:18:59,845 epoch 2 - iter 117/138 - loss 0.19740281 - time (sec): 6.60 - samples/sec: 2923.65 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 08:19:00,577 epoch 2 - iter 130/138 - loss 0.19452383 - time (sec): 7.33 - samples/sec: 2914.09 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 08:19:01,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:01,051 EPOCH 2 done: loss 0.1977 - lr: 0.000045 |
|
2023-10-13 08:19:01,753 DEV : loss 0.14731210470199585 - f1-score (micro avg) 0.8041 |
|
2023-10-13 08:19:01,758 saving best model |
|
2023-10-13 08:19:02,198 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:02,903 epoch 3 - iter 13/138 - loss 0.12039031 - time (sec): 0.70 - samples/sec: 2866.43 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 08:19:03,675 epoch 3 - iter 26/138 - loss 0.09369394 - time (sec): 1.47 - samples/sec: 2908.65 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 08:19:04,473 epoch 3 - iter 39/138 - loss 0.10326361 - time (sec): 2.27 - samples/sec: 2888.04 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 08:19:05,196 epoch 3 - iter 52/138 - loss 0.10553197 - time (sec): 3.00 - samples/sec: 2938.08 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 08:19:05,892 epoch 3 - iter 65/138 - loss 0.11195550 - time (sec): 3.69 - samples/sec: 2951.29 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 08:19:06,665 epoch 3 - iter 78/138 - loss 0.10537692 - time (sec): 4.46 - samples/sec: 2904.53 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 08:19:07,387 epoch 3 - iter 91/138 - loss 0.10329309 - time (sec): 5.19 - samples/sec: 2935.01 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 08:19:08,155 epoch 3 - iter 104/138 - loss 0.10645368 - time (sec): 5.96 - samples/sec: 2935.54 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 08:19:08,832 epoch 3 - iter 117/138 - loss 0.10399949 - time (sec): 6.63 - samples/sec: 2922.79 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 08:19:09,542 epoch 3 - iter 130/138 - loss 0.10393272 - time (sec): 7.34 - samples/sec: 2914.56 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 08:19:09,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:09,975 EPOCH 3 done: loss 0.1024 - lr: 0.000039 |
|
2023-10-13 08:19:10,669 DEV : loss 0.13100138306617737 - f1-score (micro avg) 0.8446 |
|
2023-10-13 08:19:10,674 saving best model |
|
2023-10-13 08:19:11,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:11,917 epoch 4 - iter 13/138 - loss 0.06021214 - time (sec): 0.77 - samples/sec: 2989.30 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 08:19:12,689 epoch 4 - iter 26/138 - loss 0.05825819 - time (sec): 1.54 - samples/sec: 2978.65 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 08:19:13,424 epoch 4 - iter 39/138 - loss 0.06845418 - time (sec): 2.27 - samples/sec: 2977.41 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 08:19:14,151 epoch 4 - iter 52/138 - loss 0.06522582 - time (sec): 3.00 - samples/sec: 2905.17 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 08:19:14,914 epoch 4 - iter 65/138 - loss 0.06752163 - time (sec): 3.76 - samples/sec: 2893.53 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 08:19:15,628 epoch 4 - iter 78/138 - loss 0.06464366 - time (sec): 4.48 - samples/sec: 2880.99 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 08:19:16,386 epoch 4 - iter 91/138 - loss 0.07140504 - time (sec): 5.24 - samples/sec: 2905.51 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 08:19:17,117 epoch 4 - iter 104/138 - loss 0.06658007 - time (sec): 5.97 - samples/sec: 2878.12 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 08:19:17,882 epoch 4 - iter 117/138 - loss 0.06558565 - time (sec): 6.73 - samples/sec: 2869.24 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 08:19:18,619 epoch 4 - iter 130/138 - loss 0.06645078 - time (sec): 7.47 - samples/sec: 2885.99 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 08:19:19,072 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:19,072 EPOCH 4 done: loss 0.0696 - lr: 0.000034 |
|
2023-10-13 08:19:19,781 DEV : loss 0.1526499092578888 - f1-score (micro avg) 0.8148 |
|
2023-10-13 08:19:19,786 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:20,455 epoch 5 - iter 13/138 - loss 0.03833787 - time (sec): 0.67 - samples/sec: 2816.94 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 08:19:21,163 epoch 5 - iter 26/138 - loss 0.03221155 - time (sec): 1.38 - samples/sec: 2885.15 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 08:19:21,922 epoch 5 - iter 39/138 - loss 0.03577119 - time (sec): 2.13 - samples/sec: 2999.40 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 08:19:22,647 epoch 5 - iter 52/138 - loss 0.04000067 - time (sec): 2.86 - samples/sec: 3018.36 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 08:19:23,435 epoch 5 - iter 65/138 - loss 0.04109399 - time (sec): 3.65 - samples/sec: 3003.21 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 08:19:24,174 epoch 5 - iter 78/138 - loss 0.04030136 - time (sec): 4.39 - samples/sec: 3024.11 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 08:19:24,887 epoch 5 - iter 91/138 - loss 0.04252457 - time (sec): 5.10 - samples/sec: 3003.16 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 08:19:25,575 epoch 5 - iter 104/138 - loss 0.04785958 - time (sec): 5.79 - samples/sec: 2990.90 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:19:26,315 epoch 5 - iter 117/138 - loss 0.05214743 - time (sec): 6.53 - samples/sec: 2985.14 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:19:27,024 epoch 5 - iter 130/138 - loss 0.05201815 - time (sec): 7.24 - samples/sec: 2968.00 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 08:19:27,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:27,474 EPOCH 5 done: loss 0.0509 - lr: 0.000028 |
|
2023-10-13 08:19:28,153 DEV : loss 0.12619180977344513 - f1-score (micro avg) 0.8616 |
|
2023-10-13 08:19:28,158 saving best model |
|
2023-10-13 08:19:28,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:29,300 epoch 6 - iter 13/138 - loss 0.03249176 - time (sec): 0.72 - samples/sec: 2647.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 08:19:30,065 epoch 6 - iter 26/138 - loss 0.04087283 - time (sec): 1.49 - samples/sec: 2782.72 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 08:19:30,792 epoch 6 - iter 39/138 - loss 0.03129313 - time (sec): 2.22 - samples/sec: 2835.49 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:19:31,514 epoch 6 - iter 52/138 - loss 0.04084296 - time (sec): 2.94 - samples/sec: 2897.95 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:19:32,211 epoch 6 - iter 65/138 - loss 0.03888486 - time (sec): 3.64 - samples/sec: 2915.67 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 08:19:32,928 epoch 6 - iter 78/138 - loss 0.03995297 - time (sec): 4.35 - samples/sec: 2923.81 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 08:19:33,701 epoch 6 - iter 91/138 - loss 0.03603623 - time (sec): 5.13 - samples/sec: 2943.54 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:19:34,443 epoch 6 - iter 104/138 - loss 0.03370356 - time (sec): 5.87 - samples/sec: 2909.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:19:35,220 epoch 6 - iter 117/138 - loss 0.03376515 - time (sec): 6.65 - samples/sec: 2903.31 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:19:35,946 epoch 6 - iter 130/138 - loss 0.03817809 - time (sec): 7.37 - samples/sec: 2911.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:19:36,414 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:36,415 EPOCH 6 done: loss 0.0381 - lr: 0.000023 |
|
2023-10-13 08:19:37,126 DEV : loss 0.1473054736852646 - f1-score (micro avg) 0.8674 |
|
2023-10-13 08:19:37,132 saving best model |
|
2023-10-13 08:19:37,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:38,371 epoch 7 - iter 13/138 - loss 0.01997817 - time (sec): 0.78 - samples/sec: 3006.86 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 08:19:39,068 epoch 7 - iter 26/138 - loss 0.02454367 - time (sec): 1.48 - samples/sec: 2879.53 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:19:39,717 epoch 7 - iter 39/138 - loss 0.01992186 - time (sec): 2.13 - samples/sec: 2893.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:19:40,457 epoch 7 - iter 52/138 - loss 0.02502743 - time (sec): 2.87 - samples/sec: 2922.55 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 08:19:41,140 epoch 7 - iter 65/138 - loss 0.03753523 - time (sec): 3.55 - samples/sec: 2902.17 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 08:19:41,896 epoch 7 - iter 78/138 - loss 0.03107997 - time (sec): 4.30 - samples/sec: 2926.49 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:19:42,589 epoch 7 - iter 91/138 - loss 0.02773132 - time (sec): 5.00 - samples/sec: 2925.34 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:19:43,372 epoch 7 - iter 104/138 - loss 0.02730363 - time (sec): 5.78 - samples/sec: 2945.17 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:19:44,061 epoch 7 - iter 117/138 - loss 0.02855620 - time (sec): 6.47 - samples/sec: 2944.97 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:19:44,758 epoch 7 - iter 130/138 - loss 0.02921022 - time (sec): 7.17 - samples/sec: 2966.83 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 08:19:45,225 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:45,225 EPOCH 7 done: loss 0.0291 - lr: 0.000017 |
|
2023-10-13 08:19:45,890 DEV : loss 0.15323472023010254 - f1-score (micro avg) 0.8816 |
|
2023-10-13 08:19:45,895 saving best model |
|
2023-10-13 08:19:46,305 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:47,051 epoch 8 - iter 13/138 - loss 0.02038702 - time (sec): 0.74 - samples/sec: 3037.11 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 08:19:47,801 epoch 8 - iter 26/138 - loss 0.01412744 - time (sec): 1.49 - samples/sec: 2977.30 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 08:19:48,567 epoch 8 - iter 39/138 - loss 0.02947389 - time (sec): 2.26 - samples/sec: 2865.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:19:49,279 epoch 8 - iter 52/138 - loss 0.02461736 - time (sec): 2.97 - samples/sec: 2874.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:19:49,978 epoch 8 - iter 65/138 - loss 0.02189776 - time (sec): 3.67 - samples/sec: 2919.69 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:19:50,680 epoch 8 - iter 78/138 - loss 0.02300181 - time (sec): 4.37 - samples/sec: 2913.35 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:19:51,401 epoch 8 - iter 91/138 - loss 0.02120278 - time (sec): 5.09 - samples/sec: 2943.64 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 08:19:52,092 epoch 8 - iter 104/138 - loss 0.02079869 - time (sec): 5.78 - samples/sec: 2947.80 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 08:19:52,867 epoch 8 - iter 117/138 - loss 0.02160055 - time (sec): 6.56 - samples/sec: 2934.89 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:19:53,596 epoch 8 - iter 130/138 - loss 0.02068289 - time (sec): 7.29 - samples/sec: 2940.32 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:19:54,067 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:54,068 EPOCH 8 done: loss 0.0204 - lr: 0.000012 |
|
2023-10-13 08:19:54,835 DEV : loss 0.1583411544561386 - f1-score (micro avg) 0.869 |
|
2023-10-13 08:19:54,841 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:19:55,621 epoch 9 - iter 13/138 - loss 0.00691619 - time (sec): 0.78 - samples/sec: 2842.13 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 08:19:56,322 epoch 9 - iter 26/138 - loss 0.01096592 - time (sec): 1.48 - samples/sec: 2899.25 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:19:57,137 epoch 9 - iter 39/138 - loss 0.01429396 - time (sec): 2.29 - samples/sec: 2944.59 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:19:57,879 epoch 9 - iter 52/138 - loss 0.01763747 - time (sec): 3.04 - samples/sec: 2938.06 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:19:58,620 epoch 9 - iter 65/138 - loss 0.01457269 - time (sec): 3.78 - samples/sec: 2921.83 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:19:59,360 epoch 9 - iter 78/138 - loss 0.01700158 - time (sec): 4.52 - samples/sec: 2974.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 08:20:00,073 epoch 9 - iter 91/138 - loss 0.01610376 - time (sec): 5.23 - samples/sec: 2951.92 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 08:20:00,816 epoch 9 - iter 104/138 - loss 0.01519905 - time (sec): 5.97 - samples/sec: 2916.25 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 08:20:01,535 epoch 9 - iter 117/138 - loss 0.01372095 - time (sec): 6.69 - samples/sec: 2900.86 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 08:20:02,235 epoch 9 - iter 130/138 - loss 0.01660732 - time (sec): 7.39 - samples/sec: 2934.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 08:20:02,642 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:20:02,642 EPOCH 9 done: loss 0.0164 - lr: 0.000006 |
|
2023-10-13 08:20:03,404 DEV : loss 0.15242387354373932 - f1-score (micro avg) 0.8786 |
|
2023-10-13 08:20:03,409 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:20:04,151 epoch 10 - iter 13/138 - loss 0.00567269 - time (sec): 0.74 - samples/sec: 2961.98 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:20:04,915 epoch 10 - iter 26/138 - loss 0.00887180 - time (sec): 1.50 - samples/sec: 3105.29 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:20:05,704 epoch 10 - iter 39/138 - loss 0.00671609 - time (sec): 2.29 - samples/sec: 2986.45 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:20:06,488 epoch 10 - iter 52/138 - loss 0.00747113 - time (sec): 3.08 - samples/sec: 2893.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:20:07,205 epoch 10 - iter 65/138 - loss 0.00834283 - time (sec): 3.79 - samples/sec: 2944.79 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 08:20:07,910 epoch 10 - iter 78/138 - loss 0.00826691 - time (sec): 4.50 - samples/sec: 2948.72 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 08:20:08,616 epoch 10 - iter 91/138 - loss 0.00802626 - time (sec): 5.21 - samples/sec: 2922.30 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 08:20:09,384 epoch 10 - iter 104/138 - loss 0.00935538 - time (sec): 5.97 - samples/sec: 2900.14 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 08:20:10,136 epoch 10 - iter 117/138 - loss 0.01226645 - time (sec): 6.73 - samples/sec: 2892.25 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 08:20:10,897 epoch 10 - iter 130/138 - loss 0.01177825 - time (sec): 7.49 - samples/sec: 2888.89 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 08:20:11,295 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:20:11,295 EPOCH 10 done: loss 0.0113 - lr: 0.000000 |
|
2023-10-13 08:20:12,004 DEV : loss 0.15443024039268494 - f1-score (micro avg) 0.8786 |
|
2023-10-13 08:20:12,332 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:20:12,333 Loading model from best epoch ... |
|
2023-10-13 08:20:13,698 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date |
|
2023-10-13 08:20:14,577 |
|
Results: |
|
- F-score (micro) 0.9155 |
|
- F-score (macro) 0.806 |
|
- Accuracy 0.8564 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.8962 0.9318 0.9136 176 |
|
pers 0.9609 0.9609 0.9609 128 |
|
work 0.8732 0.8378 0.8552 74 |
|
loc 0.6667 1.0000 0.8000 2 |
|
object 0.5000 0.5000 0.5000 2 |
|
|
|
micro avg 0.9096 0.9215 0.9155 382 |
|
macro avg 0.7794 0.8461 0.8060 382 |
|
weighted avg 0.9102 0.9215 0.9154 382 |
|
|
|
2023-10-13 08:20:14,577 ---------------------------------------------------------------------------------------------------- |
|
|