|
2023-10-16 18:40:58,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,253 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:40:58,253 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,253 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,254 Train: 1166 sentences |
|
2023-10-16 18:40:58,254 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,254 Training Params: |
|
2023-10-16 18:40:58,254 - learning_rate: "3e-05" |
|
2023-10-16 18:40:58,254 - mini_batch_size: "8" |
|
2023-10-16 18:40:58,254 - max_epochs: "10" |
|
2023-10-16 18:40:58,254 - shuffle: "True" |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,254 Plugins: |
|
2023-10-16 18:40:58,254 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,254 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:40:58,254 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,254 Computation: |
|
2023-10-16 18:40:58,254 - compute on device: cuda:0 |
|
2023-10-16 18:40:58,254 - embedding storage: none |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,254 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:40:59,717 epoch 1 - iter 14/146 - loss 2.97390099 - time (sec): 1.46 - samples/sec: 3017.24 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:41:00,900 epoch 1 - iter 28/146 - loss 2.77526217 - time (sec): 2.64 - samples/sec: 3044.65 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:41:02,535 epoch 1 - iter 42/146 - loss 2.37063522 - time (sec): 4.28 - samples/sec: 2990.83 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:41:04,316 epoch 1 - iter 56/146 - loss 1.94618735 - time (sec): 6.06 - samples/sec: 2861.03 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:41:05,651 epoch 1 - iter 70/146 - loss 1.73354724 - time (sec): 7.40 - samples/sec: 2856.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:41:07,365 epoch 1 - iter 84/146 - loss 1.56843020 - time (sec): 9.11 - samples/sec: 2831.84 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:41:08,780 epoch 1 - iter 98/146 - loss 1.41105109 - time (sec): 10.53 - samples/sec: 2856.75 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:41:10,204 epoch 1 - iter 112/146 - loss 1.27309307 - time (sec): 11.95 - samples/sec: 2885.22 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:41:11,681 epoch 1 - iter 126/146 - loss 1.16627535 - time (sec): 13.43 - samples/sec: 2901.13 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:41:12,960 epoch 1 - iter 140/146 - loss 1.08434747 - time (sec): 14.70 - samples/sec: 2924.63 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:41:13,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:41:13,475 EPOCH 1 done: loss 1.0599 - lr: 0.000029 |
|
2023-10-16 18:41:14,283 DEV : loss 0.22420375049114227 - f1-score (micro avg) 0.3689 |
|
2023-10-16 18:41:14,287 saving best model |
|
2023-10-16 18:41:14,730 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:41:16,109 epoch 2 - iter 14/146 - loss 0.29776445 - time (sec): 1.38 - samples/sec: 3082.48 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:41:17,774 epoch 2 - iter 28/146 - loss 0.31816820 - time (sec): 3.04 - samples/sec: 3091.80 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:41:19,469 epoch 2 - iter 42/146 - loss 0.33197211 - time (sec): 4.74 - samples/sec: 2877.88 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:41:21,233 epoch 2 - iter 56/146 - loss 0.29603529 - time (sec): 6.50 - samples/sec: 2823.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:41:22,578 epoch 2 - iter 70/146 - loss 0.28356557 - time (sec): 7.85 - samples/sec: 2835.26 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:41:23,993 epoch 2 - iter 84/146 - loss 0.27591257 - time (sec): 9.26 - samples/sec: 2858.44 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:41:25,233 epoch 2 - iter 98/146 - loss 0.26858017 - time (sec): 10.50 - samples/sec: 2878.25 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:41:26,511 epoch 2 - iter 112/146 - loss 0.25364260 - time (sec): 11.78 - samples/sec: 2913.62 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:41:28,100 epoch 2 - iter 126/146 - loss 0.24500735 - time (sec): 13.37 - samples/sec: 2896.51 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:41:29,365 epoch 2 - iter 140/146 - loss 0.23870700 - time (sec): 14.63 - samples/sec: 2907.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:41:30,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:41:30,053 EPOCH 2 done: loss 0.2330 - lr: 0.000027 |
|
2023-10-16 18:41:31,652 DEV : loss 0.13390937447547913 - f1-score (micro avg) 0.6225 |
|
2023-10-16 18:41:31,657 saving best model |
|
2023-10-16 18:41:32,194 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:41:33,744 epoch 3 - iter 14/146 - loss 0.10987697 - time (sec): 1.55 - samples/sec: 3085.26 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:41:35,595 epoch 3 - iter 28/146 - loss 0.12399653 - time (sec): 3.40 - samples/sec: 2866.15 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:41:36,764 epoch 3 - iter 42/146 - loss 0.13044732 - time (sec): 4.57 - samples/sec: 2925.68 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:41:37,927 epoch 3 - iter 56/146 - loss 0.12721141 - time (sec): 5.73 - samples/sec: 2937.70 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:41:39,222 epoch 3 - iter 70/146 - loss 0.12759325 - time (sec): 7.03 - samples/sec: 2956.17 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:41:40,773 epoch 3 - iter 84/146 - loss 0.12258261 - time (sec): 8.58 - samples/sec: 2989.78 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:41:42,438 epoch 3 - iter 98/146 - loss 0.12529590 - time (sec): 10.24 - samples/sec: 3008.63 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:41:43,950 epoch 3 - iter 112/146 - loss 0.12515723 - time (sec): 11.75 - samples/sec: 2983.18 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:41:45,318 epoch 3 - iter 126/146 - loss 0.12458492 - time (sec): 13.12 - samples/sec: 2974.87 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:41:46,842 epoch 3 - iter 140/146 - loss 0.12705098 - time (sec): 14.65 - samples/sec: 2933.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:41:47,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:41:47,343 EPOCH 3 done: loss 0.1269 - lr: 0.000024 |
|
2023-10-16 18:41:48,612 DEV : loss 0.1261664777994156 - f1-score (micro avg) 0.7047 |
|
2023-10-16 18:41:48,617 saving best model |
|
2023-10-16 18:41:49,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:41:50,833 epoch 4 - iter 14/146 - loss 0.08426860 - time (sec): 1.64 - samples/sec: 3083.74 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:41:52,488 epoch 4 - iter 28/146 - loss 0.09754780 - time (sec): 3.29 - samples/sec: 2865.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:41:53,974 epoch 4 - iter 42/146 - loss 0.08349681 - time (sec): 4.78 - samples/sec: 2919.36 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:41:55,124 epoch 4 - iter 56/146 - loss 0.08357742 - time (sec): 5.93 - samples/sec: 2923.52 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:41:56,664 epoch 4 - iter 70/146 - loss 0.08252609 - time (sec): 7.47 - samples/sec: 2933.63 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:41:58,302 epoch 4 - iter 84/146 - loss 0.08292927 - time (sec): 9.11 - samples/sec: 2881.12 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:41:59,580 epoch 4 - iter 98/146 - loss 0.08204986 - time (sec): 10.39 - samples/sec: 2876.23 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:42:01,361 epoch 4 - iter 112/146 - loss 0.08113285 - time (sec): 12.17 - samples/sec: 2880.19 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:42:02,669 epoch 4 - iter 126/146 - loss 0.08204963 - time (sec): 13.47 - samples/sec: 2909.89 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:42:03,984 epoch 4 - iter 140/146 - loss 0.08229389 - time (sec): 14.79 - samples/sec: 2910.43 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:42:04,425 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:04,425 EPOCH 4 done: loss 0.0823 - lr: 0.000020 |
|
2023-10-16 18:42:05,739 DEV : loss 0.11765624582767487 - f1-score (micro avg) 0.7382 |
|
2023-10-16 18:42:05,746 saving best model |
|
2023-10-16 18:42:06,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:07,846 epoch 5 - iter 14/146 - loss 0.05045388 - time (sec): 1.58 - samples/sec: 2745.48 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:42:09,412 epoch 5 - iter 28/146 - loss 0.04670668 - time (sec): 3.14 - samples/sec: 2646.53 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:42:10,934 epoch 5 - iter 42/146 - loss 0.04342810 - time (sec): 4.66 - samples/sec: 2726.93 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:42:12,625 epoch 5 - iter 56/146 - loss 0.05072506 - time (sec): 6.35 - samples/sec: 2831.43 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:42:14,211 epoch 5 - iter 70/146 - loss 0.04968675 - time (sec): 7.94 - samples/sec: 2863.10 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:42:15,525 epoch 5 - iter 84/146 - loss 0.05401489 - time (sec): 9.25 - samples/sec: 2881.42 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:42:17,136 epoch 5 - iter 98/146 - loss 0.05351693 - time (sec): 10.87 - samples/sec: 2916.64 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:42:18,316 epoch 5 - iter 112/146 - loss 0.05569047 - time (sec): 12.04 - samples/sec: 2930.09 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:42:19,708 epoch 5 - iter 126/146 - loss 0.05757850 - time (sec): 13.44 - samples/sec: 2926.24 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:42:20,948 epoch 5 - iter 140/146 - loss 0.05972547 - time (sec): 14.68 - samples/sec: 2944.98 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:42:21,431 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:21,431 EPOCH 5 done: loss 0.0601 - lr: 0.000017 |
|
2023-10-16 18:42:22,719 DEV : loss 0.10323068499565125 - f1-score (micro avg) 0.7639 |
|
2023-10-16 18:42:22,724 saving best model |
|
2023-10-16 18:42:23,609 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:24,951 epoch 6 - iter 14/146 - loss 0.06420642 - time (sec): 1.34 - samples/sec: 3160.55 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:42:26,500 epoch 6 - iter 28/146 - loss 0.05228968 - time (sec): 2.89 - samples/sec: 3171.17 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:42:27,809 epoch 6 - iter 42/146 - loss 0.04666758 - time (sec): 4.20 - samples/sec: 3131.77 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:42:29,236 epoch 6 - iter 56/146 - loss 0.04726276 - time (sec): 5.62 - samples/sec: 3044.60 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:42:30,757 epoch 6 - iter 70/146 - loss 0.04494478 - time (sec): 7.14 - samples/sec: 3007.68 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:42:32,150 epoch 6 - iter 84/146 - loss 0.04396196 - time (sec): 8.54 - samples/sec: 2974.60 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:42:33,587 epoch 6 - iter 98/146 - loss 0.04190146 - time (sec): 9.97 - samples/sec: 3005.86 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:42:34,981 epoch 6 - iter 112/146 - loss 0.04097606 - time (sec): 11.37 - samples/sec: 3021.65 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:42:36,451 epoch 6 - iter 126/146 - loss 0.04174709 - time (sec): 12.84 - samples/sec: 3030.85 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:42:37,774 epoch 6 - iter 140/146 - loss 0.04259495 - time (sec): 14.16 - samples/sec: 3013.59 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:42:38,410 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:38,411 EPOCH 6 done: loss 0.0429 - lr: 0.000014 |
|
2023-10-16 18:42:39,666 DEV : loss 0.12086369842290878 - f1-score (micro avg) 0.7331 |
|
2023-10-16 18:42:39,671 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:41,060 epoch 7 - iter 14/146 - loss 0.03536041 - time (sec): 1.39 - samples/sec: 3083.88 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:42:42,338 epoch 7 - iter 28/146 - loss 0.03135766 - time (sec): 2.67 - samples/sec: 3113.18 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:42:43,772 epoch 7 - iter 42/146 - loss 0.02784114 - time (sec): 4.10 - samples/sec: 3139.52 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:42:45,298 epoch 7 - iter 56/146 - loss 0.02751308 - time (sec): 5.63 - samples/sec: 3096.78 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:42:46,664 epoch 7 - iter 70/146 - loss 0.02605524 - time (sec): 6.99 - samples/sec: 3045.10 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:42:48,465 epoch 7 - iter 84/146 - loss 0.03041736 - time (sec): 8.79 - samples/sec: 2978.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:42:49,866 epoch 7 - iter 98/146 - loss 0.02942324 - time (sec): 10.19 - samples/sec: 2977.37 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:42:51,429 epoch 7 - iter 112/146 - loss 0.03021149 - time (sec): 11.76 - samples/sec: 2934.42 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:42:52,967 epoch 7 - iter 126/146 - loss 0.03017723 - time (sec): 13.30 - samples/sec: 2911.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:42:54,512 epoch 7 - iter 140/146 - loss 0.03237412 - time (sec): 14.84 - samples/sec: 2895.37 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:42:55,004 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:55,005 EPOCH 7 done: loss 0.0319 - lr: 0.000010 |
|
2023-10-16 18:42:56,306 DEV : loss 0.12415074557065964 - f1-score (micro avg) 0.7716 |
|
2023-10-16 18:42:56,312 saving best model |
|
2023-10-16 18:42:56,844 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:42:58,269 epoch 8 - iter 14/146 - loss 0.01649001 - time (sec): 1.42 - samples/sec: 3042.84 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:42:59,833 epoch 8 - iter 28/146 - loss 0.01761338 - time (sec): 2.99 - samples/sec: 2985.23 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:43:01,519 epoch 8 - iter 42/146 - loss 0.02810541 - time (sec): 4.67 - samples/sec: 2933.19 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:43:02,724 epoch 8 - iter 56/146 - loss 0.02894650 - time (sec): 5.88 - samples/sec: 2883.58 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:43:04,191 epoch 8 - iter 70/146 - loss 0.02727455 - time (sec): 7.34 - samples/sec: 2927.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:43:05,867 epoch 8 - iter 84/146 - loss 0.02891311 - time (sec): 9.02 - samples/sec: 2917.72 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:43:07,161 epoch 8 - iter 98/146 - loss 0.02597422 - time (sec): 10.31 - samples/sec: 2963.64 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:43:08,650 epoch 8 - iter 112/146 - loss 0.02727534 - time (sec): 11.80 - samples/sec: 2946.61 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:43:10,154 epoch 8 - iter 126/146 - loss 0.02628321 - time (sec): 13.31 - samples/sec: 2947.47 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:43:11,406 epoch 8 - iter 140/146 - loss 0.02579516 - time (sec): 14.56 - samples/sec: 2952.88 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:43:11,909 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:43:11,909 EPOCH 8 done: loss 0.0259 - lr: 0.000007 |
|
2023-10-16 18:43:13,196 DEV : loss 0.1355086863040924 - f1-score (micro avg) 0.7191 |
|
2023-10-16 18:43:13,202 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:43:14,688 epoch 9 - iter 14/146 - loss 0.05423101 - time (sec): 1.48 - samples/sec: 3174.47 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:43:16,439 epoch 9 - iter 28/146 - loss 0.04240930 - time (sec): 3.24 - samples/sec: 2894.49 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:43:17,996 epoch 9 - iter 42/146 - loss 0.03263167 - time (sec): 4.79 - samples/sec: 2719.93 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:43:19,384 epoch 9 - iter 56/146 - loss 0.02854858 - time (sec): 6.18 - samples/sec: 2772.88 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:43:20,598 epoch 9 - iter 70/146 - loss 0.02688335 - time (sec): 7.39 - samples/sec: 2826.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:43:21,940 epoch 9 - iter 84/146 - loss 0.02662945 - time (sec): 8.74 - samples/sec: 2861.28 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:43:23,486 epoch 9 - iter 98/146 - loss 0.02365962 - time (sec): 10.28 - samples/sec: 2875.77 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:43:24,796 epoch 9 - iter 112/146 - loss 0.02270041 - time (sec): 11.59 - samples/sec: 2868.17 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:43:26,452 epoch 9 - iter 126/146 - loss 0.02197911 - time (sec): 13.25 - samples/sec: 2846.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:43:28,011 epoch 9 - iter 140/146 - loss 0.02169160 - time (sec): 14.81 - samples/sec: 2863.71 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:43:28,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:43:28,595 EPOCH 9 done: loss 0.0215 - lr: 0.000004 |
|
2023-10-16 18:43:29,856 DEV : loss 0.13685813546180725 - f1-score (micro avg) 0.7484 |
|
2023-10-16 18:43:29,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:43:31,152 epoch 10 - iter 14/146 - loss 0.02494954 - time (sec): 1.29 - samples/sec: 2857.52 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:43:32,476 epoch 10 - iter 28/146 - loss 0.01602085 - time (sec): 2.61 - samples/sec: 3020.65 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:43:33,794 epoch 10 - iter 42/146 - loss 0.01621527 - time (sec): 3.93 - samples/sec: 3066.78 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:43:35,570 epoch 10 - iter 56/146 - loss 0.01678019 - time (sec): 5.71 - samples/sec: 2969.94 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:43:37,094 epoch 10 - iter 70/146 - loss 0.01639143 - time (sec): 7.23 - samples/sec: 3007.77 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:43:38,504 epoch 10 - iter 84/146 - loss 0.01946257 - time (sec): 8.64 - samples/sec: 3023.29 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:43:39,791 epoch 10 - iter 98/146 - loss 0.01874862 - time (sec): 9.93 - samples/sec: 3038.93 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:43:41,180 epoch 10 - iter 112/146 - loss 0.01865213 - time (sec): 11.32 - samples/sec: 3027.93 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:43:42,660 epoch 10 - iter 126/146 - loss 0.01993260 - time (sec): 12.80 - samples/sec: 3034.99 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:43:44,108 epoch 10 - iter 140/146 - loss 0.01869978 - time (sec): 14.25 - samples/sec: 3042.21 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:43:44,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:43:44,558 EPOCH 10 done: loss 0.0183 - lr: 0.000000 |
|
2023-10-16 18:43:45,833 DEV : loss 0.14017795026302338 - f1-score (micro avg) 0.742 |
|
2023-10-16 18:43:46,218 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:43:46,219 Loading model from best epoch ... |
|
2023-10-16 18:43:47,826 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:43:50,204 |
|
Results: |
|
- F-score (micro) 0.7545 |
|
- F-score (macro) 0.6678 |
|
- Accuracy 0.6271 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8027 0.8420 0.8219 348 |
|
LOC 0.6707 0.8429 0.7470 261 |
|
ORG 0.3500 0.4038 0.3750 52 |
|
HumanProd 0.7273 0.7273 0.7273 22 |
|
|
|
micro avg 0.7097 0.8053 0.7545 683 |
|
macro avg 0.6377 0.7040 0.6678 683 |
|
weighted avg 0.7154 0.8053 0.7562 683 |
|
|
|
2023-10-16 18:43:50,205 ---------------------------------------------------------------------------------------------------- |
|
|