|
2023-10-19 19:46:38,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,013 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 19:46:38,013 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,013 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 Train: 7142 sentences |
|
2023-10-19 19:46:38,014 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 Training Params: |
|
2023-10-19 19:46:38,014 - learning_rate: "5e-05" |
|
2023-10-19 19:46:38,014 - mini_batch_size: "8" |
|
2023-10-19 19:46:38,014 - max_epochs: "10" |
|
2023-10-19 19:46:38,014 - shuffle: "True" |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 Plugins: |
|
2023-10-19 19:46:38,014 - TensorboardLogger |
|
2023-10-19 19:46:38,014 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 19:46:38,014 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 Computation: |
|
2023-10-19 19:46:38,014 - compute on device: cuda:0 |
|
2023-10-19 19:46:38,014 - embedding storage: none |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:46:38,014 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 19:46:40,369 epoch 1 - iter 89/893 - loss 3.37640508 - time (sec): 2.35 - samples/sec: 11373.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 19:46:42,695 epoch 1 - iter 178/893 - loss 3.07826801 - time (sec): 4.68 - samples/sec: 10997.16 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 19:46:45,034 epoch 1 - iter 267/893 - loss 2.58529612 - time (sec): 7.02 - samples/sec: 10981.00 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 19:46:47,400 epoch 1 - iter 356/893 - loss 2.20691769 - time (sec): 9.39 - samples/sec: 10658.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 19:46:49,719 epoch 1 - iter 445/893 - loss 1.91781584 - time (sec): 11.70 - samples/sec: 10673.16 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 19:46:51,911 epoch 1 - iter 534/893 - loss 1.71304019 - time (sec): 13.90 - samples/sec: 10749.52 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 19:46:54,158 epoch 1 - iter 623/893 - loss 1.56017559 - time (sec): 16.14 - samples/sec: 10794.44 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 19:46:56,509 epoch 1 - iter 712/893 - loss 1.43835721 - time (sec): 18.49 - samples/sec: 10850.05 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 19:46:58,764 epoch 1 - iter 801/893 - loss 1.33819693 - time (sec): 20.75 - samples/sec: 10911.41 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 19:47:00,987 epoch 1 - iter 890/893 - loss 1.26187902 - time (sec): 22.97 - samples/sec: 10805.70 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-19 19:47:01,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:47:01,055 EPOCH 1 done: loss 1.2607 - lr: 0.000050 |
|
2023-10-19 19:47:02,455 DEV : loss 0.3137783408164978 - f1-score (micro avg) 0.1708 |
|
2023-10-19 19:47:02,469 saving best model |
|
2023-10-19 19:47:02,503 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:47:04,794 epoch 2 - iter 89/893 - loss 0.48754615 - time (sec): 2.29 - samples/sec: 10644.42 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 19:47:07,041 epoch 2 - iter 178/893 - loss 0.44684462 - time (sec): 4.54 - samples/sec: 11011.34 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 19:47:09,306 epoch 2 - iter 267/893 - loss 0.45018132 - time (sec): 6.80 - samples/sec: 11075.93 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 19:47:11,571 epoch 2 - iter 356/893 - loss 0.44058455 - time (sec): 9.07 - samples/sec: 11136.12 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 19:47:13,756 epoch 2 - iter 445/893 - loss 0.43272212 - time (sec): 11.25 - samples/sec: 10964.27 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 19:47:16,040 epoch 2 - iter 534/893 - loss 0.42221248 - time (sec): 13.54 - samples/sec: 10979.81 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 19:47:18,305 epoch 2 - iter 623/893 - loss 0.41838997 - time (sec): 15.80 - samples/sec: 10980.82 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 19:47:20,601 epoch 2 - iter 712/893 - loss 0.41881081 - time (sec): 18.10 - samples/sec: 11031.93 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 19:47:22,812 epoch 2 - iter 801/893 - loss 0.41420760 - time (sec): 20.31 - samples/sec: 11005.07 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 19:47:25,062 epoch 2 - iter 890/893 - loss 0.40871779 - time (sec): 22.56 - samples/sec: 10995.98 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 19:47:25,132 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:47:25,132 EPOCH 2 done: loss 0.4086 - lr: 0.000044 |
|
2023-10-19 19:47:27,987 DEV : loss 0.24208439886569977 - f1-score (micro avg) 0.3709 |
|
2023-10-19 19:47:28,001 saving best model |
|
2023-10-19 19:47:28,032 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:47:30,273 epoch 3 - iter 89/893 - loss 0.32721434 - time (sec): 2.24 - samples/sec: 10508.71 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 19:47:32,458 epoch 3 - iter 178/893 - loss 0.33857062 - time (sec): 4.43 - samples/sec: 10755.95 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 19:47:34,698 epoch 3 - iter 267/893 - loss 0.35063974 - time (sec): 6.67 - samples/sec: 10780.80 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 19:47:36,942 epoch 3 - iter 356/893 - loss 0.34219793 - time (sec): 8.91 - samples/sec: 10822.72 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 19:47:39,216 epoch 3 - iter 445/893 - loss 0.34135107 - time (sec): 11.18 - samples/sec: 10795.37 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 19:47:41,497 epoch 3 - iter 534/893 - loss 0.33949812 - time (sec): 13.46 - samples/sec: 10936.08 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 19:47:43,746 epoch 3 - iter 623/893 - loss 0.33616037 - time (sec): 15.71 - samples/sec: 10893.89 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 19:47:45,924 epoch 3 - iter 712/893 - loss 0.33589533 - time (sec): 17.89 - samples/sec: 11000.90 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 19:47:48,205 epoch 3 - iter 801/893 - loss 0.32891389 - time (sec): 20.17 - samples/sec: 11078.12 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 19:47:50,605 epoch 3 - iter 890/893 - loss 0.32724615 - time (sec): 22.57 - samples/sec: 10983.70 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 19:47:50,681 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:47:50,682 EPOCH 3 done: loss 0.3274 - lr: 0.000039 |
|
2023-10-19 19:47:53,048 DEV : loss 0.21458815038204193 - f1-score (micro avg) 0.4403 |
|
2023-10-19 19:47:53,062 saving best model |
|
2023-10-19 19:47:53,096 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:47:55,666 epoch 4 - iter 89/893 - loss 0.31930568 - time (sec): 2.57 - samples/sec: 9137.78 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 19:47:57,933 epoch 4 - iter 178/893 - loss 0.30629343 - time (sec): 4.84 - samples/sec: 9979.44 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 19:48:00,204 epoch 4 - iter 267/893 - loss 0.30752114 - time (sec): 7.11 - samples/sec: 10470.91 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 19:48:02,457 epoch 4 - iter 356/893 - loss 0.30666558 - time (sec): 9.36 - samples/sec: 10340.18 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 19:48:04,731 epoch 4 - iter 445/893 - loss 0.30167364 - time (sec): 11.63 - samples/sec: 10404.87 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 19:48:07,007 epoch 4 - iter 534/893 - loss 0.30292471 - time (sec): 13.91 - samples/sec: 10505.98 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 19:48:09,367 epoch 4 - iter 623/893 - loss 0.29585318 - time (sec): 16.27 - samples/sec: 10565.88 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 19:48:11,669 epoch 4 - iter 712/893 - loss 0.29505339 - time (sec): 18.57 - samples/sec: 10615.90 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 19:48:14,082 epoch 4 - iter 801/893 - loss 0.29152164 - time (sec): 20.98 - samples/sec: 10611.08 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 19:48:16,356 epoch 4 - iter 890/893 - loss 0.28915070 - time (sec): 23.26 - samples/sec: 10667.55 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 19:48:16,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:48:16,433 EPOCH 4 done: loss 0.2894 - lr: 0.000033 |
|
2023-10-19 19:48:18,802 DEV : loss 0.1998962014913559 - f1-score (micro avg) 0.4765 |
|
2023-10-19 19:48:18,816 saving best model |
|
2023-10-19 19:48:18,849 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:48:21,129 epoch 5 - iter 89/893 - loss 0.25613882 - time (sec): 2.28 - samples/sec: 11164.30 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 19:48:23,397 epoch 5 - iter 178/893 - loss 0.27252700 - time (sec): 4.55 - samples/sec: 11100.29 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 19:48:25,700 epoch 5 - iter 267/893 - loss 0.27336272 - time (sec): 6.85 - samples/sec: 11156.33 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 19:48:27,927 epoch 5 - iter 356/893 - loss 0.27685354 - time (sec): 9.08 - samples/sec: 10868.80 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 19:48:30,230 epoch 5 - iter 445/893 - loss 0.27440157 - time (sec): 11.38 - samples/sec: 10863.36 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 19:48:32,485 epoch 5 - iter 534/893 - loss 0.27014232 - time (sec): 13.64 - samples/sec: 10901.27 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 19:48:34,732 epoch 5 - iter 623/893 - loss 0.26848829 - time (sec): 15.88 - samples/sec: 10920.06 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 19:48:36,997 epoch 5 - iter 712/893 - loss 0.26557942 - time (sec): 18.15 - samples/sec: 11008.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 19:48:39,195 epoch 5 - iter 801/893 - loss 0.26457269 - time (sec): 20.35 - samples/sec: 11029.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 19:48:41,498 epoch 5 - iter 890/893 - loss 0.26123523 - time (sec): 22.65 - samples/sec: 10960.40 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 19:48:41,560 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:48:41,560 EPOCH 5 done: loss 0.2615 - lr: 0.000028 |
|
2023-10-19 19:48:44,453 DEV : loss 0.1972755491733551 - f1-score (micro avg) 0.4981 |
|
2023-10-19 19:48:44,467 saving best model |
|
2023-10-19 19:48:44,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:48:46,848 epoch 6 - iter 89/893 - loss 0.24100338 - time (sec): 2.35 - samples/sec: 10514.47 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 19:48:49,063 epoch 6 - iter 178/893 - loss 0.23380207 - time (sec): 4.56 - samples/sec: 11087.19 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 19:48:51,355 epoch 6 - iter 267/893 - loss 0.23677290 - time (sec): 6.85 - samples/sec: 11100.58 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 19:48:53,695 epoch 6 - iter 356/893 - loss 0.24138562 - time (sec): 9.19 - samples/sec: 10936.16 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 19:48:56,034 epoch 6 - iter 445/893 - loss 0.24357171 - time (sec): 11.53 - samples/sec: 10970.53 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 19:48:58,314 epoch 6 - iter 534/893 - loss 0.24264433 - time (sec): 13.81 - samples/sec: 10920.65 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 19:49:00,559 epoch 6 - iter 623/893 - loss 0.24318955 - time (sec): 16.06 - samples/sec: 10850.02 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 19:49:02,912 epoch 6 - iter 712/893 - loss 0.24272678 - time (sec): 18.41 - samples/sec: 10794.09 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 19:49:05,159 epoch 6 - iter 801/893 - loss 0.23973048 - time (sec): 20.66 - samples/sec: 10805.29 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 19:49:07,493 epoch 6 - iter 890/893 - loss 0.24145450 - time (sec): 22.99 - samples/sec: 10797.15 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 19:49:07,570 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:49:07,570 EPOCH 6 done: loss 0.2419 - lr: 0.000022 |
|
2023-10-19 19:49:09,955 DEV : loss 0.19272524118423462 - f1-score (micro avg) 0.516 |
|
2023-10-19 19:49:09,970 saving best model |
|
2023-10-19 19:49:10,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:49:12,723 epoch 7 - iter 89/893 - loss 0.22796823 - time (sec): 2.71 - samples/sec: 8518.77 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 19:49:14,969 epoch 7 - iter 178/893 - loss 0.23136940 - time (sec): 4.96 - samples/sec: 9701.00 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 19:49:17,252 epoch 7 - iter 267/893 - loss 0.22245665 - time (sec): 7.24 - samples/sec: 10135.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 19:49:19,552 epoch 7 - iter 356/893 - loss 0.22754160 - time (sec): 9.54 - samples/sec: 10250.16 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 19:49:21,900 epoch 7 - iter 445/893 - loss 0.22918453 - time (sec): 11.89 - samples/sec: 10353.62 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 19:49:24,340 epoch 7 - iter 534/893 - loss 0.22544731 - time (sec): 14.33 - samples/sec: 10390.19 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 19:49:26,677 epoch 7 - iter 623/893 - loss 0.22755112 - time (sec): 16.67 - samples/sec: 10437.07 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 19:49:28,977 epoch 7 - iter 712/893 - loss 0.22920826 - time (sec): 18.97 - samples/sec: 10434.60 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 19:49:31,322 epoch 7 - iter 801/893 - loss 0.22920268 - time (sec): 21.31 - samples/sec: 10436.99 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 19:49:33,503 epoch 7 - iter 890/893 - loss 0.23055208 - time (sec): 23.49 - samples/sec: 10561.79 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 19:49:33,573 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:49:33,573 EPOCH 7 done: loss 0.2301 - lr: 0.000017 |
|
2023-10-19 19:49:35,969 DEV : loss 0.18620598316192627 - f1-score (micro avg) 0.5239 |
|
2023-10-19 19:49:35,983 saving best model |
|
2023-10-19 19:49:36,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:49:38,247 epoch 8 - iter 89/893 - loss 0.22861556 - time (sec): 2.23 - samples/sec: 11349.33 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 19:49:40,516 epoch 8 - iter 178/893 - loss 0.22939806 - time (sec): 4.50 - samples/sec: 11160.25 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 19:49:42,859 epoch 8 - iter 267/893 - loss 0.22523571 - time (sec): 6.84 - samples/sec: 10745.49 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 19:49:45,145 epoch 8 - iter 356/893 - loss 0.21832618 - time (sec): 9.13 - samples/sec: 10843.98 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 19:49:47,397 epoch 8 - iter 445/893 - loss 0.22014114 - time (sec): 11.38 - samples/sec: 10815.63 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 19:49:49,623 epoch 8 - iter 534/893 - loss 0.21661746 - time (sec): 13.61 - samples/sec: 11040.16 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 19:49:51,768 epoch 8 - iter 623/893 - loss 0.21759510 - time (sec): 15.75 - samples/sec: 10999.95 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 19:49:54,097 epoch 8 - iter 712/893 - loss 0.21475664 - time (sec): 18.08 - samples/sec: 10978.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 19:49:56,368 epoch 8 - iter 801/893 - loss 0.21609813 - time (sec): 20.35 - samples/sec: 11024.99 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 19:49:58,703 epoch 8 - iter 890/893 - loss 0.21775578 - time (sec): 22.69 - samples/sec: 10919.33 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 19:49:58,779 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:49:58,779 EPOCH 8 done: loss 0.2172 - lr: 0.000011 |
|
2023-10-19 19:50:01,670 DEV : loss 0.18776430189609528 - f1-score (micro avg) 0.5486 |
|
2023-10-19 19:50:01,685 saving best model |
|
2023-10-19 19:50:01,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:50:04,038 epoch 9 - iter 89/893 - loss 0.21006407 - time (sec): 2.32 - samples/sec: 10497.41 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 19:50:06,373 epoch 9 - iter 178/893 - loss 0.22112362 - time (sec): 4.65 - samples/sec: 10482.48 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 19:50:08,623 epoch 9 - iter 267/893 - loss 0.20970277 - time (sec): 6.91 - samples/sec: 10544.38 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 19:50:10,913 epoch 9 - iter 356/893 - loss 0.21231648 - time (sec): 9.20 - samples/sec: 10709.98 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 19:50:13,170 epoch 9 - iter 445/893 - loss 0.21024936 - time (sec): 11.45 - samples/sec: 10693.91 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 19:50:15,419 epoch 9 - iter 534/893 - loss 0.21014782 - time (sec): 13.70 - samples/sec: 10849.46 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 19:50:17,678 epoch 9 - iter 623/893 - loss 0.21038853 - time (sec): 15.96 - samples/sec: 10786.72 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 19:50:20,022 epoch 9 - iter 712/893 - loss 0.20926168 - time (sec): 18.30 - samples/sec: 10819.07 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 19:50:22,302 epoch 9 - iter 801/893 - loss 0.21189301 - time (sec): 20.58 - samples/sec: 10819.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 19:50:24,507 epoch 9 - iter 890/893 - loss 0.20858457 - time (sec): 22.79 - samples/sec: 10877.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 19:50:24,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:50:24,575 EPOCH 9 done: loss 0.2085 - lr: 0.000006 |
|
2023-10-19 19:50:26,942 DEV : loss 0.18552501499652863 - f1-score (micro avg) 0.538 |
|
2023-10-19 19:50:26,958 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:50:29,763 epoch 10 - iter 89/893 - loss 0.21678179 - time (sec): 2.81 - samples/sec: 8898.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 19:50:32,002 epoch 10 - iter 178/893 - loss 0.21145976 - time (sec): 5.04 - samples/sec: 9977.49 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 19:50:34,285 epoch 10 - iter 267/893 - loss 0.21440312 - time (sec): 7.33 - samples/sec: 10370.80 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 19:50:36,505 epoch 10 - iter 356/893 - loss 0.20785955 - time (sec): 9.55 - samples/sec: 10407.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 19:50:38,637 epoch 10 - iter 445/893 - loss 0.21105629 - time (sec): 11.68 - samples/sec: 10548.47 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 19:50:40,903 epoch 10 - iter 534/893 - loss 0.21185853 - time (sec): 13.94 - samples/sec: 10608.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 19:50:43,177 epoch 10 - iter 623/893 - loss 0.21162765 - time (sec): 16.22 - samples/sec: 10659.71 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 19:50:45,454 epoch 10 - iter 712/893 - loss 0.20746545 - time (sec): 18.50 - samples/sec: 10715.57 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 19:50:47,720 epoch 10 - iter 801/893 - loss 0.20692386 - time (sec): 20.76 - samples/sec: 10705.21 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 19:50:49,986 epoch 10 - iter 890/893 - loss 0.20621958 - time (sec): 23.03 - samples/sec: 10778.66 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 19:50:50,062 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:50:50,063 EPOCH 10 done: loss 0.2062 - lr: 0.000000 |
|
2023-10-19 19:50:52,460 DEV : loss 0.184996098279953 - f1-score (micro avg) 0.5351 |
|
2023-10-19 19:50:52,503 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:50:52,504 Loading model from best epoch ... |
|
2023-10-19 19:50:52,583 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 19:50:57,158 |
|
Results: |
|
- F-score (micro) 0.4218 |
|
- F-score (macro) 0.263 |
|
- Accuracy 0.2766 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4249 0.4959 0.4576 1095 |
|
PER 0.4409 0.4901 0.4642 1012 |
|
ORG 0.1827 0.1008 0.1300 357 |
|
HumanProd 0.0000 0.0000 0.0000 33 |
|
|
|
micro avg 0.4135 0.4305 0.4218 2497 |
|
macro avg 0.2621 0.2717 0.2630 2497 |
|
weighted avg 0.3911 0.4305 0.4074 2497 |
|
|
|
2023-10-19 19:50:57,158 ---------------------------------------------------------------------------------------------------- |
|
|