|
2023-10-19 20:58:16,394 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,394 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 20:58:16,394 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,394 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-19 20:58:16,394 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,394 Train: 7142 sentences |
|
2023-10-19 20:58:16,394 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 20:58:16,394 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,395 Training Params: |
|
2023-10-19 20:58:16,395 - learning_rate: "5e-05" |
|
2023-10-19 20:58:16,395 - mini_batch_size: "4" |
|
2023-10-19 20:58:16,395 - max_epochs: "10" |
|
2023-10-19 20:58:16,395 - shuffle: "True" |
|
2023-10-19 20:58:16,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,395 Plugins: |
|
2023-10-19 20:58:16,395 - TensorboardLogger |
|
2023-10-19 20:58:16,395 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 20:58:16,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,395 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 20:58:16,395 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 20:58:16,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,395 Computation: |
|
2023-10-19 20:58:16,395 - compute on device: cuda:0 |
|
2023-10-19 20:58:16,395 - embedding storage: none |
|
2023-10-19 20:58:16,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,395 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-19 20:58:16,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:16,395 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 20:58:19,025 epoch 1 - iter 178/1786 - loss 3.39433925 - time (sec): 2.63 - samples/sec: 9379.21 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 20:58:21,707 epoch 1 - iter 356/1786 - loss 2.91442988 - time (sec): 5.31 - samples/sec: 9619.01 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 20:58:24,768 epoch 1 - iter 534/1786 - loss 2.29912039 - time (sec): 8.37 - samples/sec: 9185.83 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 20:58:27,960 epoch 1 - iter 712/1786 - loss 1.91163547 - time (sec): 11.56 - samples/sec: 8856.81 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 20:58:31,072 epoch 1 - iter 890/1786 - loss 1.69293519 - time (sec): 14.68 - samples/sec: 8590.79 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 20:58:34,159 epoch 1 - iter 1068/1786 - loss 1.52968510 - time (sec): 17.76 - samples/sec: 8452.84 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 20:58:37,316 epoch 1 - iter 1246/1786 - loss 1.39348187 - time (sec): 20.92 - samples/sec: 8397.32 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 20:58:40,371 epoch 1 - iter 1424/1786 - loss 1.28507688 - time (sec): 23.98 - samples/sec: 8375.29 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 20:58:43,370 epoch 1 - iter 1602/1786 - loss 1.19861795 - time (sec): 26.97 - samples/sec: 8325.77 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 20:58:46,435 epoch 1 - iter 1780/1786 - loss 1.13141747 - time (sec): 30.04 - samples/sec: 8257.10 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-19 20:58:46,527 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:46,527 EPOCH 1 done: loss 1.1299 - lr: 0.000050 |
|
2023-10-19 20:58:47,918 DEV : loss 0.3072870969772339 - f1-score (micro avg) 0.2262 |
|
2023-10-19 20:58:47,932 saving best model |
|
2023-10-19 20:58:47,966 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:58:51,058 epoch 2 - iter 178/1786 - loss 0.46644868 - time (sec): 3.09 - samples/sec: 7639.57 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 20:58:54,074 epoch 2 - iter 356/1786 - loss 0.46322785 - time (sec): 6.11 - samples/sec: 7992.93 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 20:58:57,117 epoch 2 - iter 534/1786 - loss 0.45211784 - time (sec): 9.15 - samples/sec: 8006.32 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 20:59:00,061 epoch 2 - iter 712/1786 - loss 0.44022463 - time (sec): 12.09 - samples/sec: 8050.83 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 20:59:03,120 epoch 2 - iter 890/1786 - loss 0.42492323 - time (sec): 15.15 - samples/sec: 8078.68 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 20:59:06,209 epoch 2 - iter 1068/1786 - loss 0.41906923 - time (sec): 18.24 - samples/sec: 8122.04 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 20:59:09,470 epoch 2 - iter 1246/1786 - loss 0.41285425 - time (sec): 21.50 - samples/sec: 8094.77 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 20:59:12,467 epoch 2 - iter 1424/1786 - loss 0.41328103 - time (sec): 24.50 - samples/sec: 8047.76 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 20:59:15,495 epoch 2 - iter 1602/1786 - loss 0.40602105 - time (sec): 27.53 - samples/sec: 8059.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 20:59:18,579 epoch 2 - iter 1780/1786 - loss 0.40352887 - time (sec): 30.61 - samples/sec: 8101.49 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 20:59:18,683 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:59:18,683 EPOCH 2 done: loss 0.4037 - lr: 0.000044 |
|
2023-10-19 20:59:21,544 DEV : loss 0.23947705328464508 - f1-score (micro avg) 0.3628 |
|
2023-10-19 20:59:21,558 saving best model |
|
2023-10-19 20:59:21,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:59:24,722 epoch 3 - iter 178/1786 - loss 0.30638540 - time (sec): 3.13 - samples/sec: 8248.30 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 20:59:27,738 epoch 3 - iter 356/1786 - loss 0.32648450 - time (sec): 6.14 - samples/sec: 8067.28 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 20:59:30,745 epoch 3 - iter 534/1786 - loss 0.33121133 - time (sec): 9.15 - samples/sec: 8004.37 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 20:59:33,798 epoch 3 - iter 712/1786 - loss 0.32647076 - time (sec): 12.20 - samples/sec: 8063.71 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 20:59:36,949 epoch 3 - iter 890/1786 - loss 0.32878218 - time (sec): 15.35 - samples/sec: 8065.26 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 20:59:40,305 epoch 3 - iter 1068/1786 - loss 0.33069697 - time (sec): 18.71 - samples/sec: 7989.44 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 20:59:43,433 epoch 3 - iter 1246/1786 - loss 0.32906811 - time (sec): 21.84 - samples/sec: 7995.43 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 20:59:46,484 epoch 3 - iter 1424/1786 - loss 0.32966262 - time (sec): 24.89 - samples/sec: 7989.64 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 20:59:49,454 epoch 3 - iter 1602/1786 - loss 0.32998018 - time (sec): 27.86 - samples/sec: 7989.40 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 20:59:52,398 epoch 3 - iter 1780/1786 - loss 0.32830930 - time (sec): 30.80 - samples/sec: 8052.42 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 20:59:52,479 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:59:52,479 EPOCH 3 done: loss 0.3291 - lr: 0.000039 |
|
2023-10-19 20:59:54,832 DEV : loss 0.21711337566375732 - f1-score (micro avg) 0.4412 |
|
2023-10-19 20:59:54,845 saving best model |
|
2023-10-19 20:59:54,880 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:59:58,253 epoch 4 - iter 178/1786 - loss 0.28910294 - time (sec): 3.37 - samples/sec: 7599.18 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 21:00:01,330 epoch 4 - iter 356/1786 - loss 0.28554259 - time (sec): 6.45 - samples/sec: 7654.04 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 21:00:04,474 epoch 4 - iter 534/1786 - loss 0.28752177 - time (sec): 9.59 - samples/sec: 7716.32 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 21:00:07,494 epoch 4 - iter 712/1786 - loss 0.29481804 - time (sec): 12.61 - samples/sec: 7746.74 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 21:00:10,542 epoch 4 - iter 890/1786 - loss 0.29209292 - time (sec): 15.66 - samples/sec: 7825.10 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 21:00:13,639 epoch 4 - iter 1068/1786 - loss 0.29321400 - time (sec): 18.76 - samples/sec: 7799.47 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 21:00:16,855 epoch 4 - iter 1246/1786 - loss 0.29088719 - time (sec): 21.97 - samples/sec: 7788.31 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 21:00:20,030 epoch 4 - iter 1424/1786 - loss 0.28877714 - time (sec): 25.15 - samples/sec: 7893.11 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 21:00:23,060 epoch 4 - iter 1602/1786 - loss 0.28795049 - time (sec): 28.18 - samples/sec: 7909.66 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 21:00:26,192 epoch 4 - iter 1780/1786 - loss 0.28608450 - time (sec): 31.31 - samples/sec: 7908.48 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 21:00:26,298 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:00:26,298 EPOCH 4 done: loss 0.2856 - lr: 0.000033 |
|
2023-10-19 21:00:29,109 DEV : loss 0.21065032482147217 - f1-score (micro avg) 0.4851 |
|
2023-10-19 21:00:29,122 saving best model |
|
2023-10-19 21:00:29,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:00:32,181 epoch 5 - iter 178/1786 - loss 0.26936022 - time (sec): 3.02 - samples/sec: 7810.87 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 21:00:35,233 epoch 5 - iter 356/1786 - loss 0.25893596 - time (sec): 6.07 - samples/sec: 8113.38 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 21:00:38,268 epoch 5 - iter 534/1786 - loss 0.26522534 - time (sec): 9.11 - samples/sec: 8061.37 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 21:00:41,439 epoch 5 - iter 712/1786 - loss 0.26176498 - time (sec): 12.28 - samples/sec: 8024.11 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 21:00:44,487 epoch 5 - iter 890/1786 - loss 0.26297763 - time (sec): 15.33 - samples/sec: 8044.77 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 21:00:47,482 epoch 5 - iter 1068/1786 - loss 0.25919324 - time (sec): 18.32 - samples/sec: 8016.32 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 21:00:50,611 epoch 5 - iter 1246/1786 - loss 0.25806711 - time (sec): 21.45 - samples/sec: 8119.10 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 21:00:53,748 epoch 5 - iter 1424/1786 - loss 0.25466980 - time (sec): 24.59 - samples/sec: 8072.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 21:00:56,856 epoch 5 - iter 1602/1786 - loss 0.25598645 - time (sec): 27.70 - samples/sec: 8070.63 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 21:00:59,939 epoch 5 - iter 1780/1786 - loss 0.25465192 - time (sec): 30.78 - samples/sec: 8048.62 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 21:01:00,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:01:00,052 EPOCH 5 done: loss 0.2543 - lr: 0.000028 |
|
2023-10-19 21:01:02,433 DEV : loss 0.20281952619552612 - f1-score (micro avg) 0.4948 |
|
2023-10-19 21:01:02,448 saving best model |
|
2023-10-19 21:01:02,483 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:01:05,510 epoch 6 - iter 178/1786 - loss 0.23393227 - time (sec): 3.03 - samples/sec: 7808.95 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 21:01:08,545 epoch 6 - iter 356/1786 - loss 0.23081599 - time (sec): 6.06 - samples/sec: 8163.67 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 21:01:11,648 epoch 6 - iter 534/1786 - loss 0.23479214 - time (sec): 9.16 - samples/sec: 8155.13 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 21:01:14,701 epoch 6 - iter 712/1786 - loss 0.23298850 - time (sec): 12.22 - samples/sec: 8207.54 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 21:01:17,670 epoch 6 - iter 890/1786 - loss 0.23743133 - time (sec): 15.19 - samples/sec: 8100.39 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 21:01:20,467 epoch 6 - iter 1068/1786 - loss 0.23775517 - time (sec): 17.98 - samples/sec: 8153.61 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 21:01:23,606 epoch 6 - iter 1246/1786 - loss 0.23641985 - time (sec): 21.12 - samples/sec: 8149.38 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 21:01:26,829 epoch 6 - iter 1424/1786 - loss 0.23416317 - time (sec): 24.35 - samples/sec: 8123.71 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 21:01:29,978 epoch 6 - iter 1602/1786 - loss 0.23280312 - time (sec): 27.49 - samples/sec: 8121.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 21:01:33,180 epoch 6 - iter 1780/1786 - loss 0.23351653 - time (sec): 30.70 - samples/sec: 8085.38 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 21:01:33,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:01:33,291 EPOCH 6 done: loss 0.2338 - lr: 0.000022 |
|
2023-10-19 21:01:36,137 DEV : loss 0.2004702240228653 - f1-score (micro avg) 0.5138 |
|
2023-10-19 21:01:36,151 saving best model |
|
2023-10-19 21:01:36,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:01:39,079 epoch 7 - iter 178/1786 - loss 0.23612974 - time (sec): 2.89 - samples/sec: 8198.00 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 21:01:42,193 epoch 7 - iter 356/1786 - loss 0.21613429 - time (sec): 6.01 - samples/sec: 8097.03 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 21:01:45,222 epoch 7 - iter 534/1786 - loss 0.21090820 - time (sec): 9.03 - samples/sec: 8043.86 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 21:01:48,279 epoch 7 - iter 712/1786 - loss 0.21151190 - time (sec): 12.09 - samples/sec: 8048.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 21:01:51,326 epoch 7 - iter 890/1786 - loss 0.21415014 - time (sec): 15.14 - samples/sec: 8097.58 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 21:01:54,448 epoch 7 - iter 1068/1786 - loss 0.22230949 - time (sec): 18.26 - samples/sec: 8094.44 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 21:01:57,518 epoch 7 - iter 1246/1786 - loss 0.22203000 - time (sec): 21.33 - samples/sec: 8038.74 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 21:02:00,602 epoch 7 - iter 1424/1786 - loss 0.22048490 - time (sec): 24.41 - samples/sec: 8093.24 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 21:02:03,686 epoch 7 - iter 1602/1786 - loss 0.21897496 - time (sec): 27.50 - samples/sec: 8107.85 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 21:02:06,657 epoch 7 - iter 1780/1786 - loss 0.21764711 - time (sec): 30.47 - samples/sec: 8128.14 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 21:02:06,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:02:06,761 EPOCH 7 done: loss 0.2174 - lr: 0.000017 |
|
2023-10-19 21:02:09,120 DEV : loss 0.20184864103794098 - f1-score (micro avg) 0.5363 |
|
2023-10-19 21:02:09,133 saving best model |
|
2023-10-19 21:02:09,167 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:02:12,237 epoch 8 - iter 178/1786 - loss 0.19631205 - time (sec): 3.07 - samples/sec: 8420.53 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 21:02:15,400 epoch 8 - iter 356/1786 - loss 0.20351488 - time (sec): 6.23 - samples/sec: 8448.43 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 21:02:18,524 epoch 8 - iter 534/1786 - loss 0.19623338 - time (sec): 9.36 - samples/sec: 8444.47 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 21:02:21,610 epoch 8 - iter 712/1786 - loss 0.20086781 - time (sec): 12.44 - samples/sec: 8326.28 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 21:02:24,651 epoch 8 - iter 890/1786 - loss 0.20639441 - time (sec): 15.48 - samples/sec: 8245.95 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 21:02:27,907 epoch 8 - iter 1068/1786 - loss 0.20336709 - time (sec): 18.74 - samples/sec: 8209.92 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 21:02:30,983 epoch 8 - iter 1246/1786 - loss 0.20354696 - time (sec): 21.81 - samples/sec: 8112.38 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 21:02:34,043 epoch 8 - iter 1424/1786 - loss 0.20314656 - time (sec): 24.88 - samples/sec: 8074.10 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 21:02:37,160 epoch 8 - iter 1602/1786 - loss 0.20316820 - time (sec): 27.99 - samples/sec: 8039.11 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 21:02:40,110 epoch 8 - iter 1780/1786 - loss 0.20620940 - time (sec): 30.94 - samples/sec: 8015.46 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 21:02:40,204 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:02:40,205 EPOCH 8 done: loss 0.2060 - lr: 0.000011 |
|
2023-10-19 21:02:43,005 DEV : loss 0.19445763528347015 - f1-score (micro avg) 0.5412 |
|
2023-10-19 21:02:43,020 saving best model |
|
2023-10-19 21:02:43,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:02:46,193 epoch 9 - iter 178/1786 - loss 0.20595102 - time (sec): 3.14 - samples/sec: 8098.61 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 21:02:49,523 epoch 9 - iter 356/1786 - loss 0.20915004 - time (sec): 6.47 - samples/sec: 7781.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 21:02:52,616 epoch 9 - iter 534/1786 - loss 0.21073170 - time (sec): 9.56 - samples/sec: 7757.88 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 21:02:55,656 epoch 9 - iter 712/1786 - loss 0.21153248 - time (sec): 12.60 - samples/sec: 7796.02 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 21:02:58,639 epoch 9 - iter 890/1786 - loss 0.20761614 - time (sec): 15.58 - samples/sec: 7833.49 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 21:03:01,755 epoch 9 - iter 1068/1786 - loss 0.20308539 - time (sec): 18.70 - samples/sec: 7955.21 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 21:03:04,756 epoch 9 - iter 1246/1786 - loss 0.20231833 - time (sec): 21.70 - samples/sec: 7957.84 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 21:03:08,025 epoch 9 - iter 1424/1786 - loss 0.20265135 - time (sec): 24.97 - samples/sec: 7928.87 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 21:03:11,119 epoch 9 - iter 1602/1786 - loss 0.20136577 - time (sec): 28.06 - samples/sec: 7934.50 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 21:03:14,180 epoch 9 - iter 1780/1786 - loss 0.19963572 - time (sec): 31.12 - samples/sec: 7968.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 21:03:14,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:03:14,279 EPOCH 9 done: loss 0.1998 - lr: 0.000006 |
|
2023-10-19 21:03:16,670 DEV : loss 0.19829225540161133 - f1-score (micro avg) 0.5422 |
|
2023-10-19 21:03:16,685 saving best model |
|
2023-10-19 21:03:16,720 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:03:19,410 epoch 10 - iter 178/1786 - loss 0.18647044 - time (sec): 2.69 - samples/sec: 8782.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 21:03:22,378 epoch 10 - iter 356/1786 - loss 0.18609814 - time (sec): 5.66 - samples/sec: 8421.25 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 21:03:25,553 epoch 10 - iter 534/1786 - loss 0.18905060 - time (sec): 8.83 - samples/sec: 8418.27 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 21:03:28,473 epoch 10 - iter 712/1786 - loss 0.18325220 - time (sec): 11.75 - samples/sec: 8332.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 21:03:31,603 epoch 10 - iter 890/1786 - loss 0.18627484 - time (sec): 14.88 - samples/sec: 8203.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 21:03:34,723 epoch 10 - iter 1068/1786 - loss 0.18580625 - time (sec): 18.00 - samples/sec: 8209.34 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 21:03:37,964 epoch 10 - iter 1246/1786 - loss 0.18908200 - time (sec): 21.24 - samples/sec: 8087.53 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 21:03:41,184 epoch 10 - iter 1424/1786 - loss 0.19346145 - time (sec): 24.46 - samples/sec: 8079.92 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 21:03:44,560 epoch 10 - iter 1602/1786 - loss 0.19442863 - time (sec): 27.84 - samples/sec: 8041.93 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 21:03:47,563 epoch 10 - iter 1780/1786 - loss 0.19490021 - time (sec): 30.84 - samples/sec: 8043.92 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 21:03:47,661 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:03:47,661 EPOCH 10 done: loss 0.1948 - lr: 0.000000 |
|
2023-10-19 21:03:50,516 DEV : loss 0.19669239223003387 - f1-score (micro avg) 0.539 |
|
2023-10-19 21:03:50,560 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 21:03:50,560 Loading model from best epoch ... |
|
2023-10-19 21:03:50,642 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 21:03:55,238 |
|
Results: |
|
- F-score (micro) 0.4292 |
|
- F-score (macro) 0.2773 |
|
- Accuracy 0.2824 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4196 0.5361 0.4707 1095 |
|
PER 0.4366 0.4931 0.4631 1012 |
|
ORG 0.1986 0.1569 0.1753 357 |
|
HumanProd 0.0000 0.0000 0.0000 33 |
|
|
|
micro avg 0.4044 0.4573 0.4292 2497 |
|
macro avg 0.2637 0.2965 0.2773 2497 |
|
weighted avg 0.3893 0.4573 0.4192 2497 |
|
|
|
2023-10-19 21:03:55,238 ---------------------------------------------------------------------------------------------------- |
|
|