|
2023-10-19 19:56:56,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,386 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 19:56:56,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,386 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-19 19:56:56,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,386 Train: 7142 sentences |
|
2023-10-19 19:56:56,386 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 19:56:56,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,386 Training Params: |
|
2023-10-19 19:56:56,386 - learning_rate: "5e-05" |
|
2023-10-19 19:56:56,386 - mini_batch_size: "4" |
|
2023-10-19 19:56:56,386 - max_epochs: "10" |
|
2023-10-19 19:56:56,386 - shuffle: "True" |
|
2023-10-19 19:56:56,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,386 Plugins: |
|
2023-10-19 19:56:56,386 - TensorboardLogger |
|
2023-10-19 19:56:56,386 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 19:56:56,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,386 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 19:56:56,386 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 19:56:56,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,387 Computation: |
|
2023-10-19 19:56:56,387 - compute on device: cuda:0 |
|
2023-10-19 19:56:56,387 - embedding storage: none |
|
2023-10-19 19:56:56,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,387 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-19 19:56:56,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:56:56,387 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 19:56:59,484 epoch 1 - iter 178/1786 - loss 2.72583835 - time (sec): 3.10 - samples/sec: 8628.89 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 19:57:02,567 epoch 1 - iter 356/1786 - loss 2.28399604 - time (sec): 6.18 - samples/sec: 8233.47 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 19:57:05,501 epoch 1 - iter 534/1786 - loss 1.84616762 - time (sec): 9.11 - samples/sec: 8156.39 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 19:57:08,543 epoch 1 - iter 712/1786 - loss 1.55839783 - time (sec): 12.16 - samples/sec: 8168.51 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 19:57:11,561 epoch 1 - iter 890/1786 - loss 1.38530571 - time (sec): 15.17 - samples/sec: 8105.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 19:57:14,566 epoch 1 - iter 1068/1786 - loss 1.26744376 - time (sec): 18.18 - samples/sec: 8093.41 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 19:57:17,760 epoch 1 - iter 1246/1786 - loss 1.15679850 - time (sec): 21.37 - samples/sec: 8089.28 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 19:57:20,780 epoch 1 - iter 1424/1786 - loss 1.07406304 - time (sec): 24.39 - samples/sec: 8059.87 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 19:57:23,840 epoch 1 - iter 1602/1786 - loss 1.00605621 - time (sec): 27.45 - samples/sec: 8084.14 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 19:57:26,889 epoch 1 - iter 1780/1786 - loss 0.94992355 - time (sec): 30.50 - samples/sec: 8133.35 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-19 19:57:26,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:57:26,989 EPOCH 1 done: loss 0.9486 - lr: 0.000050 |
|
2023-10-19 19:57:28,453 DEV : loss 0.29305753111839294 - f1-score (micro avg) 0.249 |
|
2023-10-19 19:57:28,468 saving best model |
|
2023-10-19 19:57:28,503 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:57:31,169 epoch 2 - iter 178/1786 - loss 0.44594390 - time (sec): 2.67 - samples/sec: 8876.07 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 19:57:34,235 epoch 2 - iter 356/1786 - loss 0.41159547 - time (sec): 5.73 - samples/sec: 8474.76 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 19:57:37,249 epoch 2 - iter 534/1786 - loss 0.41144726 - time (sec): 8.75 - samples/sec: 8158.94 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 19:57:40,308 epoch 2 - iter 712/1786 - loss 0.39931129 - time (sec): 11.80 - samples/sec: 8135.11 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 19:57:43,366 epoch 2 - iter 890/1786 - loss 0.40220612 - time (sec): 14.86 - samples/sec: 8169.71 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 19:57:46,440 epoch 2 - iter 1068/1786 - loss 0.39783851 - time (sec): 17.94 - samples/sec: 8200.55 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 19:57:49,549 epoch 2 - iter 1246/1786 - loss 0.39066605 - time (sec): 21.05 - samples/sec: 8278.83 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 19:57:52,827 epoch 2 - iter 1424/1786 - loss 0.38832131 - time (sec): 24.32 - samples/sec: 8199.66 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 19:57:56,026 epoch 2 - iter 1602/1786 - loss 0.38666982 - time (sec): 27.52 - samples/sec: 8141.75 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 19:57:59,146 epoch 2 - iter 1780/1786 - loss 0.38480581 - time (sec): 30.64 - samples/sec: 8100.47 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 19:57:59,235 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:57:59,236 EPOCH 2 done: loss 0.3847 - lr: 0.000044 |
|
2023-10-19 19:58:01,589 DEV : loss 0.23073670268058777 - f1-score (micro avg) 0.4494 |
|
2023-10-19 19:58:01,602 saving best model |
|
2023-10-19 19:58:01,635 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:58:04,884 epoch 3 - iter 178/1786 - loss 0.31405377 - time (sec): 3.25 - samples/sec: 7326.64 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 19:58:08,542 epoch 3 - iter 356/1786 - loss 0.29918815 - time (sec): 6.91 - samples/sec: 7336.99 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 19:58:11,630 epoch 3 - iter 534/1786 - loss 0.29653379 - time (sec): 9.99 - samples/sec: 7560.52 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 19:58:14,695 epoch 3 - iter 712/1786 - loss 0.30267607 - time (sec): 13.06 - samples/sec: 7649.08 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 19:58:17,742 epoch 3 - iter 890/1786 - loss 0.30690130 - time (sec): 16.11 - samples/sec: 7818.28 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 19:58:20,754 epoch 3 - iter 1068/1786 - loss 0.30838373 - time (sec): 19.12 - samples/sec: 7925.28 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 19:58:23,746 epoch 3 - iter 1246/1786 - loss 0.30686090 - time (sec): 22.11 - samples/sec: 7898.97 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 19:58:26,798 epoch 3 - iter 1424/1786 - loss 0.30975257 - time (sec): 25.16 - samples/sec: 7919.77 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 19:58:29,717 epoch 3 - iter 1602/1786 - loss 0.31078462 - time (sec): 28.08 - samples/sec: 7993.49 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 19:58:32,673 epoch 3 - iter 1780/1786 - loss 0.30834283 - time (sec): 31.04 - samples/sec: 7989.82 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 19:58:32,767 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:58:32,767 EPOCH 3 done: loss 0.3085 - lr: 0.000039 |
|
2023-10-19 19:58:35,142 DEV : loss 0.20818448066711426 - f1-score (micro avg) 0.4907 |
|
2023-10-19 19:58:35,155 saving best model |
|
2023-10-19 19:58:35,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:58:38,191 epoch 4 - iter 178/1786 - loss 0.27724988 - time (sec): 3.00 - samples/sec: 8636.87 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 19:58:41,275 epoch 4 - iter 356/1786 - loss 0.28611619 - time (sec): 6.09 - samples/sec: 8168.75 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 19:58:44,340 epoch 4 - iter 534/1786 - loss 0.29478782 - time (sec): 9.15 - samples/sec: 8075.94 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 19:58:47,333 epoch 4 - iter 712/1786 - loss 0.28033530 - time (sec): 12.14 - samples/sec: 8200.91 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 19:58:50,386 epoch 4 - iter 890/1786 - loss 0.27617105 - time (sec): 15.20 - samples/sec: 8187.17 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 19:58:53,226 epoch 4 - iter 1068/1786 - loss 0.27197505 - time (sec): 18.04 - samples/sec: 8226.99 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 19:58:56,053 epoch 4 - iter 1246/1786 - loss 0.27230716 - time (sec): 20.86 - samples/sec: 8251.50 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 19:58:59,031 epoch 4 - iter 1424/1786 - loss 0.27153860 - time (sec): 23.84 - samples/sec: 8246.27 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 19:59:02,112 epoch 4 - iter 1602/1786 - loss 0.27197949 - time (sec): 26.92 - samples/sec: 8260.76 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 19:59:05,192 epoch 4 - iter 1780/1786 - loss 0.26875111 - time (sec): 30.00 - samples/sec: 8272.93 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 19:59:05,285 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:59:05,285 EPOCH 4 done: loss 0.2689 - lr: 0.000033 |
|
2023-10-19 19:59:08,121 DEV : loss 0.191745325922966 - f1-score (micro avg) 0.499 |
|
2023-10-19 19:59:08,135 saving best model |
|
2023-10-19 19:59:08,168 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:59:11,255 epoch 5 - iter 178/1786 - loss 0.26076343 - time (sec): 3.09 - samples/sec: 8103.56 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 19:59:14,449 epoch 5 - iter 356/1786 - loss 0.25312425 - time (sec): 6.28 - samples/sec: 8157.09 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 19:59:17,516 epoch 5 - iter 534/1786 - loss 0.24657920 - time (sec): 9.35 - samples/sec: 8112.84 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 19:59:20,617 epoch 5 - iter 712/1786 - loss 0.24929645 - time (sec): 12.45 - samples/sec: 7994.69 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 19:59:23,609 epoch 5 - iter 890/1786 - loss 0.24850076 - time (sec): 15.44 - samples/sec: 7921.48 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 19:59:26,734 epoch 5 - iter 1068/1786 - loss 0.24097893 - time (sec): 18.56 - samples/sec: 7956.00 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 19:59:29,745 epoch 5 - iter 1246/1786 - loss 0.24335773 - time (sec): 21.58 - samples/sec: 7925.61 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 19:59:32,778 epoch 5 - iter 1424/1786 - loss 0.24082207 - time (sec): 24.61 - samples/sec: 7979.59 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 19:59:35,918 epoch 5 - iter 1602/1786 - loss 0.24015223 - time (sec): 27.75 - samples/sec: 8023.83 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 19:59:38,964 epoch 5 - iter 1780/1786 - loss 0.24046767 - time (sec): 30.80 - samples/sec: 8050.69 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 19:59:39,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:59:39,044 EPOCH 5 done: loss 0.2403 - lr: 0.000028 |
|
2023-10-19 19:59:41,414 DEV : loss 0.1918220967054367 - f1-score (micro avg) 0.5135 |
|
2023-10-19 19:59:41,429 saving best model |
|
2023-10-19 19:59:41,465 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 19:59:44,427 epoch 6 - iter 178/1786 - loss 0.21180179 - time (sec): 2.96 - samples/sec: 8480.65 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 19:59:47,394 epoch 6 - iter 356/1786 - loss 0.21702266 - time (sec): 5.93 - samples/sec: 8183.50 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 19:59:50,443 epoch 6 - iter 534/1786 - loss 0.22223314 - time (sec): 8.98 - samples/sec: 8050.91 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 19:59:53,514 epoch 6 - iter 712/1786 - loss 0.21924966 - time (sec): 12.05 - samples/sec: 8177.54 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 19:59:56,555 epoch 6 - iter 890/1786 - loss 0.21767231 - time (sec): 15.09 - samples/sec: 8284.64 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 19:59:59,654 epoch 6 - iter 1068/1786 - loss 0.21749357 - time (sec): 18.19 - samples/sec: 8202.54 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 20:00:02,945 epoch 6 - iter 1246/1786 - loss 0.21862146 - time (sec): 21.48 - samples/sec: 8051.91 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 20:00:06,069 epoch 6 - iter 1424/1786 - loss 0.22007018 - time (sec): 24.60 - samples/sec: 8027.26 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 20:00:09,135 epoch 6 - iter 1602/1786 - loss 0.21938150 - time (sec): 27.67 - samples/sec: 8073.43 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 20:00:12,296 epoch 6 - iter 1780/1786 - loss 0.21988434 - time (sec): 30.83 - samples/sec: 8046.76 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 20:00:12,397 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:00:12,397 EPOCH 6 done: loss 0.2200 - lr: 0.000022 |
|
2023-10-19 20:00:15,251 DEV : loss 0.18987227976322174 - f1-score (micro avg) 0.5335 |
|
2023-10-19 20:00:15,264 saving best model |
|
2023-10-19 20:00:15,297 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:00:18,364 epoch 7 - iter 178/1786 - loss 0.19179214 - time (sec): 3.07 - samples/sec: 8611.48 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 20:00:21,320 epoch 7 - iter 356/1786 - loss 0.20316963 - time (sec): 6.02 - samples/sec: 8591.59 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 20:00:24,042 epoch 7 - iter 534/1786 - loss 0.20106304 - time (sec): 8.74 - samples/sec: 8595.51 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 20:00:26,699 epoch 7 - iter 712/1786 - loss 0.20457425 - time (sec): 11.40 - samples/sec: 8635.39 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 20:00:29,586 epoch 7 - iter 890/1786 - loss 0.20559868 - time (sec): 14.29 - samples/sec: 8632.86 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 20:00:32,638 epoch 7 - iter 1068/1786 - loss 0.20441274 - time (sec): 17.34 - samples/sec: 8538.94 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 20:00:35,776 epoch 7 - iter 1246/1786 - loss 0.20404307 - time (sec): 20.48 - samples/sec: 8477.96 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 20:00:38,866 epoch 7 - iter 1424/1786 - loss 0.20331972 - time (sec): 23.57 - samples/sec: 8506.69 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 20:00:41,876 epoch 7 - iter 1602/1786 - loss 0.20559511 - time (sec): 26.58 - samples/sec: 8436.74 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 20:00:44,948 epoch 7 - iter 1780/1786 - loss 0.20567843 - time (sec): 29.65 - samples/sec: 8375.37 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 20:00:45,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:00:45,045 EPOCH 7 done: loss 0.2058 - lr: 0.000017 |
|
2023-10-19 20:00:47,397 DEV : loss 0.19306714832782745 - f1-score (micro avg) 0.5548 |
|
2023-10-19 20:00:47,412 saving best model |
|
2023-10-19 20:00:47,447 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:00:50,568 epoch 8 - iter 178/1786 - loss 0.19689046 - time (sec): 3.12 - samples/sec: 8040.04 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 20:00:53,580 epoch 8 - iter 356/1786 - loss 0.18936570 - time (sec): 6.13 - samples/sec: 8150.05 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 20:00:56,635 epoch 8 - iter 534/1786 - loss 0.19552394 - time (sec): 9.19 - samples/sec: 8034.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 20:00:59,723 epoch 8 - iter 712/1786 - loss 0.19544811 - time (sec): 12.28 - samples/sec: 8039.73 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 20:01:02,832 epoch 8 - iter 890/1786 - loss 0.19406914 - time (sec): 15.38 - samples/sec: 8042.01 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 20:01:05,960 epoch 8 - iter 1068/1786 - loss 0.19371324 - time (sec): 18.51 - samples/sec: 8062.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 20:01:09,034 epoch 8 - iter 1246/1786 - loss 0.19346585 - time (sec): 21.59 - samples/sec: 8060.43 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 20:01:12,085 epoch 8 - iter 1424/1786 - loss 0.19281594 - time (sec): 24.64 - samples/sec: 8080.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 20:01:14,967 epoch 8 - iter 1602/1786 - loss 0.19551404 - time (sec): 27.52 - samples/sec: 8117.22 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 20:01:18,146 epoch 8 - iter 1780/1786 - loss 0.19565402 - time (sec): 30.70 - samples/sec: 8077.28 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 20:01:18,246 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:01:18,246 EPOCH 8 done: loss 0.1955 - lr: 0.000011 |
|
2023-10-19 20:01:21,157 DEV : loss 0.1908378303050995 - f1-score (micro avg) 0.5511 |
|
2023-10-19 20:01:21,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:01:24,362 epoch 9 - iter 178/1786 - loss 0.18088382 - time (sec): 3.19 - samples/sec: 8303.42 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 20:01:27,480 epoch 9 - iter 356/1786 - loss 0.17917792 - time (sec): 6.31 - samples/sec: 8210.43 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 20:01:30,457 epoch 9 - iter 534/1786 - loss 0.17920050 - time (sec): 9.28 - samples/sec: 8204.99 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 20:01:33,384 epoch 9 - iter 712/1786 - loss 0.18318395 - time (sec): 12.21 - samples/sec: 8147.49 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 20:01:36,370 epoch 9 - iter 890/1786 - loss 0.18476519 - time (sec): 15.20 - samples/sec: 8104.66 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 20:01:39,482 epoch 9 - iter 1068/1786 - loss 0.18662129 - time (sec): 18.31 - samples/sec: 8129.74 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 20:01:42,674 epoch 9 - iter 1246/1786 - loss 0.18931705 - time (sec): 21.50 - samples/sec: 8094.90 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 20:01:45,671 epoch 9 - iter 1424/1786 - loss 0.18853491 - time (sec): 24.50 - samples/sec: 8096.88 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 20:01:48,900 epoch 9 - iter 1602/1786 - loss 0.18944300 - time (sec): 27.73 - samples/sec: 8070.14 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 20:01:51,967 epoch 9 - iter 1780/1786 - loss 0.18742971 - time (sec): 30.79 - samples/sec: 8054.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 20:01:52,069 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:01:52,069 EPOCH 9 done: loss 0.1871 - lr: 0.000006 |
|
2023-10-19 20:01:54,434 DEV : loss 0.1917891502380371 - f1-score (micro avg) 0.5576 |
|
2023-10-19 20:01:54,448 saving best model |
|
2023-10-19 20:01:54,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:01:57,521 epoch 10 - iter 178/1786 - loss 0.19254167 - time (sec): 3.04 - samples/sec: 7641.91 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 20:02:00,659 epoch 10 - iter 356/1786 - loss 0.19593691 - time (sec): 6.18 - samples/sec: 7746.14 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 20:02:03,654 epoch 10 - iter 534/1786 - loss 0.19352313 - time (sec): 9.17 - samples/sec: 7904.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 20:02:06,740 epoch 10 - iter 712/1786 - loss 0.19368257 - time (sec): 12.26 - samples/sec: 7965.43 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 20:02:09,806 epoch 10 - iter 890/1786 - loss 0.18973332 - time (sec): 15.32 - samples/sec: 7925.22 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 20:02:12,863 epoch 10 - iter 1068/1786 - loss 0.18637461 - time (sec): 18.38 - samples/sec: 7953.10 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 20:02:16,003 epoch 10 - iter 1246/1786 - loss 0.18229577 - time (sec): 21.52 - samples/sec: 8003.31 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 20:02:19,049 epoch 10 - iter 1424/1786 - loss 0.17769788 - time (sec): 24.57 - samples/sec: 8063.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 20:02:22,100 epoch 10 - iter 1602/1786 - loss 0.17934314 - time (sec): 27.62 - samples/sec: 8089.31 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 20:02:25,155 epoch 10 - iter 1780/1786 - loss 0.18135445 - time (sec): 30.67 - samples/sec: 8090.82 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 20:02:25,250 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:02:25,250 EPOCH 10 done: loss 0.1819 - lr: 0.000000 |
|
2023-10-19 20:02:28,074 DEV : loss 0.19165924191474915 - f1-score (micro avg) 0.5584 |
|
2023-10-19 20:02:28,088 saving best model |
|
2023-10-19 20:02:28,154 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 20:02:28,154 Loading model from best epoch ... |
|
2023-10-19 20:02:28,235 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 20:02:32,857 |
|
Results: |
|
- F-score (micro) 0.449 |
|
- F-score (macro) 0.2899 |
|
- Accuracy 0.2994 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.4268 0.5562 0.4830 1095 |
|
PER 0.4745 0.5247 0.4984 1012 |
|
ORG 0.1980 0.1625 0.1785 357 |
|
HumanProd 0.0000 0.0000 0.0000 33 |
|
|
|
micro avg 0.4220 0.4798 0.4490 2497 |
|
macro avg 0.2748 0.3108 0.2899 2497 |
|
weighted avg 0.4078 0.4798 0.4393 2497 |
|
|
|
2023-10-19 20:02:32,857 ---------------------------------------------------------------------------------------------------- |
|
|