|
2023-10-10 21:12:02,183 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,186 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-10 21:12:02,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,186 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-10 21:12:02,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,186 Train: 7142 sentences |
|
2023-10-10 21:12:02,186 (train_with_dev=False, train_with_test=False) |
|
2023-10-10 21:12:02,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,186 Training Params: |
|
2023-10-10 21:12:02,187 - learning_rate: "0.00016" |
|
2023-10-10 21:12:02,187 - mini_batch_size: "8" |
|
2023-10-10 21:12:02,187 - max_epochs: "10" |
|
2023-10-10 21:12:02,187 - shuffle: "True" |
|
2023-10-10 21:12:02,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,187 Plugins: |
|
2023-10-10 21:12:02,187 - TensorboardLogger |
|
2023-10-10 21:12:02,187 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-10 21:12:02,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,187 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-10 21:12:02,187 - metric: "('micro avg', 'f1-score')" |
|
2023-10-10 21:12:02,187 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,187 Computation: |
|
2023-10-10 21:12:02,187 - compute on device: cuda:0 |
|
2023-10-10 21:12:02,188 - embedding storage: none |
|
2023-10-10 21:12:02,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,188 Model training base path: "hmbench-newseye/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-10 21:12:02,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,188 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:12:02,188 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-10 21:12:55,812 epoch 1 - iter 89/893 - loss 2.82831623 - time (sec): 53.62 - samples/sec: 475.14 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-10 21:13:49,028 epoch 1 - iter 178/893 - loss 2.77409191 - time (sec): 106.84 - samples/sec: 471.84 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-10 21:14:41,947 epoch 1 - iter 267/893 - loss 2.58893902 - time (sec): 159.76 - samples/sec: 469.74 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-10 21:15:34,481 epoch 1 - iter 356/893 - loss 2.35977449 - time (sec): 212.29 - samples/sec: 468.65 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-10 21:16:29,178 epoch 1 - iter 445/893 - loss 2.09834694 - time (sec): 266.99 - samples/sec: 470.82 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-10 21:17:21,664 epoch 1 - iter 534/893 - loss 1.88678651 - time (sec): 319.47 - samples/sec: 465.84 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-10 21:18:14,012 epoch 1 - iter 623/893 - loss 1.70594199 - time (sec): 371.82 - samples/sec: 463.66 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-10 21:19:05,643 epoch 1 - iter 712/893 - loss 1.54482663 - time (sec): 423.45 - samples/sec: 466.76 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-10 21:19:56,253 epoch 1 - iter 801/893 - loss 1.41640141 - time (sec): 474.06 - samples/sec: 471.26 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-10 21:20:46,274 epoch 1 - iter 890/893 - loss 1.31458924 - time (sec): 524.08 - samples/sec: 473.16 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-10 21:20:47,807 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:20:47,807 EPOCH 1 done: loss 1.3116 - lr: 0.000159 |
|
2023-10-10 21:21:07,697 DEV : loss 0.283738374710083 - f1-score (micro avg) 0.2694 |
|
2023-10-10 21:21:07,728 saving best model |
|
2023-10-10 21:21:08,577 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:22:00,389 epoch 2 - iter 89/893 - loss 0.31517413 - time (sec): 51.81 - samples/sec: 510.50 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-10 21:22:49,744 epoch 2 - iter 178/893 - loss 0.30864388 - time (sec): 101.16 - samples/sec: 498.63 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-10 21:23:41,490 epoch 2 - iter 267/893 - loss 0.29140884 - time (sec): 152.91 - samples/sec: 485.83 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-10 21:24:33,788 epoch 2 - iter 356/893 - loss 0.27172420 - time (sec): 205.21 - samples/sec: 484.09 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-10 21:25:26,569 epoch 2 - iter 445/893 - loss 0.25547354 - time (sec): 257.99 - samples/sec: 481.26 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-10 21:26:18,173 epoch 2 - iter 534/893 - loss 0.24503363 - time (sec): 309.59 - samples/sec: 478.48 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-10 21:27:10,956 epoch 2 - iter 623/893 - loss 0.23162160 - time (sec): 362.38 - samples/sec: 478.84 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-10 21:28:04,603 epoch 2 - iter 712/893 - loss 0.22016367 - time (sec): 416.02 - samples/sec: 479.91 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-10 21:28:55,788 epoch 2 - iter 801/893 - loss 0.21046665 - time (sec): 467.21 - samples/sec: 478.78 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-10 21:29:46,943 epoch 2 - iter 890/893 - loss 0.20217199 - time (sec): 518.36 - samples/sec: 478.66 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-10 21:29:48,400 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:29:48,400 EPOCH 2 done: loss 0.2019 - lr: 0.000142 |
|
2023-10-10 21:30:10,827 DEV : loss 0.11071376502513885 - f1-score (micro avg) 0.7305 |
|
2023-10-10 21:30:10,860 saving best model |
|
2023-10-10 21:30:19,632 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:31:11,573 epoch 3 - iter 89/893 - loss 0.09192063 - time (sec): 51.94 - samples/sec: 459.99 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-10 21:32:03,261 epoch 3 - iter 178/893 - loss 0.08757251 - time (sec): 103.62 - samples/sec: 476.56 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-10 21:32:55,633 epoch 3 - iter 267/893 - loss 0.08958475 - time (sec): 156.00 - samples/sec: 474.65 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 21:33:46,581 epoch 3 - iter 356/893 - loss 0.09204390 - time (sec): 206.94 - samples/sec: 470.62 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-10 21:34:40,061 epoch 3 - iter 445/893 - loss 0.09030135 - time (sec): 260.42 - samples/sec: 473.36 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-10 21:35:33,000 epoch 3 - iter 534/893 - loss 0.08756515 - time (sec): 313.36 - samples/sec: 472.67 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-10 21:36:28,115 epoch 3 - iter 623/893 - loss 0.08431353 - time (sec): 368.48 - samples/sec: 469.26 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-10 21:37:19,051 epoch 3 - iter 712/893 - loss 0.08334354 - time (sec): 419.41 - samples/sec: 473.01 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-10 21:38:10,924 epoch 3 - iter 801/893 - loss 0.08188831 - time (sec): 471.29 - samples/sec: 477.66 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-10 21:38:59,366 epoch 3 - iter 890/893 - loss 0.08186637 - time (sec): 519.73 - samples/sec: 477.20 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-10 21:39:00,986 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:39:00,986 EPOCH 3 done: loss 0.0818 - lr: 0.000125 |
|
2023-10-10 21:39:23,280 DEV : loss 0.1093674823641777 - f1-score (micro avg) 0.7573 |
|
2023-10-10 21:39:23,310 saving best model |
|
2023-10-10 21:39:29,432 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:40:20,072 epoch 4 - iter 89/893 - loss 0.04796913 - time (sec): 50.64 - samples/sec: 492.08 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-10 21:41:10,758 epoch 4 - iter 178/893 - loss 0.05118746 - time (sec): 101.32 - samples/sec: 485.71 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-10 21:42:02,462 epoch 4 - iter 267/893 - loss 0.05230464 - time (sec): 153.03 - samples/sec: 482.23 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-10 21:42:54,950 epoch 4 - iter 356/893 - loss 0.05482020 - time (sec): 205.51 - samples/sec: 482.78 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-10 21:43:49,144 epoch 4 - iter 445/893 - loss 0.05300902 - time (sec): 259.71 - samples/sec: 483.40 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-10 21:44:41,491 epoch 4 - iter 534/893 - loss 0.05241872 - time (sec): 312.05 - samples/sec: 483.03 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-10 21:45:34,715 epoch 4 - iter 623/893 - loss 0.05109125 - time (sec): 365.28 - samples/sec: 484.73 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-10 21:46:26,148 epoch 4 - iter 712/893 - loss 0.05123523 - time (sec): 416.71 - samples/sec: 483.62 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-10 21:47:16,467 epoch 4 - iter 801/893 - loss 0.05153884 - time (sec): 467.03 - samples/sec: 481.68 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-10 21:48:05,530 epoch 4 - iter 890/893 - loss 0.05165032 - time (sec): 516.09 - samples/sec: 480.77 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-10 21:48:06,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:48:06,972 EPOCH 4 done: loss 0.0518 - lr: 0.000107 |
|
2023-10-10 21:48:29,692 DEV : loss 0.11296474188566208 - f1-score (micro avg) 0.782 |
|
2023-10-10 21:48:29,723 saving best model |
|
2023-10-10 21:48:35,782 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:49:28,784 epoch 5 - iter 89/893 - loss 0.03597354 - time (sec): 53.00 - samples/sec: 479.80 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-10 21:50:20,053 epoch 5 - iter 178/893 - loss 0.03649107 - time (sec): 104.27 - samples/sec: 466.65 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-10 21:51:12,981 epoch 5 - iter 267/893 - loss 0.03580546 - time (sec): 157.19 - samples/sec: 473.76 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-10 21:52:06,072 epoch 5 - iter 356/893 - loss 0.03754465 - time (sec): 210.29 - samples/sec: 478.75 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-10 21:52:56,908 epoch 5 - iter 445/893 - loss 0.03813979 - time (sec): 261.12 - samples/sec: 472.86 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-10 21:53:47,123 epoch 5 - iter 534/893 - loss 0.03795036 - time (sec): 311.34 - samples/sec: 473.37 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-10 21:54:40,551 epoch 5 - iter 623/893 - loss 0.03843597 - time (sec): 364.76 - samples/sec: 472.83 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-10 21:55:32,349 epoch 5 - iter 712/893 - loss 0.03898741 - time (sec): 416.56 - samples/sec: 475.76 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-10 21:56:23,023 epoch 5 - iter 801/893 - loss 0.03832088 - time (sec): 467.24 - samples/sec: 477.46 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-10 21:57:12,653 epoch 5 - iter 890/893 - loss 0.03812875 - time (sec): 516.87 - samples/sec: 479.88 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-10 21:57:14,185 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:57:14,186 EPOCH 5 done: loss 0.0382 - lr: 0.000089 |
|
2023-10-10 21:57:36,202 DEV : loss 0.13481374084949493 - f1-score (micro avg) 0.7888 |
|
2023-10-10 21:57:36,234 saving best model |
|
2023-10-10 21:57:45,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:58:34,561 epoch 6 - iter 89/893 - loss 0.02499157 - time (sec): 49.48 - samples/sec: 503.84 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-10 21:59:24,317 epoch 6 - iter 178/893 - loss 0.02831728 - time (sec): 99.23 - samples/sec: 499.20 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-10 22:00:14,930 epoch 6 - iter 267/893 - loss 0.02752569 - time (sec): 149.85 - samples/sec: 501.57 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-10 22:01:06,016 epoch 6 - iter 356/893 - loss 0.02810037 - time (sec): 200.93 - samples/sec: 493.70 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-10 22:01:56,477 epoch 6 - iter 445/893 - loss 0.02752164 - time (sec): 251.39 - samples/sec: 489.39 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-10 22:02:47,904 epoch 6 - iter 534/893 - loss 0.02786169 - time (sec): 302.82 - samples/sec: 488.32 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-10 22:03:40,128 epoch 6 - iter 623/893 - loss 0.02761400 - time (sec): 355.05 - samples/sec: 490.71 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-10 22:04:30,228 epoch 6 - iter 712/893 - loss 0.02792058 - time (sec): 405.15 - samples/sec: 491.45 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-10 22:05:21,277 epoch 6 - iter 801/893 - loss 0.02858658 - time (sec): 456.19 - samples/sec: 492.50 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-10 22:06:11,127 epoch 6 - iter 890/893 - loss 0.02899685 - time (sec): 506.04 - samples/sec: 490.13 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-10 22:06:12,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:06:12,769 EPOCH 6 done: loss 0.0289 - lr: 0.000071 |
|
2023-10-10 22:06:34,365 DEV : loss 0.16970570385456085 - f1-score (micro avg) 0.7684 |
|
2023-10-10 22:06:34,396 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:07:25,120 epoch 7 - iter 89/893 - loss 0.01834186 - time (sec): 50.72 - samples/sec: 500.26 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-10 22:08:14,621 epoch 7 - iter 178/893 - loss 0.02010496 - time (sec): 100.22 - samples/sec: 486.08 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-10 22:09:05,764 epoch 7 - iter 267/893 - loss 0.02031563 - time (sec): 151.37 - samples/sec: 490.13 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-10 22:09:56,686 epoch 7 - iter 356/893 - loss 0.02131291 - time (sec): 202.29 - samples/sec: 489.17 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-10 22:10:46,378 epoch 7 - iter 445/893 - loss 0.02126158 - time (sec): 251.98 - samples/sec: 487.98 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-10 22:11:36,949 epoch 7 - iter 534/893 - loss 0.02104177 - time (sec): 302.55 - samples/sec: 490.11 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-10 22:12:28,625 epoch 7 - iter 623/893 - loss 0.02189809 - time (sec): 354.23 - samples/sec: 488.81 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-10 22:13:19,313 epoch 7 - iter 712/893 - loss 0.02158960 - time (sec): 404.91 - samples/sec: 485.19 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-10 22:14:11,525 epoch 7 - iter 801/893 - loss 0.02192818 - time (sec): 457.13 - samples/sec: 487.63 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-10 22:15:02,258 epoch 7 - iter 890/893 - loss 0.02245709 - time (sec): 507.86 - samples/sec: 488.52 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-10 22:15:03,904 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:15:03,904 EPOCH 7 done: loss 0.0224 - lr: 0.000053 |
|
2023-10-10 22:15:27,246 DEV : loss 0.16878585517406464 - f1-score (micro avg) 0.781 |
|
2023-10-10 22:15:27,280 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:16:19,605 epoch 8 - iter 89/893 - loss 0.01907382 - time (sec): 52.32 - samples/sec: 469.29 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-10 22:17:10,982 epoch 8 - iter 178/893 - loss 0.01745040 - time (sec): 103.70 - samples/sec: 467.76 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-10 22:18:02,971 epoch 8 - iter 267/893 - loss 0.01934552 - time (sec): 155.69 - samples/sec: 462.81 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-10 22:18:55,894 epoch 8 - iter 356/893 - loss 0.01839673 - time (sec): 208.61 - samples/sec: 470.60 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-10 22:19:49,404 epoch 8 - iter 445/893 - loss 0.01773640 - time (sec): 262.12 - samples/sec: 468.09 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 22:20:42,717 epoch 8 - iter 534/893 - loss 0.01726269 - time (sec): 315.44 - samples/sec: 462.59 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-10 22:21:35,308 epoch 8 - iter 623/893 - loss 0.01741947 - time (sec): 368.03 - samples/sec: 464.14 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-10 22:22:26,901 epoch 8 - iter 712/893 - loss 0.01689766 - time (sec): 419.62 - samples/sec: 465.07 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-10 22:23:20,021 epoch 8 - iter 801/893 - loss 0.01726890 - time (sec): 472.74 - samples/sec: 467.86 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-10 22:24:13,002 epoch 8 - iter 890/893 - loss 0.01701346 - time (sec): 525.72 - samples/sec: 471.25 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-10 22:24:14,799 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:24:14,799 EPOCH 8 done: loss 0.0171 - lr: 0.000036 |
|
2023-10-10 22:24:38,300 DEV : loss 0.183110311627388 - f1-score (micro avg) 0.7858 |
|
2023-10-10 22:24:38,331 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:25:29,529 epoch 9 - iter 89/893 - loss 0.01593252 - time (sec): 51.20 - samples/sec: 486.33 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-10 22:26:22,330 epoch 9 - iter 178/893 - loss 0.01574775 - time (sec): 104.00 - samples/sec: 469.56 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-10 22:27:13,972 epoch 9 - iter 267/893 - loss 0.01645156 - time (sec): 155.64 - samples/sec: 480.85 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-10 22:28:04,643 epoch 9 - iter 356/893 - loss 0.01555361 - time (sec): 206.31 - samples/sec: 473.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-10 22:28:57,093 epoch 9 - iter 445/893 - loss 0.01523364 - time (sec): 258.76 - samples/sec: 470.80 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-10 22:29:48,190 epoch 9 - iter 534/893 - loss 0.01528624 - time (sec): 309.86 - samples/sec: 471.86 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-10 22:30:38,406 epoch 9 - iter 623/893 - loss 0.01463531 - time (sec): 360.07 - samples/sec: 473.08 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-10 22:31:30,450 epoch 9 - iter 712/893 - loss 0.01411081 - time (sec): 412.12 - samples/sec: 475.02 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-10 22:32:22,181 epoch 9 - iter 801/893 - loss 0.01404007 - time (sec): 463.85 - samples/sec: 476.81 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-10 22:33:14,754 epoch 9 - iter 890/893 - loss 0.01366814 - time (sec): 516.42 - samples/sec: 480.08 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-10 22:33:16,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:33:16,430 EPOCH 9 done: loss 0.0137 - lr: 0.000018 |
|
2023-10-10 22:33:39,324 DEV : loss 0.19475506246089935 - f1-score (micro avg) 0.7882 |
|
2023-10-10 22:33:39,354 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:34:30,591 epoch 10 - iter 89/893 - loss 0.01252786 - time (sec): 51.23 - samples/sec: 492.58 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-10 22:35:22,707 epoch 10 - iter 178/893 - loss 0.01292130 - time (sec): 103.35 - samples/sec: 476.58 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-10 22:36:14,189 epoch 10 - iter 267/893 - loss 0.01258759 - time (sec): 154.83 - samples/sec: 466.66 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-10 22:37:09,377 epoch 10 - iter 356/893 - loss 0.01217698 - time (sec): 210.02 - samples/sec: 469.59 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-10 22:38:01,970 epoch 10 - iter 445/893 - loss 0.01170757 - time (sec): 262.61 - samples/sec: 475.29 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-10 22:38:54,782 epoch 10 - iter 534/893 - loss 0.01179636 - time (sec): 315.43 - samples/sec: 471.41 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-10 22:39:47,080 epoch 10 - iter 623/893 - loss 0.01205257 - time (sec): 367.72 - samples/sec: 476.23 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-10 22:40:38,533 epoch 10 - iter 712/893 - loss 0.01244404 - time (sec): 419.18 - samples/sec: 473.93 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-10 22:41:28,980 epoch 10 - iter 801/893 - loss 0.01197188 - time (sec): 469.62 - samples/sec: 473.93 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-10 22:42:22,734 epoch 10 - iter 890/893 - loss 0.01189933 - time (sec): 523.38 - samples/sec: 473.94 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-10 22:42:24,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:42:24,282 EPOCH 10 done: loss 0.0119 - lr: 0.000000 |
|
2023-10-10 22:42:46,904 DEV : loss 0.20220361649990082 - f1-score (micro avg) 0.7859 |
|
2023-10-10 22:42:47,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:42:47,807 Loading model from best epoch ... |
|
2023-10-10 22:42:52,366 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-10 22:44:04,150 |
|
Results: |
|
- F-score (micro) 0.6943 |
|
- F-score (macro) 0.6066 |
|
- Accuracy 0.5472 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7046 0.7014 0.7030 1095 |
|
PER 0.7580 0.7737 0.7658 1012 |
|
ORG 0.4698 0.5658 0.5133 357 |
|
HumanProd 0.3509 0.6061 0.4444 33 |
|
|
|
micro avg 0.6793 0.7101 0.6943 2497 |
|
macro avg 0.5708 0.6617 0.6066 2497 |
|
weighted avg 0.6880 0.7101 0.6979 2497 |
|
|
|
2023-10-10 22:44:04,150 ---------------------------------------------------------------------------------------------------- |
|
|