stefan-it's picture
Upload folder using huggingface_hub
3c094ed
2023-10-12 22:05:12,690 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,692 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-12 22:05:12,692 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,692 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-12 22:05:12,692 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,693 Train: 7936 sentences
2023-10-12 22:05:12,693 (train_with_dev=False, train_with_test=False)
2023-10-12 22:05:12,693 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,693 Training Params:
2023-10-12 22:05:12,693 - learning_rate: "0.00016"
2023-10-12 22:05:12,693 - mini_batch_size: "8"
2023-10-12 22:05:12,693 - max_epochs: "10"
2023-10-12 22:05:12,693 - shuffle: "True"
2023-10-12 22:05:12,693 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,693 Plugins:
2023-10-12 22:05:12,693 - TensorboardLogger
2023-10-12 22:05:12,693 - LinearScheduler | warmup_fraction: '0.1'
2023-10-12 22:05:12,693 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,693 Final evaluation on model from best epoch (best-model.pt)
2023-10-12 22:05:12,694 - metric: "('micro avg', 'f1-score')"
2023-10-12 22:05:12,694 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,694 Computation:
2023-10-12 22:05:12,694 - compute on device: cuda:0
2023-10-12 22:05:12,694 - embedding storage: none
2023-10-12 22:05:12,694 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,694 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3"
2023-10-12 22:05:12,694 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,694 ----------------------------------------------------------------------------------------------------
2023-10-12 22:05:12,694 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-12 22:06:02,586 epoch 1 - iter 99/992 - loss 2.53866750 - time (sec): 49.89 - samples/sec: 321.99 - lr: 0.000016 - momentum: 0.000000
2023-10-12 22:06:51,649 epoch 1 - iter 198/992 - loss 2.44203287 - time (sec): 98.95 - samples/sec: 321.09 - lr: 0.000032 - momentum: 0.000000
2023-10-12 22:07:42,422 epoch 1 - iter 297/992 - loss 2.21386372 - time (sec): 149.73 - samples/sec: 325.56 - lr: 0.000048 - momentum: 0.000000
2023-10-12 22:08:33,193 epoch 1 - iter 396/992 - loss 1.97014077 - time (sec): 200.50 - samples/sec: 325.47 - lr: 0.000064 - momentum: 0.000000
2023-10-12 22:09:25,877 epoch 1 - iter 495/992 - loss 1.71310585 - time (sec): 253.18 - samples/sec: 324.30 - lr: 0.000080 - momentum: 0.000000
2023-10-12 22:10:17,664 epoch 1 - iter 594/992 - loss 1.49579279 - time (sec): 304.97 - samples/sec: 322.49 - lr: 0.000096 - momentum: 0.000000
2023-10-12 22:11:05,041 epoch 1 - iter 693/992 - loss 1.33675130 - time (sec): 352.35 - samples/sec: 323.61 - lr: 0.000112 - momentum: 0.000000
2023-10-12 22:11:53,190 epoch 1 - iter 792/992 - loss 1.20207715 - time (sec): 400.49 - samples/sec: 325.89 - lr: 0.000128 - momentum: 0.000000
2023-10-12 22:12:41,655 epoch 1 - iter 891/992 - loss 1.09115549 - time (sec): 448.96 - samples/sec: 328.46 - lr: 0.000144 - momentum: 0.000000
2023-10-12 22:13:30,799 epoch 1 - iter 990/992 - loss 1.00336945 - time (sec): 498.10 - samples/sec: 328.65 - lr: 0.000160 - momentum: 0.000000
2023-10-12 22:13:31,769 ----------------------------------------------------------------------------------------------------
2023-10-12 22:13:31,769 EPOCH 1 done: loss 1.0021 - lr: 0.000160
2023-10-12 22:13:55,906 DEV : loss 0.16479156911373138 - f1-score (micro avg) 0.4881
2023-10-12 22:13:55,944 saving best model
2023-10-12 22:13:56,845 ----------------------------------------------------------------------------------------------------
2023-10-12 22:14:45,524 epoch 2 - iter 99/992 - loss 0.16749492 - time (sec): 48.68 - samples/sec: 345.97 - lr: 0.000158 - momentum: 0.000000
2023-10-12 22:15:33,867 epoch 2 - iter 198/992 - loss 0.16513327 - time (sec): 97.02 - samples/sec: 340.72 - lr: 0.000156 - momentum: 0.000000
2023-10-12 22:16:21,989 epoch 2 - iter 297/992 - loss 0.16266570 - time (sec): 145.14 - samples/sec: 340.03 - lr: 0.000155 - momentum: 0.000000
2023-10-12 22:17:09,262 epoch 2 - iter 396/992 - loss 0.15468695 - time (sec): 192.42 - samples/sec: 343.81 - lr: 0.000153 - momentum: 0.000000
2023-10-12 22:17:57,533 epoch 2 - iter 495/992 - loss 0.15190678 - time (sec): 240.69 - samples/sec: 341.26 - lr: 0.000151 - momentum: 0.000000
2023-10-12 22:18:44,949 epoch 2 - iter 594/992 - loss 0.14770062 - time (sec): 288.10 - samples/sec: 343.88 - lr: 0.000149 - momentum: 0.000000
2023-10-12 22:19:32,810 epoch 2 - iter 693/992 - loss 0.14365046 - time (sec): 335.96 - samples/sec: 344.85 - lr: 0.000148 - momentum: 0.000000
2023-10-12 22:20:20,690 epoch 2 - iter 792/992 - loss 0.14111146 - time (sec): 383.84 - samples/sec: 342.67 - lr: 0.000146 - momentum: 0.000000
2023-10-12 22:21:07,617 epoch 2 - iter 891/992 - loss 0.13769670 - time (sec): 430.77 - samples/sec: 342.53 - lr: 0.000144 - momentum: 0.000000
2023-10-12 22:21:54,279 epoch 2 - iter 990/992 - loss 0.13551472 - time (sec): 477.43 - samples/sec: 342.97 - lr: 0.000142 - momentum: 0.000000
2023-10-12 22:21:55,197 ----------------------------------------------------------------------------------------------------
2023-10-12 22:21:55,197 EPOCH 2 done: loss 0.1354 - lr: 0.000142
2023-10-12 22:22:20,768 DEV : loss 0.08708374202251434 - f1-score (micro avg) 0.7258
2023-10-12 22:22:20,817 saving best model
2023-10-12 22:22:23,472 ----------------------------------------------------------------------------------------------------
2023-10-12 22:23:12,687 epoch 3 - iter 99/992 - loss 0.07926069 - time (sec): 49.21 - samples/sec: 348.70 - lr: 0.000140 - momentum: 0.000000
2023-10-12 22:24:00,548 epoch 3 - iter 198/992 - loss 0.08319851 - time (sec): 97.07 - samples/sec: 341.76 - lr: 0.000139 - momentum: 0.000000
2023-10-12 22:24:48,441 epoch 3 - iter 297/992 - loss 0.07989106 - time (sec): 144.96 - samples/sec: 337.38 - lr: 0.000137 - momentum: 0.000000
2023-10-12 22:25:35,848 epoch 3 - iter 396/992 - loss 0.07688124 - time (sec): 192.37 - samples/sec: 336.27 - lr: 0.000135 - momentum: 0.000000
2023-10-12 22:26:24,723 epoch 3 - iter 495/992 - loss 0.07665381 - time (sec): 241.24 - samples/sec: 337.07 - lr: 0.000133 - momentum: 0.000000
2023-10-12 22:27:13,187 epoch 3 - iter 594/992 - loss 0.07779634 - time (sec): 289.71 - samples/sec: 336.66 - lr: 0.000132 - momentum: 0.000000
2023-10-12 22:28:07,961 epoch 3 - iter 693/992 - loss 0.07743573 - time (sec): 344.48 - samples/sec: 329.27 - lr: 0.000130 - momentum: 0.000000
2023-10-12 22:28:59,877 epoch 3 - iter 792/992 - loss 0.07691112 - time (sec): 396.40 - samples/sec: 327.63 - lr: 0.000128 - momentum: 0.000000
2023-10-12 22:29:48,612 epoch 3 - iter 891/992 - loss 0.07602786 - time (sec): 445.13 - samples/sec: 329.00 - lr: 0.000126 - momentum: 0.000000
2023-10-12 22:30:37,961 epoch 3 - iter 990/992 - loss 0.07645169 - time (sec): 494.48 - samples/sec: 331.20 - lr: 0.000125 - momentum: 0.000000
2023-10-12 22:30:38,911 ----------------------------------------------------------------------------------------------------
2023-10-12 22:30:38,912 EPOCH 3 done: loss 0.0765 - lr: 0.000125
2023-10-12 22:31:03,320 DEV : loss 0.08860880136489868 - f1-score (micro avg) 0.7359
2023-10-12 22:31:03,362 saving best model
2023-10-12 22:31:05,994 ----------------------------------------------------------------------------------------------------
2023-10-12 22:31:55,030 epoch 4 - iter 99/992 - loss 0.05266051 - time (sec): 49.03 - samples/sec: 355.22 - lr: 0.000123 - momentum: 0.000000
2023-10-12 22:32:41,837 epoch 4 - iter 198/992 - loss 0.05108763 - time (sec): 95.84 - samples/sec: 346.53 - lr: 0.000121 - momentum: 0.000000
2023-10-12 22:33:29,635 epoch 4 - iter 297/992 - loss 0.05139942 - time (sec): 143.63 - samples/sec: 344.00 - lr: 0.000119 - momentum: 0.000000
2023-10-12 22:34:17,373 epoch 4 - iter 396/992 - loss 0.04998338 - time (sec): 191.37 - samples/sec: 343.45 - lr: 0.000117 - momentum: 0.000000
2023-10-12 22:35:05,184 epoch 4 - iter 495/992 - loss 0.05009155 - time (sec): 239.18 - samples/sec: 343.37 - lr: 0.000116 - momentum: 0.000000
2023-10-12 22:35:52,496 epoch 4 - iter 594/992 - loss 0.05093715 - time (sec): 286.49 - samples/sec: 342.57 - lr: 0.000114 - momentum: 0.000000
2023-10-12 22:36:39,570 epoch 4 - iter 693/992 - loss 0.05159004 - time (sec): 333.57 - samples/sec: 340.90 - lr: 0.000112 - momentum: 0.000000
2023-10-12 22:37:32,868 epoch 4 - iter 792/992 - loss 0.05202951 - time (sec): 386.87 - samples/sec: 337.64 - lr: 0.000110 - momentum: 0.000000
2023-10-12 22:38:26,673 epoch 4 - iter 891/992 - loss 0.05194402 - time (sec): 440.67 - samples/sec: 333.04 - lr: 0.000109 - momentum: 0.000000
2023-10-12 22:39:21,320 epoch 4 - iter 990/992 - loss 0.05169748 - time (sec): 495.32 - samples/sec: 330.46 - lr: 0.000107 - momentum: 0.000000
2023-10-12 22:39:22,402 ----------------------------------------------------------------------------------------------------
2023-10-12 22:39:22,402 EPOCH 4 done: loss 0.0519 - lr: 0.000107
2023-10-12 22:39:49,852 DEV : loss 0.10738497972488403 - f1-score (micro avg) 0.7452
2023-10-12 22:39:49,898 saving best model
2023-10-12 22:39:52,641 ----------------------------------------------------------------------------------------------------
2023-10-12 22:40:45,393 epoch 5 - iter 99/992 - loss 0.03744860 - time (sec): 52.75 - samples/sec: 294.32 - lr: 0.000105 - momentum: 0.000000
2023-10-12 22:41:36,500 epoch 5 - iter 198/992 - loss 0.03310035 - time (sec): 103.85 - samples/sec: 307.26 - lr: 0.000103 - momentum: 0.000000
2023-10-12 22:42:23,692 epoch 5 - iter 297/992 - loss 0.03625772 - time (sec): 151.05 - samples/sec: 320.74 - lr: 0.000101 - momentum: 0.000000
2023-10-12 22:43:11,285 epoch 5 - iter 396/992 - loss 0.03810834 - time (sec): 198.64 - samples/sec: 320.58 - lr: 0.000100 - momentum: 0.000000
2023-10-12 22:43:59,504 epoch 5 - iter 495/992 - loss 0.03758287 - time (sec): 246.86 - samples/sec: 321.20 - lr: 0.000098 - momentum: 0.000000
2023-10-12 22:44:48,523 epoch 5 - iter 594/992 - loss 0.03713120 - time (sec): 295.88 - samples/sec: 324.15 - lr: 0.000096 - momentum: 0.000000
2023-10-12 22:45:37,246 epoch 5 - iter 693/992 - loss 0.03650804 - time (sec): 344.60 - samples/sec: 330.08 - lr: 0.000094 - momentum: 0.000000
2023-10-12 22:46:26,597 epoch 5 - iter 792/992 - loss 0.03660515 - time (sec): 393.95 - samples/sec: 330.86 - lr: 0.000093 - momentum: 0.000000
2023-10-12 22:47:16,389 epoch 5 - iter 891/992 - loss 0.03830079 - time (sec): 443.74 - samples/sec: 330.21 - lr: 0.000091 - momentum: 0.000000
2023-10-12 22:48:05,714 epoch 5 - iter 990/992 - loss 0.03848700 - time (sec): 493.07 - samples/sec: 331.93 - lr: 0.000089 - momentum: 0.000000
2023-10-12 22:48:06,780 ----------------------------------------------------------------------------------------------------
2023-10-12 22:48:06,781 EPOCH 5 done: loss 0.0385 - lr: 0.000089
2023-10-12 22:48:32,775 DEV : loss 0.12247787415981293 - f1-score (micro avg) 0.765
2023-10-12 22:48:32,824 saving best model
2023-10-12 22:48:35,498 ----------------------------------------------------------------------------------------------------
2023-10-12 22:49:23,615 epoch 6 - iter 99/992 - loss 0.03956974 - time (sec): 48.10 - samples/sec: 337.82 - lr: 0.000087 - momentum: 0.000000
2023-10-12 22:50:13,289 epoch 6 - iter 198/992 - loss 0.03197861 - time (sec): 97.78 - samples/sec: 329.46 - lr: 0.000085 - momentum: 0.000000
2023-10-12 22:51:03,269 epoch 6 - iter 297/992 - loss 0.03209333 - time (sec): 147.76 - samples/sec: 330.30 - lr: 0.000084 - momentum: 0.000000
2023-10-12 22:51:57,716 epoch 6 - iter 396/992 - loss 0.03241736 - time (sec): 202.21 - samples/sec: 323.77 - lr: 0.000082 - momentum: 0.000000
2023-10-12 22:52:50,732 epoch 6 - iter 495/992 - loss 0.03193044 - time (sec): 255.22 - samples/sec: 321.77 - lr: 0.000080 - momentum: 0.000000
2023-10-12 22:53:41,437 epoch 6 - iter 594/992 - loss 0.03185070 - time (sec): 305.93 - samples/sec: 320.29 - lr: 0.000078 - momentum: 0.000000
2023-10-12 22:54:31,597 epoch 6 - iter 693/992 - loss 0.03141795 - time (sec): 356.09 - samples/sec: 323.41 - lr: 0.000077 - momentum: 0.000000
2023-10-12 22:55:20,841 epoch 6 - iter 792/992 - loss 0.02995177 - time (sec): 405.33 - samples/sec: 324.97 - lr: 0.000075 - momentum: 0.000000
2023-10-12 22:56:13,061 epoch 6 - iter 891/992 - loss 0.03069202 - time (sec): 457.55 - samples/sec: 324.19 - lr: 0.000073 - momentum: 0.000000
2023-10-12 22:57:04,066 epoch 6 - iter 990/992 - loss 0.03019184 - time (sec): 508.56 - samples/sec: 322.03 - lr: 0.000071 - momentum: 0.000000
2023-10-12 22:57:05,223 ----------------------------------------------------------------------------------------------------
2023-10-12 22:57:05,223 EPOCH 6 done: loss 0.0302 - lr: 0.000071
2023-10-12 22:57:31,123 DEV : loss 0.15935450792312622 - f1-score (micro avg) 0.7639
2023-10-12 22:57:31,171 ----------------------------------------------------------------------------------------------------
2023-10-12 22:58:24,920 epoch 7 - iter 99/992 - loss 0.01651467 - time (sec): 53.75 - samples/sec: 300.62 - lr: 0.000069 - momentum: 0.000000
2023-10-12 22:59:19,604 epoch 7 - iter 198/992 - loss 0.01743649 - time (sec): 108.43 - samples/sec: 303.71 - lr: 0.000068 - momentum: 0.000000
2023-10-12 23:00:15,752 epoch 7 - iter 297/992 - loss 0.01956424 - time (sec): 164.58 - samples/sec: 297.51 - lr: 0.000066 - momentum: 0.000000
2023-10-12 23:01:10,441 epoch 7 - iter 396/992 - loss 0.02209278 - time (sec): 219.27 - samples/sec: 298.43 - lr: 0.000064 - momentum: 0.000000
2023-10-12 23:02:00,796 epoch 7 - iter 495/992 - loss 0.02209311 - time (sec): 269.62 - samples/sec: 302.36 - lr: 0.000062 - momentum: 0.000000
2023-10-12 23:02:53,389 epoch 7 - iter 594/992 - loss 0.02242793 - time (sec): 322.22 - samples/sec: 305.28 - lr: 0.000061 - momentum: 0.000000
2023-10-12 23:03:41,135 epoch 7 - iter 693/992 - loss 0.02205529 - time (sec): 369.96 - samples/sec: 310.06 - lr: 0.000059 - momentum: 0.000000
2023-10-12 23:04:29,920 epoch 7 - iter 792/992 - loss 0.02172005 - time (sec): 418.75 - samples/sec: 313.47 - lr: 0.000057 - momentum: 0.000000
2023-10-12 23:05:18,911 epoch 7 - iter 891/992 - loss 0.02166572 - time (sec): 467.74 - samples/sec: 314.91 - lr: 0.000055 - momentum: 0.000000
2023-10-12 23:06:07,894 epoch 7 - iter 990/992 - loss 0.02245169 - time (sec): 516.72 - samples/sec: 316.80 - lr: 0.000053 - momentum: 0.000000
2023-10-12 23:06:08,942 ----------------------------------------------------------------------------------------------------
2023-10-12 23:06:08,943 EPOCH 7 done: loss 0.0224 - lr: 0.000053
2023-10-12 23:06:35,694 DEV : loss 0.1707056164741516 - f1-score (micro avg) 0.7589
2023-10-12 23:06:35,736 ----------------------------------------------------------------------------------------------------
2023-10-12 23:07:27,949 epoch 8 - iter 99/992 - loss 0.01962247 - time (sec): 52.21 - samples/sec: 316.83 - lr: 0.000052 - momentum: 0.000000
2023-10-12 23:08:19,108 epoch 8 - iter 198/992 - loss 0.01859288 - time (sec): 103.37 - samples/sec: 308.87 - lr: 0.000050 - momentum: 0.000000
2023-10-12 23:09:10,193 epoch 8 - iter 297/992 - loss 0.01679401 - time (sec): 154.46 - samples/sec: 315.72 - lr: 0.000048 - momentum: 0.000000
2023-10-12 23:09:59,592 epoch 8 - iter 396/992 - loss 0.01831256 - time (sec): 203.85 - samples/sec: 321.87 - lr: 0.000046 - momentum: 0.000000
2023-10-12 23:10:50,216 epoch 8 - iter 495/992 - loss 0.01858703 - time (sec): 254.48 - samples/sec: 322.29 - lr: 0.000045 - momentum: 0.000000
2023-10-12 23:11:39,881 epoch 8 - iter 594/992 - loss 0.01869986 - time (sec): 304.14 - samples/sec: 322.59 - lr: 0.000043 - momentum: 0.000000
2023-10-12 23:12:32,934 epoch 8 - iter 693/992 - loss 0.01788096 - time (sec): 357.20 - samples/sec: 319.54 - lr: 0.000041 - momentum: 0.000000
2023-10-12 23:13:25,150 epoch 8 - iter 792/992 - loss 0.01720215 - time (sec): 409.41 - samples/sec: 320.00 - lr: 0.000039 - momentum: 0.000000
2023-10-12 23:14:15,414 epoch 8 - iter 891/992 - loss 0.01721473 - time (sec): 459.68 - samples/sec: 319.64 - lr: 0.000037 - momentum: 0.000000
2023-10-12 23:15:07,027 epoch 8 - iter 990/992 - loss 0.01805657 - time (sec): 511.29 - samples/sec: 320.02 - lr: 0.000036 - momentum: 0.000000
2023-10-12 23:15:07,988 ----------------------------------------------------------------------------------------------------
2023-10-12 23:15:07,988 EPOCH 8 done: loss 0.0181 - lr: 0.000036
2023-10-12 23:15:33,788 DEV : loss 0.1880449652671814 - f1-score (micro avg) 0.7603
2023-10-12 23:15:33,833 ----------------------------------------------------------------------------------------------------
2023-10-12 23:16:24,732 epoch 9 - iter 99/992 - loss 0.01209024 - time (sec): 50.90 - samples/sec: 303.76 - lr: 0.000034 - momentum: 0.000000
2023-10-12 23:17:12,430 epoch 9 - iter 198/992 - loss 0.01145554 - time (sec): 98.59 - samples/sec: 313.15 - lr: 0.000032 - momentum: 0.000000
2023-10-12 23:18:00,671 epoch 9 - iter 297/992 - loss 0.01364701 - time (sec): 146.83 - samples/sec: 321.91 - lr: 0.000030 - momentum: 0.000000
2023-10-12 23:18:51,788 epoch 9 - iter 396/992 - loss 0.01407126 - time (sec): 197.95 - samples/sec: 324.82 - lr: 0.000029 - momentum: 0.000000
2023-10-12 23:19:45,370 epoch 9 - iter 495/992 - loss 0.01342117 - time (sec): 251.53 - samples/sec: 321.54 - lr: 0.000027 - momentum: 0.000000
2023-10-12 23:20:34,936 epoch 9 - iter 594/992 - loss 0.01418408 - time (sec): 301.10 - samples/sec: 327.44 - lr: 0.000025 - momentum: 0.000000
2023-10-12 23:21:24,448 epoch 9 - iter 693/992 - loss 0.01478215 - time (sec): 350.61 - samples/sec: 329.28 - lr: 0.000023 - momentum: 0.000000
2023-10-12 23:22:12,712 epoch 9 - iter 792/992 - loss 0.01519139 - time (sec): 398.88 - samples/sec: 331.22 - lr: 0.000022 - momentum: 0.000000
2023-10-12 23:23:00,954 epoch 9 - iter 891/992 - loss 0.01476678 - time (sec): 447.12 - samples/sec: 332.32 - lr: 0.000020 - momentum: 0.000000
2023-10-12 23:23:47,907 epoch 9 - iter 990/992 - loss 0.01486460 - time (sec): 494.07 - samples/sec: 331.20 - lr: 0.000018 - momentum: 0.000000
2023-10-12 23:23:48,866 ----------------------------------------------------------------------------------------------------
2023-10-12 23:23:48,866 EPOCH 9 done: loss 0.0148 - lr: 0.000018
2023-10-12 23:24:14,572 DEV : loss 0.18750455975532532 - f1-score (micro avg) 0.7678
2023-10-12 23:24:14,613 saving best model
2023-10-12 23:24:17,232 ----------------------------------------------------------------------------------------------------
2023-10-12 23:25:06,168 epoch 10 - iter 99/992 - loss 0.00876544 - time (sec): 48.93 - samples/sec: 337.18 - lr: 0.000016 - momentum: 0.000000
2023-10-12 23:25:52,988 epoch 10 - iter 198/992 - loss 0.00939953 - time (sec): 95.75 - samples/sec: 347.04 - lr: 0.000014 - momentum: 0.000000
2023-10-12 23:26:40,053 epoch 10 - iter 297/992 - loss 0.01057757 - time (sec): 142.82 - samples/sec: 350.83 - lr: 0.000013 - momentum: 0.000000
2023-10-12 23:27:28,303 epoch 10 - iter 396/992 - loss 0.01038951 - time (sec): 191.07 - samples/sec: 345.97 - lr: 0.000011 - momentum: 0.000000
2023-10-12 23:28:16,166 epoch 10 - iter 495/992 - loss 0.01021717 - time (sec): 238.93 - samples/sec: 344.94 - lr: 0.000009 - momentum: 0.000000
2023-10-12 23:29:08,104 epoch 10 - iter 594/992 - loss 0.01086773 - time (sec): 290.87 - samples/sec: 337.98 - lr: 0.000007 - momentum: 0.000000
2023-10-12 23:29:57,166 epoch 10 - iter 693/992 - loss 0.01145449 - time (sec): 339.93 - samples/sec: 338.36 - lr: 0.000006 - momentum: 0.000000
2023-10-12 23:30:45,548 epoch 10 - iter 792/992 - loss 0.01111800 - time (sec): 388.31 - samples/sec: 336.93 - lr: 0.000004 - momentum: 0.000000
2023-10-12 23:31:36,159 epoch 10 - iter 891/992 - loss 0.01070743 - time (sec): 438.92 - samples/sec: 336.74 - lr: 0.000002 - momentum: 0.000000
2023-10-12 23:32:29,104 epoch 10 - iter 990/992 - loss 0.01101518 - time (sec): 491.87 - samples/sec: 332.77 - lr: 0.000000 - momentum: 0.000000
2023-10-12 23:32:30,054 ----------------------------------------------------------------------------------------------------
2023-10-12 23:32:30,054 EPOCH 10 done: loss 0.0111 - lr: 0.000000
2023-10-12 23:32:55,364 DEV : loss 0.19627229869365692 - f1-score (micro avg) 0.7659
2023-10-12 23:32:56,332 ----------------------------------------------------------------------------------------------------
2023-10-12 23:32:56,334 Loading model from best epoch ...
2023-10-12 23:32:59,958 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-12 23:33:23,960
Results:
- F-score (micro) 0.7784
- F-score (macro) 0.6876
- Accuracy 0.6589
By class:
precision recall f1-score support
LOC 0.8260 0.8550 0.8402 655
PER 0.7236 0.7982 0.7591 223
ORG 0.5094 0.4252 0.4635 127
micro avg 0.7689 0.7881 0.7784 1005
macro avg 0.6863 0.6928 0.6876 1005
weighted avg 0.7632 0.7881 0.7746 1005
2023-10-12 23:33:23,960 ----------------------------------------------------------------------------------------------------