stefan-it's picture
Upload folder using huggingface_hub
d08b163
2023-10-13 08:56:28,382 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,384 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:56:28,384 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,385 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 08:56:28,385 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,385 Train: 7936 sentences
2023-10-13 08:56:28,385 (train_with_dev=False, train_with_test=False)
2023-10-13 08:56:28,385 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,385 Training Params:
2023-10-13 08:56:28,385 - learning_rate: "0.00015"
2023-10-13 08:56:28,385 - mini_batch_size: "8"
2023-10-13 08:56:28,385 - max_epochs: "10"
2023-10-13 08:56:28,385 - shuffle: "True"
2023-10-13 08:56:28,385 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,386 Plugins:
2023-10-13 08:56:28,386 - TensorboardLogger
2023-10-13 08:56:28,386 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:56:28,386 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,386 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:56:28,386 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:56:28,386 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,386 Computation:
2023-10-13 08:56:28,386 - compute on device: cuda:0
2023-10-13 08:56:28,386 - embedding storage: none
2023-10-13 08:56:28,386 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,386 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5"
2023-10-13 08:56:28,386 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,386 ----------------------------------------------------------------------------------------------------
2023-10-13 08:56:28,387 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 08:57:20,355 epoch 1 - iter 99/992 - loss 2.54696046 - time (sec): 51.97 - samples/sec: 340.20 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:58:09,788 epoch 1 - iter 198/992 - loss 2.46415056 - time (sec): 101.40 - samples/sec: 329.09 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:58:59,863 epoch 1 - iter 297/992 - loss 2.24201712 - time (sec): 151.47 - samples/sec: 334.02 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:59:50,195 epoch 1 - iter 396/992 - loss 2.02428447 - time (sec): 201.81 - samples/sec: 325.11 - lr: 0.000060 - momentum: 0.000000
2023-10-13 09:00:44,266 epoch 1 - iter 495/992 - loss 1.78340987 - time (sec): 255.88 - samples/sec: 318.86 - lr: 0.000075 - momentum: 0.000000
2023-10-13 09:01:39,162 epoch 1 - iter 594/992 - loss 1.56772073 - time (sec): 310.77 - samples/sec: 313.87 - lr: 0.000090 - momentum: 0.000000
2023-10-13 09:02:32,798 epoch 1 - iter 693/992 - loss 1.39606142 - time (sec): 364.41 - samples/sec: 313.73 - lr: 0.000105 - momentum: 0.000000
2023-10-13 09:03:22,875 epoch 1 - iter 792/992 - loss 1.25709517 - time (sec): 414.49 - samples/sec: 314.37 - lr: 0.000120 - momentum: 0.000000
2023-10-13 09:04:16,323 epoch 1 - iter 891/992 - loss 1.13234168 - time (sec): 467.93 - samples/sec: 316.29 - lr: 0.000135 - momentum: 0.000000
2023-10-13 09:05:06,784 epoch 1 - iter 990/992 - loss 1.04362881 - time (sec): 518.40 - samples/sec: 315.88 - lr: 0.000150 - momentum: 0.000000
2023-10-13 09:05:07,750 ----------------------------------------------------------------------------------------------------
2023-10-13 09:05:07,750 EPOCH 1 done: loss 1.0424 - lr: 0.000150
2023-10-13 09:05:34,003 DEV : loss 0.16410574316978455 - f1-score (micro avg) 0.6208
2023-10-13 09:05:34,054 saving best model
2023-10-13 09:05:35,073 ----------------------------------------------------------------------------------------------------
2023-10-13 09:06:26,766 epoch 2 - iter 99/992 - loss 0.20154135 - time (sec): 51.69 - samples/sec: 320.17 - lr: 0.000148 - momentum: 0.000000
2023-10-13 09:07:18,258 epoch 2 - iter 198/992 - loss 0.17954279 - time (sec): 103.18 - samples/sec: 322.07 - lr: 0.000147 - momentum: 0.000000
2023-10-13 09:08:11,018 epoch 2 - iter 297/992 - loss 0.16837201 - time (sec): 155.94 - samples/sec: 321.91 - lr: 0.000145 - momentum: 0.000000
2023-10-13 09:09:01,780 epoch 2 - iter 396/992 - loss 0.16409268 - time (sec): 206.70 - samples/sec: 318.47 - lr: 0.000143 - momentum: 0.000000
2023-10-13 09:09:56,347 epoch 2 - iter 495/992 - loss 0.15724817 - time (sec): 261.27 - samples/sec: 315.00 - lr: 0.000142 - momentum: 0.000000
2023-10-13 09:10:46,563 epoch 2 - iter 594/992 - loss 0.15241512 - time (sec): 311.49 - samples/sec: 316.46 - lr: 0.000140 - momentum: 0.000000
2023-10-13 09:11:38,477 epoch 2 - iter 693/992 - loss 0.14811317 - time (sec): 363.40 - samples/sec: 316.06 - lr: 0.000138 - momentum: 0.000000
2023-10-13 09:12:30,098 epoch 2 - iter 792/992 - loss 0.14356440 - time (sec): 415.02 - samples/sec: 315.19 - lr: 0.000137 - momentum: 0.000000
2023-10-13 09:13:19,473 epoch 2 - iter 891/992 - loss 0.14085816 - time (sec): 464.40 - samples/sec: 314.87 - lr: 0.000135 - momentum: 0.000000
2023-10-13 09:14:10,344 epoch 2 - iter 990/992 - loss 0.13664423 - time (sec): 515.27 - samples/sec: 317.74 - lr: 0.000133 - momentum: 0.000000
2023-10-13 09:14:11,319 ----------------------------------------------------------------------------------------------------
2023-10-13 09:14:11,320 EPOCH 2 done: loss 0.1365 - lr: 0.000133
2023-10-13 09:14:37,108 DEV : loss 0.0895773395895958 - f1-score (micro avg) 0.7289
2023-10-13 09:14:37,159 saving best model
2023-10-13 09:14:39,893 ----------------------------------------------------------------------------------------------------
2023-10-13 09:15:31,008 epoch 3 - iter 99/992 - loss 0.08180511 - time (sec): 51.11 - samples/sec: 315.07 - lr: 0.000132 - momentum: 0.000000
2023-10-13 09:16:21,529 epoch 3 - iter 198/992 - loss 0.08521021 - time (sec): 101.63 - samples/sec: 318.92 - lr: 0.000130 - momentum: 0.000000
2023-10-13 09:17:12,339 epoch 3 - iter 297/992 - loss 0.08522855 - time (sec): 152.44 - samples/sec: 318.21 - lr: 0.000128 - momentum: 0.000000
2023-10-13 09:18:02,275 epoch 3 - iter 396/992 - loss 0.08470432 - time (sec): 202.38 - samples/sec: 320.72 - lr: 0.000127 - momentum: 0.000000
2023-10-13 09:18:52,304 epoch 3 - iter 495/992 - loss 0.08346708 - time (sec): 252.41 - samples/sec: 320.75 - lr: 0.000125 - momentum: 0.000000
2023-10-13 09:19:48,106 epoch 3 - iter 594/992 - loss 0.08109664 - time (sec): 308.21 - samples/sec: 317.13 - lr: 0.000123 - momentum: 0.000000
2023-10-13 09:20:39,889 epoch 3 - iter 693/992 - loss 0.08032019 - time (sec): 359.99 - samples/sec: 317.35 - lr: 0.000122 - momentum: 0.000000
2023-10-13 09:21:30,881 epoch 3 - iter 792/992 - loss 0.07847918 - time (sec): 410.98 - samples/sec: 317.73 - lr: 0.000120 - momentum: 0.000000
2023-10-13 09:22:23,159 epoch 3 - iter 891/992 - loss 0.07639249 - time (sec): 463.26 - samples/sec: 317.68 - lr: 0.000118 - momentum: 0.000000
2023-10-13 09:23:13,066 epoch 3 - iter 990/992 - loss 0.07671320 - time (sec): 513.17 - samples/sec: 318.90 - lr: 0.000117 - momentum: 0.000000
2023-10-13 09:23:14,036 ----------------------------------------------------------------------------------------------------
2023-10-13 09:23:14,037 EPOCH 3 done: loss 0.0766 - lr: 0.000117
2023-10-13 09:23:41,303 DEV : loss 0.0838015154004097 - f1-score (micro avg) 0.7484
2023-10-13 09:23:41,354 saving best model
2023-10-13 09:23:44,052 ----------------------------------------------------------------------------------------------------
2023-10-13 09:24:33,521 epoch 4 - iter 99/992 - loss 0.05700303 - time (sec): 49.47 - samples/sec: 333.14 - lr: 0.000115 - momentum: 0.000000
2023-10-13 09:25:24,566 epoch 4 - iter 198/992 - loss 0.05545520 - time (sec): 100.51 - samples/sec: 325.25 - lr: 0.000113 - momentum: 0.000000
2023-10-13 09:26:15,304 epoch 4 - iter 297/992 - loss 0.05114430 - time (sec): 151.25 - samples/sec: 323.41 - lr: 0.000112 - momentum: 0.000000
2023-10-13 09:27:12,162 epoch 4 - iter 396/992 - loss 0.05100050 - time (sec): 208.11 - samples/sec: 313.70 - lr: 0.000110 - momentum: 0.000000
2023-10-13 09:28:06,465 epoch 4 - iter 495/992 - loss 0.05124164 - time (sec): 262.41 - samples/sec: 314.02 - lr: 0.000108 - momentum: 0.000000
2023-10-13 09:28:58,097 epoch 4 - iter 594/992 - loss 0.05017759 - time (sec): 314.04 - samples/sec: 315.81 - lr: 0.000107 - momentum: 0.000000
2023-10-13 09:29:48,568 epoch 4 - iter 693/992 - loss 0.04993850 - time (sec): 364.51 - samples/sec: 315.32 - lr: 0.000105 - momentum: 0.000000
2023-10-13 09:30:39,746 epoch 4 - iter 792/992 - loss 0.05082169 - time (sec): 415.69 - samples/sec: 315.46 - lr: 0.000103 - momentum: 0.000000
2023-10-13 09:31:30,560 epoch 4 - iter 891/992 - loss 0.05205071 - time (sec): 466.50 - samples/sec: 316.03 - lr: 0.000102 - momentum: 0.000000
2023-10-13 09:32:22,698 epoch 4 - iter 990/992 - loss 0.05310253 - time (sec): 518.64 - samples/sec: 315.62 - lr: 0.000100 - momentum: 0.000000
2023-10-13 09:32:23,868 ----------------------------------------------------------------------------------------------------
2023-10-13 09:32:23,868 EPOCH 4 done: loss 0.0531 - lr: 0.000100
2023-10-13 09:32:51,988 DEV : loss 0.10059013962745667 - f1-score (micro avg) 0.7736
2023-10-13 09:32:52,043 saving best model
2023-10-13 09:32:54,775 ----------------------------------------------------------------------------------------------------
2023-10-13 09:33:48,250 epoch 5 - iter 99/992 - loss 0.03161792 - time (sec): 53.47 - samples/sec: 314.00 - lr: 0.000098 - momentum: 0.000000
2023-10-13 09:34:42,408 epoch 5 - iter 198/992 - loss 0.03331366 - time (sec): 107.63 - samples/sec: 309.72 - lr: 0.000097 - momentum: 0.000000
2023-10-13 09:35:32,419 epoch 5 - iter 297/992 - loss 0.04028897 - time (sec): 157.64 - samples/sec: 317.47 - lr: 0.000095 - momentum: 0.000000
2023-10-13 09:36:20,792 epoch 5 - iter 396/992 - loss 0.03982061 - time (sec): 206.01 - samples/sec: 322.52 - lr: 0.000093 - momentum: 0.000000
2023-10-13 09:37:11,255 epoch 5 - iter 495/992 - loss 0.03948044 - time (sec): 256.48 - samples/sec: 321.65 - lr: 0.000092 - momentum: 0.000000
2023-10-13 09:38:01,748 epoch 5 - iter 594/992 - loss 0.04077289 - time (sec): 306.97 - samples/sec: 320.15 - lr: 0.000090 - momentum: 0.000000
2023-10-13 09:38:51,355 epoch 5 - iter 693/992 - loss 0.04061989 - time (sec): 356.58 - samples/sec: 320.32 - lr: 0.000088 - momentum: 0.000000
2023-10-13 09:39:41,209 epoch 5 - iter 792/992 - loss 0.04020643 - time (sec): 406.43 - samples/sec: 322.94 - lr: 0.000087 - momentum: 0.000000
2023-10-13 09:40:33,064 epoch 5 - iter 891/992 - loss 0.03991271 - time (sec): 458.29 - samples/sec: 322.18 - lr: 0.000085 - momentum: 0.000000
2023-10-13 09:41:20,751 epoch 5 - iter 990/992 - loss 0.03965766 - time (sec): 505.97 - samples/sec: 323.69 - lr: 0.000083 - momentum: 0.000000
2023-10-13 09:41:21,708 ----------------------------------------------------------------------------------------------------
2023-10-13 09:41:21,709 EPOCH 5 done: loss 0.0396 - lr: 0.000083
2023-10-13 09:41:48,133 DEV : loss 0.12499061226844788 - f1-score (micro avg) 0.7649
2023-10-13 09:41:48,176 ----------------------------------------------------------------------------------------------------
2023-10-13 09:42:37,843 epoch 6 - iter 99/992 - loss 0.02652638 - time (sec): 49.66 - samples/sec: 344.01 - lr: 0.000082 - momentum: 0.000000
2023-10-13 09:43:31,747 epoch 6 - iter 198/992 - loss 0.02574886 - time (sec): 103.57 - samples/sec: 323.74 - lr: 0.000080 - momentum: 0.000000
2023-10-13 09:44:24,410 epoch 6 - iter 297/992 - loss 0.02509660 - time (sec): 156.23 - samples/sec: 315.76 - lr: 0.000078 - momentum: 0.000000
2023-10-13 09:45:14,297 epoch 6 - iter 396/992 - loss 0.02835529 - time (sec): 206.12 - samples/sec: 319.87 - lr: 0.000077 - momentum: 0.000000
2023-10-13 09:46:03,625 epoch 6 - iter 495/992 - loss 0.02739146 - time (sec): 255.45 - samples/sec: 322.02 - lr: 0.000075 - momentum: 0.000000
2023-10-13 09:46:54,143 epoch 6 - iter 594/992 - loss 0.02719080 - time (sec): 305.96 - samples/sec: 322.15 - lr: 0.000073 - momentum: 0.000000
2023-10-13 09:47:43,296 epoch 6 - iter 693/992 - loss 0.02772867 - time (sec): 355.12 - samples/sec: 323.78 - lr: 0.000072 - momentum: 0.000000
2023-10-13 09:48:35,741 epoch 6 - iter 792/992 - loss 0.02870254 - time (sec): 407.56 - samples/sec: 321.24 - lr: 0.000070 - momentum: 0.000000
2023-10-13 09:49:27,562 epoch 6 - iter 891/992 - loss 0.02796614 - time (sec): 459.38 - samples/sec: 320.36 - lr: 0.000068 - momentum: 0.000000
2023-10-13 09:50:16,821 epoch 6 - iter 990/992 - loss 0.02903213 - time (sec): 508.64 - samples/sec: 321.85 - lr: 0.000067 - momentum: 0.000000
2023-10-13 09:50:17,859 ----------------------------------------------------------------------------------------------------
2023-10-13 09:50:17,859 EPOCH 6 done: loss 0.0292 - lr: 0.000067
2023-10-13 09:50:44,310 DEV : loss 0.14741767942905426 - f1-score (micro avg) 0.7613
2023-10-13 09:50:44,359 ----------------------------------------------------------------------------------------------------
2023-10-13 09:51:34,378 epoch 7 - iter 99/992 - loss 0.02402973 - time (sec): 50.02 - samples/sec: 329.63 - lr: 0.000065 - momentum: 0.000000
2023-10-13 09:52:24,501 epoch 7 - iter 198/992 - loss 0.01979590 - time (sec): 100.14 - samples/sec: 322.00 - lr: 0.000063 - momentum: 0.000000
2023-10-13 09:53:15,073 epoch 7 - iter 297/992 - loss 0.01995906 - time (sec): 150.71 - samples/sec: 326.38 - lr: 0.000062 - momentum: 0.000000
2023-10-13 09:54:05,819 epoch 7 - iter 396/992 - loss 0.02009440 - time (sec): 201.46 - samples/sec: 323.25 - lr: 0.000060 - momentum: 0.000000
2023-10-13 09:55:00,768 epoch 7 - iter 495/992 - loss 0.01994798 - time (sec): 256.41 - samples/sec: 317.85 - lr: 0.000058 - momentum: 0.000000
2023-10-13 09:55:54,048 epoch 7 - iter 594/992 - loss 0.02029700 - time (sec): 309.69 - samples/sec: 315.76 - lr: 0.000057 - momentum: 0.000000
2023-10-13 09:56:48,087 epoch 7 - iter 693/992 - loss 0.02118011 - time (sec): 363.73 - samples/sec: 313.64 - lr: 0.000055 - momentum: 0.000000
2023-10-13 09:57:44,144 epoch 7 - iter 792/992 - loss 0.02111159 - time (sec): 419.78 - samples/sec: 309.70 - lr: 0.000053 - momentum: 0.000000
2023-10-13 09:58:33,502 epoch 7 - iter 891/992 - loss 0.02190440 - time (sec): 469.14 - samples/sec: 313.48 - lr: 0.000052 - momentum: 0.000000
2023-10-13 09:59:22,237 epoch 7 - iter 990/992 - loss 0.02284813 - time (sec): 517.88 - samples/sec: 316.22 - lr: 0.000050 - momentum: 0.000000
2023-10-13 09:59:23,164 ----------------------------------------------------------------------------------------------------
2023-10-13 09:59:23,164 EPOCH 7 done: loss 0.0228 - lr: 0.000050
2023-10-13 09:59:49,598 DEV : loss 0.1585685759782791 - f1-score (micro avg) 0.7641
2023-10-13 09:59:49,641 ----------------------------------------------------------------------------------------------------
2023-10-13 10:00:40,528 epoch 8 - iter 99/992 - loss 0.01976251 - time (sec): 50.88 - samples/sec: 323.83 - lr: 0.000048 - momentum: 0.000000
2023-10-13 10:01:31,660 epoch 8 - iter 198/992 - loss 0.01626721 - time (sec): 102.02 - samples/sec: 325.37 - lr: 0.000047 - momentum: 0.000000
2023-10-13 10:02:24,535 epoch 8 - iter 297/992 - loss 0.01648077 - time (sec): 154.89 - samples/sec: 318.34 - lr: 0.000045 - momentum: 0.000000
2023-10-13 10:03:15,699 epoch 8 - iter 396/992 - loss 0.01817637 - time (sec): 206.06 - samples/sec: 320.26 - lr: 0.000043 - momentum: 0.000000
2023-10-13 10:04:05,995 epoch 8 - iter 495/992 - loss 0.01706704 - time (sec): 256.35 - samples/sec: 320.84 - lr: 0.000042 - momentum: 0.000000
2023-10-13 10:04:55,572 epoch 8 - iter 594/992 - loss 0.01717630 - time (sec): 305.93 - samples/sec: 322.26 - lr: 0.000040 - momentum: 0.000000
2023-10-13 10:05:46,143 epoch 8 - iter 693/992 - loss 0.01728220 - time (sec): 356.50 - samples/sec: 322.32 - lr: 0.000038 - momentum: 0.000000
2023-10-13 10:06:34,978 epoch 8 - iter 792/992 - loss 0.01702323 - time (sec): 405.34 - samples/sec: 322.08 - lr: 0.000037 - momentum: 0.000000
2023-10-13 10:07:27,092 epoch 8 - iter 891/992 - loss 0.01749045 - time (sec): 457.45 - samples/sec: 321.33 - lr: 0.000035 - momentum: 0.000000
2023-10-13 10:08:20,655 epoch 8 - iter 990/992 - loss 0.01730552 - time (sec): 511.01 - samples/sec: 320.44 - lr: 0.000033 - momentum: 0.000000
2023-10-13 10:08:21,730 ----------------------------------------------------------------------------------------------------
2023-10-13 10:08:21,730 EPOCH 8 done: loss 0.0173 - lr: 0.000033
2023-10-13 10:08:48,431 DEV : loss 0.18116213381290436 - f1-score (micro avg) 0.7622
2023-10-13 10:08:48,483 ----------------------------------------------------------------------------------------------------
2023-10-13 10:09:40,279 epoch 9 - iter 99/992 - loss 0.01131004 - time (sec): 51.79 - samples/sec: 304.50 - lr: 0.000032 - momentum: 0.000000
2023-10-13 10:10:33,101 epoch 9 - iter 198/992 - loss 0.01056116 - time (sec): 104.62 - samples/sec: 304.04 - lr: 0.000030 - momentum: 0.000000
2023-10-13 10:11:24,837 epoch 9 - iter 297/992 - loss 0.01170672 - time (sec): 156.35 - samples/sec: 309.56 - lr: 0.000028 - momentum: 0.000000
2023-10-13 10:12:17,122 epoch 9 - iter 396/992 - loss 0.01247187 - time (sec): 208.64 - samples/sec: 311.83 - lr: 0.000027 - momentum: 0.000000
2023-10-13 10:13:11,366 epoch 9 - iter 495/992 - loss 0.01243560 - time (sec): 262.88 - samples/sec: 309.06 - lr: 0.000025 - momentum: 0.000000
2023-10-13 10:14:02,044 epoch 9 - iter 594/992 - loss 0.01178669 - time (sec): 313.56 - samples/sec: 305.75 - lr: 0.000023 - momentum: 0.000000
2023-10-13 10:14:55,017 epoch 9 - iter 693/992 - loss 0.01196996 - time (sec): 366.53 - samples/sec: 309.15 - lr: 0.000022 - momentum: 0.000000
2023-10-13 10:15:49,028 epoch 9 - iter 792/992 - loss 0.01293566 - time (sec): 420.54 - samples/sec: 308.65 - lr: 0.000020 - momentum: 0.000000
2023-10-13 10:16:42,422 epoch 9 - iter 891/992 - loss 0.01341924 - time (sec): 473.94 - samples/sec: 310.41 - lr: 0.000018 - momentum: 0.000000
2023-10-13 10:17:35,746 epoch 9 - iter 990/992 - loss 0.01327578 - time (sec): 527.26 - samples/sec: 310.26 - lr: 0.000017 - momentum: 0.000000
2023-10-13 10:17:36,942 ----------------------------------------------------------------------------------------------------
2023-10-13 10:17:36,942 EPOCH 9 done: loss 0.0132 - lr: 0.000017
2023-10-13 10:18:03,364 DEV : loss 0.19423788785934448 - f1-score (micro avg) 0.7604
2023-10-13 10:18:03,410 ----------------------------------------------------------------------------------------------------
2023-10-13 10:18:54,068 epoch 10 - iter 99/992 - loss 0.01113977 - time (sec): 50.66 - samples/sec: 334.16 - lr: 0.000015 - momentum: 0.000000
2023-10-13 10:19:43,518 epoch 10 - iter 198/992 - loss 0.01142677 - time (sec): 100.11 - samples/sec: 325.76 - lr: 0.000013 - momentum: 0.000000
2023-10-13 10:20:34,362 epoch 10 - iter 297/992 - loss 0.01151712 - time (sec): 150.95 - samples/sec: 321.42 - lr: 0.000012 - momentum: 0.000000
2023-10-13 10:21:27,132 epoch 10 - iter 396/992 - loss 0.01138982 - time (sec): 203.72 - samples/sec: 316.12 - lr: 0.000010 - momentum: 0.000000
2023-10-13 10:22:21,103 epoch 10 - iter 495/992 - loss 0.01077407 - time (sec): 257.69 - samples/sec: 315.05 - lr: 0.000008 - momentum: 0.000000
2023-10-13 10:23:14,002 epoch 10 - iter 594/992 - loss 0.01139220 - time (sec): 310.59 - samples/sec: 315.49 - lr: 0.000007 - momentum: 0.000000
2023-10-13 10:24:06,650 epoch 10 - iter 693/992 - loss 0.01127794 - time (sec): 363.24 - samples/sec: 315.70 - lr: 0.000005 - momentum: 0.000000
2023-10-13 10:24:57,601 epoch 10 - iter 792/992 - loss 0.01139651 - time (sec): 414.19 - samples/sec: 317.34 - lr: 0.000004 - momentum: 0.000000
2023-10-13 10:25:48,997 epoch 10 - iter 891/992 - loss 0.01099854 - time (sec): 465.58 - samples/sec: 317.57 - lr: 0.000002 - momentum: 0.000000
2023-10-13 10:26:39,438 epoch 10 - iter 990/992 - loss 0.01091343 - time (sec): 516.03 - samples/sec: 317.05 - lr: 0.000000 - momentum: 0.000000
2023-10-13 10:26:40,487 ----------------------------------------------------------------------------------------------------
2023-10-13 10:26:40,487 EPOCH 10 done: loss 0.0109 - lr: 0.000000
2023-10-13 10:27:06,508 DEV : loss 0.19711482524871826 - f1-score (micro avg) 0.7634
2023-10-13 10:27:07,582 ----------------------------------------------------------------------------------------------------
2023-10-13 10:27:07,584 Loading model from best epoch ...
2023-10-13 10:27:11,260 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-13 10:27:35,913
Results:
- F-score (micro) 0.7745
- F-score (macro) 0.6954
- Accuracy 0.6581
By class:
precision recall f1-score support
LOC 0.8152 0.8489 0.8317 655
PER 0.6818 0.8072 0.7392 223
ORG 0.5784 0.4646 0.5153 127
micro avg 0.7586 0.7910 0.7745 1005
macro avg 0.6918 0.7069 0.6954 1005
weighted avg 0.7557 0.7910 0.7712 1005
2023-10-13 10:27:35,913 ----------------------------------------------------------------------------------------------------