|
2023-10-13 08:56:28,382 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,384 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 08:56:28,384 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,385 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-13 08:56:28,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,385 Train: 7936 sentences |
|
2023-10-13 08:56:28,385 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 08:56:28,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,385 Training Params: |
|
2023-10-13 08:56:28,385 - learning_rate: "0.00015" |
|
2023-10-13 08:56:28,385 - mini_batch_size: "8" |
|
2023-10-13 08:56:28,385 - max_epochs: "10" |
|
2023-10-13 08:56:28,385 - shuffle: "True" |
|
2023-10-13 08:56:28,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,386 Plugins: |
|
2023-10-13 08:56:28,386 - TensorboardLogger |
|
2023-10-13 08:56:28,386 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 08:56:28,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,386 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 08:56:28,386 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 08:56:28,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,386 Computation: |
|
2023-10-13 08:56:28,386 - compute on device: cuda:0 |
|
2023-10-13 08:56:28,386 - embedding storage: none |
|
2023-10-13 08:56:28,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,386 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-13 08:56:28,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:56:28,387 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 08:57:20,355 epoch 1 - iter 99/992 - loss 2.54696046 - time (sec): 51.97 - samples/sec: 340.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:58:09,788 epoch 1 - iter 198/992 - loss 2.46415056 - time (sec): 101.40 - samples/sec: 329.09 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 08:58:59,863 epoch 1 - iter 297/992 - loss 2.24201712 - time (sec): 151.47 - samples/sec: 334.02 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 08:59:50,195 epoch 1 - iter 396/992 - loss 2.02428447 - time (sec): 201.81 - samples/sec: 325.11 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 09:00:44,266 epoch 1 - iter 495/992 - loss 1.78340987 - time (sec): 255.88 - samples/sec: 318.86 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 09:01:39,162 epoch 1 - iter 594/992 - loss 1.56772073 - time (sec): 310.77 - samples/sec: 313.87 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-13 09:02:32,798 epoch 1 - iter 693/992 - loss 1.39606142 - time (sec): 364.41 - samples/sec: 313.73 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 09:03:22,875 epoch 1 - iter 792/992 - loss 1.25709517 - time (sec): 414.49 - samples/sec: 314.37 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-13 09:04:16,323 epoch 1 - iter 891/992 - loss 1.13234168 - time (sec): 467.93 - samples/sec: 316.29 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 09:05:06,784 epoch 1 - iter 990/992 - loss 1.04362881 - time (sec): 518.40 - samples/sec: 315.88 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-13 09:05:07,750 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:05:07,750 EPOCH 1 done: loss 1.0424 - lr: 0.000150 |
|
2023-10-13 09:05:34,003 DEV : loss 0.16410574316978455 - f1-score (micro avg) 0.6208 |
|
2023-10-13 09:05:34,054 saving best model |
|
2023-10-13 09:05:35,073 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:06:26,766 epoch 2 - iter 99/992 - loss 0.20154135 - time (sec): 51.69 - samples/sec: 320.17 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 09:07:18,258 epoch 2 - iter 198/992 - loss 0.17954279 - time (sec): 103.18 - samples/sec: 322.07 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-13 09:08:11,018 epoch 2 - iter 297/992 - loss 0.16837201 - time (sec): 155.94 - samples/sec: 321.91 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-13 09:09:01,780 epoch 2 - iter 396/992 - loss 0.16409268 - time (sec): 206.70 - samples/sec: 318.47 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-13 09:09:56,347 epoch 2 - iter 495/992 - loss 0.15724817 - time (sec): 261.27 - samples/sec: 315.00 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 09:10:46,563 epoch 2 - iter 594/992 - loss 0.15241512 - time (sec): 311.49 - samples/sec: 316.46 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 09:11:38,477 epoch 2 - iter 693/992 - loss 0.14811317 - time (sec): 363.40 - samples/sec: 316.06 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-13 09:12:30,098 epoch 2 - iter 792/992 - loss 0.14356440 - time (sec): 415.02 - samples/sec: 315.19 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 09:13:19,473 epoch 2 - iter 891/992 - loss 0.14085816 - time (sec): 464.40 - samples/sec: 314.87 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 09:14:10,344 epoch 2 - iter 990/992 - loss 0.13664423 - time (sec): 515.27 - samples/sec: 317.74 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 09:14:11,319 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:14:11,320 EPOCH 2 done: loss 0.1365 - lr: 0.000133 |
|
2023-10-13 09:14:37,108 DEV : loss 0.0895773395895958 - f1-score (micro avg) 0.7289 |
|
2023-10-13 09:14:37,159 saving best model |
|
2023-10-13 09:14:39,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:15:31,008 epoch 3 - iter 99/992 - loss 0.08180511 - time (sec): 51.11 - samples/sec: 315.07 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 09:16:21,529 epoch 3 - iter 198/992 - loss 0.08521021 - time (sec): 101.63 - samples/sec: 318.92 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 09:17:12,339 epoch 3 - iter 297/992 - loss 0.08522855 - time (sec): 152.44 - samples/sec: 318.21 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 09:18:02,275 epoch 3 - iter 396/992 - loss 0.08470432 - time (sec): 202.38 - samples/sec: 320.72 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-13 09:18:52,304 epoch 3 - iter 495/992 - loss 0.08346708 - time (sec): 252.41 - samples/sec: 320.75 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 09:19:48,106 epoch 3 - iter 594/992 - loss 0.08109664 - time (sec): 308.21 - samples/sec: 317.13 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 09:20:39,889 epoch 3 - iter 693/992 - loss 0.08032019 - time (sec): 359.99 - samples/sec: 317.35 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-13 09:21:30,881 epoch 3 - iter 792/992 - loss 0.07847918 - time (sec): 410.98 - samples/sec: 317.73 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-13 09:22:23,159 epoch 3 - iter 891/992 - loss 0.07639249 - time (sec): 463.26 - samples/sec: 317.68 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-13 09:23:13,066 epoch 3 - iter 990/992 - loss 0.07671320 - time (sec): 513.17 - samples/sec: 318.90 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 09:23:14,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:23:14,037 EPOCH 3 done: loss 0.0766 - lr: 0.000117 |
|
2023-10-13 09:23:41,303 DEV : loss 0.0838015154004097 - f1-score (micro avg) 0.7484 |
|
2023-10-13 09:23:41,354 saving best model |
|
2023-10-13 09:23:44,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:24:33,521 epoch 4 - iter 99/992 - loss 0.05700303 - time (sec): 49.47 - samples/sec: 333.14 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-13 09:25:24,566 epoch 4 - iter 198/992 - loss 0.05545520 - time (sec): 100.51 - samples/sec: 325.25 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-13 09:26:15,304 epoch 4 - iter 297/992 - loss 0.05114430 - time (sec): 151.25 - samples/sec: 323.41 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 09:27:12,162 epoch 4 - iter 396/992 - loss 0.05100050 - time (sec): 208.11 - samples/sec: 313.70 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 09:28:06,465 epoch 4 - iter 495/992 - loss 0.05124164 - time (sec): 262.41 - samples/sec: 314.02 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-13 09:28:58,097 epoch 4 - iter 594/992 - loss 0.05017759 - time (sec): 314.04 - samples/sec: 315.81 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 09:29:48,568 epoch 4 - iter 693/992 - loss 0.04993850 - time (sec): 364.51 - samples/sec: 315.32 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 09:30:39,746 epoch 4 - iter 792/992 - loss 0.05082169 - time (sec): 415.69 - samples/sec: 315.46 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 09:31:30,560 epoch 4 - iter 891/992 - loss 0.05205071 - time (sec): 466.50 - samples/sec: 316.03 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-13 09:32:22,698 epoch 4 - iter 990/992 - loss 0.05310253 - time (sec): 518.64 - samples/sec: 315.62 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 09:32:23,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:32:23,868 EPOCH 4 done: loss 0.0531 - lr: 0.000100 |
|
2023-10-13 09:32:51,988 DEV : loss 0.10059013962745667 - f1-score (micro avg) 0.7736 |
|
2023-10-13 09:32:52,043 saving best model |
|
2023-10-13 09:32:54,775 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:33:48,250 epoch 5 - iter 99/992 - loss 0.03161792 - time (sec): 53.47 - samples/sec: 314.00 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 09:34:42,408 epoch 5 - iter 198/992 - loss 0.03331366 - time (sec): 107.63 - samples/sec: 309.72 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-13 09:35:32,419 epoch 5 - iter 297/992 - loss 0.04028897 - time (sec): 157.64 - samples/sec: 317.47 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-13 09:36:20,792 epoch 5 - iter 396/992 - loss 0.03982061 - time (sec): 206.01 - samples/sec: 322.52 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 09:37:11,255 epoch 5 - iter 495/992 - loss 0.03948044 - time (sec): 256.48 - samples/sec: 321.65 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-13 09:38:01,748 epoch 5 - iter 594/992 - loss 0.04077289 - time (sec): 306.97 - samples/sec: 320.15 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-13 09:38:51,355 epoch 5 - iter 693/992 - loss 0.04061989 - time (sec): 356.58 - samples/sec: 320.32 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-13 09:39:41,209 epoch 5 - iter 792/992 - loss 0.04020643 - time (sec): 406.43 - samples/sec: 322.94 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 09:40:33,064 epoch 5 - iter 891/992 - loss 0.03991271 - time (sec): 458.29 - samples/sec: 322.18 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 09:41:20,751 epoch 5 - iter 990/992 - loss 0.03965766 - time (sec): 505.97 - samples/sec: 323.69 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-13 09:41:21,708 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:41:21,709 EPOCH 5 done: loss 0.0396 - lr: 0.000083 |
|
2023-10-13 09:41:48,133 DEV : loss 0.12499061226844788 - f1-score (micro avg) 0.7649 |
|
2023-10-13 09:41:48,176 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:42:37,843 epoch 6 - iter 99/992 - loss 0.02652638 - time (sec): 49.66 - samples/sec: 344.01 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 09:43:31,747 epoch 6 - iter 198/992 - loss 0.02574886 - time (sec): 103.57 - samples/sec: 323.74 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 09:44:24,410 epoch 6 - iter 297/992 - loss 0.02509660 - time (sec): 156.23 - samples/sec: 315.76 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 09:45:14,297 epoch 6 - iter 396/992 - loss 0.02835529 - time (sec): 206.12 - samples/sec: 319.87 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 09:46:03,625 epoch 6 - iter 495/992 - loss 0.02739146 - time (sec): 255.45 - samples/sec: 322.02 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 09:46:54,143 epoch 6 - iter 594/992 - loss 0.02719080 - time (sec): 305.96 - samples/sec: 322.15 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 09:47:43,296 epoch 6 - iter 693/992 - loss 0.02772867 - time (sec): 355.12 - samples/sec: 323.78 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-13 09:48:35,741 epoch 6 - iter 792/992 - loss 0.02870254 - time (sec): 407.56 - samples/sec: 321.24 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-13 09:49:27,562 epoch 6 - iter 891/992 - loss 0.02796614 - time (sec): 459.38 - samples/sec: 320.36 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 09:50:16,821 epoch 6 - iter 990/992 - loss 0.02903213 - time (sec): 508.64 - samples/sec: 321.85 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-13 09:50:17,859 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:50:17,859 EPOCH 6 done: loss 0.0292 - lr: 0.000067 |
|
2023-10-13 09:50:44,310 DEV : loss 0.14741767942905426 - f1-score (micro avg) 0.7613 |
|
2023-10-13 09:50:44,359 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:51:34,378 epoch 7 - iter 99/992 - loss 0.02402973 - time (sec): 50.02 - samples/sec: 329.63 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-13 09:52:24,501 epoch 7 - iter 198/992 - loss 0.01979590 - time (sec): 100.14 - samples/sec: 322.00 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-13 09:53:15,073 epoch 7 - iter 297/992 - loss 0.01995906 - time (sec): 150.71 - samples/sec: 326.38 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 09:54:05,819 epoch 7 - iter 396/992 - loss 0.02009440 - time (sec): 201.46 - samples/sec: 323.25 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 09:55:00,768 epoch 7 - iter 495/992 - loss 0.01994798 - time (sec): 256.41 - samples/sec: 317.85 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-13 09:55:54,048 epoch 7 - iter 594/992 - loss 0.02029700 - time (sec): 309.69 - samples/sec: 315.76 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 09:56:48,087 epoch 7 - iter 693/992 - loss 0.02118011 - time (sec): 363.73 - samples/sec: 313.64 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 09:57:44,144 epoch 7 - iter 792/992 - loss 0.02111159 - time (sec): 419.78 - samples/sec: 309.70 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 09:58:33,502 epoch 7 - iter 891/992 - loss 0.02190440 - time (sec): 469.14 - samples/sec: 313.48 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 09:59:22,237 epoch 7 - iter 990/992 - loss 0.02284813 - time (sec): 517.88 - samples/sec: 316.22 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 09:59:23,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 09:59:23,164 EPOCH 7 done: loss 0.0228 - lr: 0.000050 |
|
2023-10-13 09:59:49,598 DEV : loss 0.1585685759782791 - f1-score (micro avg) 0.7641 |
|
2023-10-13 09:59:49,641 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:00:40,528 epoch 8 - iter 99/992 - loss 0.01976251 - time (sec): 50.88 - samples/sec: 323.83 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 10:01:31,660 epoch 8 - iter 198/992 - loss 0.01626721 - time (sec): 102.02 - samples/sec: 325.37 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 10:02:24,535 epoch 8 - iter 297/992 - loss 0.01648077 - time (sec): 154.89 - samples/sec: 318.34 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 10:03:15,699 epoch 8 - iter 396/992 - loss 0.01817637 - time (sec): 206.06 - samples/sec: 320.26 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 10:04:05,995 epoch 8 - iter 495/992 - loss 0.01706704 - time (sec): 256.35 - samples/sec: 320.84 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 10:04:55,572 epoch 8 - iter 594/992 - loss 0.01717630 - time (sec): 305.93 - samples/sec: 322.26 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 10:05:46,143 epoch 8 - iter 693/992 - loss 0.01728220 - time (sec): 356.50 - samples/sec: 322.32 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 10:06:34,978 epoch 8 - iter 792/992 - loss 0.01702323 - time (sec): 405.34 - samples/sec: 322.08 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 10:07:27,092 epoch 8 - iter 891/992 - loss 0.01749045 - time (sec): 457.45 - samples/sec: 321.33 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 10:08:20,655 epoch 8 - iter 990/992 - loss 0.01730552 - time (sec): 511.01 - samples/sec: 320.44 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 10:08:21,730 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:08:21,730 EPOCH 8 done: loss 0.0173 - lr: 0.000033 |
|
2023-10-13 10:08:48,431 DEV : loss 0.18116213381290436 - f1-score (micro avg) 0.7622 |
|
2023-10-13 10:08:48,483 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:09:40,279 epoch 9 - iter 99/992 - loss 0.01131004 - time (sec): 51.79 - samples/sec: 304.50 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 10:10:33,101 epoch 9 - iter 198/992 - loss 0.01056116 - time (sec): 104.62 - samples/sec: 304.04 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 10:11:24,837 epoch 9 - iter 297/992 - loss 0.01170672 - time (sec): 156.35 - samples/sec: 309.56 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 10:12:17,122 epoch 9 - iter 396/992 - loss 0.01247187 - time (sec): 208.64 - samples/sec: 311.83 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 10:13:11,366 epoch 9 - iter 495/992 - loss 0.01243560 - time (sec): 262.88 - samples/sec: 309.06 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 10:14:02,044 epoch 9 - iter 594/992 - loss 0.01178669 - time (sec): 313.56 - samples/sec: 305.75 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 10:14:55,017 epoch 9 - iter 693/992 - loss 0.01196996 - time (sec): 366.53 - samples/sec: 309.15 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 10:15:49,028 epoch 9 - iter 792/992 - loss 0.01293566 - time (sec): 420.54 - samples/sec: 308.65 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 10:16:42,422 epoch 9 - iter 891/992 - loss 0.01341924 - time (sec): 473.94 - samples/sec: 310.41 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 10:17:35,746 epoch 9 - iter 990/992 - loss 0.01327578 - time (sec): 527.26 - samples/sec: 310.26 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 10:17:36,942 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:17:36,942 EPOCH 9 done: loss 0.0132 - lr: 0.000017 |
|
2023-10-13 10:18:03,364 DEV : loss 0.19423788785934448 - f1-score (micro avg) 0.7604 |
|
2023-10-13 10:18:03,410 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:18:54,068 epoch 10 - iter 99/992 - loss 0.01113977 - time (sec): 50.66 - samples/sec: 334.16 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 10:19:43,518 epoch 10 - iter 198/992 - loss 0.01142677 - time (sec): 100.11 - samples/sec: 325.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 10:20:34,362 epoch 10 - iter 297/992 - loss 0.01151712 - time (sec): 150.95 - samples/sec: 321.42 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 10:21:27,132 epoch 10 - iter 396/992 - loss 0.01138982 - time (sec): 203.72 - samples/sec: 316.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 10:22:21,103 epoch 10 - iter 495/992 - loss 0.01077407 - time (sec): 257.69 - samples/sec: 315.05 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 10:23:14,002 epoch 10 - iter 594/992 - loss 0.01139220 - time (sec): 310.59 - samples/sec: 315.49 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 10:24:06,650 epoch 10 - iter 693/992 - loss 0.01127794 - time (sec): 363.24 - samples/sec: 315.70 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 10:24:57,601 epoch 10 - iter 792/992 - loss 0.01139651 - time (sec): 414.19 - samples/sec: 317.34 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 10:25:48,997 epoch 10 - iter 891/992 - loss 0.01099854 - time (sec): 465.58 - samples/sec: 317.57 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 10:26:39,438 epoch 10 - iter 990/992 - loss 0.01091343 - time (sec): 516.03 - samples/sec: 317.05 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 10:26:40,487 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:26:40,487 EPOCH 10 done: loss 0.0109 - lr: 0.000000 |
|
2023-10-13 10:27:06,508 DEV : loss 0.19711482524871826 - f1-score (micro avg) 0.7634 |
|
2023-10-13 10:27:07,582 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 10:27:07,584 Loading model from best epoch ... |
|
2023-10-13 10:27:11,260 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-13 10:27:35,913 |
|
Results: |
|
- F-score (micro) 0.7745 |
|
- F-score (macro) 0.6954 |
|
- Accuracy 0.6581 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8152 0.8489 0.8317 655 |
|
PER 0.6818 0.8072 0.7392 223 |
|
ORG 0.5784 0.4646 0.5153 127 |
|
|
|
micro avg 0.7586 0.7910 0.7745 1005 |
|
macro avg 0.6918 0.7069 0.6954 1005 |
|
weighted avg 0.7557 0.7910 0.7712 1005 |
|
|
|
2023-10-13 10:27:35,913 ---------------------------------------------------------------------------------------------------- |
|
|