stefan-it's picture
Upload folder using huggingface_hub
6bec244
2023-10-14 02:56:52,966 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,968 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 02:56:52,968 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,969 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-14 02:56:52,969 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,969 Train: 14465 sentences
2023-10-14 02:56:52,969 (train_with_dev=False, train_with_test=False)
2023-10-14 02:56:52,969 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,969 Training Params:
2023-10-14 02:56:52,969 - learning_rate: "0.00015"
2023-10-14 02:56:52,969 - mini_batch_size: "4"
2023-10-14 02:56:52,969 - max_epochs: "10"
2023-10-14 02:56:52,969 - shuffle: "True"
2023-10-14 02:56:52,969 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,970 Plugins:
2023-10-14 02:56:52,970 - TensorboardLogger
2023-10-14 02:56:52,970 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,970 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 02:56:52,970 - metric: "('micro avg', 'f1-score')"
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,970 Computation:
2023-10-14 02:56:52,970 - compute on device: cuda:0
2023-10-14 02:56:52,970 - embedding storage: none
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,970 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3"
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,970 ----------------------------------------------------------------------------------------------------
2023-10-14 02:56:52,971 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-14 02:58:32,206 epoch 1 - iter 361/3617 - loss 2.48047885 - time (sec): 99.23 - samples/sec: 380.07 - lr: 0.000015 - momentum: 0.000000
2023-10-14 03:00:10,564 epoch 1 - iter 722/3617 - loss 2.10182133 - time (sec): 197.59 - samples/sec: 376.99 - lr: 0.000030 - momentum: 0.000000
2023-10-14 03:01:51,336 epoch 1 - iter 1083/3617 - loss 1.65225929 - time (sec): 298.36 - samples/sec: 378.51 - lr: 0.000045 - momentum: 0.000000
2023-10-14 03:03:32,329 epoch 1 - iter 1444/3617 - loss 1.30928949 - time (sec): 399.36 - samples/sec: 378.96 - lr: 0.000060 - momentum: 0.000000
2023-10-14 03:05:11,749 epoch 1 - iter 1805/3617 - loss 1.08702017 - time (sec): 498.78 - samples/sec: 378.85 - lr: 0.000075 - momentum: 0.000000
2023-10-14 03:06:53,731 epoch 1 - iter 2166/3617 - loss 0.93505989 - time (sec): 600.76 - samples/sec: 377.19 - lr: 0.000090 - momentum: 0.000000
2023-10-14 03:08:31,230 epoch 1 - iter 2527/3617 - loss 0.82602626 - time (sec): 698.26 - samples/sec: 376.94 - lr: 0.000105 - momentum: 0.000000
2023-10-14 03:10:07,444 epoch 1 - iter 2888/3617 - loss 0.74069780 - time (sec): 794.47 - samples/sec: 379.05 - lr: 0.000120 - momentum: 0.000000
2023-10-14 03:11:45,292 epoch 1 - iter 3249/3617 - loss 0.66755548 - time (sec): 892.32 - samples/sec: 381.80 - lr: 0.000135 - momentum: 0.000000
2023-10-14 03:13:23,530 epoch 1 - iter 3610/3617 - loss 0.61237021 - time (sec): 990.56 - samples/sec: 382.89 - lr: 0.000150 - momentum: 0.000000
2023-10-14 03:13:25,225 ----------------------------------------------------------------------------------------------------
2023-10-14 03:13:25,225 EPOCH 1 done: loss 0.6115 - lr: 0.000150
2023-10-14 03:14:03,125 DEV : loss 0.12874433398246765 - f1-score (micro avg) 0.6138
2023-10-14 03:14:03,190 saving best model
2023-10-14 03:14:04,104 ----------------------------------------------------------------------------------------------------
2023-10-14 03:15:40,905 epoch 2 - iter 361/3617 - loss 0.09418970 - time (sec): 96.80 - samples/sec: 391.51 - lr: 0.000148 - momentum: 0.000000
2023-10-14 03:17:19,253 epoch 2 - iter 722/3617 - loss 0.09519731 - time (sec): 195.15 - samples/sec: 384.46 - lr: 0.000147 - momentum: 0.000000
2023-10-14 03:19:03,584 epoch 2 - iter 1083/3617 - loss 0.09418116 - time (sec): 299.48 - samples/sec: 388.15 - lr: 0.000145 - momentum: 0.000000
2023-10-14 03:20:44,465 epoch 2 - iter 1444/3617 - loss 0.09280081 - time (sec): 400.36 - samples/sec: 387.09 - lr: 0.000143 - momentum: 0.000000
2023-10-14 03:22:24,435 epoch 2 - iter 1805/3617 - loss 0.09367715 - time (sec): 500.33 - samples/sec: 386.16 - lr: 0.000142 - momentum: 0.000000
2023-10-14 03:24:00,186 epoch 2 - iter 2166/3617 - loss 0.09233796 - time (sec): 596.08 - samples/sec: 385.19 - lr: 0.000140 - momentum: 0.000000
2023-10-14 03:25:37,405 epoch 2 - iter 2527/3617 - loss 0.09099030 - time (sec): 693.30 - samples/sec: 385.19 - lr: 0.000138 - momentum: 0.000000
2023-10-14 03:27:18,849 epoch 2 - iter 2888/3617 - loss 0.09098331 - time (sec): 794.74 - samples/sec: 382.89 - lr: 0.000137 - momentum: 0.000000
2023-10-14 03:28:58,925 epoch 2 - iter 3249/3617 - loss 0.09046497 - time (sec): 894.82 - samples/sec: 382.77 - lr: 0.000135 - momentum: 0.000000
2023-10-14 03:30:35,687 epoch 2 - iter 3610/3617 - loss 0.08998216 - time (sec): 991.58 - samples/sec: 382.45 - lr: 0.000133 - momentum: 0.000000
2023-10-14 03:30:37,445 ----------------------------------------------------------------------------------------------------
2023-10-14 03:30:37,445 EPOCH 2 done: loss 0.0899 - lr: 0.000133
2023-10-14 03:31:21,578 DEV : loss 0.11994253098964691 - f1-score (micro avg) 0.6262
2023-10-14 03:31:21,672 saving best model
2023-10-14 03:31:24,476 ----------------------------------------------------------------------------------------------------
2023-10-14 03:33:17,847 epoch 3 - iter 361/3617 - loss 0.06787226 - time (sec): 113.36 - samples/sec: 330.36 - lr: 0.000132 - momentum: 0.000000
2023-10-14 03:35:08,246 epoch 3 - iter 722/3617 - loss 0.06752508 - time (sec): 223.76 - samples/sec: 344.88 - lr: 0.000130 - momentum: 0.000000
2023-10-14 03:36:56,698 epoch 3 - iter 1083/3617 - loss 0.06518068 - time (sec): 332.22 - samples/sec: 346.12 - lr: 0.000128 - momentum: 0.000000
2023-10-14 03:38:44,830 epoch 3 - iter 1444/3617 - loss 0.06371927 - time (sec): 440.35 - samples/sec: 346.81 - lr: 0.000127 - momentum: 0.000000
2023-10-14 03:40:32,739 epoch 3 - iter 1805/3617 - loss 0.06387658 - time (sec): 548.26 - samples/sec: 350.40 - lr: 0.000125 - momentum: 0.000000
2023-10-14 03:42:18,558 epoch 3 - iter 2166/3617 - loss 0.06346408 - time (sec): 654.08 - samples/sec: 352.37 - lr: 0.000123 - momentum: 0.000000
2023-10-14 03:44:05,888 epoch 3 - iter 2527/3617 - loss 0.06459000 - time (sec): 761.41 - samples/sec: 350.21 - lr: 0.000122 - momentum: 0.000000
2023-10-14 03:45:55,184 epoch 3 - iter 2888/3617 - loss 0.06401459 - time (sec): 870.70 - samples/sec: 348.70 - lr: 0.000120 - momentum: 0.000000
2023-10-14 03:47:39,172 epoch 3 - iter 3249/3617 - loss 0.06394856 - time (sec): 974.69 - samples/sec: 349.27 - lr: 0.000118 - momentum: 0.000000
2023-10-14 03:49:25,723 epoch 3 - iter 3610/3617 - loss 0.06409756 - time (sec): 1081.24 - samples/sec: 350.60 - lr: 0.000117 - momentum: 0.000000
2023-10-14 03:49:27,697 ----------------------------------------------------------------------------------------------------
2023-10-14 03:49:27,697 EPOCH 3 done: loss 0.0640 - lr: 0.000117
2023-10-14 03:50:10,253 DEV : loss 0.1677953451871872 - f1-score (micro avg) 0.6461
2023-10-14 03:50:10,312 saving best model
2023-10-14 03:50:13,061 ----------------------------------------------------------------------------------------------------
2023-10-14 03:52:00,514 epoch 4 - iter 361/3617 - loss 0.04549276 - time (sec): 107.45 - samples/sec: 340.97 - lr: 0.000115 - momentum: 0.000000
2023-10-14 03:53:47,270 epoch 4 - iter 722/3617 - loss 0.04285214 - time (sec): 214.20 - samples/sec: 351.22 - lr: 0.000113 - momentum: 0.000000
2023-10-14 03:55:39,687 epoch 4 - iter 1083/3617 - loss 0.04301779 - time (sec): 326.62 - samples/sec: 345.40 - lr: 0.000112 - momentum: 0.000000
2023-10-14 03:57:21,761 epoch 4 - iter 1444/3617 - loss 0.04227254 - time (sec): 428.70 - samples/sec: 349.22 - lr: 0.000110 - momentum: 0.000000
2023-10-14 03:59:08,557 epoch 4 - iter 1805/3617 - loss 0.04271265 - time (sec): 535.49 - samples/sec: 350.63 - lr: 0.000108 - momentum: 0.000000
2023-10-14 04:00:56,545 epoch 4 - iter 2166/3617 - loss 0.04401912 - time (sec): 643.48 - samples/sec: 353.09 - lr: 0.000107 - momentum: 0.000000
2023-10-14 04:02:49,728 epoch 4 - iter 2527/3617 - loss 0.04552488 - time (sec): 756.66 - samples/sec: 351.47 - lr: 0.000105 - momentum: 0.000000
2023-10-14 04:04:38,740 epoch 4 - iter 2888/3617 - loss 0.04620049 - time (sec): 865.67 - samples/sec: 350.11 - lr: 0.000103 - momentum: 0.000000
2023-10-14 04:06:26,364 epoch 4 - iter 3249/3617 - loss 0.04661299 - time (sec): 973.30 - samples/sec: 351.46 - lr: 0.000102 - momentum: 0.000000
2023-10-14 04:08:08,784 epoch 4 - iter 3610/3617 - loss 0.04677504 - time (sec): 1075.72 - samples/sec: 352.63 - lr: 0.000100 - momentum: 0.000000
2023-10-14 04:08:10,581 ----------------------------------------------------------------------------------------------------
2023-10-14 04:08:10,581 EPOCH 4 done: loss 0.0467 - lr: 0.000100
2023-10-14 04:08:52,500 DEV : loss 0.2164839208126068 - f1-score (micro avg) 0.6366
2023-10-14 04:08:52,567 ----------------------------------------------------------------------------------------------------
2023-10-14 04:10:41,619 epoch 5 - iter 361/3617 - loss 0.02826111 - time (sec): 109.05 - samples/sec: 352.78 - lr: 0.000098 - momentum: 0.000000
2023-10-14 04:12:31,812 epoch 5 - iter 722/3617 - loss 0.02975887 - time (sec): 219.24 - samples/sec: 348.59 - lr: 0.000097 - momentum: 0.000000
2023-10-14 04:14:13,673 epoch 5 - iter 1083/3617 - loss 0.03117931 - time (sec): 321.10 - samples/sec: 353.89 - lr: 0.000095 - momentum: 0.000000
2023-10-14 04:15:59,664 epoch 5 - iter 1444/3617 - loss 0.03114051 - time (sec): 427.09 - samples/sec: 351.78 - lr: 0.000093 - momentum: 0.000000
2023-10-14 04:17:51,747 epoch 5 - iter 1805/3617 - loss 0.03157137 - time (sec): 539.18 - samples/sec: 350.38 - lr: 0.000092 - momentum: 0.000000
2023-10-14 04:19:39,699 epoch 5 - iter 2166/3617 - loss 0.03117937 - time (sec): 647.13 - samples/sec: 349.96 - lr: 0.000090 - momentum: 0.000000
2023-10-14 04:21:21,870 epoch 5 - iter 2527/3617 - loss 0.03194759 - time (sec): 749.30 - samples/sec: 350.89 - lr: 0.000088 - momentum: 0.000000
2023-10-14 04:23:09,331 epoch 5 - iter 2888/3617 - loss 0.03227297 - time (sec): 856.76 - samples/sec: 351.11 - lr: 0.000087 - momentum: 0.000000
2023-10-14 04:24:50,326 epoch 5 - iter 3249/3617 - loss 0.03237174 - time (sec): 957.76 - samples/sec: 355.14 - lr: 0.000085 - momentum: 0.000000
2023-10-14 04:26:33,987 epoch 5 - iter 3610/3617 - loss 0.03258639 - time (sec): 1061.42 - samples/sec: 357.40 - lr: 0.000083 - momentum: 0.000000
2023-10-14 04:26:35,865 ----------------------------------------------------------------------------------------------------
2023-10-14 04:26:35,865 EPOCH 5 done: loss 0.0326 - lr: 0.000083
2023-10-14 04:27:17,687 DEV : loss 0.23494853079319 - f1-score (micro avg) 0.641
2023-10-14 04:27:17,752 ----------------------------------------------------------------------------------------------------
2023-10-14 04:29:14,894 epoch 6 - iter 361/3617 - loss 0.01939447 - time (sec): 117.14 - samples/sec: 335.01 - lr: 0.000082 - momentum: 0.000000
2023-10-14 04:30:58,369 epoch 6 - iter 722/3617 - loss 0.01924277 - time (sec): 220.61 - samples/sec: 349.27 - lr: 0.000080 - momentum: 0.000000
2023-10-14 04:32:36,989 epoch 6 - iter 1083/3617 - loss 0.02053705 - time (sec): 319.23 - samples/sec: 358.43 - lr: 0.000078 - momentum: 0.000000
2023-10-14 04:34:18,846 epoch 6 - iter 1444/3617 - loss 0.02164758 - time (sec): 421.09 - samples/sec: 358.61 - lr: 0.000077 - momentum: 0.000000
2023-10-14 04:36:07,097 epoch 6 - iter 1805/3617 - loss 0.02255132 - time (sec): 529.34 - samples/sec: 354.80 - lr: 0.000075 - momentum: 0.000000
2023-10-14 04:37:50,944 epoch 6 - iter 2166/3617 - loss 0.02251730 - time (sec): 633.19 - samples/sec: 355.87 - lr: 0.000073 - momentum: 0.000000
2023-10-14 04:39:38,861 epoch 6 - iter 2527/3617 - loss 0.02245883 - time (sec): 741.11 - samples/sec: 357.56 - lr: 0.000072 - momentum: 0.000000
2023-10-14 04:41:21,087 epoch 6 - iter 2888/3617 - loss 0.02197406 - time (sec): 843.33 - samples/sec: 360.49 - lr: 0.000070 - momentum: 0.000000
2023-10-14 04:43:02,150 epoch 6 - iter 3249/3617 - loss 0.02290527 - time (sec): 944.40 - samples/sec: 360.40 - lr: 0.000068 - momentum: 0.000000
2023-10-14 04:44:46,979 epoch 6 - iter 3610/3617 - loss 0.02272655 - time (sec): 1049.22 - samples/sec: 361.29 - lr: 0.000067 - momentum: 0.000000
2023-10-14 04:44:48,993 ----------------------------------------------------------------------------------------------------
2023-10-14 04:44:48,994 EPOCH 6 done: loss 0.0227 - lr: 0.000067
2023-10-14 04:45:30,501 DEV : loss 0.28848496079444885 - f1-score (micro avg) 0.6514
2023-10-14 04:45:30,570 saving best model
2023-10-14 04:45:35,631 ----------------------------------------------------------------------------------------------------
2023-10-14 04:47:22,333 epoch 7 - iter 361/3617 - loss 0.01165935 - time (sec): 106.69 - samples/sec: 359.92 - lr: 0.000065 - momentum: 0.000000
2023-10-14 04:49:03,875 epoch 7 - iter 722/3617 - loss 0.01137913 - time (sec): 208.23 - samples/sec: 365.84 - lr: 0.000063 - momentum: 0.000000
2023-10-14 04:50:48,226 epoch 7 - iter 1083/3617 - loss 0.01305213 - time (sec): 312.58 - samples/sec: 363.16 - lr: 0.000062 - momentum: 0.000000
2023-10-14 04:52:34,504 epoch 7 - iter 1444/3617 - loss 0.01287060 - time (sec): 418.86 - samples/sec: 365.53 - lr: 0.000060 - momentum: 0.000000
2023-10-14 04:54:21,015 epoch 7 - iter 1805/3617 - loss 0.01312161 - time (sec): 525.37 - samples/sec: 362.68 - lr: 0.000058 - momentum: 0.000000
2023-10-14 04:56:05,373 epoch 7 - iter 2166/3617 - loss 0.01362263 - time (sec): 629.73 - samples/sec: 362.43 - lr: 0.000057 - momentum: 0.000000
2023-10-14 04:57:53,808 epoch 7 - iter 2527/3617 - loss 0.01454342 - time (sec): 738.16 - samples/sec: 361.92 - lr: 0.000055 - momentum: 0.000000
2023-10-14 04:59:36,320 epoch 7 - iter 2888/3617 - loss 0.01455589 - time (sec): 840.68 - samples/sec: 362.15 - lr: 0.000053 - momentum: 0.000000
2023-10-14 05:01:16,947 epoch 7 - iter 3249/3617 - loss 0.01486078 - time (sec): 941.30 - samples/sec: 363.83 - lr: 0.000052 - momentum: 0.000000
2023-10-14 05:03:01,921 epoch 7 - iter 3610/3617 - loss 0.01483858 - time (sec): 1046.28 - samples/sec: 362.57 - lr: 0.000050 - momentum: 0.000000
2023-10-14 05:03:03,721 ----------------------------------------------------------------------------------------------------
2023-10-14 05:03:03,721 EPOCH 7 done: loss 0.0148 - lr: 0.000050
2023-10-14 05:03:46,800 DEV : loss 0.3004520535469055 - f1-score (micro avg) 0.6474
2023-10-14 05:03:46,866 ----------------------------------------------------------------------------------------------------
2023-10-14 05:05:31,450 epoch 8 - iter 361/3617 - loss 0.00618969 - time (sec): 104.58 - samples/sec: 355.17 - lr: 0.000048 - momentum: 0.000000
2023-10-14 05:07:14,817 epoch 8 - iter 722/3617 - loss 0.00964368 - time (sec): 207.95 - samples/sec: 361.93 - lr: 0.000047 - momentum: 0.000000
2023-10-14 05:09:01,458 epoch 8 - iter 1083/3617 - loss 0.01105306 - time (sec): 314.59 - samples/sec: 364.87 - lr: 0.000045 - momentum: 0.000000
2023-10-14 05:10:45,447 epoch 8 - iter 1444/3617 - loss 0.01059372 - time (sec): 418.58 - samples/sec: 364.30 - lr: 0.000043 - momentum: 0.000000
2023-10-14 05:12:30,524 epoch 8 - iter 1805/3617 - loss 0.01012078 - time (sec): 523.66 - samples/sec: 365.47 - lr: 0.000042 - momentum: 0.000000
2023-10-14 05:14:14,804 epoch 8 - iter 2166/3617 - loss 0.00963085 - time (sec): 627.94 - samples/sec: 364.14 - lr: 0.000040 - momentum: 0.000000
2023-10-14 05:16:02,304 epoch 8 - iter 2527/3617 - loss 0.00979590 - time (sec): 735.44 - samples/sec: 362.59 - lr: 0.000038 - momentum: 0.000000
2023-10-14 05:17:45,665 epoch 8 - iter 2888/3617 - loss 0.00961432 - time (sec): 838.80 - samples/sec: 362.84 - lr: 0.000037 - momentum: 0.000000
2023-10-14 05:19:28,394 epoch 8 - iter 3249/3617 - loss 0.00997350 - time (sec): 941.53 - samples/sec: 362.99 - lr: 0.000035 - momentum: 0.000000
2023-10-14 05:21:12,120 epoch 8 - iter 3610/3617 - loss 0.00969534 - time (sec): 1045.25 - samples/sec: 363.06 - lr: 0.000033 - momentum: 0.000000
2023-10-14 05:21:13,905 ----------------------------------------------------------------------------------------------------
2023-10-14 05:21:13,905 EPOCH 8 done: loss 0.0097 - lr: 0.000033
2023-10-14 05:21:55,993 DEV : loss 0.33705711364746094 - f1-score (micro avg) 0.6492
2023-10-14 05:21:56,061 ----------------------------------------------------------------------------------------------------
2023-10-14 05:23:41,737 epoch 9 - iter 361/3617 - loss 0.00768336 - time (sec): 105.67 - samples/sec: 366.27 - lr: 0.000032 - momentum: 0.000000
2023-10-14 05:25:32,318 epoch 9 - iter 722/3617 - loss 0.00809368 - time (sec): 216.25 - samples/sec: 361.32 - lr: 0.000030 - momentum: 0.000000
2023-10-14 05:27:20,923 epoch 9 - iter 1083/3617 - loss 0.00801755 - time (sec): 324.86 - samples/sec: 355.90 - lr: 0.000028 - momentum: 0.000000
2023-10-14 05:29:02,262 epoch 9 - iter 1444/3617 - loss 0.00794549 - time (sec): 426.20 - samples/sec: 361.14 - lr: 0.000027 - momentum: 0.000000
2023-10-14 05:30:41,183 epoch 9 - iter 1805/3617 - loss 0.00771750 - time (sec): 525.12 - samples/sec: 364.12 - lr: 0.000025 - momentum: 0.000000
2023-10-14 05:32:20,956 epoch 9 - iter 2166/3617 - loss 0.00754007 - time (sec): 624.89 - samples/sec: 367.42 - lr: 0.000023 - momentum: 0.000000
2023-10-14 05:34:00,453 epoch 9 - iter 2527/3617 - loss 0.00713992 - time (sec): 724.39 - samples/sec: 369.15 - lr: 0.000022 - momentum: 0.000000
2023-10-14 05:35:41,205 epoch 9 - iter 2888/3617 - loss 0.00704793 - time (sec): 825.14 - samples/sec: 368.97 - lr: 0.000020 - momentum: 0.000000
2023-10-14 05:37:23,105 epoch 9 - iter 3249/3617 - loss 0.00699019 - time (sec): 927.04 - samples/sec: 367.49 - lr: 0.000018 - momentum: 0.000000
2023-10-14 05:39:03,150 epoch 9 - iter 3610/3617 - loss 0.00658969 - time (sec): 1027.09 - samples/sec: 369.22 - lr: 0.000017 - momentum: 0.000000
2023-10-14 05:39:05,141 ----------------------------------------------------------------------------------------------------
2023-10-14 05:39:05,142 EPOCH 9 done: loss 0.0066 - lr: 0.000017
2023-10-14 05:39:46,293 DEV : loss 0.3554496467113495 - f1-score (micro avg) 0.65
2023-10-14 05:39:46,351 ----------------------------------------------------------------------------------------------------
2023-10-14 05:41:28,259 epoch 10 - iter 361/3617 - loss 0.00317370 - time (sec): 101.91 - samples/sec: 370.33 - lr: 0.000015 - momentum: 0.000000
2023-10-14 05:43:12,503 epoch 10 - iter 722/3617 - loss 0.00395930 - time (sec): 206.15 - samples/sec: 358.82 - lr: 0.000013 - momentum: 0.000000
2023-10-14 05:44:56,641 epoch 10 - iter 1083/3617 - loss 0.00484872 - time (sec): 310.29 - samples/sec: 362.49 - lr: 0.000012 - momentum: 0.000000
2023-10-14 05:46:40,943 epoch 10 - iter 1444/3617 - loss 0.00441603 - time (sec): 414.59 - samples/sec: 360.25 - lr: 0.000010 - momentum: 0.000000
2023-10-14 05:48:22,688 epoch 10 - iter 1805/3617 - loss 0.00440111 - time (sec): 516.33 - samples/sec: 364.19 - lr: 0.000008 - momentum: 0.000000
2023-10-14 05:50:03,498 epoch 10 - iter 2166/3617 - loss 0.00463574 - time (sec): 617.14 - samples/sec: 365.23 - lr: 0.000007 - momentum: 0.000000
2023-10-14 05:51:45,733 epoch 10 - iter 2527/3617 - loss 0.00475224 - time (sec): 719.38 - samples/sec: 368.15 - lr: 0.000005 - momentum: 0.000000
2023-10-14 05:53:30,970 epoch 10 - iter 2888/3617 - loss 0.00477992 - time (sec): 824.62 - samples/sec: 366.33 - lr: 0.000003 - momentum: 0.000000
2023-10-14 05:55:21,600 epoch 10 - iter 3249/3617 - loss 0.00469963 - time (sec): 935.25 - samples/sec: 364.85 - lr: 0.000002 - momentum: 0.000000
2023-10-14 05:57:08,911 epoch 10 - iter 3610/3617 - loss 0.00449982 - time (sec): 1042.56 - samples/sec: 363.50 - lr: 0.000000 - momentum: 0.000000
2023-10-14 05:57:11,042 ----------------------------------------------------------------------------------------------------
2023-10-14 05:57:11,043 EPOCH 10 done: loss 0.0045 - lr: 0.000000
2023-10-14 05:57:55,357 DEV : loss 0.3686419725418091 - f1-score (micro avg) 0.654
2023-10-14 05:57:55,427 saving best model
2023-10-14 05:58:03,839 ----------------------------------------------------------------------------------------------------
2023-10-14 05:58:03,841 Loading model from best epoch ...
2023-10-14 05:58:08,061 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-14 05:59:10,367
Results:
- F-score (micro) 0.6565
- F-score (macro) 0.5195
- Accuracy 0.5017
By class:
precision recall f1-score support
loc 0.6609 0.7750 0.7134 591
pers 0.5807 0.7255 0.6451 357
org 0.2295 0.1772 0.2000 79
micro avg 0.6092 0.7118 0.6565 1027
macro avg 0.4904 0.5592 0.5195 1027
weighted avg 0.5998 0.7118 0.6502 1027
2023-10-14 05:59:10,367 ----------------------------------------------------------------------------------------------------