|
2023-10-11 13:00:47,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,818 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 13:00:47,818 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,818 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-11 13:00:47,818 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,818 Train: 7142 sentences |
|
2023-10-11 13:00:47,818 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 13:00:47,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,819 Training Params: |
|
2023-10-11 13:00:47,819 - learning_rate: "0.00016" |
|
2023-10-11 13:00:47,819 - mini_batch_size: "4" |
|
2023-10-11 13:00:47,819 - max_epochs: "10" |
|
2023-10-11 13:00:47,819 - shuffle: "True" |
|
2023-10-11 13:00:47,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,819 Plugins: |
|
2023-10-11 13:00:47,819 - TensorboardLogger |
|
2023-10-11 13:00:47,819 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 13:00:47,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,819 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 13:00:47,819 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 13:00:47,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,820 Computation: |
|
2023-10-11 13:00:47,820 - compute on device: cuda:0 |
|
2023-10-11 13:00:47,820 - embedding storage: none |
|
2023-10-11 13:00:47,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,820 Model training base path: "hmbench-newseye/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-11 13:00:47,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:00:47,820 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 13:01:44,208 epoch 1 - iter 178/1786 - loss 2.81170991 - time (sec): 56.38 - samples/sec: 477.56 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 13:02:39,090 epoch 1 - iter 356/1786 - loss 2.63745970 - time (sec): 111.27 - samples/sec: 455.58 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 13:03:34,033 epoch 1 - iter 534/1786 - loss 2.35715581 - time (sec): 166.21 - samples/sec: 448.32 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 13:04:29,519 epoch 1 - iter 712/1786 - loss 2.06074361 - time (sec): 221.70 - samples/sec: 445.36 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 13:05:26,211 epoch 1 - iter 890/1786 - loss 1.78883260 - time (sec): 278.39 - samples/sec: 448.55 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 13:06:20,110 epoch 1 - iter 1068/1786 - loss 1.58610501 - time (sec): 332.29 - samples/sec: 447.69 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 13:07:15,433 epoch 1 - iter 1246/1786 - loss 1.40913339 - time (sec): 387.61 - samples/sec: 449.86 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 13:08:13,637 epoch 1 - iter 1424/1786 - loss 1.27901313 - time (sec): 445.81 - samples/sec: 446.05 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 13:09:09,748 epoch 1 - iter 1602/1786 - loss 1.16574649 - time (sec): 501.92 - samples/sec: 446.37 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 13:10:05,183 epoch 1 - iter 1780/1786 - loss 1.07906676 - time (sec): 557.36 - samples/sec: 444.83 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-11 13:10:06,873 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:10:06,874 EPOCH 1 done: loss 1.0763 - lr: 0.000159 |
|
2023-10-11 13:10:27,235 DEV : loss 0.17565929889678955 - f1-score (micro avg) 0.6003 |
|
2023-10-11 13:10:27,268 saving best model |
|
2023-10-11 13:10:28,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:11:24,075 epoch 2 - iter 178/1786 - loss 0.16255892 - time (sec): 55.82 - samples/sec: 470.03 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 13:12:20,568 epoch 2 - iter 356/1786 - loss 0.16881265 - time (sec): 112.31 - samples/sec: 458.36 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-11 13:13:19,344 epoch 2 - iter 534/1786 - loss 0.15873645 - time (sec): 171.09 - samples/sec: 440.45 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 13:14:17,771 epoch 2 - iter 712/1786 - loss 0.14765138 - time (sec): 229.52 - samples/sec: 438.98 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 13:15:13,377 epoch 2 - iter 890/1786 - loss 0.14254566 - time (sec): 285.12 - samples/sec: 436.39 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-11 13:16:14,773 epoch 2 - iter 1068/1786 - loss 0.13991538 - time (sec): 346.52 - samples/sec: 433.54 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 13:17:16,141 epoch 2 - iter 1246/1786 - loss 0.13851750 - time (sec): 407.89 - samples/sec: 428.23 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 13:18:15,089 epoch 2 - iter 1424/1786 - loss 0.13580485 - time (sec): 466.83 - samples/sec: 424.05 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-11 13:19:09,450 epoch 2 - iter 1602/1786 - loss 0.13348564 - time (sec): 521.20 - samples/sec: 426.97 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 13:20:06,575 epoch 2 - iter 1780/1786 - loss 0.13055574 - time (sec): 578.32 - samples/sec: 428.70 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 13:20:08,493 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:20:08,493 EPOCH 2 done: loss 0.1303 - lr: 0.000142 |
|
2023-10-11 13:20:30,672 DEV : loss 0.10548630356788635 - f1-score (micro avg) 0.7578 |
|
2023-10-11 13:20:30,703 saving best model |
|
2023-10-11 13:20:33,274 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:21:25,594 epoch 3 - iter 178/1786 - loss 0.06430020 - time (sec): 52.32 - samples/sec: 455.81 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 13:22:18,793 epoch 3 - iter 356/1786 - loss 0.06283356 - time (sec): 105.51 - samples/sec: 463.47 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 13:23:10,053 epoch 3 - iter 534/1786 - loss 0.06385061 - time (sec): 156.77 - samples/sec: 466.82 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 13:24:04,302 epoch 3 - iter 712/1786 - loss 0.06659712 - time (sec): 211.02 - samples/sec: 465.28 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 13:24:56,648 epoch 3 - iter 890/1786 - loss 0.07114886 - time (sec): 263.37 - samples/sec: 466.58 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 13:25:48,940 epoch 3 - iter 1068/1786 - loss 0.07352806 - time (sec): 315.66 - samples/sec: 467.30 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 13:26:45,407 epoch 3 - iter 1246/1786 - loss 0.07514593 - time (sec): 372.13 - samples/sec: 468.01 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 13:27:42,174 epoch 3 - iter 1424/1786 - loss 0.07344231 - time (sec): 428.90 - samples/sec: 461.84 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 13:28:39,296 epoch 3 - iter 1602/1786 - loss 0.07210740 - time (sec): 486.02 - samples/sec: 459.31 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 13:29:36,366 epoch 3 - iter 1780/1786 - loss 0.07250482 - time (sec): 543.09 - samples/sec: 456.92 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 13:29:38,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:29:38,029 EPOCH 3 done: loss 0.0726 - lr: 0.000125 |
|
2023-10-11 13:30:01,230 DEV : loss 0.10552459955215454 - f1-score (micro avg) 0.7957 |
|
2023-10-11 13:30:01,262 saving best model |
|
2023-10-11 13:30:03,958 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:30:59,030 epoch 4 - iter 178/1786 - loss 0.04332888 - time (sec): 55.07 - samples/sec: 436.30 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 13:31:55,067 epoch 4 - iter 356/1786 - loss 0.04883610 - time (sec): 111.10 - samples/sec: 441.19 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 13:32:50,737 epoch 4 - iter 534/1786 - loss 0.04847201 - time (sec): 166.77 - samples/sec: 453.36 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 13:33:44,707 epoch 4 - iter 712/1786 - loss 0.05062238 - time (sec): 220.74 - samples/sec: 453.88 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 13:34:38,564 epoch 4 - iter 890/1786 - loss 0.05112561 - time (sec): 274.60 - samples/sec: 458.03 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 13:35:31,790 epoch 4 - iter 1068/1786 - loss 0.05210222 - time (sec): 327.83 - samples/sec: 453.38 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 13:36:30,302 epoch 4 - iter 1246/1786 - loss 0.05253056 - time (sec): 386.34 - samples/sec: 449.69 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 13:37:25,466 epoch 4 - iter 1424/1786 - loss 0.05198458 - time (sec): 441.50 - samples/sec: 448.36 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 13:38:23,731 epoch 4 - iter 1602/1786 - loss 0.05147654 - time (sec): 499.77 - samples/sec: 446.38 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 13:39:18,931 epoch 4 - iter 1780/1786 - loss 0.05054278 - time (sec): 554.97 - samples/sec: 446.88 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 13:39:20,569 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:39:20,569 EPOCH 4 done: loss 0.0505 - lr: 0.000107 |
|
2023-10-11 13:39:43,214 DEV : loss 0.156653493642807 - f1-score (micro avg) 0.7909 |
|
2023-10-11 13:39:43,255 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:40:43,029 epoch 5 - iter 178/1786 - loss 0.03328650 - time (sec): 59.77 - samples/sec: 420.20 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 13:41:39,262 epoch 5 - iter 356/1786 - loss 0.04050878 - time (sec): 116.00 - samples/sec: 436.17 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 13:42:33,938 epoch 5 - iter 534/1786 - loss 0.03621945 - time (sec): 170.68 - samples/sec: 443.11 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 13:43:27,857 epoch 5 - iter 712/1786 - loss 0.03475165 - time (sec): 224.60 - samples/sec: 444.06 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 13:44:20,262 epoch 5 - iter 890/1786 - loss 0.03461616 - time (sec): 277.00 - samples/sec: 446.22 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 13:45:14,874 epoch 5 - iter 1068/1786 - loss 0.03386874 - time (sec): 331.62 - samples/sec: 445.15 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 13:46:12,877 epoch 5 - iter 1246/1786 - loss 0.03430305 - time (sec): 389.62 - samples/sec: 443.88 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 13:47:13,910 epoch 5 - iter 1424/1786 - loss 0.03506913 - time (sec): 450.65 - samples/sec: 437.96 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 13:48:13,048 epoch 5 - iter 1602/1786 - loss 0.03525813 - time (sec): 509.79 - samples/sec: 436.07 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 13:49:06,991 epoch 5 - iter 1780/1786 - loss 0.03807119 - time (sec): 563.73 - samples/sec: 440.04 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 13:49:08,616 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:49:08,616 EPOCH 5 done: loss 0.0381 - lr: 0.000089 |
|
2023-10-11 13:49:30,953 DEV : loss 0.17025883495807648 - f1-score (micro avg) 0.788 |
|
2023-10-11 13:49:30,986 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:50:30,448 epoch 6 - iter 178/1786 - loss 0.03037012 - time (sec): 59.46 - samples/sec: 437.48 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 13:51:23,893 epoch 6 - iter 356/1786 - loss 0.02735029 - time (sec): 112.90 - samples/sec: 439.65 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 13:52:18,715 epoch 6 - iter 534/1786 - loss 0.02908184 - time (sec): 167.73 - samples/sec: 439.17 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 13:53:17,235 epoch 6 - iter 712/1786 - loss 0.02904645 - time (sec): 226.25 - samples/sec: 438.56 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 13:54:11,434 epoch 6 - iter 890/1786 - loss 0.02790749 - time (sec): 280.45 - samples/sec: 441.20 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 13:55:03,988 epoch 6 - iter 1068/1786 - loss 0.02701632 - time (sec): 333.00 - samples/sec: 441.61 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 13:55:57,385 epoch 6 - iter 1246/1786 - loss 0.02648261 - time (sec): 386.40 - samples/sec: 444.67 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 13:56:56,380 epoch 6 - iter 1424/1786 - loss 0.02759848 - time (sec): 445.39 - samples/sec: 444.90 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 13:57:51,975 epoch 6 - iter 1602/1786 - loss 0.02741452 - time (sec): 500.99 - samples/sec: 445.51 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 13:58:44,618 epoch 6 - iter 1780/1786 - loss 0.02735127 - time (sec): 553.63 - samples/sec: 448.07 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 13:58:46,201 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:58:46,201 EPOCH 6 done: loss 0.0273 - lr: 0.000071 |
|
2023-10-11 13:59:06,894 DEV : loss 0.18334950506687164 - f1-score (micro avg) 0.8033 |
|
2023-10-11 13:59:06,940 saving best model |
|
2023-10-11 13:59:09,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:00:03,722 epoch 7 - iter 178/1786 - loss 0.02116711 - time (sec): 54.14 - samples/sec: 452.76 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 14:00:56,097 epoch 7 - iter 356/1786 - loss 0.02358540 - time (sec): 106.52 - samples/sec: 448.60 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 14:01:50,382 epoch 7 - iter 534/1786 - loss 0.02380570 - time (sec): 160.80 - samples/sec: 454.36 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 14:02:44,369 epoch 7 - iter 712/1786 - loss 0.02389820 - time (sec): 214.79 - samples/sec: 452.33 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 14:03:38,758 epoch 7 - iter 890/1786 - loss 0.02394287 - time (sec): 269.18 - samples/sec: 454.52 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 14:04:33,458 epoch 7 - iter 1068/1786 - loss 0.02260999 - time (sec): 323.88 - samples/sec: 456.84 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 14:05:28,679 epoch 7 - iter 1246/1786 - loss 0.02182956 - time (sec): 379.10 - samples/sec: 456.41 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 14:06:22,977 epoch 7 - iter 1424/1786 - loss 0.02105207 - time (sec): 433.40 - samples/sec: 457.13 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 14:07:16,767 epoch 7 - iter 1602/1786 - loss 0.02145575 - time (sec): 487.19 - samples/sec: 458.26 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 14:08:09,598 epoch 7 - iter 1780/1786 - loss 0.02106435 - time (sec): 540.02 - samples/sec: 459.55 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 14:08:11,066 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:08:11,067 EPOCH 7 done: loss 0.0211 - lr: 0.000053 |
|
2023-10-11 14:08:32,426 DEV : loss 0.194308340549469 - f1-score (micro avg) 0.7928 |
|
2023-10-11 14:08:32,455 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:09:23,421 epoch 8 - iter 178/1786 - loss 0.01641532 - time (sec): 50.96 - samples/sec: 484.58 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 14:10:14,666 epoch 8 - iter 356/1786 - loss 0.01440783 - time (sec): 102.21 - samples/sec: 483.68 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 14:11:06,282 epoch 8 - iter 534/1786 - loss 0.01285935 - time (sec): 153.83 - samples/sec: 473.03 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 14:11:58,394 epoch 8 - iter 712/1786 - loss 0.01220440 - time (sec): 205.94 - samples/sec: 467.19 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 14:12:50,304 epoch 8 - iter 890/1786 - loss 0.01374698 - time (sec): 257.85 - samples/sec: 465.93 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 14:13:44,353 epoch 8 - iter 1068/1786 - loss 0.01485753 - time (sec): 311.90 - samples/sec: 469.85 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 14:14:37,502 epoch 8 - iter 1246/1786 - loss 0.01514008 - time (sec): 365.05 - samples/sec: 472.32 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 14:15:30,709 epoch 8 - iter 1424/1786 - loss 0.01557502 - time (sec): 418.25 - samples/sec: 475.69 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 14:16:22,878 epoch 8 - iter 1602/1786 - loss 0.01641034 - time (sec): 470.42 - samples/sec: 477.11 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 14:17:13,034 epoch 8 - iter 1780/1786 - loss 0.01592313 - time (sec): 520.58 - samples/sec: 476.58 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 14:17:14,558 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:17:14,559 EPOCH 8 done: loss 0.0159 - lr: 0.000036 |
|
2023-10-11 14:17:36,426 DEV : loss 0.2070261836051941 - f1-score (micro avg) 0.7869 |
|
2023-10-11 14:17:36,457 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:18:28,783 epoch 9 - iter 178/1786 - loss 0.01279919 - time (sec): 52.32 - samples/sec: 455.83 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 14:19:20,233 epoch 9 - iter 356/1786 - loss 0.01083550 - time (sec): 103.77 - samples/sec: 450.34 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 14:20:11,070 epoch 9 - iter 534/1786 - loss 0.01073281 - time (sec): 154.61 - samples/sec: 445.00 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 14:21:04,072 epoch 9 - iter 712/1786 - loss 0.01005165 - time (sec): 207.61 - samples/sec: 457.10 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 14:21:56,520 epoch 9 - iter 890/1786 - loss 0.01008754 - time (sec): 260.06 - samples/sec: 462.46 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 14:22:50,929 epoch 9 - iter 1068/1786 - loss 0.01062970 - time (sec): 314.47 - samples/sec: 464.76 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 14:23:44,599 epoch 9 - iter 1246/1786 - loss 0.01091575 - time (sec): 368.14 - samples/sec: 468.87 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 14:24:38,320 epoch 9 - iter 1424/1786 - loss 0.01047999 - time (sec): 421.86 - samples/sec: 470.87 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 14:25:31,677 epoch 9 - iter 1602/1786 - loss 0.01056976 - time (sec): 475.22 - samples/sec: 470.42 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 14:26:25,182 epoch 9 - iter 1780/1786 - loss 0.01085308 - time (sec): 528.72 - samples/sec: 468.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 14:26:27,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:26:27,252 EPOCH 9 done: loss 0.0109 - lr: 0.000018 |
|
2023-10-11 14:26:49,598 DEV : loss 0.21609559655189514 - f1-score (micro avg) 0.8054 |
|
2023-10-11 14:26:49,638 saving best model |
|
2023-10-11 14:26:52,383 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:27:45,526 epoch 10 - iter 178/1786 - loss 0.00795703 - time (sec): 53.14 - samples/sec: 469.48 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 14:28:38,444 epoch 10 - iter 356/1786 - loss 0.00911006 - time (sec): 106.06 - samples/sec: 453.61 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 14:29:32,746 epoch 10 - iter 534/1786 - loss 0.00807375 - time (sec): 160.36 - samples/sec: 453.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 14:30:25,481 epoch 10 - iter 712/1786 - loss 0.00794053 - time (sec): 213.09 - samples/sec: 461.38 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 14:31:18,756 epoch 10 - iter 890/1786 - loss 0.00822094 - time (sec): 266.37 - samples/sec: 466.30 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 14:32:11,427 epoch 10 - iter 1068/1786 - loss 0.00765007 - time (sec): 319.04 - samples/sec: 466.07 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 14:33:02,846 epoch 10 - iter 1246/1786 - loss 0.00782413 - time (sec): 370.46 - samples/sec: 466.16 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 14:33:55,039 epoch 10 - iter 1424/1786 - loss 0.00755323 - time (sec): 422.65 - samples/sec: 467.52 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 14:34:48,378 epoch 10 - iter 1602/1786 - loss 0.00741746 - time (sec): 475.99 - samples/sec: 467.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 14:35:41,429 epoch 10 - iter 1780/1786 - loss 0.00748365 - time (sec): 529.04 - samples/sec: 469.23 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 14:35:42,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:35:42,901 EPOCH 10 done: loss 0.0075 - lr: 0.000000 |
|
2023-10-11 14:36:04,553 DEV : loss 0.22351031005382538 - f1-score (micro avg) 0.7973 |
|
2023-10-11 14:36:05,510 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 14:36:05,512 Loading model from best epoch ... |
|
2023-10-11 14:36:09,777 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 14:37:17,218 |
|
Results: |
|
- F-score (micro) 0.7051 |
|
- F-score (macro) 0.6521 |
|
- Accuracy 0.5611 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6985 0.7342 0.7159 1095 |
|
PER 0.7796 0.7688 0.7741 1012 |
|
ORG 0.4577 0.5910 0.5159 357 |
|
HumanProd 0.5000 0.7576 0.6024 33 |
|
|
|
micro avg 0.6835 0.7281 0.7051 2497 |
|
macro avg 0.6089 0.7129 0.6521 2497 |
|
weighted avg 0.6943 0.7281 0.7094 2497 |
|
|
|
2023-10-11 14:37:17,218 ---------------------------------------------------------------------------------------------------- |
|
|