|
2023-10-14 18:28:56,599 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,600 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 18:28:56,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,600 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-14 18:28:56,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,600 Train: 3575 sentences |
|
2023-10-14 18:28:56,600 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 18:28:56,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,600 Training Params: |
|
2023-10-14 18:28:56,600 - learning_rate: "0.00015" |
|
2023-10-14 18:28:56,601 - mini_batch_size: "8" |
|
2023-10-14 18:28:56,601 - max_epochs: "10" |
|
2023-10-14 18:28:56,601 - shuffle: "True" |
|
2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,601 Plugins: |
|
2023-10-14 18:28:56,601 - TensorboardLogger |
|
2023-10-14 18:28:56,601 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,601 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 18:28:56,601 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,601 Computation: |
|
2023-10-14 18:28:56,601 - compute on device: cuda:0 |
|
2023-10-14 18:28:56,601 - embedding storage: none |
|
2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,601 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,601 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:28:56,601 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-14 18:29:13,026 epoch 1 - iter 44/447 - loss 3.04968209 - time (sec): 16.42 - samples/sec: 558.83 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 18:29:27,978 epoch 1 - iter 88/447 - loss 3.03220833 - time (sec): 31.38 - samples/sec: 550.16 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 18:29:43,363 epoch 1 - iter 132/447 - loss 2.98006060 - time (sec): 46.76 - samples/sec: 545.03 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 18:29:58,651 epoch 1 - iter 176/447 - loss 2.85571004 - time (sec): 62.05 - samples/sec: 546.60 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-14 18:30:13,328 epoch 1 - iter 220/447 - loss 2.71064544 - time (sec): 76.73 - samples/sec: 540.41 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-14 18:30:28,322 epoch 1 - iter 264/447 - loss 2.53638057 - time (sec): 91.72 - samples/sec: 539.60 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-14 18:30:44,206 epoch 1 - iter 308/447 - loss 2.32818224 - time (sec): 107.60 - samples/sec: 545.92 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-14 18:30:59,374 epoch 1 - iter 352/447 - loss 2.14903375 - time (sec): 122.77 - samples/sec: 546.91 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-14 18:31:17,104 epoch 1 - iter 396/447 - loss 1.94372981 - time (sec): 140.50 - samples/sec: 551.15 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-14 18:31:32,315 epoch 1 - iter 440/447 - loss 1.81241981 - time (sec): 155.71 - samples/sec: 547.13 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-14 18:31:34,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:31:34,703 EPOCH 1 done: loss 1.7931 - lr: 0.000147 |
|
2023-10-14 18:31:57,276 DEV : loss 0.4706111252307892 - f1-score (micro avg) 0.0 |
|
2023-10-14 18:31:57,301 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:32:12,593 epoch 2 - iter 44/447 - loss 0.51221402 - time (sec): 15.29 - samples/sec: 553.41 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-14 18:32:27,785 epoch 2 - iter 88/447 - loss 0.49327832 - time (sec): 30.48 - samples/sec: 552.35 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-14 18:32:43,466 epoch 2 - iter 132/447 - loss 0.45252612 - time (sec): 46.16 - samples/sec: 567.85 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-14 18:33:00,409 epoch 2 - iter 176/447 - loss 0.42537516 - time (sec): 63.11 - samples/sec: 564.16 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-14 18:33:15,852 epoch 2 - iter 220/447 - loss 0.40524817 - time (sec): 78.55 - samples/sec: 562.93 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-14 18:33:31,601 epoch 2 - iter 264/447 - loss 0.38456250 - time (sec): 94.30 - samples/sec: 561.96 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-14 18:33:46,668 epoch 2 - iter 308/447 - loss 0.38166992 - time (sec): 109.37 - samples/sec: 557.13 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-14 18:34:02,008 epoch 2 - iter 352/447 - loss 0.37044403 - time (sec): 124.71 - samples/sec: 557.20 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-14 18:34:17,604 epoch 2 - iter 396/447 - loss 0.36100765 - time (sec): 140.30 - samples/sec: 555.12 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-14 18:34:32,442 epoch 2 - iter 440/447 - loss 0.35295013 - time (sec): 155.14 - samples/sec: 551.20 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-14 18:34:34,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:34:34,705 EPOCH 2 done: loss 0.3534 - lr: 0.000134 |
|
2023-10-14 18:34:59,145 DEV : loss 0.24648237228393555 - f1-score (micro avg) 0.4615 |
|
2023-10-14 18:34:59,170 saving best model |
|
2023-10-14 18:34:59,966 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:35:15,514 epoch 3 - iter 44/447 - loss 0.28031449 - time (sec): 15.55 - samples/sec: 529.00 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-14 18:35:30,601 epoch 3 - iter 88/447 - loss 0.24660222 - time (sec): 30.63 - samples/sec: 533.31 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-14 18:35:46,488 epoch 3 - iter 132/447 - loss 0.24011989 - time (sec): 46.52 - samples/sec: 535.83 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-14 18:36:01,757 epoch 3 - iter 176/447 - loss 0.23680025 - time (sec): 61.79 - samples/sec: 538.05 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-14 18:36:19,186 epoch 3 - iter 220/447 - loss 0.22759569 - time (sec): 79.22 - samples/sec: 547.03 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-14 18:36:34,393 epoch 3 - iter 264/447 - loss 0.22429489 - time (sec): 94.43 - samples/sec: 545.33 - lr: 0.000124 - momentum: 0.000000 |
|
2023-10-14 18:36:49,787 epoch 3 - iter 308/447 - loss 0.21942290 - time (sec): 109.82 - samples/sec: 543.54 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-14 18:37:04,772 epoch 3 - iter 352/447 - loss 0.21213137 - time (sec): 124.80 - samples/sec: 541.12 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-14 18:37:20,495 epoch 3 - iter 396/447 - loss 0.20683047 - time (sec): 140.53 - samples/sec: 543.46 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-14 18:37:35,931 epoch 3 - iter 440/447 - loss 0.20185030 - time (sec): 155.96 - samples/sec: 545.25 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-14 18:37:38,436 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:37:38,437 EPOCH 3 done: loss 0.2006 - lr: 0.000117 |
|
2023-10-14 18:38:03,004 DEV : loss 0.17484012246131897 - f1-score (micro avg) 0.6667 |
|
2023-10-14 18:38:03,030 saving best model |
|
2023-10-14 18:38:03,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:38:19,427 epoch 4 - iter 44/447 - loss 0.16075713 - time (sec): 15.56 - samples/sec: 535.32 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-14 18:38:34,365 epoch 4 - iter 88/447 - loss 0.15141283 - time (sec): 30.50 - samples/sec: 526.90 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-14 18:38:49,421 epoch 4 - iter 132/447 - loss 0.14651235 - time (sec): 45.55 - samples/sec: 528.07 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-14 18:39:04,810 epoch 4 - iter 176/447 - loss 0.14574267 - time (sec): 60.94 - samples/sec: 528.80 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-14 18:39:19,959 epoch 4 - iter 220/447 - loss 0.13882476 - time (sec): 76.09 - samples/sec: 527.66 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-14 18:39:35,657 epoch 4 - iter 264/447 - loss 0.13265973 - time (sec): 91.79 - samples/sec: 535.86 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-14 18:39:50,775 epoch 4 - iter 308/447 - loss 0.12641890 - time (sec): 106.91 - samples/sec: 535.20 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-14 18:40:05,921 epoch 4 - iter 352/447 - loss 0.12439053 - time (sec): 122.05 - samples/sec: 535.52 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-14 18:40:23,362 epoch 4 - iter 396/447 - loss 0.12260346 - time (sec): 139.49 - samples/sec: 539.84 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-14 18:40:39,750 epoch 4 - iter 440/447 - loss 0.11757667 - time (sec): 155.88 - samples/sec: 544.25 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-14 18:40:42,378 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:40:42,378 EPOCH 4 done: loss 0.1161 - lr: 0.000100 |
|
2023-10-14 18:41:07,153 DEV : loss 0.16356223821640015 - f1-score (micro avg) 0.725 |
|
2023-10-14 18:41:07,179 saving best model |
|
2023-10-14 18:41:11,734 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:41:26,859 epoch 5 - iter 44/447 - loss 0.07023867 - time (sec): 15.12 - samples/sec: 506.11 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-14 18:41:42,294 epoch 5 - iter 88/447 - loss 0.06478001 - time (sec): 30.56 - samples/sec: 523.37 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-14 18:41:58,072 epoch 5 - iter 132/447 - loss 0.06412258 - time (sec): 46.34 - samples/sec: 536.84 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-14 18:42:13,295 epoch 5 - iter 176/447 - loss 0.07024363 - time (sec): 61.56 - samples/sec: 538.35 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-14 18:42:28,441 epoch 5 - iter 220/447 - loss 0.06745593 - time (sec): 76.70 - samples/sec: 541.44 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-14 18:42:45,723 epoch 5 - iter 264/447 - loss 0.07148378 - time (sec): 93.99 - samples/sec: 544.05 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-14 18:43:00,596 epoch 5 - iter 308/447 - loss 0.07209831 - time (sec): 108.86 - samples/sec: 542.81 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-14 18:43:15,817 epoch 5 - iter 352/447 - loss 0.07099360 - time (sec): 124.08 - samples/sec: 544.94 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-14 18:43:31,278 epoch 5 - iter 396/447 - loss 0.07091432 - time (sec): 139.54 - samples/sec: 548.72 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-14 18:43:46,569 epoch 5 - iter 440/447 - loss 0.07134499 - time (sec): 154.83 - samples/sec: 550.12 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-14 18:43:48,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:43:48,967 EPOCH 5 done: loss 0.0719 - lr: 0.000084 |
|
2023-10-14 18:44:13,629 DEV : loss 0.1595887392759323 - f1-score (micro avg) 0.7469 |
|
2023-10-14 18:44:13,655 saving best model |
|
2023-10-14 18:44:18,225 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:44:33,929 epoch 6 - iter 44/447 - loss 0.02958638 - time (sec): 15.70 - samples/sec: 542.37 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-14 18:44:49,057 epoch 6 - iter 88/447 - loss 0.03872661 - time (sec): 30.83 - samples/sec: 545.68 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-14 18:45:04,426 epoch 6 - iter 132/447 - loss 0.04324198 - time (sec): 46.20 - samples/sec: 546.14 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-14 18:45:19,712 epoch 6 - iter 176/447 - loss 0.04411436 - time (sec): 61.48 - samples/sec: 549.13 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-14 18:45:34,792 epoch 6 - iter 220/447 - loss 0.04536235 - time (sec): 76.56 - samples/sec: 545.39 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-14 18:45:51,942 epoch 6 - iter 264/447 - loss 0.04612584 - time (sec): 93.71 - samples/sec: 546.82 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-14 18:46:07,599 epoch 6 - iter 308/447 - loss 0.04566728 - time (sec): 109.37 - samples/sec: 551.32 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-14 18:46:23,331 epoch 6 - iter 352/447 - loss 0.04571960 - time (sec): 125.10 - samples/sec: 549.12 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-14 18:46:38,316 epoch 6 - iter 396/447 - loss 0.04798319 - time (sec): 140.09 - samples/sec: 546.83 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-14 18:46:53,874 epoch 6 - iter 440/447 - loss 0.04832325 - time (sec): 155.65 - samples/sec: 547.34 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-14 18:46:56,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:46:56,277 EPOCH 6 done: loss 0.0481 - lr: 0.000067 |
|
2023-10-14 18:47:20,963 DEV : loss 0.1803191751241684 - f1-score (micro avg) 0.7481 |
|
2023-10-14 18:47:20,988 saving best model |
|
2023-10-14 18:47:25,327 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:47:42,568 epoch 7 - iter 44/447 - loss 0.04512986 - time (sec): 17.24 - samples/sec: 561.18 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-14 18:47:58,148 epoch 7 - iter 88/447 - loss 0.03940830 - time (sec): 32.82 - samples/sec: 558.82 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 18:48:13,106 epoch 7 - iter 132/447 - loss 0.04551032 - time (sec): 47.78 - samples/sec: 547.80 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-14 18:48:28,259 epoch 7 - iter 176/447 - loss 0.04250563 - time (sec): 62.93 - samples/sec: 547.24 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-14 18:48:43,733 epoch 7 - iter 220/447 - loss 0.03900809 - time (sec): 78.40 - samples/sec: 549.82 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-14 18:48:59,886 epoch 7 - iter 264/447 - loss 0.03682204 - time (sec): 94.56 - samples/sec: 549.51 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-14 18:49:15,131 epoch 7 - iter 308/447 - loss 0.03689685 - time (sec): 109.80 - samples/sec: 549.10 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-14 18:49:30,170 epoch 7 - iter 352/447 - loss 0.03473793 - time (sec): 124.84 - samples/sec: 548.47 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-14 18:49:45,552 epoch 7 - iter 396/447 - loss 0.03496683 - time (sec): 140.22 - samples/sec: 550.10 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-14 18:50:00,748 epoch 7 - iter 440/447 - loss 0.03374463 - time (sec): 155.42 - samples/sec: 548.61 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 18:50:03,115 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:50:03,116 EPOCH 7 done: loss 0.0338 - lr: 0.000050 |
|
2023-10-14 18:50:27,933 DEV : loss 0.1989319771528244 - f1-score (micro avg) 0.7592 |
|
2023-10-14 18:50:27,958 saving best model |
|
2023-10-14 18:50:32,398 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:50:47,481 epoch 8 - iter 44/447 - loss 0.02816068 - time (sec): 15.08 - samples/sec: 542.31 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-14 18:51:03,151 epoch 8 - iter 88/447 - loss 0.03674012 - time (sec): 30.75 - samples/sec: 546.65 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-14 18:51:18,134 epoch 8 - iter 132/447 - loss 0.03190669 - time (sec): 45.73 - samples/sec: 540.12 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 18:51:33,859 epoch 8 - iter 176/447 - loss 0.02926721 - time (sec): 61.46 - samples/sec: 552.77 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 18:51:49,605 epoch 8 - iter 220/447 - loss 0.02779927 - time (sec): 77.21 - samples/sec: 558.46 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-14 18:52:05,003 epoch 8 - iter 264/447 - loss 0.02652766 - time (sec): 92.60 - samples/sec: 551.05 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-14 18:52:21,852 epoch 8 - iter 308/447 - loss 0.02801775 - time (sec): 109.45 - samples/sec: 549.38 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 18:52:36,870 epoch 8 - iter 352/447 - loss 0.02681118 - time (sec): 124.47 - samples/sec: 548.52 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 18:52:52,122 epoch 8 - iter 396/447 - loss 0.02608349 - time (sec): 139.72 - samples/sec: 548.04 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 18:53:07,402 epoch 8 - iter 440/447 - loss 0.02505040 - time (sec): 155.00 - samples/sec: 549.68 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 18:53:09,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:53:09,819 EPOCH 8 done: loss 0.0249 - lr: 0.000034 |
|
2023-10-14 18:53:34,610 DEV : loss 0.20397181808948517 - f1-score (micro avg) 0.7593 |
|
2023-10-14 18:53:34,635 saving best model |
|
2023-10-14 18:53:38,825 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:53:56,216 epoch 9 - iter 44/447 - loss 0.03413553 - time (sec): 17.39 - samples/sec: 559.05 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 18:54:12,137 epoch 9 - iter 88/447 - loss 0.02537551 - time (sec): 33.31 - samples/sec: 563.45 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 18:54:27,605 epoch 9 - iter 132/447 - loss 0.02230186 - time (sec): 48.78 - samples/sec: 561.25 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 18:54:43,093 epoch 9 - iter 176/447 - loss 0.02191161 - time (sec): 64.27 - samples/sec: 562.30 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 18:54:57,966 epoch 9 - iter 220/447 - loss 0.02003936 - time (sec): 79.14 - samples/sec: 553.96 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 18:55:13,466 epoch 9 - iter 264/447 - loss 0.02206598 - time (sec): 94.64 - samples/sec: 550.03 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 18:55:28,451 epoch 9 - iter 308/447 - loss 0.02063661 - time (sec): 109.62 - samples/sec: 545.60 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 18:55:43,831 epoch 9 - iter 352/447 - loss 0.02036125 - time (sec): 125.00 - samples/sec: 545.41 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 18:55:59,355 epoch 9 - iter 396/447 - loss 0.01930381 - time (sec): 140.53 - samples/sec: 546.14 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 18:56:14,934 epoch 9 - iter 440/447 - loss 0.02015557 - time (sec): 156.11 - samples/sec: 545.94 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 18:56:17,335 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:56:17,335 EPOCH 9 done: loss 0.0201 - lr: 0.000017 |
|
2023-10-14 18:56:42,347 DEV : loss 0.2131572663784027 - f1-score (micro avg) 0.7503 |
|
2023-10-14 18:56:42,372 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:56:57,773 epoch 10 - iter 44/447 - loss 0.02127213 - time (sec): 15.40 - samples/sec: 568.28 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 18:57:12,525 epoch 10 - iter 88/447 - loss 0.01811628 - time (sec): 30.15 - samples/sec: 544.58 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 18:57:27,600 epoch 10 - iter 132/447 - loss 0.01611064 - time (sec): 45.23 - samples/sec: 545.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 18:57:43,311 epoch 10 - iter 176/447 - loss 0.01554251 - time (sec): 60.94 - samples/sec: 551.21 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 18:58:01,072 epoch 10 - iter 220/447 - loss 0.01884512 - time (sec): 78.70 - samples/sec: 556.12 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 18:58:16,786 epoch 10 - iter 264/447 - loss 0.01782274 - time (sec): 94.41 - samples/sec: 551.69 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 18:58:31,821 epoch 10 - iter 308/447 - loss 0.01717610 - time (sec): 109.45 - samples/sec: 547.71 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 18:58:46,654 epoch 10 - iter 352/447 - loss 0.01630284 - time (sec): 124.28 - samples/sec: 543.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 18:59:02,118 epoch 10 - iter 396/447 - loss 0.01612965 - time (sec): 139.74 - samples/sec: 545.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 18:59:18,081 epoch 10 - iter 440/447 - loss 0.01766576 - time (sec): 155.71 - samples/sec: 547.01 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 18:59:20,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:59:20,506 EPOCH 10 done: loss 0.0180 - lr: 0.000001 |
|
2023-10-14 18:59:45,871 DEV : loss 0.22141988575458527 - f1-score (micro avg) 0.7508 |
|
2023-10-14 18:59:46,688 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 18:59:46,689 Loading model from best epoch ... |
|
2023-10-14 18:59:49,776 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-14 19:00:11,248 |
|
Results: |
|
- F-score (micro) 0.7519 |
|
- F-score (macro) 0.6501 |
|
- Accuracy 0.6163 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8339 0.8674 0.8503 596 |
|
pers 0.6772 0.7748 0.7227 333 |
|
org 0.5263 0.5303 0.5283 132 |
|
prod 0.6296 0.5152 0.5667 66 |
|
time 0.5556 0.6122 0.5825 49 |
|
|
|
micro avg 0.7319 0.7730 0.7519 1176 |
|
macro avg 0.6445 0.6600 0.6501 1176 |
|
weighted avg 0.7319 0.7730 0.7510 1176 |
|
|
|
2023-10-14 19:00:11,249 ---------------------------------------------------------------------------------------------------- |
|
|