|
2023-10-10 23:35:37,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,073 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-10 23:35:37,073 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,074 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-10 23:35:37,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,074 Train: 1166 sentences |
|
2023-10-10 23:35:37,074 (train_with_dev=False, train_with_test=False) |
|
2023-10-10 23:35:37,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,074 Training Params: |
|
2023-10-10 23:35:37,074 - learning_rate: "0.00015" |
|
2023-10-10 23:35:37,074 - mini_batch_size: "4" |
|
2023-10-10 23:35:37,074 - max_epochs: "10" |
|
2023-10-10 23:35:37,074 - shuffle: "True" |
|
2023-10-10 23:35:37,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,074 Plugins: |
|
2023-10-10 23:35:37,074 - TensorboardLogger |
|
2023-10-10 23:35:37,075 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-10 23:35:37,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,075 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-10 23:35:37,075 - metric: "('micro avg', 'f1-score')" |
|
2023-10-10 23:35:37,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,075 Computation: |
|
2023-10-10 23:35:37,075 - compute on device: cuda:0 |
|
2023-10-10 23:35:37,075 - embedding storage: none |
|
2023-10-10 23:35:37,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,075 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-10 23:35:37,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,075 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:35:37,075 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-10 23:35:46,918 epoch 1 - iter 29/292 - loss 2.85313542 - time (sec): 9.84 - samples/sec: 511.95 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-10 23:35:55,830 epoch 1 - iter 58/292 - loss 2.84372720 - time (sec): 18.75 - samples/sec: 483.82 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-10 23:36:05,558 epoch 1 - iter 87/292 - loss 2.82117874 - time (sec): 28.48 - samples/sec: 484.65 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-10 23:36:14,853 epoch 1 - iter 116/292 - loss 2.77756334 - time (sec): 37.78 - samples/sec: 481.28 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-10 23:36:23,528 epoch 1 - iter 145/292 - loss 2.70096029 - time (sec): 46.45 - samples/sec: 471.66 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-10 23:36:32,493 epoch 1 - iter 174/292 - loss 2.59885515 - time (sec): 55.42 - samples/sec: 466.51 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-10 23:36:42,145 epoch 1 - iter 203/292 - loss 2.47709292 - time (sec): 65.07 - samples/sec: 467.68 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-10 23:36:51,289 epoch 1 - iter 232/292 - loss 2.36240655 - time (sec): 74.21 - samples/sec: 466.99 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-10 23:37:01,345 epoch 1 - iter 261/292 - loss 2.21801160 - time (sec): 84.27 - samples/sec: 470.53 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-10 23:37:11,105 epoch 1 - iter 290/292 - loss 2.08431696 - time (sec): 94.03 - samples/sec: 471.53 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-10 23:37:11,490 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:37:11,490 EPOCH 1 done: loss 2.0821 - lr: 0.000148 |
|
2023-10-10 23:37:16,630 DEV : loss 0.7319196462631226 - f1-score (micro avg) 0.0 |
|
2023-10-10 23:37:16,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:37:25,760 epoch 2 - iter 29/292 - loss 0.78142018 - time (sec): 9.12 - samples/sec: 475.48 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-10 23:37:35,383 epoch 2 - iter 58/292 - loss 0.69086359 - time (sec): 18.74 - samples/sec: 487.15 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-10 23:37:45,245 epoch 2 - iter 87/292 - loss 0.67792000 - time (sec): 28.60 - samples/sec: 493.68 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-10 23:37:53,662 epoch 2 - iter 116/292 - loss 0.65922426 - time (sec): 37.02 - samples/sec: 479.84 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-10 23:38:03,204 epoch 2 - iter 145/292 - loss 0.62436588 - time (sec): 46.56 - samples/sec: 478.37 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-10 23:38:11,407 epoch 2 - iter 174/292 - loss 0.62263495 - time (sec): 54.77 - samples/sec: 465.55 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-10 23:38:21,079 epoch 2 - iter 203/292 - loss 0.59059183 - time (sec): 64.44 - samples/sec: 471.68 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-10 23:38:30,761 epoch 2 - iter 232/292 - loss 0.55458788 - time (sec): 74.12 - samples/sec: 476.41 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 23:38:39,702 epoch 2 - iter 261/292 - loss 0.52948949 - time (sec): 83.06 - samples/sec: 475.48 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-10 23:38:49,110 epoch 2 - iter 290/292 - loss 0.54146179 - time (sec): 92.47 - samples/sec: 478.37 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-10 23:38:49,561 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:38:49,561 EPOCH 2 done: loss 0.5406 - lr: 0.000134 |
|
2023-10-10 23:38:54,981 DEV : loss 0.30725252628326416 - f1-score (micro avg) 0.0 |
|
2023-10-10 23:38:54,990 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:39:03,937 epoch 3 - iter 29/292 - loss 0.44916497 - time (sec): 8.94 - samples/sec: 410.97 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-10 23:39:13,069 epoch 3 - iter 58/292 - loss 0.37025341 - time (sec): 18.08 - samples/sec: 444.55 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-10 23:39:22,841 epoch 3 - iter 87/292 - loss 0.43563961 - time (sec): 27.85 - samples/sec: 471.68 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-10 23:39:31,276 epoch 3 - iter 116/292 - loss 0.42624209 - time (sec): 36.28 - samples/sec: 460.09 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-10 23:39:41,047 epoch 3 - iter 145/292 - loss 0.40305206 - time (sec): 46.06 - samples/sec: 467.72 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-10 23:39:50,060 epoch 3 - iter 174/292 - loss 0.38723134 - time (sec): 55.07 - samples/sec: 466.78 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-10 23:39:59,322 epoch 3 - iter 203/292 - loss 0.37293211 - time (sec): 64.33 - samples/sec: 469.75 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-10 23:40:08,762 epoch 3 - iter 232/292 - loss 0.36065966 - time (sec): 73.77 - samples/sec: 473.13 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-10 23:40:18,095 epoch 3 - iter 261/292 - loss 0.35244201 - time (sec): 83.10 - samples/sec: 472.58 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-10 23:40:27,867 epoch 3 - iter 290/292 - loss 0.34283270 - time (sec): 92.88 - samples/sec: 476.03 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-10 23:40:28,376 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:40:28,376 EPOCH 3 done: loss 0.3487 - lr: 0.000117 |
|
2023-10-10 23:40:34,075 DEV : loss 0.25399211049079895 - f1-score (micro avg) 0.2737 |
|
2023-10-10 23:40:34,084 saving best model |
|
2023-10-10 23:40:34,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:40:44,089 epoch 4 - iter 29/292 - loss 0.27665069 - time (sec): 9.09 - samples/sec: 456.67 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-10 23:40:53,537 epoch 4 - iter 58/292 - loss 0.35599118 - time (sec): 18.54 - samples/sec: 465.92 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-10 23:41:03,309 epoch 4 - iter 87/292 - loss 0.29107191 - time (sec): 28.31 - samples/sec: 461.84 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-10 23:41:12,868 epoch 4 - iter 116/292 - loss 0.28109945 - time (sec): 37.87 - samples/sec: 461.68 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-10 23:41:21,797 epoch 4 - iter 145/292 - loss 0.27689495 - time (sec): 46.80 - samples/sec: 455.20 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-10 23:41:31,466 epoch 4 - iter 174/292 - loss 0.27119259 - time (sec): 56.47 - samples/sec: 453.83 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-10 23:41:41,243 epoch 4 - iter 203/292 - loss 0.25991322 - time (sec): 66.25 - samples/sec: 455.98 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-10 23:41:50,603 epoch 4 - iter 232/292 - loss 0.25730647 - time (sec): 75.61 - samples/sec: 455.08 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-10 23:42:00,430 epoch 4 - iter 261/292 - loss 0.26262908 - time (sec): 85.43 - samples/sec: 459.51 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-10 23:42:10,745 epoch 4 - iter 290/292 - loss 0.25695125 - time (sec): 95.75 - samples/sec: 462.90 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-10 23:42:11,163 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:42:11,163 EPOCH 4 done: loss 0.2568 - lr: 0.000100 |
|
2023-10-10 23:42:16,884 DEV : loss 0.19706253707408905 - f1-score (micro avg) 0.4559 |
|
2023-10-10 23:42:16,894 saving best model |
|
2023-10-10 23:42:24,060 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:42:33,369 epoch 5 - iter 29/292 - loss 0.21009808 - time (sec): 9.31 - samples/sec: 462.53 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-10 23:42:43,233 epoch 5 - iter 58/292 - loss 0.18064570 - time (sec): 19.17 - samples/sec: 471.90 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-10 23:42:52,542 epoch 5 - iter 87/292 - loss 0.17577573 - time (sec): 28.48 - samples/sec: 464.00 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-10 23:43:02,200 epoch 5 - iter 116/292 - loss 0.17534975 - time (sec): 38.14 - samples/sec: 463.29 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-10 23:43:11,628 epoch 5 - iter 145/292 - loss 0.17692015 - time (sec): 47.56 - samples/sec: 456.12 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-10 23:43:22,538 epoch 5 - iter 174/292 - loss 0.19355367 - time (sec): 58.47 - samples/sec: 468.31 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-10 23:43:32,633 epoch 5 - iter 203/292 - loss 0.18908499 - time (sec): 68.57 - samples/sec: 464.72 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-10 23:43:42,635 epoch 5 - iter 232/292 - loss 0.18645282 - time (sec): 78.57 - samples/sec: 464.25 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-10 23:43:51,591 epoch 5 - iter 261/292 - loss 0.18419102 - time (sec): 87.53 - samples/sec: 458.59 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-10 23:44:01,370 epoch 5 - iter 290/292 - loss 0.18219314 - time (sec): 97.31 - samples/sec: 455.09 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-10 23:44:01,814 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:44:01,814 EPOCH 5 done: loss 0.1825 - lr: 0.000084 |
|
2023-10-10 23:44:07,667 DEV : loss 0.16620376706123352 - f1-score (micro avg) 0.6582 |
|
2023-10-10 23:44:07,676 saving best model |
|
2023-10-10 23:44:14,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:44:24,827 epoch 6 - iter 29/292 - loss 0.12681655 - time (sec): 10.12 - samples/sec: 476.31 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-10 23:44:34,359 epoch 6 - iter 58/292 - loss 0.13248581 - time (sec): 19.65 - samples/sec: 457.69 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-10 23:44:43,923 epoch 6 - iter 87/292 - loss 0.12617558 - time (sec): 29.21 - samples/sec: 454.60 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-10 23:44:53,324 epoch 6 - iter 116/292 - loss 0.13236752 - time (sec): 38.61 - samples/sec: 453.54 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-10 23:45:02,801 epoch 6 - iter 145/292 - loss 0.13561748 - time (sec): 48.09 - samples/sec: 452.93 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-10 23:45:12,212 epoch 6 - iter 174/292 - loss 0.13666887 - time (sec): 57.50 - samples/sec: 451.10 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-10 23:45:22,723 epoch 6 - iter 203/292 - loss 0.13676891 - time (sec): 68.01 - samples/sec: 460.56 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-10 23:45:31,946 epoch 6 - iter 232/292 - loss 0.13971349 - time (sec): 77.24 - samples/sec: 457.86 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-10 23:45:41,291 epoch 6 - iter 261/292 - loss 0.13570923 - time (sec): 86.58 - samples/sec: 459.15 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-10 23:45:50,981 epoch 6 - iter 290/292 - loss 0.13228776 - time (sec): 96.27 - samples/sec: 457.72 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-10 23:45:51,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:45:51,640 EPOCH 6 done: loss 0.1315 - lr: 0.000067 |
|
2023-10-10 23:45:57,282 DEV : loss 0.14966456592082977 - f1-score (micro avg) 0.7119 |
|
2023-10-10 23:45:57,292 saving best model |
|
2023-10-10 23:46:05,725 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:46:15,412 epoch 7 - iter 29/292 - loss 0.10699187 - time (sec): 9.68 - samples/sec: 452.78 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-10 23:46:25,634 epoch 7 - iter 58/292 - loss 0.10066351 - time (sec): 19.91 - samples/sec: 474.60 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-10 23:46:34,493 epoch 7 - iter 87/292 - loss 0.09640648 - time (sec): 28.76 - samples/sec: 455.47 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-10 23:46:43,743 epoch 7 - iter 116/292 - loss 0.10508994 - time (sec): 38.01 - samples/sec: 454.73 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-10 23:46:52,292 epoch 7 - iter 145/292 - loss 0.10571912 - time (sec): 46.56 - samples/sec: 444.09 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-10 23:47:01,874 epoch 7 - iter 174/292 - loss 0.10302895 - time (sec): 56.14 - samples/sec: 452.90 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-10 23:47:11,954 epoch 7 - iter 203/292 - loss 0.10265168 - time (sec): 66.22 - samples/sec: 456.99 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-10 23:47:21,832 epoch 7 - iter 232/292 - loss 0.10359819 - time (sec): 76.10 - samples/sec: 461.95 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-10 23:47:31,190 epoch 7 - iter 261/292 - loss 0.10237414 - time (sec): 85.46 - samples/sec: 462.74 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-10 23:47:41,383 epoch 7 - iter 290/292 - loss 0.10022497 - time (sec): 95.65 - samples/sec: 462.90 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-10 23:47:41,885 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:47:41,886 EPOCH 7 done: loss 0.1003 - lr: 0.000050 |
|
2023-10-10 23:47:48,311 DEV : loss 0.14436788856983185 - f1-score (micro avg) 0.7431 |
|
2023-10-10 23:47:48,322 saving best model |
|
2023-10-10 23:47:55,695 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:48:05,875 epoch 8 - iter 29/292 - loss 0.08606206 - time (sec): 10.18 - samples/sec: 428.08 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-10 23:48:15,176 epoch 8 - iter 58/292 - loss 0.10042979 - time (sec): 19.48 - samples/sec: 439.24 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-10 23:48:24,658 epoch 8 - iter 87/292 - loss 0.08629826 - time (sec): 28.96 - samples/sec: 451.68 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 23:48:33,961 epoch 8 - iter 116/292 - loss 0.09198991 - time (sec): 38.26 - samples/sec: 456.71 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-10 23:48:43,305 epoch 8 - iter 145/292 - loss 0.08720786 - time (sec): 47.61 - samples/sec: 458.40 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-10 23:48:51,893 epoch 8 - iter 174/292 - loss 0.08829853 - time (sec): 56.19 - samples/sec: 450.20 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-10 23:49:02,167 epoch 8 - iter 203/292 - loss 0.08393690 - time (sec): 66.47 - samples/sec: 461.17 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-10 23:49:11,359 epoch 8 - iter 232/292 - loss 0.08287430 - time (sec): 75.66 - samples/sec: 459.05 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-10 23:49:21,542 epoch 8 - iter 261/292 - loss 0.08141757 - time (sec): 85.84 - samples/sec: 466.25 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-10 23:49:30,780 epoch 8 - iter 290/292 - loss 0.08090538 - time (sec): 95.08 - samples/sec: 466.14 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-10 23:49:31,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:49:31,187 EPOCH 8 done: loss 0.0807 - lr: 0.000034 |
|
2023-10-10 23:49:36,774 DEV : loss 0.14600899815559387 - f1-score (micro avg) 0.7373 |
|
2023-10-10 23:49:36,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:49:46,636 epoch 9 - iter 29/292 - loss 0.07929525 - time (sec): 9.85 - samples/sec: 446.43 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-10 23:49:55,786 epoch 9 - iter 58/292 - loss 0.06917974 - time (sec): 19.00 - samples/sec: 446.43 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-10 23:50:04,404 epoch 9 - iter 87/292 - loss 0.07967206 - time (sec): 27.62 - samples/sec: 436.54 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-10 23:50:14,209 epoch 9 - iter 116/292 - loss 0.07544498 - time (sec): 37.42 - samples/sec: 451.67 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-10 23:50:24,534 epoch 9 - iter 145/292 - loss 0.07339838 - time (sec): 47.75 - samples/sec: 464.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-10 23:50:33,546 epoch 9 - iter 174/292 - loss 0.07269647 - time (sec): 56.76 - samples/sec: 460.21 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-10 23:50:43,184 epoch 9 - iter 203/292 - loss 0.07186638 - time (sec): 66.40 - samples/sec: 464.40 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-10 23:50:52,306 epoch 9 - iter 232/292 - loss 0.07095228 - time (sec): 75.52 - samples/sec: 461.80 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-10 23:51:02,019 epoch 9 - iter 261/292 - loss 0.07058102 - time (sec): 85.23 - samples/sec: 465.29 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-10 23:51:11,747 epoch 9 - iter 290/292 - loss 0.06922655 - time (sec): 94.96 - samples/sec: 466.40 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-10 23:51:12,183 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:51:12,184 EPOCH 9 done: loss 0.0691 - lr: 0.000017 |
|
2023-10-10 23:51:17,740 DEV : loss 0.1435050070285797 - f1-score (micro avg) 0.7431 |
|
2023-10-10 23:51:17,750 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:51:26,849 epoch 10 - iter 29/292 - loss 0.07707015 - time (sec): 9.10 - samples/sec: 469.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-10 23:51:36,524 epoch 10 - iter 58/292 - loss 0.07455355 - time (sec): 18.77 - samples/sec: 481.63 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-10 23:51:45,636 epoch 10 - iter 87/292 - loss 0.06383844 - time (sec): 27.88 - samples/sec: 465.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-10 23:51:55,848 epoch 10 - iter 116/292 - loss 0.06023946 - time (sec): 38.10 - samples/sec: 454.54 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-10 23:52:06,208 epoch 10 - iter 145/292 - loss 0.05687143 - time (sec): 48.46 - samples/sec: 444.57 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-10 23:52:16,612 epoch 10 - iter 174/292 - loss 0.06003429 - time (sec): 58.86 - samples/sec: 447.08 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-10 23:52:26,689 epoch 10 - iter 203/292 - loss 0.06203918 - time (sec): 68.94 - samples/sec: 454.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-10 23:52:35,890 epoch 10 - iter 232/292 - loss 0.06183864 - time (sec): 78.14 - samples/sec: 450.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-10 23:52:46,122 epoch 10 - iter 261/292 - loss 0.06243654 - time (sec): 88.37 - samples/sec: 455.81 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-10 23:52:55,079 epoch 10 - iter 290/292 - loss 0.06314414 - time (sec): 97.33 - samples/sec: 454.93 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-10 23:52:55,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:52:55,520 EPOCH 10 done: loss 0.0629 - lr: 0.000000 |
|
2023-10-10 23:53:00,971 DEV : loss 0.144588902592659 - f1-score (micro avg) 0.7553 |
|
2023-10-10 23:53:00,980 saving best model |
|
2023-10-10 23:53:07,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:53:07,829 Loading model from best epoch ... |
|
2023-10-10 23:53:11,365 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-10 23:53:25,880 |
|
Results: |
|
- F-score (micro) 0.7047 |
|
- F-score (macro) 0.6351 |
|
- Accuracy 0.5597 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7429 0.8218 0.7804 348 |
|
LOC 0.5879 0.7816 0.6711 261 |
|
ORG 0.3585 0.3654 0.3619 52 |
|
HumanProd 0.7273 0.7273 0.7273 22 |
|
|
|
micro avg 0.6506 0.7687 0.7047 683 |
|
macro avg 0.6041 0.6740 0.6351 683 |
|
weighted avg 0.6539 0.7687 0.7050 683 |
|
|
|
2023-10-10 23:53:25,881 ---------------------------------------------------------------------------------------------------- |
|
|