flair-hipe-2022-ajmc-de / training.log
stefan-it's picture
Upload folder using huggingface_hub
95b954a
2023-10-06 22:19:57,613 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,615 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-06 22:19:57,615 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,615 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-06 22:19:57,615 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,615 Train: 1100 sentences
2023-10-06 22:19:57,615 (train_with_dev=False, train_with_test=False)
2023-10-06 22:19:57,615 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,615 Training Params:
2023-10-06 22:19:57,616 - learning_rate: "0.00016"
2023-10-06 22:19:57,616 - mini_batch_size: "4"
2023-10-06 22:19:57,616 - max_epochs: "10"
2023-10-06 22:19:57,616 - shuffle: "True"
2023-10-06 22:19:57,616 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,616 Plugins:
2023-10-06 22:19:57,616 - TensorboardLogger
2023-10-06 22:19:57,616 - LinearScheduler | warmup_fraction: '0.1'
2023-10-06 22:19:57,616 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,616 Final evaluation on model from best epoch (best-model.pt)
2023-10-06 22:19:57,616 - metric: "('micro avg', 'f1-score')"
2023-10-06 22:19:57,616 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,616 Computation:
2023-10-06 22:19:57,616 - compute on device: cuda:0
2023-10-06 22:19:57,616 - embedding storage: none
2023-10-06 22:19:57,617 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,617 Model training base path: "hmbench-ajmc/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-2"
2023-10-06 22:19:57,617 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,617 ----------------------------------------------------------------------------------------------------
2023-10-06 22:19:57,617 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-06 22:20:08,003 epoch 1 - iter 27/275 - loss 3.23046973 - time (sec): 10.38 - samples/sec: 207.90 - lr: 0.000015 - momentum: 0.000000
2023-10-06 22:20:19,207 epoch 1 - iter 54/275 - loss 3.22045630 - time (sec): 21.59 - samples/sec: 207.88 - lr: 0.000031 - momentum: 0.000000
2023-10-06 22:20:30,543 epoch 1 - iter 81/275 - loss 3.20278466 - time (sec): 32.92 - samples/sec: 209.54 - lr: 0.000047 - momentum: 0.000000
2023-10-06 22:20:40,870 epoch 1 - iter 108/275 - loss 3.16363759 - time (sec): 43.25 - samples/sec: 205.15 - lr: 0.000062 - momentum: 0.000000
2023-10-06 22:20:52,093 epoch 1 - iter 135/275 - loss 3.07464330 - time (sec): 54.47 - samples/sec: 205.49 - lr: 0.000078 - momentum: 0.000000
2023-10-06 22:21:01,940 epoch 1 - iter 162/275 - loss 2.98484966 - time (sec): 64.32 - samples/sec: 203.73 - lr: 0.000094 - momentum: 0.000000
2023-10-06 22:21:13,053 epoch 1 - iter 189/275 - loss 2.86837180 - time (sec): 75.43 - samples/sec: 205.24 - lr: 0.000109 - momentum: 0.000000
2023-10-06 22:21:23,565 epoch 1 - iter 216/275 - loss 2.75403892 - time (sec): 85.95 - samples/sec: 205.56 - lr: 0.000125 - momentum: 0.000000
2023-10-06 22:21:34,838 epoch 1 - iter 243/275 - loss 2.61332537 - time (sec): 97.22 - samples/sec: 206.98 - lr: 0.000141 - momentum: 0.000000
2023-10-06 22:21:45,272 epoch 1 - iter 270/275 - loss 2.49139408 - time (sec): 107.65 - samples/sec: 207.32 - lr: 0.000157 - momentum: 0.000000
2023-10-06 22:21:47,401 ----------------------------------------------------------------------------------------------------
2023-10-06 22:21:47,402 EPOCH 1 done: loss 2.4654 - lr: 0.000157
2023-10-06 22:21:53,775 DEV : loss 1.122562289237976 - f1-score (micro avg) 0.0
2023-10-06 22:21:53,781 ----------------------------------------------------------------------------------------------------
2023-10-06 22:22:04,252 epoch 2 - iter 27/275 - loss 1.06316421 - time (sec): 10.47 - samples/sec: 209.84 - lr: 0.000158 - momentum: 0.000000
2023-10-06 22:22:15,114 epoch 2 - iter 54/275 - loss 0.99982815 - time (sec): 21.33 - samples/sec: 210.39 - lr: 0.000157 - momentum: 0.000000
2023-10-06 22:22:25,327 epoch 2 - iter 81/275 - loss 0.95642739 - time (sec): 31.54 - samples/sec: 206.41 - lr: 0.000155 - momentum: 0.000000
2023-10-06 22:22:36,243 epoch 2 - iter 108/275 - loss 0.93981903 - time (sec): 42.46 - samples/sec: 206.47 - lr: 0.000153 - momentum: 0.000000
2023-10-06 22:22:47,493 epoch 2 - iter 135/275 - loss 0.87433260 - time (sec): 53.71 - samples/sec: 207.37 - lr: 0.000151 - momentum: 0.000000
2023-10-06 22:22:58,582 epoch 2 - iter 162/275 - loss 0.82965738 - time (sec): 64.80 - samples/sec: 205.66 - lr: 0.000150 - momentum: 0.000000
2023-10-06 22:23:08,837 epoch 2 - iter 189/275 - loss 0.79093517 - time (sec): 75.05 - samples/sec: 204.31 - lr: 0.000148 - momentum: 0.000000
2023-10-06 22:23:19,870 epoch 2 - iter 216/275 - loss 0.74140812 - time (sec): 86.09 - samples/sec: 204.55 - lr: 0.000146 - momentum: 0.000000
2023-10-06 22:23:30,843 epoch 2 - iter 243/275 - loss 0.70765895 - time (sec): 97.06 - samples/sec: 205.88 - lr: 0.000144 - momentum: 0.000000
2023-10-06 22:23:41,422 epoch 2 - iter 270/275 - loss 0.68311458 - time (sec): 107.64 - samples/sec: 206.89 - lr: 0.000143 - momentum: 0.000000
2023-10-06 22:23:43,598 ----------------------------------------------------------------------------------------------------
2023-10-06 22:23:43,598 EPOCH 2 done: loss 0.6774 - lr: 0.000143
2023-10-06 22:23:50,082 DEV : loss 0.40427157282829285 - f1-score (micro avg) 0.4749
2023-10-06 22:23:50,088 saving best model
2023-10-06 22:23:50,952 ----------------------------------------------------------------------------------------------------
2023-10-06 22:24:01,156 epoch 3 - iter 27/275 - loss 0.38574587 - time (sec): 10.20 - samples/sec: 209.75 - lr: 0.000141 - momentum: 0.000000
2023-10-06 22:24:12,561 epoch 3 - iter 54/275 - loss 0.36307024 - time (sec): 21.61 - samples/sec: 212.24 - lr: 0.000139 - momentum: 0.000000
2023-10-06 22:24:23,860 epoch 3 - iter 81/275 - loss 0.36188610 - time (sec): 32.91 - samples/sec: 210.66 - lr: 0.000137 - momentum: 0.000000
2023-10-06 22:24:34,334 epoch 3 - iter 108/275 - loss 0.34404760 - time (sec): 43.38 - samples/sec: 207.15 - lr: 0.000135 - momentum: 0.000000
2023-10-06 22:24:44,676 epoch 3 - iter 135/275 - loss 0.32394867 - time (sec): 53.72 - samples/sec: 205.00 - lr: 0.000134 - momentum: 0.000000
2023-10-06 22:24:56,051 epoch 3 - iter 162/275 - loss 0.32128496 - time (sec): 65.10 - samples/sec: 206.58 - lr: 0.000132 - momentum: 0.000000
2023-10-06 22:25:07,011 epoch 3 - iter 189/275 - loss 0.31666190 - time (sec): 76.06 - samples/sec: 206.99 - lr: 0.000130 - momentum: 0.000000
2023-10-06 22:25:17,509 epoch 3 - iter 216/275 - loss 0.30596676 - time (sec): 86.56 - samples/sec: 205.75 - lr: 0.000128 - momentum: 0.000000
2023-10-06 22:25:28,963 epoch 3 - iter 243/275 - loss 0.29289012 - time (sec): 98.01 - samples/sec: 206.87 - lr: 0.000127 - momentum: 0.000000
2023-10-06 22:25:39,328 epoch 3 - iter 270/275 - loss 0.28836828 - time (sec): 108.38 - samples/sec: 206.99 - lr: 0.000125 - momentum: 0.000000
2023-10-06 22:25:41,093 ----------------------------------------------------------------------------------------------------
2023-10-06 22:25:41,093 EPOCH 3 done: loss 0.2864 - lr: 0.000125
2023-10-06 22:25:47,636 DEV : loss 0.19967962801456451 - f1-score (micro avg) 0.7842
2023-10-06 22:25:47,642 saving best model
2023-10-06 22:25:48,570 ----------------------------------------------------------------------------------------------------
2023-10-06 22:25:59,737 epoch 4 - iter 27/275 - loss 0.18598406 - time (sec): 11.17 - samples/sec: 212.26 - lr: 0.000123 - momentum: 0.000000
2023-10-06 22:26:10,416 epoch 4 - iter 54/275 - loss 0.17351357 - time (sec): 21.84 - samples/sec: 212.41 - lr: 0.000121 - momentum: 0.000000
2023-10-06 22:26:21,077 epoch 4 - iter 81/275 - loss 0.18311547 - time (sec): 32.51 - samples/sec: 212.45 - lr: 0.000119 - momentum: 0.000000
2023-10-06 22:26:32,048 epoch 4 - iter 108/275 - loss 0.17408439 - time (sec): 43.48 - samples/sec: 212.85 - lr: 0.000118 - momentum: 0.000000
2023-10-06 22:26:42,183 epoch 4 - iter 135/275 - loss 0.17031787 - time (sec): 53.61 - samples/sec: 209.79 - lr: 0.000116 - momentum: 0.000000
2023-10-06 22:26:52,816 epoch 4 - iter 162/275 - loss 0.16848434 - time (sec): 64.24 - samples/sec: 208.98 - lr: 0.000114 - momentum: 0.000000
2023-10-06 22:27:03,836 epoch 4 - iter 189/275 - loss 0.16145777 - time (sec): 75.27 - samples/sec: 208.37 - lr: 0.000112 - momentum: 0.000000
2023-10-06 22:27:14,515 epoch 4 - iter 216/275 - loss 0.15556608 - time (sec): 85.94 - samples/sec: 208.44 - lr: 0.000111 - momentum: 0.000000
2023-10-06 22:27:24,940 epoch 4 - iter 243/275 - loss 0.15447509 - time (sec): 96.37 - samples/sec: 208.45 - lr: 0.000109 - momentum: 0.000000
2023-10-06 22:27:36,076 epoch 4 - iter 270/275 - loss 0.15163452 - time (sec): 107.50 - samples/sec: 208.14 - lr: 0.000107 - momentum: 0.000000
2023-10-06 22:27:38,020 ----------------------------------------------------------------------------------------------------
2023-10-06 22:27:38,020 EPOCH 4 done: loss 0.1501 - lr: 0.000107
2023-10-06 22:27:44,585 DEV : loss 0.1390346884727478 - f1-score (micro avg) 0.8291
2023-10-06 22:27:44,590 saving best model
2023-10-06 22:27:45,511 ----------------------------------------------------------------------------------------------------
2023-10-06 22:27:56,779 epoch 5 - iter 27/275 - loss 0.12408828 - time (sec): 11.27 - samples/sec: 223.95 - lr: 0.000105 - momentum: 0.000000
2023-10-06 22:28:07,327 epoch 5 - iter 54/275 - loss 0.11531775 - time (sec): 21.81 - samples/sec: 212.35 - lr: 0.000103 - momentum: 0.000000
2023-10-06 22:28:18,432 epoch 5 - iter 81/275 - loss 0.10327970 - time (sec): 32.92 - samples/sec: 210.97 - lr: 0.000102 - momentum: 0.000000
2023-10-06 22:28:29,135 epoch 5 - iter 108/275 - loss 0.09930425 - time (sec): 43.62 - samples/sec: 209.99 - lr: 0.000100 - momentum: 0.000000
2023-10-06 22:28:39,069 epoch 5 - iter 135/275 - loss 0.09480377 - time (sec): 53.56 - samples/sec: 206.50 - lr: 0.000098 - momentum: 0.000000
2023-10-06 22:28:50,333 epoch 5 - iter 162/275 - loss 0.08876726 - time (sec): 64.82 - samples/sec: 207.54 - lr: 0.000096 - momentum: 0.000000
2023-10-06 22:29:01,744 epoch 5 - iter 189/275 - loss 0.09497833 - time (sec): 76.23 - samples/sec: 208.37 - lr: 0.000095 - momentum: 0.000000
2023-10-06 22:29:12,196 epoch 5 - iter 216/275 - loss 0.09275557 - time (sec): 86.68 - samples/sec: 206.95 - lr: 0.000093 - momentum: 0.000000
2023-10-06 22:29:22,893 epoch 5 - iter 243/275 - loss 0.09396631 - time (sec): 97.38 - samples/sec: 206.35 - lr: 0.000091 - momentum: 0.000000
2023-10-06 22:29:33,633 epoch 5 - iter 270/275 - loss 0.09534730 - time (sec): 108.12 - samples/sec: 206.59 - lr: 0.000089 - momentum: 0.000000
2023-10-06 22:29:35,734 ----------------------------------------------------------------------------------------------------
2023-10-06 22:29:35,734 EPOCH 5 done: loss 0.0944 - lr: 0.000089
2023-10-06 22:29:42,409 DEV : loss 0.12325194478034973 - f1-score (micro avg) 0.8632
2023-10-06 22:29:42,415 saving best model
2023-10-06 22:29:43,340 ----------------------------------------------------------------------------------------------------
2023-10-06 22:29:54,552 epoch 6 - iter 27/275 - loss 0.05850414 - time (sec): 11.21 - samples/sec: 205.26 - lr: 0.000087 - momentum: 0.000000
2023-10-06 22:30:05,979 epoch 6 - iter 54/275 - loss 0.06612571 - time (sec): 22.64 - samples/sec: 208.69 - lr: 0.000086 - momentum: 0.000000
2023-10-06 22:30:17,098 epoch 6 - iter 81/275 - loss 0.06088849 - time (sec): 33.76 - samples/sec: 210.45 - lr: 0.000084 - momentum: 0.000000
2023-10-06 22:30:28,140 epoch 6 - iter 108/275 - loss 0.06707336 - time (sec): 44.80 - samples/sec: 209.99 - lr: 0.000082 - momentum: 0.000000
2023-10-06 22:30:38,494 epoch 6 - iter 135/275 - loss 0.06978029 - time (sec): 55.15 - samples/sec: 209.21 - lr: 0.000080 - momentum: 0.000000
2023-10-06 22:30:48,317 epoch 6 - iter 162/275 - loss 0.07177387 - time (sec): 64.97 - samples/sec: 207.26 - lr: 0.000079 - momentum: 0.000000
2023-10-06 22:30:59,170 epoch 6 - iter 189/275 - loss 0.06989013 - time (sec): 75.83 - samples/sec: 206.96 - lr: 0.000077 - momentum: 0.000000
2023-10-06 22:31:09,796 epoch 6 - iter 216/275 - loss 0.07138482 - time (sec): 86.45 - samples/sec: 206.94 - lr: 0.000075 - momentum: 0.000000
2023-10-06 22:31:20,268 epoch 6 - iter 243/275 - loss 0.07213276 - time (sec): 96.93 - samples/sec: 206.88 - lr: 0.000073 - momentum: 0.000000
2023-10-06 22:31:31,421 epoch 6 - iter 270/275 - loss 0.07074569 - time (sec): 108.08 - samples/sec: 206.66 - lr: 0.000072 - momentum: 0.000000
2023-10-06 22:31:33,525 ----------------------------------------------------------------------------------------------------
2023-10-06 22:31:33,525 EPOCH 6 done: loss 0.0704 - lr: 0.000072
2023-10-06 22:31:40,161 DEV : loss 0.11782091856002808 - f1-score (micro avg) 0.8678
2023-10-06 22:31:40,167 saving best model
2023-10-06 22:31:41,101 ----------------------------------------------------------------------------------------------------
2023-10-06 22:31:52,232 epoch 7 - iter 27/275 - loss 0.03938381 - time (sec): 11.13 - samples/sec: 217.72 - lr: 0.000070 - momentum: 0.000000
2023-10-06 22:32:03,100 epoch 7 - iter 54/275 - loss 0.05396100 - time (sec): 22.00 - samples/sec: 212.80 - lr: 0.000068 - momentum: 0.000000
2023-10-06 22:32:13,461 epoch 7 - iter 81/275 - loss 0.04534983 - time (sec): 32.36 - samples/sec: 207.33 - lr: 0.000066 - momentum: 0.000000
2023-10-06 22:32:24,012 epoch 7 - iter 108/275 - loss 0.04423814 - time (sec): 42.91 - samples/sec: 205.74 - lr: 0.000064 - momentum: 0.000000
2023-10-06 22:32:34,380 epoch 7 - iter 135/275 - loss 0.04611314 - time (sec): 53.28 - samples/sec: 205.40 - lr: 0.000063 - momentum: 0.000000
2023-10-06 22:32:45,931 epoch 7 - iter 162/275 - loss 0.04893617 - time (sec): 64.83 - samples/sec: 206.47 - lr: 0.000061 - momentum: 0.000000
2023-10-06 22:32:56,333 epoch 7 - iter 189/275 - loss 0.05266054 - time (sec): 75.23 - samples/sec: 205.04 - lr: 0.000059 - momentum: 0.000000
2023-10-06 22:33:06,956 epoch 7 - iter 216/275 - loss 0.05442865 - time (sec): 85.85 - samples/sec: 205.12 - lr: 0.000058 - momentum: 0.000000
2023-10-06 22:33:17,494 epoch 7 - iter 243/275 - loss 0.05907969 - time (sec): 96.39 - samples/sec: 205.58 - lr: 0.000056 - momentum: 0.000000
2023-10-06 22:33:28,918 epoch 7 - iter 270/275 - loss 0.05681461 - time (sec): 107.82 - samples/sec: 206.86 - lr: 0.000054 - momentum: 0.000000
2023-10-06 22:33:31,093 ----------------------------------------------------------------------------------------------------
2023-10-06 22:33:31,093 EPOCH 7 done: loss 0.0578 - lr: 0.000054
2023-10-06 22:33:37,713 DEV : loss 0.11987826228141785 - f1-score (micro avg) 0.882
2023-10-06 22:33:37,719 saving best model
2023-10-06 22:33:38,646 ----------------------------------------------------------------------------------------------------
2023-10-06 22:33:49,334 epoch 8 - iter 27/275 - loss 0.03329545 - time (sec): 10.69 - samples/sec: 201.75 - lr: 0.000052 - momentum: 0.000000
2023-10-06 22:33:59,757 epoch 8 - iter 54/275 - loss 0.04592979 - time (sec): 21.11 - samples/sec: 204.50 - lr: 0.000050 - momentum: 0.000000
2023-10-06 22:34:10,984 epoch 8 - iter 81/275 - loss 0.05586859 - time (sec): 32.34 - samples/sec: 207.82 - lr: 0.000048 - momentum: 0.000000
2023-10-06 22:34:22,267 epoch 8 - iter 108/275 - loss 0.05455353 - time (sec): 43.62 - samples/sec: 208.92 - lr: 0.000047 - momentum: 0.000000
2023-10-06 22:34:33,165 epoch 8 - iter 135/275 - loss 0.04841264 - time (sec): 54.52 - samples/sec: 209.27 - lr: 0.000045 - momentum: 0.000000
2023-10-06 22:34:43,597 epoch 8 - iter 162/275 - loss 0.04875369 - time (sec): 64.95 - samples/sec: 208.67 - lr: 0.000043 - momentum: 0.000000
2023-10-06 22:34:54,186 epoch 8 - iter 189/275 - loss 0.04671139 - time (sec): 75.54 - samples/sec: 207.58 - lr: 0.000042 - momentum: 0.000000
2023-10-06 22:35:05,512 epoch 8 - iter 216/275 - loss 0.04691085 - time (sec): 86.86 - samples/sec: 208.67 - lr: 0.000040 - momentum: 0.000000
2023-10-06 22:35:15,731 epoch 8 - iter 243/275 - loss 0.04408575 - time (sec): 97.08 - samples/sec: 206.68 - lr: 0.000038 - momentum: 0.000000
2023-10-06 22:35:26,627 epoch 8 - iter 270/275 - loss 0.04172358 - time (sec): 107.98 - samples/sec: 206.96 - lr: 0.000036 - momentum: 0.000000
2023-10-06 22:35:28,725 ----------------------------------------------------------------------------------------------------
2023-10-06 22:35:28,725 EPOCH 8 done: loss 0.0429 - lr: 0.000036
2023-10-06 22:35:35,343 DEV : loss 0.12027797102928162 - f1-score (micro avg) 0.8854
2023-10-06 22:35:35,349 saving best model
2023-10-06 22:35:36,284 ----------------------------------------------------------------------------------------------------
2023-10-06 22:35:46,952 epoch 9 - iter 27/275 - loss 0.04986711 - time (sec): 10.67 - samples/sec: 215.71 - lr: 0.000034 - momentum: 0.000000
2023-10-06 22:35:58,646 epoch 9 - iter 54/275 - loss 0.03493711 - time (sec): 22.36 - samples/sec: 215.55 - lr: 0.000032 - momentum: 0.000000
2023-10-06 22:36:08,815 epoch 9 - iter 81/275 - loss 0.03697438 - time (sec): 32.53 - samples/sec: 207.97 - lr: 0.000031 - momentum: 0.000000
2023-10-06 22:36:19,397 epoch 9 - iter 108/275 - loss 0.03623976 - time (sec): 43.11 - samples/sec: 206.28 - lr: 0.000029 - momentum: 0.000000
2023-10-06 22:36:30,579 epoch 9 - iter 135/275 - loss 0.03334176 - time (sec): 54.29 - samples/sec: 207.21 - lr: 0.000027 - momentum: 0.000000
2023-10-06 22:36:41,808 epoch 9 - iter 162/275 - loss 0.03063102 - time (sec): 65.52 - samples/sec: 208.74 - lr: 0.000026 - momentum: 0.000000
2023-10-06 22:36:52,183 epoch 9 - iter 189/275 - loss 0.02832208 - time (sec): 75.90 - samples/sec: 207.31 - lr: 0.000024 - momentum: 0.000000
2023-10-06 22:37:02,502 epoch 9 - iter 216/275 - loss 0.03178635 - time (sec): 86.22 - samples/sec: 206.19 - lr: 0.000022 - momentum: 0.000000
2023-10-06 22:37:12,629 epoch 9 - iter 243/275 - loss 0.03567592 - time (sec): 96.34 - samples/sec: 206.17 - lr: 0.000020 - momentum: 0.000000
2023-10-06 22:37:24,066 epoch 9 - iter 270/275 - loss 0.03589895 - time (sec): 107.78 - samples/sec: 206.74 - lr: 0.000019 - momentum: 0.000000
2023-10-06 22:37:26,287 ----------------------------------------------------------------------------------------------------
2023-10-06 22:37:26,287 EPOCH 9 done: loss 0.0402 - lr: 0.000019
2023-10-06 22:37:32,922 DEV : loss 0.12215113639831543 - f1-score (micro avg) 0.8852
2023-10-06 22:37:32,928 ----------------------------------------------------------------------------------------------------
2023-10-06 22:37:43,513 epoch 10 - iter 27/275 - loss 0.03017664 - time (sec): 10.58 - samples/sec: 207.01 - lr: 0.000017 - momentum: 0.000000
2023-10-06 22:37:54,245 epoch 10 - iter 54/275 - loss 0.03648685 - time (sec): 21.32 - samples/sec: 208.34 - lr: 0.000015 - momentum: 0.000000
2023-10-06 22:38:04,499 epoch 10 - iter 81/275 - loss 0.03264491 - time (sec): 31.57 - samples/sec: 204.69 - lr: 0.000013 - momentum: 0.000000
2023-10-06 22:38:15,205 epoch 10 - iter 108/275 - loss 0.03814732 - time (sec): 42.28 - samples/sec: 203.19 - lr: 0.000011 - momentum: 0.000000
2023-10-06 22:38:27,605 epoch 10 - iter 135/275 - loss 0.04002682 - time (sec): 54.68 - samples/sec: 205.45 - lr: 0.000010 - momentum: 0.000000
2023-10-06 22:38:37,681 epoch 10 - iter 162/275 - loss 0.04109835 - time (sec): 64.75 - samples/sec: 205.91 - lr: 0.000008 - momentum: 0.000000
2023-10-06 22:38:48,711 epoch 10 - iter 189/275 - loss 0.03880280 - time (sec): 75.78 - samples/sec: 206.50 - lr: 0.000006 - momentum: 0.000000
2023-10-06 22:38:59,906 epoch 10 - iter 216/275 - loss 0.03568743 - time (sec): 86.98 - samples/sec: 207.39 - lr: 0.000004 - momentum: 0.000000
2023-10-06 22:39:10,843 epoch 10 - iter 243/275 - loss 0.03599762 - time (sec): 97.91 - samples/sec: 207.55 - lr: 0.000003 - momentum: 0.000000
2023-10-06 22:39:21,179 epoch 10 - iter 270/275 - loss 0.03470141 - time (sec): 108.25 - samples/sec: 206.87 - lr: 0.000001 - momentum: 0.000000
2023-10-06 22:39:22,963 ----------------------------------------------------------------------------------------------------
2023-10-06 22:39:22,963 EPOCH 10 done: loss 0.0351 - lr: 0.000001
2023-10-06 22:39:29,572 DEV : loss 0.12222656607627869 - f1-score (micro avg) 0.8913
2023-10-06 22:39:29,578 saving best model
2023-10-06 22:39:31,470 ----------------------------------------------------------------------------------------------------
2023-10-06 22:39:31,472 Loading model from best epoch ...
2023-10-06 22:39:34,130 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-06 22:39:41,345
Results:
- F-score (micro) 0.8903
- F-score (macro) 0.7282
- Accuracy 0.8175
By class:
precision recall f1-score support
scope 0.8798 0.9148 0.8969 176
pers 0.9302 0.9375 0.9339 128
work 0.7848 0.8378 0.8105 74
loc 1.0000 1.0000 1.0000 2
object 0.0000 0.0000 0.0000 2
micro avg 0.8779 0.9031 0.8903 382
macro avg 0.7190 0.7380 0.7282 382
weighted avg 0.8743 0.9031 0.8884 382
2023-10-06 22:39:41,346 ----------------------------------------------------------------------------------------------------