stefan-it's picture
Upload folder using huggingface_hub
75f863a
2023-10-06 21:40:48,284 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,285 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-06 21:40:48,285 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,285 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-06 21:40:48,286 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,286 Train: 1100 sentences
2023-10-06 21:40:48,286 (train_with_dev=False, train_with_test=False)
2023-10-06 21:40:48,286 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,286 Training Params:
2023-10-06 21:40:48,286 - learning_rate: "0.00016"
2023-10-06 21:40:48,286 - mini_batch_size: "8"
2023-10-06 21:40:48,286 - max_epochs: "10"
2023-10-06 21:40:48,286 - shuffle: "True"
2023-10-06 21:40:48,286 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,286 Plugins:
2023-10-06 21:40:48,286 - TensorboardLogger
2023-10-06 21:40:48,286 - LinearScheduler | warmup_fraction: '0.1'
2023-10-06 21:40:48,286 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,286 Final evaluation on model from best epoch (best-model.pt)
2023-10-06 21:40:48,286 - metric: "('micro avg', 'f1-score')"
2023-10-06 21:40:48,286 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,286 Computation:
2023-10-06 21:40:48,286 - compute on device: cuda:0
2023-10-06 21:40:48,287 - embedding storage: none
2023-10-06 21:40:48,287 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,287 Model training base path: "hmbench-ajmc/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1"
2023-10-06 21:40:48,287 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,287 ----------------------------------------------------------------------------------------------------
2023-10-06 21:40:48,287 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-06 21:40:57,559 epoch 1 - iter 13/138 - loss 3.22932241 - time (sec): 9.27 - samples/sec: 230.62 - lr: 0.000014 - momentum: 0.000000
2023-10-06 21:41:07,327 epoch 1 - iter 26/138 - loss 3.22496377 - time (sec): 19.04 - samples/sec: 224.28 - lr: 0.000029 - momentum: 0.000000
2023-10-06 21:41:16,428 epoch 1 - iter 39/138 - loss 3.21594068 - time (sec): 28.14 - samples/sec: 219.65 - lr: 0.000044 - momentum: 0.000000
2023-10-06 21:41:25,744 epoch 1 - iter 52/138 - loss 3.20171907 - time (sec): 37.46 - samples/sec: 218.47 - lr: 0.000059 - momentum: 0.000000
2023-10-06 21:41:36,624 epoch 1 - iter 65/138 - loss 3.16881497 - time (sec): 48.34 - samples/sec: 221.68 - lr: 0.000074 - momentum: 0.000000
2023-10-06 21:41:46,773 epoch 1 - iter 78/138 - loss 3.11163081 - time (sec): 58.48 - samples/sec: 223.37 - lr: 0.000089 - momentum: 0.000000
2023-10-06 21:41:56,659 epoch 1 - iter 91/138 - loss 3.04972079 - time (sec): 68.37 - samples/sec: 223.09 - lr: 0.000104 - momentum: 0.000000
2023-10-06 21:42:06,249 epoch 1 - iter 104/138 - loss 2.97412228 - time (sec): 77.96 - samples/sec: 223.07 - lr: 0.000119 - momentum: 0.000000
2023-10-06 21:42:16,033 epoch 1 - iter 117/138 - loss 2.89301074 - time (sec): 87.74 - samples/sec: 223.03 - lr: 0.000134 - momentum: 0.000000
2023-10-06 21:42:25,040 epoch 1 - iter 130/138 - loss 2.82504112 - time (sec): 96.75 - samples/sec: 222.68 - lr: 0.000150 - momentum: 0.000000
2023-10-06 21:42:30,771 ----------------------------------------------------------------------------------------------------
2023-10-06 21:42:30,772 EPOCH 1 done: loss 2.7769 - lr: 0.000150
2023-10-06 21:42:37,535 DEV : loss 1.7709752321243286 - f1-score (micro avg) 0.0
2023-10-06 21:42:37,541 ----------------------------------------------------------------------------------------------------
2023-10-06 21:42:47,080 epoch 2 - iter 13/138 - loss 1.70465968 - time (sec): 9.54 - samples/sec: 224.57 - lr: 0.000158 - momentum: 0.000000
2023-10-06 21:42:57,077 epoch 2 - iter 26/138 - loss 1.57513086 - time (sec): 19.53 - samples/sec: 222.02 - lr: 0.000157 - momentum: 0.000000
2023-10-06 21:43:06,413 epoch 2 - iter 39/138 - loss 1.51682688 - time (sec): 28.87 - samples/sec: 221.16 - lr: 0.000155 - momentum: 0.000000
2023-10-06 21:43:15,371 epoch 2 - iter 52/138 - loss 1.43139706 - time (sec): 37.83 - samples/sec: 217.53 - lr: 0.000153 - momentum: 0.000000
2023-10-06 21:43:24,766 epoch 2 - iter 65/138 - loss 1.35600385 - time (sec): 47.22 - samples/sec: 217.26 - lr: 0.000152 - momentum: 0.000000
2023-10-06 21:43:34,661 epoch 2 - iter 78/138 - loss 1.29510452 - time (sec): 57.12 - samples/sec: 217.62 - lr: 0.000150 - momentum: 0.000000
2023-10-06 21:43:44,696 epoch 2 - iter 91/138 - loss 1.23627642 - time (sec): 67.15 - samples/sec: 219.26 - lr: 0.000148 - momentum: 0.000000
2023-10-06 21:43:54,956 epoch 2 - iter 104/138 - loss 1.18744407 - time (sec): 77.41 - samples/sec: 220.17 - lr: 0.000147 - momentum: 0.000000
2023-10-06 21:44:04,455 epoch 2 - iter 117/138 - loss 1.15075822 - time (sec): 86.91 - samples/sec: 220.82 - lr: 0.000145 - momentum: 0.000000
2023-10-06 21:44:14,750 epoch 2 - iter 130/138 - loss 1.11339905 - time (sec): 97.21 - samples/sec: 222.22 - lr: 0.000143 - momentum: 0.000000
2023-10-06 21:44:20,016 ----------------------------------------------------------------------------------------------------
2023-10-06 21:44:20,016 EPOCH 2 done: loss 1.0983 - lr: 0.000143
2023-10-06 21:44:26,738 DEV : loss 0.7268592715263367 - f1-score (micro avg) 0.0
2023-10-06 21:44:26,744 ----------------------------------------------------------------------------------------------------
2023-10-06 21:44:36,396 epoch 3 - iter 13/138 - loss 0.70907147 - time (sec): 9.65 - samples/sec: 223.08 - lr: 0.000141 - momentum: 0.000000
2023-10-06 21:44:46,642 epoch 3 - iter 26/138 - loss 0.64621440 - time (sec): 19.90 - samples/sec: 226.52 - lr: 0.000139 - momentum: 0.000000
2023-10-06 21:44:56,414 epoch 3 - iter 39/138 - loss 0.63420933 - time (sec): 29.67 - samples/sec: 226.60 - lr: 0.000137 - momentum: 0.000000
2023-10-06 21:45:06,826 epoch 3 - iter 52/138 - loss 0.61482481 - time (sec): 40.08 - samples/sec: 227.07 - lr: 0.000136 - momentum: 0.000000
2023-10-06 21:45:15,817 epoch 3 - iter 65/138 - loss 0.60997011 - time (sec): 49.07 - samples/sec: 226.14 - lr: 0.000134 - momentum: 0.000000
2023-10-06 21:45:26,124 epoch 3 - iter 78/138 - loss 0.59932463 - time (sec): 59.38 - samples/sec: 226.83 - lr: 0.000132 - momentum: 0.000000
2023-10-06 21:45:35,137 epoch 3 - iter 91/138 - loss 0.58378814 - time (sec): 68.39 - samples/sec: 224.50 - lr: 0.000131 - momentum: 0.000000
2023-10-06 21:45:44,232 epoch 3 - iter 104/138 - loss 0.56431660 - time (sec): 77.49 - samples/sec: 223.01 - lr: 0.000129 - momentum: 0.000000
2023-10-06 21:45:53,869 epoch 3 - iter 117/138 - loss 0.54913062 - time (sec): 87.12 - samples/sec: 222.36 - lr: 0.000127 - momentum: 0.000000
2023-10-06 21:46:03,803 epoch 3 - iter 130/138 - loss 0.52734165 - time (sec): 97.06 - samples/sec: 222.20 - lr: 0.000126 - momentum: 0.000000
2023-10-06 21:46:09,423 ----------------------------------------------------------------------------------------------------
2023-10-06 21:46:09,423 EPOCH 3 done: loss 0.5189 - lr: 0.000126
2023-10-06 21:46:16,125 DEV : loss 0.38523873686790466 - f1-score (micro avg) 0.489
2023-10-06 21:46:16,131 saving best model
2023-10-06 21:46:17,033 ----------------------------------------------------------------------------------------------------
2023-10-06 21:46:27,861 epoch 4 - iter 13/138 - loss 0.32840705 - time (sec): 10.83 - samples/sec: 229.53 - lr: 0.000123 - momentum: 0.000000
2023-10-06 21:46:37,706 epoch 4 - iter 26/138 - loss 0.31706666 - time (sec): 20.67 - samples/sec: 224.32 - lr: 0.000121 - momentum: 0.000000
2023-10-06 21:46:46,686 epoch 4 - iter 39/138 - loss 0.32419045 - time (sec): 29.65 - samples/sec: 219.45 - lr: 0.000120 - momentum: 0.000000
2023-10-06 21:46:56,167 epoch 4 - iter 52/138 - loss 0.32908592 - time (sec): 39.13 - samples/sec: 221.28 - lr: 0.000118 - momentum: 0.000000
2023-10-06 21:47:05,997 epoch 4 - iter 65/138 - loss 0.32326386 - time (sec): 48.96 - samples/sec: 221.68 - lr: 0.000116 - momentum: 0.000000
2023-10-06 21:47:16,087 epoch 4 - iter 78/138 - loss 0.32849866 - time (sec): 59.05 - samples/sec: 222.94 - lr: 0.000115 - momentum: 0.000000
2023-10-06 21:47:25,799 epoch 4 - iter 91/138 - loss 0.31853506 - time (sec): 68.76 - samples/sec: 222.11 - lr: 0.000113 - momentum: 0.000000
2023-10-06 21:47:35,244 epoch 4 - iter 104/138 - loss 0.31529818 - time (sec): 78.21 - samples/sec: 221.90 - lr: 0.000111 - momentum: 0.000000
2023-10-06 21:47:44,556 epoch 4 - iter 117/138 - loss 0.31233610 - time (sec): 87.52 - samples/sec: 222.39 - lr: 0.000110 - momentum: 0.000000
2023-10-06 21:47:53,872 epoch 4 - iter 130/138 - loss 0.30543028 - time (sec): 96.84 - samples/sec: 221.67 - lr: 0.000108 - momentum: 0.000000
2023-10-06 21:47:59,648 ----------------------------------------------------------------------------------------------------
2023-10-06 21:47:59,649 EPOCH 4 done: loss 0.3000 - lr: 0.000108
2023-10-06 21:48:06,361 DEV : loss 0.24843597412109375 - f1-score (micro avg) 0.6997
2023-10-06 21:48:06,367 saving best model
2023-10-06 21:48:07,281 ----------------------------------------------------------------------------------------------------
2023-10-06 21:48:17,052 epoch 5 - iter 13/138 - loss 0.23847167 - time (sec): 9.77 - samples/sec: 216.80 - lr: 0.000105 - momentum: 0.000000
2023-10-06 21:48:26,728 epoch 5 - iter 26/138 - loss 0.24511983 - time (sec): 19.45 - samples/sec: 218.92 - lr: 0.000104 - momentum: 0.000000
2023-10-06 21:48:36,023 epoch 5 - iter 39/138 - loss 0.24670887 - time (sec): 28.74 - samples/sec: 219.24 - lr: 0.000102 - momentum: 0.000000
2023-10-06 21:48:46,778 epoch 5 - iter 52/138 - loss 0.23014874 - time (sec): 39.49 - samples/sec: 223.98 - lr: 0.000100 - momentum: 0.000000
2023-10-06 21:48:56,955 epoch 5 - iter 65/138 - loss 0.22490150 - time (sec): 49.67 - samples/sec: 224.11 - lr: 0.000099 - momentum: 0.000000
2023-10-06 21:49:06,877 epoch 5 - iter 78/138 - loss 0.21726106 - time (sec): 59.59 - samples/sec: 223.18 - lr: 0.000097 - momentum: 0.000000
2023-10-06 21:49:17,317 epoch 5 - iter 91/138 - loss 0.20425902 - time (sec): 70.03 - samples/sec: 222.68 - lr: 0.000095 - momentum: 0.000000
2023-10-06 21:49:26,535 epoch 5 - iter 104/138 - loss 0.19921248 - time (sec): 79.25 - samples/sec: 223.20 - lr: 0.000094 - momentum: 0.000000
2023-10-06 21:49:35,724 epoch 5 - iter 117/138 - loss 0.19801591 - time (sec): 88.44 - samples/sec: 222.34 - lr: 0.000092 - momentum: 0.000000
2023-10-06 21:49:44,843 epoch 5 - iter 130/138 - loss 0.19298194 - time (sec): 97.56 - samples/sec: 222.42 - lr: 0.000090 - momentum: 0.000000
2023-10-06 21:49:49,977 ----------------------------------------------------------------------------------------------------
2023-10-06 21:49:49,978 EPOCH 5 done: loss 0.1936 - lr: 0.000090
2023-10-06 21:49:56,718 DEV : loss 0.174832284450531 - f1-score (micro avg) 0.8195
2023-10-06 21:49:56,724 saving best model
2023-10-06 21:49:57,626 ----------------------------------------------------------------------------------------------------
2023-10-06 21:50:07,606 epoch 6 - iter 13/138 - loss 0.16131818 - time (sec): 9.98 - samples/sec: 227.71 - lr: 0.000088 - momentum: 0.000000
2023-10-06 21:50:16,783 epoch 6 - iter 26/138 - loss 0.15133612 - time (sec): 19.15 - samples/sec: 222.61 - lr: 0.000086 - momentum: 0.000000
2023-10-06 21:50:26,529 epoch 6 - iter 39/138 - loss 0.14751188 - time (sec): 28.90 - samples/sec: 224.42 - lr: 0.000084 - momentum: 0.000000
2023-10-06 21:50:36,588 epoch 6 - iter 52/138 - loss 0.14004272 - time (sec): 38.96 - samples/sec: 222.56 - lr: 0.000083 - momentum: 0.000000
2023-10-06 21:50:46,407 epoch 6 - iter 65/138 - loss 0.13322328 - time (sec): 48.78 - samples/sec: 222.35 - lr: 0.000081 - momentum: 0.000000
2023-10-06 21:50:56,197 epoch 6 - iter 78/138 - loss 0.12399005 - time (sec): 58.57 - samples/sec: 221.69 - lr: 0.000079 - momentum: 0.000000
2023-10-06 21:51:06,185 epoch 6 - iter 91/138 - loss 0.13276856 - time (sec): 68.56 - samples/sec: 223.70 - lr: 0.000077 - momentum: 0.000000
2023-10-06 21:51:15,657 epoch 6 - iter 104/138 - loss 0.13175238 - time (sec): 78.03 - samples/sec: 224.01 - lr: 0.000076 - momentum: 0.000000
2023-10-06 21:51:25,316 epoch 6 - iter 117/138 - loss 0.12953164 - time (sec): 87.69 - samples/sec: 223.29 - lr: 0.000074 - momentum: 0.000000
2023-10-06 21:51:34,559 epoch 6 - iter 130/138 - loss 0.12830343 - time (sec): 96.93 - samples/sec: 223.10 - lr: 0.000072 - momentum: 0.000000
2023-10-06 21:51:40,136 ----------------------------------------------------------------------------------------------------
2023-10-06 21:51:40,137 EPOCH 6 done: loss 0.1278 - lr: 0.000072
2023-10-06 21:51:46,829 DEV : loss 0.1418776512145996 - f1-score (micro avg) 0.8502
2023-10-06 21:51:46,835 saving best model
2023-10-06 21:51:47,751 ----------------------------------------------------------------------------------------------------
2023-10-06 21:51:56,984 epoch 7 - iter 13/138 - loss 0.09202341 - time (sec): 9.23 - samples/sec: 216.86 - lr: 0.000070 - momentum: 0.000000
2023-10-06 21:52:06,248 epoch 7 - iter 26/138 - loss 0.08962437 - time (sec): 18.50 - samples/sec: 213.72 - lr: 0.000068 - momentum: 0.000000
2023-10-06 21:52:15,978 epoch 7 - iter 39/138 - loss 0.10050212 - time (sec): 28.23 - samples/sec: 217.22 - lr: 0.000066 - momentum: 0.000000
2023-10-06 21:52:25,953 epoch 7 - iter 52/138 - loss 0.08973982 - time (sec): 38.20 - samples/sec: 221.41 - lr: 0.000065 - momentum: 0.000000
2023-10-06 21:52:36,149 epoch 7 - iter 65/138 - loss 0.08788279 - time (sec): 48.40 - samples/sec: 222.91 - lr: 0.000063 - momentum: 0.000000
2023-10-06 21:52:45,504 epoch 7 - iter 78/138 - loss 0.09283149 - time (sec): 57.75 - samples/sec: 221.05 - lr: 0.000061 - momentum: 0.000000
2023-10-06 21:52:55,369 epoch 7 - iter 91/138 - loss 0.09156409 - time (sec): 67.62 - samples/sec: 220.85 - lr: 0.000060 - momentum: 0.000000
2023-10-06 21:53:05,592 epoch 7 - iter 104/138 - loss 0.08895562 - time (sec): 77.84 - samples/sec: 222.67 - lr: 0.000058 - momentum: 0.000000
2023-10-06 21:53:15,214 epoch 7 - iter 117/138 - loss 0.08941310 - time (sec): 87.46 - samples/sec: 223.02 - lr: 0.000056 - momentum: 0.000000
2023-10-06 21:53:24,914 epoch 7 - iter 130/138 - loss 0.09673186 - time (sec): 97.16 - samples/sec: 223.19 - lr: 0.000055 - momentum: 0.000000
2023-10-06 21:53:30,238 ----------------------------------------------------------------------------------------------------
2023-10-06 21:53:30,238 EPOCH 7 done: loss 0.0949 - lr: 0.000055
2023-10-06 21:53:36,915 DEV : loss 0.12556645274162292 - f1-score (micro avg) 0.8589
2023-10-06 21:53:36,921 saving best model
2023-10-06 21:53:37,851 ----------------------------------------------------------------------------------------------------
2023-10-06 21:53:46,858 epoch 8 - iter 13/138 - loss 0.07149320 - time (sec): 9.01 - samples/sec: 208.42 - lr: 0.000052 - momentum: 0.000000
2023-10-06 21:53:56,170 epoch 8 - iter 26/138 - loss 0.09203937 - time (sec): 18.32 - samples/sec: 219.57 - lr: 0.000050 - momentum: 0.000000
2023-10-06 21:54:05,229 epoch 8 - iter 39/138 - loss 0.08711760 - time (sec): 27.38 - samples/sec: 219.31 - lr: 0.000049 - momentum: 0.000000
2023-10-06 21:54:15,492 epoch 8 - iter 52/138 - loss 0.08100056 - time (sec): 37.64 - samples/sec: 223.09 - lr: 0.000047 - momentum: 0.000000
2023-10-06 21:54:24,878 epoch 8 - iter 65/138 - loss 0.08426377 - time (sec): 47.03 - samples/sec: 220.58 - lr: 0.000045 - momentum: 0.000000
2023-10-06 21:54:35,039 epoch 8 - iter 78/138 - loss 0.08500926 - time (sec): 57.19 - samples/sec: 221.35 - lr: 0.000044 - momentum: 0.000000
2023-10-06 21:54:44,080 epoch 8 - iter 91/138 - loss 0.08055090 - time (sec): 66.23 - samples/sec: 221.41 - lr: 0.000042 - momentum: 0.000000
2023-10-06 21:54:54,543 epoch 8 - iter 104/138 - loss 0.07418720 - time (sec): 76.69 - samples/sec: 222.26 - lr: 0.000040 - momentum: 0.000000
2023-10-06 21:55:04,243 epoch 8 - iter 117/138 - loss 0.07711477 - time (sec): 86.39 - samples/sec: 222.77 - lr: 0.000039 - momentum: 0.000000
2023-10-06 21:55:14,378 epoch 8 - iter 130/138 - loss 0.07679258 - time (sec): 96.53 - samples/sec: 223.16 - lr: 0.000037 - momentum: 0.000000
2023-10-06 21:55:19,925 ----------------------------------------------------------------------------------------------------
2023-10-06 21:55:19,925 EPOCH 8 done: loss 0.0748 - lr: 0.000037
2023-10-06 21:55:26,567 DEV : loss 0.12296636402606964 - f1-score (micro avg) 0.8554
2023-10-06 21:55:26,573 ----------------------------------------------------------------------------------------------------
2023-10-06 21:55:36,064 epoch 9 - iter 13/138 - loss 0.08957227 - time (sec): 9.49 - samples/sec: 218.99 - lr: 0.000034 - momentum: 0.000000
2023-10-06 21:55:45,358 epoch 9 - iter 26/138 - loss 0.08908673 - time (sec): 18.78 - samples/sec: 221.26 - lr: 0.000033 - momentum: 0.000000
2023-10-06 21:55:54,919 epoch 9 - iter 39/138 - loss 0.08531472 - time (sec): 28.34 - samples/sec: 220.78 - lr: 0.000031 - momentum: 0.000000
2023-10-06 21:56:04,767 epoch 9 - iter 52/138 - loss 0.08120974 - time (sec): 38.19 - samples/sec: 221.25 - lr: 0.000029 - momentum: 0.000000
2023-10-06 21:56:14,503 epoch 9 - iter 65/138 - loss 0.08248220 - time (sec): 47.93 - samples/sec: 221.23 - lr: 0.000028 - momentum: 0.000000
2023-10-06 21:56:24,005 epoch 9 - iter 78/138 - loss 0.08099284 - time (sec): 57.43 - samples/sec: 222.62 - lr: 0.000026 - momentum: 0.000000
2023-10-06 21:56:34,944 epoch 9 - iter 91/138 - loss 0.07394146 - time (sec): 68.37 - samples/sec: 223.61 - lr: 0.000024 - momentum: 0.000000
2023-10-06 21:56:44,292 epoch 9 - iter 104/138 - loss 0.07115356 - time (sec): 77.72 - samples/sec: 222.87 - lr: 0.000023 - momentum: 0.000000
2023-10-06 21:56:53,548 epoch 9 - iter 117/138 - loss 0.06724952 - time (sec): 86.97 - samples/sec: 222.18 - lr: 0.000021 - momentum: 0.000000
2023-10-06 21:57:03,367 epoch 9 - iter 130/138 - loss 0.06453547 - time (sec): 96.79 - samples/sec: 222.79 - lr: 0.000019 - momentum: 0.000000
2023-10-06 21:57:08,812 ----------------------------------------------------------------------------------------------------
2023-10-06 21:57:08,813 EPOCH 9 done: loss 0.0652 - lr: 0.000019
2023-10-06 21:57:15,461 DEV : loss 0.11786513775587082 - f1-score (micro avg) 0.8575
2023-10-06 21:57:15,467 ----------------------------------------------------------------------------------------------------
2023-10-06 21:57:25,317 epoch 10 - iter 13/138 - loss 0.10447337 - time (sec): 9.85 - samples/sec: 221.66 - lr: 0.000017 - momentum: 0.000000
2023-10-06 21:57:34,946 epoch 10 - iter 26/138 - loss 0.08395709 - time (sec): 19.48 - samples/sec: 221.59 - lr: 0.000015 - momentum: 0.000000
2023-10-06 21:57:44,037 epoch 10 - iter 39/138 - loss 0.07476411 - time (sec): 28.57 - samples/sec: 220.35 - lr: 0.000013 - momentum: 0.000000
2023-10-06 21:57:54,257 epoch 10 - iter 52/138 - loss 0.07381145 - time (sec): 38.79 - samples/sec: 223.37 - lr: 0.000012 - momentum: 0.000000
2023-10-06 21:58:03,637 epoch 10 - iter 65/138 - loss 0.06455602 - time (sec): 48.17 - samples/sec: 221.54 - lr: 0.000010 - momentum: 0.000000
2023-10-06 21:58:13,312 epoch 10 - iter 78/138 - loss 0.06474742 - time (sec): 57.84 - samples/sec: 221.77 - lr: 0.000008 - momentum: 0.000000
2023-10-06 21:58:22,926 epoch 10 - iter 91/138 - loss 0.06013250 - time (sec): 67.46 - samples/sec: 220.99 - lr: 0.000007 - momentum: 0.000000
2023-10-06 21:58:32,039 epoch 10 - iter 104/138 - loss 0.06197331 - time (sec): 76.57 - samples/sec: 221.11 - lr: 0.000005 - momentum: 0.000000
2023-10-06 21:58:42,259 epoch 10 - iter 117/138 - loss 0.06140372 - time (sec): 86.79 - samples/sec: 222.49 - lr: 0.000003 - momentum: 0.000000
2023-10-06 21:58:51,604 epoch 10 - iter 130/138 - loss 0.06060334 - time (sec): 96.14 - samples/sec: 222.74 - lr: 0.000002 - momentum: 0.000000
2023-10-06 21:58:57,560 ----------------------------------------------------------------------------------------------------
2023-10-06 21:58:57,560 EPOCH 10 done: loss 0.0608 - lr: 0.000002
2023-10-06 21:59:04,228 DEV : loss 0.1174774318933487 - f1-score (micro avg) 0.8602
2023-10-06 21:59:04,234 saving best model
2023-10-06 21:59:06,124 ----------------------------------------------------------------------------------------------------
2023-10-06 21:59:06,126 Loading model from best epoch ...
2023-10-06 21:59:08,826 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-06 21:59:16,032
Results:
- F-score (micro) 0.8828
- F-score (macro) 0.5272
- Accuracy 0.8129
By class:
precision recall f1-score support
scope 0.8827 0.8977 0.8901 176
pers 0.8955 0.9375 0.9160 128
work 0.8356 0.8243 0.8299 74
object 0.0000 0.0000 0.0000 2
loc 0.0000 0.0000 0.0000 2
micro avg 0.8782 0.8874 0.8828 382
macro avg 0.5228 0.5319 0.5272 382
weighted avg 0.8686 0.8874 0.8778 382
2023-10-06 21:59:16,032 ----------------------------------------------------------------------------------------------------