2023-10-08 19:26:47,941 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,942 Model: "SequenceTagger( (embeddings): ByT5Embeddings( (model): T5EncoderModel( (shared): Embedding(384, 1472) (encoder): T5Stack( (embed_tokens): Embedding(384, 1472) (block): ModuleList( (0): T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) (relative_attention_bias): Embedding(32, 6) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (1-11): 11 x T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (final_layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1472, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-08 19:26:47,942 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,942 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-08 19:26:47,943 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,943 Train: 966 sentences 2023-10-08 19:26:47,943 (train_with_dev=False, train_with_test=False) 2023-10-08 19:26:47,943 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,943 Training Params: 2023-10-08 19:26:47,943 - learning_rate: "0.00015" 2023-10-08 19:26:47,943 - mini_batch_size: "4" 2023-10-08 19:26:47,943 - max_epochs: "10" 2023-10-08 19:26:47,943 - shuffle: "True" 2023-10-08 19:26:47,943 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,943 Plugins: 2023-10-08 19:26:47,943 - TensorboardLogger 2023-10-08 19:26:47,943 - LinearScheduler | warmup_fraction: '0.1' 2023-10-08 19:26:47,943 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,943 Final evaluation on model from best epoch (best-model.pt) 2023-10-08 19:26:47,943 - metric: "('micro avg', 'f1-score')" 2023-10-08 19:26:47,943 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,943 Computation: 2023-10-08 19:26:47,943 - compute on device: cuda:0 2023-10-08 19:26:47,943 - embedding storage: none 2023-10-08 19:26:47,944 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,944 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2" 2023-10-08 19:26:47,944 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,944 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:26:47,944 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-08 19:26:58,455 epoch 1 - iter 24/242 - loss 3.23018118 - time (sec): 10.51 - samples/sec: 248.15 - lr: 0.000014 - momentum: 0.000000 2023-10-08 19:27:09,462 epoch 1 - iter 48/242 - loss 3.21975537 - time (sec): 21.52 - samples/sec: 251.42 - lr: 0.000029 - momentum: 0.000000 2023-10-08 19:27:19,308 epoch 1 - iter 72/242 - loss 3.20292151 - time (sec): 31.36 - samples/sec: 246.41 - lr: 0.000044 - momentum: 0.000000 2023-10-08 19:27:28,606 epoch 1 - iter 96/242 - loss 3.16766256 - time (sec): 40.66 - samples/sec: 243.55 - lr: 0.000059 - momentum: 0.000000 2023-10-08 19:27:38,504 epoch 1 - iter 120/242 - loss 3.09045996 - time (sec): 50.56 - samples/sec: 242.69 - lr: 0.000074 - momentum: 0.000000 2023-10-08 19:27:48,252 epoch 1 - iter 144/242 - loss 2.99581041 - time (sec): 60.31 - samples/sec: 242.03 - lr: 0.000089 - momentum: 0.000000 2023-10-08 19:27:58,214 epoch 1 - iter 168/242 - loss 2.88863965 - time (sec): 70.27 - samples/sec: 243.22 - lr: 0.000104 - momentum: 0.000000 2023-10-08 19:28:08,332 epoch 1 - iter 192/242 - loss 2.77484047 - time (sec): 80.39 - samples/sec: 243.98 - lr: 0.000118 - momentum: 0.000000 2023-10-08 19:28:18,853 epoch 1 - iter 216/242 - loss 2.64647879 - time (sec): 90.91 - samples/sec: 245.39 - lr: 0.000133 - momentum: 0.000000 2023-10-08 19:28:28,397 epoch 1 - iter 240/242 - loss 2.53515433 - time (sec): 100.45 - samples/sec: 244.51 - lr: 0.000148 - momentum: 0.000000 2023-10-08 19:28:29,065 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:28:29,065 EPOCH 1 done: loss 2.5262 - lr: 0.000148 2023-10-08 19:28:35,558 DEV : loss 1.1793770790100098 - f1-score (micro avg) 0.0 2023-10-08 19:28:35,564 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:28:45,269 epoch 2 - iter 24/242 - loss 1.12998825 - time (sec): 9.70 - samples/sec: 239.30 - lr: 0.000148 - momentum: 0.000000 2023-10-08 19:28:55,077 epoch 2 - iter 48/242 - loss 1.02715380 - time (sec): 19.51 - samples/sec: 238.73 - lr: 0.000147 - momentum: 0.000000 2023-10-08 19:29:05,508 epoch 2 - iter 72/242 - loss 0.92096229 - time (sec): 29.94 - samples/sec: 241.96 - lr: 0.000145 - momentum: 0.000000 2023-10-08 19:29:15,624 epoch 2 - iter 96/242 - loss 0.85340030 - time (sec): 40.06 - samples/sec: 242.02 - lr: 0.000143 - momentum: 0.000000 2023-10-08 19:29:25,887 epoch 2 - iter 120/242 - loss 0.79456575 - time (sec): 50.32 - samples/sec: 241.19 - lr: 0.000142 - momentum: 0.000000 2023-10-08 19:29:35,994 epoch 2 - iter 144/242 - loss 0.75737283 - time (sec): 60.43 - samples/sec: 243.11 - lr: 0.000140 - momentum: 0.000000 2023-10-08 19:29:45,897 epoch 2 - iter 168/242 - loss 0.73321781 - time (sec): 70.33 - samples/sec: 243.22 - lr: 0.000139 - momentum: 0.000000 2023-10-08 19:29:55,775 epoch 2 - iter 192/242 - loss 0.70288592 - time (sec): 80.21 - samples/sec: 243.35 - lr: 0.000137 - momentum: 0.000000 2023-10-08 19:30:05,342 epoch 2 - iter 216/242 - loss 0.67740262 - time (sec): 89.78 - samples/sec: 242.26 - lr: 0.000135 - momentum: 0.000000 2023-10-08 19:30:16,100 epoch 2 - iter 240/242 - loss 0.63844289 - time (sec): 100.53 - samples/sec: 243.56 - lr: 0.000134 - momentum: 0.000000 2023-10-08 19:30:17,027 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:30:17,028 EPOCH 2 done: loss 0.6350 - lr: 0.000134 2023-10-08 19:30:23,505 DEV : loss 0.4186439514160156 - f1-score (micro avg) 0.0 2023-10-08 19:30:23,511 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:30:33,891 epoch 3 - iter 24/242 - loss 0.34404676 - time (sec): 10.38 - samples/sec: 256.89 - lr: 0.000132 - momentum: 0.000000 2023-10-08 19:30:44,601 epoch 3 - iter 48/242 - loss 0.34909213 - time (sec): 21.09 - samples/sec: 252.93 - lr: 0.000130 - momentum: 0.000000 2023-10-08 19:30:54,764 epoch 3 - iter 72/242 - loss 0.34216476 - time (sec): 31.25 - samples/sec: 249.40 - lr: 0.000128 - momentum: 0.000000 2023-10-08 19:31:04,335 epoch 3 - iter 96/242 - loss 0.33454166 - time (sec): 40.82 - samples/sec: 245.13 - lr: 0.000127 - momentum: 0.000000 2023-10-08 19:31:13,985 epoch 3 - iter 120/242 - loss 0.31976787 - time (sec): 50.47 - samples/sec: 245.10 - lr: 0.000125 - momentum: 0.000000 2023-10-08 19:31:23,325 epoch 3 - iter 144/242 - loss 0.31813013 - time (sec): 59.81 - samples/sec: 242.72 - lr: 0.000124 - momentum: 0.000000 2023-10-08 19:31:33,554 epoch 3 - iter 168/242 - loss 0.31672677 - time (sec): 70.04 - samples/sec: 243.18 - lr: 0.000122 - momentum: 0.000000 2023-10-08 19:31:44,595 epoch 3 - iter 192/242 - loss 0.30297284 - time (sec): 81.08 - samples/sec: 243.89 - lr: 0.000120 - momentum: 0.000000 2023-10-08 19:31:54,407 epoch 3 - iter 216/242 - loss 0.29839779 - time (sec): 90.89 - samples/sec: 243.92 - lr: 0.000119 - momentum: 0.000000 2023-10-08 19:32:04,662 epoch 3 - iter 240/242 - loss 0.29490268 - time (sec): 101.15 - samples/sec: 243.17 - lr: 0.000117 - momentum: 0.000000 2023-10-08 19:32:05,329 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:32:05,329 EPOCH 3 done: loss 0.2953 - lr: 0.000117 2023-10-08 19:32:11,891 DEV : loss 0.24679391086101532 - f1-score (micro avg) 0.5088 2023-10-08 19:32:11,897 saving best model 2023-10-08 19:32:12,776 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:32:23,503 epoch 4 - iter 24/242 - loss 0.17282296 - time (sec): 10.72 - samples/sec: 247.74 - lr: 0.000115 - momentum: 0.000000 2023-10-08 19:32:34,252 epoch 4 - iter 48/242 - loss 0.17292848 - time (sec): 21.47 - samples/sec: 245.60 - lr: 0.000113 - momentum: 0.000000 2023-10-08 19:32:43,965 epoch 4 - iter 72/242 - loss 0.18281416 - time (sec): 31.19 - samples/sec: 242.82 - lr: 0.000112 - momentum: 0.000000 2023-10-08 19:32:53,689 epoch 4 - iter 96/242 - loss 0.18598185 - time (sec): 40.91 - samples/sec: 240.67 - lr: 0.000110 - momentum: 0.000000 2023-10-08 19:33:03,446 epoch 4 - iter 120/242 - loss 0.19378349 - time (sec): 50.67 - samples/sec: 241.14 - lr: 0.000109 - momentum: 0.000000 2023-10-08 19:33:13,583 epoch 4 - iter 144/242 - loss 0.19081445 - time (sec): 60.81 - samples/sec: 242.81 - lr: 0.000107 - momentum: 0.000000 2023-10-08 19:33:24,027 epoch 4 - iter 168/242 - loss 0.19299211 - time (sec): 71.25 - samples/sec: 243.83 - lr: 0.000105 - momentum: 0.000000 2023-10-08 19:33:34,232 epoch 4 - iter 192/242 - loss 0.18829297 - time (sec): 81.45 - samples/sec: 244.29 - lr: 0.000104 - momentum: 0.000000 2023-10-08 19:33:43,781 epoch 4 - iter 216/242 - loss 0.18864321 - time (sec): 91.00 - samples/sec: 243.12 - lr: 0.000102 - momentum: 0.000000 2023-10-08 19:33:54,027 epoch 4 - iter 240/242 - loss 0.18335743 - time (sec): 101.25 - samples/sec: 243.20 - lr: 0.000100 - momentum: 0.000000 2023-10-08 19:33:54,562 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:33:54,562 EPOCH 4 done: loss 0.1836 - lr: 0.000100 2023-10-08 19:34:01,066 DEV : loss 0.16814066469669342 - f1-score (micro avg) 0.8 2023-10-08 19:34:01,072 saving best model 2023-10-08 19:34:05,615 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:34:15,235 epoch 5 - iter 24/242 - loss 0.16839716 - time (sec): 9.62 - samples/sec: 241.31 - lr: 0.000098 - momentum: 0.000000 2023-10-08 19:34:25,705 epoch 5 - iter 48/242 - loss 0.15370578 - time (sec): 20.09 - samples/sec: 250.39 - lr: 0.000097 - momentum: 0.000000 2023-10-08 19:34:35,106 epoch 5 - iter 72/242 - loss 0.13876158 - time (sec): 29.49 - samples/sec: 246.30 - lr: 0.000095 - momentum: 0.000000 2023-10-08 19:34:45,685 epoch 5 - iter 96/242 - loss 0.13675425 - time (sec): 40.07 - samples/sec: 248.52 - lr: 0.000094 - momentum: 0.000000 2023-10-08 19:34:55,753 epoch 5 - iter 120/242 - loss 0.13381720 - time (sec): 50.14 - samples/sec: 245.91 - lr: 0.000092 - momentum: 0.000000 2023-10-08 19:35:06,439 epoch 5 - iter 144/242 - loss 0.12482283 - time (sec): 60.82 - samples/sec: 246.67 - lr: 0.000090 - momentum: 0.000000 2023-10-08 19:35:16,629 epoch 5 - iter 168/242 - loss 0.12576448 - time (sec): 71.01 - samples/sec: 246.30 - lr: 0.000089 - momentum: 0.000000 2023-10-08 19:35:26,601 epoch 5 - iter 192/242 - loss 0.12561961 - time (sec): 80.98 - samples/sec: 244.64 - lr: 0.000087 - momentum: 0.000000 2023-10-08 19:35:36,252 epoch 5 - iter 216/242 - loss 0.12596156 - time (sec): 90.64 - samples/sec: 243.32 - lr: 0.000085 - momentum: 0.000000 2023-10-08 19:35:46,683 epoch 5 - iter 240/242 - loss 0.12429901 - time (sec): 101.07 - samples/sec: 243.79 - lr: 0.000084 - momentum: 0.000000 2023-10-08 19:35:47,237 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:35:47,238 EPOCH 5 done: loss 0.1242 - lr: 0.000084 2023-10-08 19:35:53,752 DEV : loss 0.14256200194358826 - f1-score (micro avg) 0.8099 2023-10-08 19:35:53,758 saving best model 2023-10-08 19:35:58,133 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:36:07,737 epoch 6 - iter 24/242 - loss 0.07318159 - time (sec): 9.60 - samples/sec: 232.64 - lr: 0.000082 - momentum: 0.000000 2023-10-08 19:36:17,776 epoch 6 - iter 48/242 - loss 0.09547560 - time (sec): 19.64 - samples/sec: 237.75 - lr: 0.000080 - momentum: 0.000000 2023-10-08 19:36:28,058 epoch 6 - iter 72/242 - loss 0.10091463 - time (sec): 29.92 - samples/sec: 241.65 - lr: 0.000079 - momentum: 0.000000 2023-10-08 19:36:37,251 epoch 6 - iter 96/242 - loss 0.09709764 - time (sec): 39.12 - samples/sec: 239.95 - lr: 0.000077 - momentum: 0.000000 2023-10-08 19:36:47,708 epoch 6 - iter 120/242 - loss 0.09869626 - time (sec): 49.57 - samples/sec: 241.46 - lr: 0.000075 - momentum: 0.000000 2023-10-08 19:36:57,206 epoch 6 - iter 144/242 - loss 0.09876091 - time (sec): 59.07 - samples/sec: 241.10 - lr: 0.000074 - momentum: 0.000000 2023-10-08 19:37:07,362 epoch 6 - iter 168/242 - loss 0.09649570 - time (sec): 69.23 - samples/sec: 241.90 - lr: 0.000072 - momentum: 0.000000 2023-10-08 19:37:18,206 epoch 6 - iter 192/242 - loss 0.09659304 - time (sec): 80.07 - samples/sec: 243.96 - lr: 0.000070 - momentum: 0.000000 2023-10-08 19:37:28,339 epoch 6 - iter 216/242 - loss 0.09258049 - time (sec): 90.20 - samples/sec: 243.64 - lr: 0.000069 - momentum: 0.000000 2023-10-08 19:37:39,031 epoch 6 - iter 240/242 - loss 0.09096943 - time (sec): 100.90 - samples/sec: 243.94 - lr: 0.000067 - momentum: 0.000000 2023-10-08 19:37:39,654 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:37:39,654 EPOCH 6 done: loss 0.0910 - lr: 0.000067 2023-10-08 19:37:46,246 DEV : loss 0.1273980289697647 - f1-score (micro avg) 0.8089 2023-10-08 19:37:46,252 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:37:57,076 epoch 7 - iter 24/242 - loss 0.06871926 - time (sec): 10.82 - samples/sec: 245.24 - lr: 0.000065 - momentum: 0.000000 2023-10-08 19:38:07,511 epoch 7 - iter 48/242 - loss 0.07730926 - time (sec): 21.26 - samples/sec: 248.48 - lr: 0.000064 - momentum: 0.000000 2023-10-08 19:38:17,953 epoch 7 - iter 72/242 - loss 0.07575126 - time (sec): 31.70 - samples/sec: 248.53 - lr: 0.000062 - momentum: 0.000000 2023-10-08 19:38:28,429 epoch 7 - iter 96/242 - loss 0.07265611 - time (sec): 42.17 - samples/sec: 249.94 - lr: 0.000060 - momentum: 0.000000 2023-10-08 19:38:37,550 epoch 7 - iter 120/242 - loss 0.07085807 - time (sec): 51.30 - samples/sec: 250.31 - lr: 0.000059 - momentum: 0.000000 2023-10-08 19:38:47,618 epoch 7 - iter 144/242 - loss 0.07431527 - time (sec): 61.36 - samples/sec: 253.47 - lr: 0.000057 - momentum: 0.000000 2023-10-08 19:38:57,283 epoch 7 - iter 168/242 - loss 0.07243790 - time (sec): 71.03 - samples/sec: 254.16 - lr: 0.000055 - momentum: 0.000000 2023-10-08 19:39:06,063 epoch 7 - iter 192/242 - loss 0.06893922 - time (sec): 79.81 - samples/sec: 253.08 - lr: 0.000054 - momentum: 0.000000 2023-10-08 19:39:15,560 epoch 7 - iter 216/242 - loss 0.06858497 - time (sec): 89.31 - samples/sec: 253.08 - lr: 0.000052 - momentum: 0.000000 2023-10-08 19:39:24,274 epoch 7 - iter 240/242 - loss 0.06623735 - time (sec): 98.02 - samples/sec: 251.23 - lr: 0.000050 - momentum: 0.000000 2023-10-08 19:39:24,809 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:39:24,809 EPOCH 7 done: loss 0.0663 - lr: 0.000050 2023-10-08 19:39:30,675 DEV : loss 0.12801626324653625 - f1-score (micro avg) 0.8209 2023-10-08 19:39:30,681 saving best model 2023-10-08 19:39:35,384 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:39:44,381 epoch 8 - iter 24/242 - loss 0.05398769 - time (sec): 9.00 - samples/sec: 256.90 - lr: 0.000049 - momentum: 0.000000 2023-10-08 19:39:53,342 epoch 8 - iter 48/242 - loss 0.05644280 - time (sec): 17.96 - samples/sec: 257.96 - lr: 0.000047 - momentum: 0.000000 2023-10-08 19:40:02,980 epoch 8 - iter 72/242 - loss 0.05735985 - time (sec): 27.59 - samples/sec: 261.97 - lr: 0.000045 - momentum: 0.000000 2023-10-08 19:40:12,506 epoch 8 - iter 96/242 - loss 0.05615888 - time (sec): 37.12 - samples/sec: 262.09 - lr: 0.000044 - momentum: 0.000000 2023-10-08 19:40:22,214 epoch 8 - iter 120/242 - loss 0.05155077 - time (sec): 46.83 - samples/sec: 262.58 - lr: 0.000042 - momentum: 0.000000 2023-10-08 19:40:30,895 epoch 8 - iter 144/242 - loss 0.05489311 - time (sec): 55.51 - samples/sec: 260.68 - lr: 0.000040 - momentum: 0.000000 2023-10-08 19:40:40,449 epoch 8 - iter 168/242 - loss 0.05635034 - time (sec): 65.06 - samples/sec: 260.91 - lr: 0.000039 - momentum: 0.000000 2023-10-08 19:40:50,423 epoch 8 - iter 192/242 - loss 0.05629495 - time (sec): 75.04 - samples/sec: 262.65 - lr: 0.000037 - momentum: 0.000000 2023-10-08 19:40:59,910 epoch 8 - iter 216/242 - loss 0.05612159 - time (sec): 84.52 - samples/sec: 262.30 - lr: 0.000035 - momentum: 0.000000 2023-10-08 19:41:09,170 epoch 8 - iter 240/242 - loss 0.05374507 - time (sec): 93.79 - samples/sec: 261.34 - lr: 0.000034 - momentum: 0.000000 2023-10-08 19:41:09,952 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:41:09,952 EPOCH 8 done: loss 0.0535 - lr: 0.000034 2023-10-08 19:41:15,913 DEV : loss 0.13416925072669983 - f1-score (micro avg) 0.8165 2023-10-08 19:41:15,919 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:41:25,146 epoch 9 - iter 24/242 - loss 0.06955313 - time (sec): 9.23 - samples/sec: 262.11 - lr: 0.000032 - momentum: 0.000000 2023-10-08 19:41:34,589 epoch 9 - iter 48/242 - loss 0.05268402 - time (sec): 18.67 - samples/sec: 261.24 - lr: 0.000030 - momentum: 0.000000 2023-10-08 19:41:44,695 epoch 9 - iter 72/242 - loss 0.04529954 - time (sec): 28.77 - samples/sec: 266.66 - lr: 0.000029 - momentum: 0.000000 2023-10-08 19:41:54,189 epoch 9 - iter 96/242 - loss 0.04194193 - time (sec): 38.27 - samples/sec: 265.41 - lr: 0.000027 - momentum: 0.000000 2023-10-08 19:42:03,055 epoch 9 - iter 120/242 - loss 0.04197999 - time (sec): 47.13 - samples/sec: 263.16 - lr: 0.000025 - momentum: 0.000000 2023-10-08 19:42:12,389 epoch 9 - iter 144/242 - loss 0.04208191 - time (sec): 56.47 - samples/sec: 261.83 - lr: 0.000024 - momentum: 0.000000 2023-10-08 19:42:21,985 epoch 9 - iter 168/242 - loss 0.03968387 - time (sec): 66.06 - samples/sec: 260.79 - lr: 0.000022 - momentum: 0.000000 2023-10-08 19:42:31,126 epoch 9 - iter 192/242 - loss 0.04070292 - time (sec): 75.21 - samples/sec: 260.71 - lr: 0.000020 - momentum: 0.000000 2023-10-08 19:42:40,599 epoch 9 - iter 216/242 - loss 0.04310635 - time (sec): 84.68 - samples/sec: 260.77 - lr: 0.000019 - momentum: 0.000000 2023-10-08 19:42:50,301 epoch 9 - iter 240/242 - loss 0.04357004 - time (sec): 94.38 - samples/sec: 260.87 - lr: 0.000017 - momentum: 0.000000 2023-10-08 19:42:50,830 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:42:50,830 EPOCH 9 done: loss 0.0434 - lr: 0.000017 2023-10-08 19:42:56,751 DEV : loss 0.13480180501937866 - f1-score (micro avg) 0.8331 2023-10-08 19:42:56,758 saving best model 2023-10-08 19:43:01,147 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:43:10,893 epoch 10 - iter 24/242 - loss 0.03780505 - time (sec): 9.74 - samples/sec: 272.37 - lr: 0.000015 - momentum: 0.000000 2023-10-08 19:43:19,978 epoch 10 - iter 48/242 - loss 0.03456896 - time (sec): 18.83 - samples/sec: 265.59 - lr: 0.000014 - momentum: 0.000000 2023-10-08 19:43:29,159 epoch 10 - iter 72/242 - loss 0.03411245 - time (sec): 28.01 - samples/sec: 260.93 - lr: 0.000012 - momentum: 0.000000 2023-10-08 19:43:38,241 epoch 10 - iter 96/242 - loss 0.03540146 - time (sec): 37.09 - samples/sec: 255.58 - lr: 0.000010 - momentum: 0.000000 2023-10-08 19:43:48,044 epoch 10 - iter 120/242 - loss 0.03421286 - time (sec): 46.90 - samples/sec: 256.51 - lr: 0.000009 - momentum: 0.000000 2023-10-08 19:43:56,972 epoch 10 - iter 144/242 - loss 0.03425299 - time (sec): 55.82 - samples/sec: 255.16 - lr: 0.000007 - momentum: 0.000000 2023-10-08 19:44:06,515 epoch 10 - iter 168/242 - loss 0.03345010 - time (sec): 65.37 - samples/sec: 254.63 - lr: 0.000005 - momentum: 0.000000 2023-10-08 19:44:16,793 epoch 10 - iter 192/242 - loss 0.03486005 - time (sec): 75.64 - samples/sec: 255.06 - lr: 0.000004 - momentum: 0.000000 2023-10-08 19:44:27,150 epoch 10 - iter 216/242 - loss 0.03874650 - time (sec): 86.00 - samples/sec: 256.60 - lr: 0.000002 - momentum: 0.000000 2023-10-08 19:44:37,226 epoch 10 - iter 240/242 - loss 0.03982743 - time (sec): 96.08 - samples/sec: 255.16 - lr: 0.000000 - momentum: 0.000000 2023-10-08 19:44:38,048 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:44:38,048 EPOCH 10 done: loss 0.0397 - lr: 0.000000 2023-10-08 19:44:44,215 DEV : loss 0.1376410871744156 - f1-score (micro avg) 0.8352 2023-10-08 19:44:44,221 saving best model 2023-10-08 19:44:49,445 ---------------------------------------------------------------------------------------------------- 2023-10-08 19:44:49,446 Loading model from best epoch ... 2023-10-08 19:44:53,145 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-08 19:44:59,472 Results: - F-score (micro) 0.7936 - F-score (macro) 0.4008 - Accuracy 0.6866 By class: precision recall f1-score support pers 0.8207 0.8561 0.8380 139 scope 0.8357 0.9070 0.8699 129 work 0.6327 0.7750 0.6966 80 loc 0.0000 0.0000 0.0000 9 object 0.0000 0.0000 0.0000 0 date 0.0000 0.0000 0.0000 3 micro avg 0.7621 0.8278 0.7936 360 macro avg 0.3815 0.4230 0.4008 360 weighted avg 0.7569 0.8278 0.7901 360 2023-10-08 19:44:59,472 ----------------------------------------------------------------------------------------------------