2023-10-06 15:14:54,549 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,550 Model: "SequenceTagger( (embeddings): ByT5Embeddings( (model): T5EncoderModel( (shared): Embedding(384, 1472) (encoder): T5Stack( (embed_tokens): Embedding(384, 1472) (block): ModuleList( (0): T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) (relative_attention_bias): Embedding(32, 6) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (1-11): 11 x T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (final_layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1472, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-06 15:14:54,550 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,551 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-06 15:14:54,551 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,551 Train: 1214 sentences 2023-10-06 15:14:54,551 (train_with_dev=False, train_with_test=False) 2023-10-06 15:14:54,551 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,551 Training Params: 2023-10-06 15:14:54,551 - learning_rate: "0.00015" 2023-10-06 15:14:54,551 - mini_batch_size: "4" 2023-10-06 15:14:54,551 - max_epochs: "10" 2023-10-06 15:14:54,551 - shuffle: "True" 2023-10-06 15:14:54,551 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,551 Plugins: 2023-10-06 15:14:54,551 - TensorboardLogger 2023-10-06 15:14:54,551 - LinearScheduler | warmup_fraction: '0.1' 2023-10-06 15:14:54,551 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,551 Final evaluation on model from best epoch (best-model.pt) 2023-10-06 15:14:54,551 - metric: "('micro avg', 'f1-score')" 2023-10-06 15:14:54,551 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,552 Computation: 2023-10-06 15:14:54,552 - compute on device: cuda:0 2023-10-06 15:14:54,552 - embedding storage: none 2023-10-06 15:14:54,552 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,552 Model training base path: "hmbench-ajmc/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" 2023-10-06 15:14:54,552 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,552 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:14:54,552 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-06 15:15:06,477 epoch 1 - iter 30/304 - loss 3.23343998 - time (sec): 11.92 - samples/sec: 268.11 - lr: 0.000014 - momentum: 0.000000 2023-10-06 15:15:17,847 epoch 1 - iter 60/304 - loss 3.22288360 - time (sec): 23.29 - samples/sec: 264.57 - lr: 0.000029 - momentum: 0.000000 2023-10-06 15:15:29,460 epoch 1 - iter 90/304 - loss 3.19705274 - time (sec): 34.91 - samples/sec: 256.68 - lr: 0.000044 - momentum: 0.000000 2023-10-06 15:15:41,241 epoch 1 - iter 120/304 - loss 3.13184691 - time (sec): 46.69 - samples/sec: 255.61 - lr: 0.000059 - momentum: 0.000000 2023-10-06 15:15:52,963 epoch 1 - iter 150/304 - loss 3.03341121 - time (sec): 58.41 - samples/sec: 254.94 - lr: 0.000074 - momentum: 0.000000 2023-10-06 15:16:05,924 epoch 1 - iter 180/304 - loss 2.90838872 - time (sec): 71.37 - samples/sec: 256.59 - lr: 0.000088 - momentum: 0.000000 2023-10-06 15:16:16,857 epoch 1 - iter 210/304 - loss 2.80542255 - time (sec): 82.30 - samples/sec: 253.58 - lr: 0.000103 - momentum: 0.000000 2023-10-06 15:16:29,245 epoch 1 - iter 240/304 - loss 2.66521471 - time (sec): 94.69 - samples/sec: 253.98 - lr: 0.000118 - momentum: 0.000000 2023-10-06 15:16:41,369 epoch 1 - iter 270/304 - loss 2.51887914 - time (sec): 106.82 - samples/sec: 254.87 - lr: 0.000133 - momentum: 0.000000 2023-10-06 15:16:53,851 epoch 1 - iter 300/304 - loss 2.36748077 - time (sec): 119.30 - samples/sec: 256.24 - lr: 0.000148 - momentum: 0.000000 2023-10-06 15:16:55,437 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:16:55,437 EPOCH 1 done: loss 2.3481 - lr: 0.000148 2023-10-06 15:17:03,270 DEV : loss 0.9383919835090637 - f1-score (micro avg) 0.0 2023-10-06 15:17:03,278 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:17:15,100 epoch 2 - iter 30/304 - loss 0.83648571 - time (sec): 11.82 - samples/sec: 259.23 - lr: 0.000148 - momentum: 0.000000 2023-10-06 15:17:26,764 epoch 2 - iter 60/304 - loss 0.76649649 - time (sec): 23.48 - samples/sec: 259.62 - lr: 0.000147 - momentum: 0.000000 2023-10-06 15:17:39,176 epoch 2 - iter 90/304 - loss 0.74619072 - time (sec): 35.90 - samples/sec: 259.50 - lr: 0.000145 - momentum: 0.000000 2023-10-06 15:17:51,335 epoch 2 - iter 120/304 - loss 0.70922269 - time (sec): 48.06 - samples/sec: 260.51 - lr: 0.000143 - momentum: 0.000000 2023-10-06 15:18:02,189 epoch 2 - iter 150/304 - loss 0.66968302 - time (sec): 58.91 - samples/sec: 255.92 - lr: 0.000142 - momentum: 0.000000 2023-10-06 15:18:13,839 epoch 2 - iter 180/304 - loss 0.61907359 - time (sec): 70.56 - samples/sec: 256.62 - lr: 0.000140 - momentum: 0.000000 2023-10-06 15:18:26,033 epoch 2 - iter 210/304 - loss 0.59912542 - time (sec): 82.75 - samples/sec: 255.28 - lr: 0.000139 - momentum: 0.000000 2023-10-06 15:18:38,038 epoch 2 - iter 240/304 - loss 0.56612844 - time (sec): 94.76 - samples/sec: 256.12 - lr: 0.000137 - momentum: 0.000000 2023-10-06 15:18:50,269 epoch 2 - iter 270/304 - loss 0.53713759 - time (sec): 106.99 - samples/sec: 256.87 - lr: 0.000135 - momentum: 0.000000 2023-10-06 15:19:02,163 epoch 2 - iter 300/304 - loss 0.51203772 - time (sec): 118.88 - samples/sec: 257.54 - lr: 0.000134 - momentum: 0.000000 2023-10-06 15:19:03,653 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:19:03,653 EPOCH 2 done: loss 0.5116 - lr: 0.000134 2023-10-06 15:19:11,497 DEV : loss 0.3360927402973175 - f1-score (micro avg) 0.4839 2023-10-06 15:19:11,505 saving best model 2023-10-06 15:19:12,492 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:19:24,035 epoch 3 - iter 30/304 - loss 0.34846495 - time (sec): 11.54 - samples/sec: 250.32 - lr: 0.000132 - momentum: 0.000000 2023-10-06 15:19:35,707 epoch 3 - iter 60/304 - loss 0.29775882 - time (sec): 23.21 - samples/sec: 252.45 - lr: 0.000130 - momentum: 0.000000 2023-10-06 15:19:46,823 epoch 3 - iter 90/304 - loss 0.27016891 - time (sec): 34.33 - samples/sec: 250.66 - lr: 0.000128 - momentum: 0.000000 2023-10-06 15:19:59,541 epoch 3 - iter 120/304 - loss 0.26645099 - time (sec): 47.05 - samples/sec: 254.47 - lr: 0.000127 - momentum: 0.000000 2023-10-06 15:20:11,289 epoch 3 - iter 150/304 - loss 0.25301882 - time (sec): 58.80 - samples/sec: 253.29 - lr: 0.000125 - momentum: 0.000000 2023-10-06 15:20:23,403 epoch 3 - iter 180/304 - loss 0.24885049 - time (sec): 70.91 - samples/sec: 253.23 - lr: 0.000124 - momentum: 0.000000 2023-10-06 15:20:35,584 epoch 3 - iter 210/304 - loss 0.23877391 - time (sec): 83.09 - samples/sec: 253.89 - lr: 0.000122 - momentum: 0.000000 2023-10-06 15:20:47,834 epoch 3 - iter 240/304 - loss 0.23063199 - time (sec): 95.34 - samples/sec: 254.60 - lr: 0.000120 - momentum: 0.000000 2023-10-06 15:20:59,711 epoch 3 - iter 270/304 - loss 0.22124203 - time (sec): 107.22 - samples/sec: 254.68 - lr: 0.000119 - momentum: 0.000000 2023-10-06 15:21:11,967 epoch 3 - iter 300/304 - loss 0.21317194 - time (sec): 119.47 - samples/sec: 255.63 - lr: 0.000117 - momentum: 0.000000 2023-10-06 15:21:13,656 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:21:13,656 EPOCH 3 done: loss 0.2127 - lr: 0.000117 2023-10-06 15:21:21,677 DEV : loss 0.19596518576145172 - f1-score (micro avg) 0.6897 2023-10-06 15:21:21,685 saving best model 2023-10-06 15:21:25,879 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:21:38,154 epoch 4 - iter 30/304 - loss 0.16356280 - time (sec): 12.27 - samples/sec: 257.95 - lr: 0.000115 - momentum: 0.000000 2023-10-06 15:21:49,440 epoch 4 - iter 60/304 - loss 0.15394459 - time (sec): 23.56 - samples/sec: 251.28 - lr: 0.000113 - momentum: 0.000000 2023-10-06 15:22:01,355 epoch 4 - iter 90/304 - loss 0.14506530 - time (sec): 35.47 - samples/sec: 252.52 - lr: 0.000112 - momentum: 0.000000 2023-10-06 15:22:13,307 epoch 4 - iter 120/304 - loss 0.13973248 - time (sec): 47.43 - samples/sec: 253.42 - lr: 0.000110 - momentum: 0.000000 2023-10-06 15:22:25,445 epoch 4 - iter 150/304 - loss 0.13837821 - time (sec): 59.57 - samples/sec: 253.71 - lr: 0.000109 - momentum: 0.000000 2023-10-06 15:22:37,347 epoch 4 - iter 180/304 - loss 0.13092858 - time (sec): 71.47 - samples/sec: 254.19 - lr: 0.000107 - momentum: 0.000000 2023-10-06 15:22:48,932 epoch 4 - iter 210/304 - loss 0.12807551 - time (sec): 83.05 - samples/sec: 253.80 - lr: 0.000105 - momentum: 0.000000 2023-10-06 15:23:01,772 epoch 4 - iter 240/304 - loss 0.12378548 - time (sec): 95.89 - samples/sec: 256.80 - lr: 0.000104 - momentum: 0.000000 2023-10-06 15:23:14,397 epoch 4 - iter 270/304 - loss 0.12188333 - time (sec): 108.52 - samples/sec: 257.16 - lr: 0.000102 - momentum: 0.000000 2023-10-06 15:23:25,691 epoch 4 - iter 300/304 - loss 0.11711521 - time (sec): 119.81 - samples/sec: 255.83 - lr: 0.000100 - momentum: 0.000000 2023-10-06 15:23:27,076 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:23:27,077 EPOCH 4 done: loss 0.1165 - lr: 0.000100 2023-10-06 15:23:34,950 DEV : loss 0.14813771843910217 - f1-score (micro avg) 0.7958 2023-10-06 15:23:34,958 saving best model 2023-10-06 15:23:39,268 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:23:51,238 epoch 5 - iter 30/304 - loss 0.05189033 - time (sec): 11.97 - samples/sec: 256.91 - lr: 0.000098 - momentum: 0.000000 2023-10-06 15:24:02,832 epoch 5 - iter 60/304 - loss 0.07692114 - time (sec): 23.56 - samples/sec: 257.49 - lr: 0.000097 - momentum: 0.000000 2023-10-06 15:24:15,080 epoch 5 - iter 90/304 - loss 0.07768443 - time (sec): 35.81 - samples/sec: 258.38 - lr: 0.000095 - momentum: 0.000000 2023-10-06 15:24:27,053 epoch 5 - iter 120/304 - loss 0.07680150 - time (sec): 47.78 - samples/sec: 259.83 - lr: 0.000094 - momentum: 0.000000 2023-10-06 15:24:39,247 epoch 5 - iter 150/304 - loss 0.08172290 - time (sec): 59.98 - samples/sec: 260.31 - lr: 0.000092 - momentum: 0.000000 2023-10-06 15:24:50,853 epoch 5 - iter 180/304 - loss 0.07660559 - time (sec): 71.58 - samples/sec: 259.08 - lr: 0.000090 - momentum: 0.000000 2023-10-06 15:25:03,175 epoch 5 - iter 210/304 - loss 0.07248568 - time (sec): 83.91 - samples/sec: 259.14 - lr: 0.000089 - momentum: 0.000000 2023-10-06 15:25:15,136 epoch 5 - iter 240/304 - loss 0.07334292 - time (sec): 95.87 - samples/sec: 259.80 - lr: 0.000087 - momentum: 0.000000 2023-10-06 15:25:26,971 epoch 5 - iter 270/304 - loss 0.07069816 - time (sec): 107.70 - samples/sec: 258.59 - lr: 0.000085 - momentum: 0.000000 2023-10-06 15:25:38,568 epoch 5 - iter 300/304 - loss 0.07232378 - time (sec): 119.30 - samples/sec: 257.86 - lr: 0.000084 - momentum: 0.000000 2023-10-06 15:25:39,696 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:25:39,697 EPOCH 5 done: loss 0.0719 - lr: 0.000084 2023-10-06 15:25:47,674 DEV : loss 0.14675813913345337 - f1-score (micro avg) 0.7995 2023-10-06 15:25:47,683 saving best model 2023-10-06 15:25:52,000 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:26:03,984 epoch 6 - iter 30/304 - loss 0.08138966 - time (sec): 11.98 - samples/sec: 265.72 - lr: 0.000082 - momentum: 0.000000 2023-10-06 15:26:16,058 epoch 6 - iter 60/304 - loss 0.06957171 - time (sec): 24.06 - samples/sec: 267.62 - lr: 0.000080 - momentum: 0.000000 2023-10-06 15:26:27,991 epoch 6 - iter 90/304 - loss 0.06031100 - time (sec): 35.99 - samples/sec: 270.50 - lr: 0.000079 - momentum: 0.000000 2023-10-06 15:26:39,420 epoch 6 - iter 120/304 - loss 0.05820154 - time (sec): 47.42 - samples/sec: 270.86 - lr: 0.000077 - momentum: 0.000000 2023-10-06 15:26:50,563 epoch 6 - iter 150/304 - loss 0.05286555 - time (sec): 58.56 - samples/sec: 271.87 - lr: 0.000075 - momentum: 0.000000 2023-10-06 15:27:00,994 epoch 6 - iter 180/304 - loss 0.04973384 - time (sec): 68.99 - samples/sec: 269.38 - lr: 0.000074 - momentum: 0.000000 2023-10-06 15:27:12,623 epoch 6 - iter 210/304 - loss 0.05049495 - time (sec): 80.62 - samples/sec: 270.82 - lr: 0.000072 - momentum: 0.000000 2023-10-06 15:27:23,331 epoch 6 - iter 240/304 - loss 0.05100559 - time (sec): 91.33 - samples/sec: 270.37 - lr: 0.000070 - momentum: 0.000000 2023-10-06 15:27:34,415 epoch 6 - iter 270/304 - loss 0.04940749 - time (sec): 102.41 - samples/sec: 269.52 - lr: 0.000069 - momentum: 0.000000 2023-10-06 15:27:45,754 epoch 6 - iter 300/304 - loss 0.05137662 - time (sec): 113.75 - samples/sec: 269.68 - lr: 0.000067 - momentum: 0.000000 2023-10-06 15:27:46,967 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:27:46,968 EPOCH 6 done: loss 0.0525 - lr: 0.000067 2023-10-06 15:27:54,072 DEV : loss 0.1478758007287979 - f1-score (micro avg) 0.8151 2023-10-06 15:27:54,080 saving best model 2023-10-06 15:27:58,409 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:28:09,207 epoch 7 - iter 30/304 - loss 0.02944793 - time (sec): 10.80 - samples/sec: 265.64 - lr: 0.000065 - momentum: 0.000000 2023-10-06 15:28:19,973 epoch 7 - iter 60/304 - loss 0.03347190 - time (sec): 21.56 - samples/sec: 266.85 - lr: 0.000063 - momentum: 0.000000 2023-10-06 15:28:31,176 epoch 7 - iter 90/304 - loss 0.02834258 - time (sec): 32.77 - samples/sec: 267.36 - lr: 0.000062 - momentum: 0.000000 2023-10-06 15:28:42,314 epoch 7 - iter 120/304 - loss 0.03031835 - time (sec): 43.90 - samples/sec: 269.43 - lr: 0.000060 - momentum: 0.000000 2023-10-06 15:28:53,456 epoch 7 - iter 150/304 - loss 0.02974656 - time (sec): 55.04 - samples/sec: 269.16 - lr: 0.000059 - momentum: 0.000000 2023-10-06 15:29:05,155 epoch 7 - iter 180/304 - loss 0.03371158 - time (sec): 66.74 - samples/sec: 271.20 - lr: 0.000057 - momentum: 0.000000 2023-10-06 15:29:16,354 epoch 7 - iter 210/304 - loss 0.03208932 - time (sec): 77.94 - samples/sec: 272.03 - lr: 0.000055 - momentum: 0.000000 2023-10-06 15:29:27,622 epoch 7 - iter 240/304 - loss 0.03349765 - time (sec): 89.21 - samples/sec: 270.88 - lr: 0.000054 - momentum: 0.000000 2023-10-06 15:29:39,306 epoch 7 - iter 270/304 - loss 0.03330430 - time (sec): 100.90 - samples/sec: 272.61 - lr: 0.000052 - momentum: 0.000000 2023-10-06 15:29:50,864 epoch 7 - iter 300/304 - loss 0.04026473 - time (sec): 112.45 - samples/sec: 272.22 - lr: 0.000050 - momentum: 0.000000 2023-10-06 15:29:52,226 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:29:52,226 EPOCH 7 done: loss 0.0398 - lr: 0.000050 2023-10-06 15:29:59,692 DEV : loss 0.15869873762130737 - f1-score (micro avg) 0.8254 2023-10-06 15:29:59,700 saving best model 2023-10-06 15:30:04,025 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:30:15,233 epoch 8 - iter 30/304 - loss 0.04093775 - time (sec): 11.21 - samples/sec: 270.27 - lr: 0.000048 - momentum: 0.000000 2023-10-06 15:30:26,471 epoch 8 - iter 60/304 - loss 0.04151933 - time (sec): 22.44 - samples/sec: 266.16 - lr: 0.000047 - momentum: 0.000000 2023-10-06 15:30:37,321 epoch 8 - iter 90/304 - loss 0.03716224 - time (sec): 33.30 - samples/sec: 261.21 - lr: 0.000045 - momentum: 0.000000 2023-10-06 15:30:48,960 epoch 8 - iter 120/304 - loss 0.03713580 - time (sec): 44.93 - samples/sec: 262.01 - lr: 0.000044 - momentum: 0.000000 2023-10-06 15:31:01,047 epoch 8 - iter 150/304 - loss 0.03630827 - time (sec): 57.02 - samples/sec: 262.85 - lr: 0.000042 - momentum: 0.000000 2023-10-06 15:31:12,792 epoch 8 - iter 180/304 - loss 0.03460352 - time (sec): 68.77 - samples/sec: 261.06 - lr: 0.000040 - momentum: 0.000000 2023-10-06 15:31:25,105 epoch 8 - iter 210/304 - loss 0.03214098 - time (sec): 81.08 - samples/sec: 261.36 - lr: 0.000039 - momentum: 0.000000 2023-10-06 15:31:37,569 epoch 8 - iter 240/304 - loss 0.03134757 - time (sec): 93.54 - samples/sec: 262.97 - lr: 0.000037 - momentum: 0.000000 2023-10-06 15:31:49,741 epoch 8 - iter 270/304 - loss 0.03276907 - time (sec): 105.72 - samples/sec: 262.73 - lr: 0.000035 - momentum: 0.000000 2023-10-06 15:32:00,861 epoch 8 - iter 300/304 - loss 0.03405317 - time (sec): 116.83 - samples/sec: 261.69 - lr: 0.000034 - momentum: 0.000000 2023-10-06 15:32:02,424 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:32:02,424 EPOCH 8 done: loss 0.0339 - lr: 0.000034 2023-10-06 15:32:09,779 DEV : loss 0.1572294384241104 - f1-score (micro avg) 0.837 2023-10-06 15:32:09,791 saving best model 2023-10-06 15:32:14,497 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:32:26,418 epoch 9 - iter 30/304 - loss 0.03926648 - time (sec): 11.92 - samples/sec: 283.50 - lr: 0.000032 - momentum: 0.000000 2023-10-06 15:32:37,728 epoch 9 - iter 60/304 - loss 0.03292966 - time (sec): 23.23 - samples/sec: 282.53 - lr: 0.000030 - momentum: 0.000000 2023-10-06 15:32:48,745 epoch 9 - iter 90/304 - loss 0.03106427 - time (sec): 34.25 - samples/sec: 278.04 - lr: 0.000029 - momentum: 0.000000 2023-10-06 15:32:59,485 epoch 9 - iter 120/304 - loss 0.03029571 - time (sec): 44.99 - samples/sec: 275.64 - lr: 0.000027 - momentum: 0.000000 2023-10-06 15:33:10,627 epoch 9 - iter 150/304 - loss 0.02641133 - time (sec): 56.13 - samples/sec: 273.14 - lr: 0.000025 - momentum: 0.000000 2023-10-06 15:33:22,017 epoch 9 - iter 180/304 - loss 0.02607911 - time (sec): 67.52 - samples/sec: 273.51 - lr: 0.000024 - momentum: 0.000000 2023-10-06 15:33:33,639 epoch 9 - iter 210/304 - loss 0.02502963 - time (sec): 79.14 - samples/sec: 273.97 - lr: 0.000022 - momentum: 0.000000 2023-10-06 15:33:44,669 epoch 9 - iter 240/304 - loss 0.02840878 - time (sec): 90.17 - samples/sec: 272.65 - lr: 0.000020 - momentum: 0.000000 2023-10-06 15:33:55,629 epoch 9 - iter 270/304 - loss 0.02606547 - time (sec): 101.13 - samples/sec: 272.82 - lr: 0.000019 - momentum: 0.000000 2023-10-06 15:34:06,757 epoch 9 - iter 300/304 - loss 0.02663716 - time (sec): 112.26 - samples/sec: 272.69 - lr: 0.000017 - momentum: 0.000000 2023-10-06 15:34:08,047 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:34:08,047 EPOCH 9 done: loss 0.0270 - lr: 0.000017 2023-10-06 15:34:15,070 DEV : loss 0.16326496005058289 - f1-score (micro avg) 0.8359 2023-10-06 15:34:15,077 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:34:25,865 epoch 10 - iter 30/304 - loss 0.03134249 - time (sec): 10.79 - samples/sec: 266.42 - lr: 0.000015 - momentum: 0.000000 2023-10-06 15:34:38,025 epoch 10 - iter 60/304 - loss 0.02757941 - time (sec): 22.95 - samples/sec: 277.85 - lr: 0.000014 - momentum: 0.000000 2023-10-06 15:34:49,214 epoch 10 - iter 90/304 - loss 0.02547701 - time (sec): 34.14 - samples/sec: 274.40 - lr: 0.000012 - momentum: 0.000000 2023-10-06 15:35:00,419 epoch 10 - iter 120/304 - loss 0.02355033 - time (sec): 45.34 - samples/sec: 272.55 - lr: 0.000010 - momentum: 0.000000 2023-10-06 15:35:11,177 epoch 10 - iter 150/304 - loss 0.02498782 - time (sec): 56.10 - samples/sec: 269.65 - lr: 0.000009 - momentum: 0.000000 2023-10-06 15:35:22,614 epoch 10 - iter 180/304 - loss 0.02443102 - time (sec): 67.54 - samples/sec: 270.51 - lr: 0.000007 - momentum: 0.000000 2023-10-06 15:35:33,818 epoch 10 - iter 210/304 - loss 0.02205130 - time (sec): 78.74 - samples/sec: 267.78 - lr: 0.000005 - momentum: 0.000000 2023-10-06 15:35:45,708 epoch 10 - iter 240/304 - loss 0.02222944 - time (sec): 90.63 - samples/sec: 268.27 - lr: 0.000004 - momentum: 0.000000 2023-10-06 15:35:57,393 epoch 10 - iter 270/304 - loss 0.02155449 - time (sec): 102.31 - samples/sec: 268.74 - lr: 0.000002 - momentum: 0.000000 2023-10-06 15:36:09,165 epoch 10 - iter 300/304 - loss 0.02313283 - time (sec): 114.09 - samples/sec: 269.33 - lr: 0.000000 - momentum: 0.000000 2023-10-06 15:36:10,383 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:36:10,383 EPOCH 10 done: loss 0.0237 - lr: 0.000000 2023-10-06 15:36:18,002 DEV : loss 0.16279256343841553 - f1-score (micro avg) 0.8302 2023-10-06 15:36:18,831 ---------------------------------------------------------------------------------------------------- 2023-10-06 15:36:18,832 Loading model from best epoch ... 2023-10-06 15:36:21,516 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-06 15:36:28,563 Results: - F-score (micro) 0.8104 - F-score (macro) 0.6512 - Accuracy 0.686 By class: precision recall f1-score support scope 0.7707 0.8013 0.7857 151 work 0.7321 0.8632 0.7923 95 pers 0.8257 0.9375 0.8780 96 loc 1.0000 0.6667 0.8000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7763 0.8477 0.8104 348 macro avg 0.6657 0.6537 0.6512 348 weighted avg 0.7707 0.8477 0.8063 348 2023-10-06 15:36:28,563 ----------------------------------------------------------------------------------------------------