2023-10-06 13:08:08,875 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,877 Model: "SequenceTagger( (embeddings): ByT5Embeddings( (model): T5EncoderModel( (shared): Embedding(384, 1472) (encoder): T5Stack( (embed_tokens): Embedding(384, 1472) (block): ModuleList( (0): T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) (relative_attention_bias): Embedding(32, 6) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (1-11): 11 x T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (final_layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1472, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-06 13:08:08,877 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,877 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-06 13:08:08,877 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,877 Train: 1214 sentences 2023-10-06 13:08:08,877 (train_with_dev=False, train_with_test=False) 2023-10-06 13:08:08,877 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,877 Training Params: 2023-10-06 13:08:08,877 - learning_rate: "0.00015" 2023-10-06 13:08:08,877 - mini_batch_size: "8" 2023-10-06 13:08:08,877 - max_epochs: "10" 2023-10-06 13:08:08,877 - shuffle: "True" 2023-10-06 13:08:08,877 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,878 Plugins: 2023-10-06 13:08:08,878 - TensorboardLogger 2023-10-06 13:08:08,878 - LinearScheduler | warmup_fraction: '0.1' 2023-10-06 13:08:08,878 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,878 Final evaluation on model from best epoch (best-model.pt) 2023-10-06 13:08:08,878 - metric: "('micro avg', 'f1-score')" 2023-10-06 13:08:08,878 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,878 Computation: 2023-10-06 13:08:08,878 - compute on device: cuda:0 2023-10-06 13:08:08,878 - embedding storage: none 2023-10-06 13:08:08,878 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,878 Model training base path: "hmbench-ajmc/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3" 2023-10-06 13:08:08,878 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,878 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:08:08,878 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-06 13:08:19,434 epoch 1 - iter 15/152 - loss 3.21438031 - time (sec): 10.55 - samples/sec: 279.61 - lr: 0.000014 - momentum: 0.000000 2023-10-06 13:08:30,150 epoch 1 - iter 30/152 - loss 3.20540351 - time (sec): 21.27 - samples/sec: 280.91 - lr: 0.000029 - momentum: 0.000000 2023-10-06 13:08:40,815 epoch 1 - iter 45/152 - loss 3.19559217 - time (sec): 31.94 - samples/sec: 283.98 - lr: 0.000043 - momentum: 0.000000 2023-10-06 13:08:51,120 epoch 1 - iter 60/152 - loss 3.17725347 - time (sec): 42.24 - samples/sec: 282.24 - lr: 0.000058 - momentum: 0.000000 2023-10-06 13:09:01,354 epoch 1 - iter 75/152 - loss 3.14091169 - time (sec): 52.47 - samples/sec: 285.15 - lr: 0.000073 - momentum: 0.000000 2023-10-06 13:09:11,547 epoch 1 - iter 90/152 - loss 3.07834090 - time (sec): 62.67 - samples/sec: 286.56 - lr: 0.000088 - momentum: 0.000000 2023-10-06 13:09:21,906 epoch 1 - iter 105/152 - loss 2.99906148 - time (sec): 73.03 - samples/sec: 287.96 - lr: 0.000103 - momentum: 0.000000 2023-10-06 13:09:32,764 epoch 1 - iter 120/152 - loss 2.90338465 - time (sec): 83.88 - samples/sec: 290.77 - lr: 0.000117 - momentum: 0.000000 2023-10-06 13:09:43,670 epoch 1 - iter 135/152 - loss 2.80969823 - time (sec): 94.79 - samples/sec: 289.70 - lr: 0.000132 - momentum: 0.000000 2023-10-06 13:09:53,984 epoch 1 - iter 150/152 - loss 2.70892700 - time (sec): 105.10 - samples/sec: 291.40 - lr: 0.000147 - momentum: 0.000000 2023-10-06 13:09:55,194 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:09:55,194 EPOCH 1 done: loss 2.6974 - lr: 0.000147 2023-10-06 13:10:02,399 DEV : loss 1.6305921077728271 - f1-score (micro avg) 0.0 2023-10-06 13:10:02,412 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:10:12,934 epoch 2 - iter 15/152 - loss 1.57320305 - time (sec): 10.52 - samples/sec: 300.00 - lr: 0.000148 - momentum: 0.000000 2023-10-06 13:10:23,466 epoch 2 - iter 30/152 - loss 1.43312210 - time (sec): 21.05 - samples/sec: 300.59 - lr: 0.000147 - momentum: 0.000000 2023-10-06 13:10:33,164 epoch 2 - iter 45/152 - loss 1.30679236 - time (sec): 30.75 - samples/sec: 293.75 - lr: 0.000145 - momentum: 0.000000 2023-10-06 13:10:43,685 epoch 2 - iter 60/152 - loss 1.22453391 - time (sec): 41.27 - samples/sec: 292.66 - lr: 0.000144 - momentum: 0.000000 2023-10-06 13:10:54,229 epoch 2 - iter 75/152 - loss 1.15899750 - time (sec): 51.81 - samples/sec: 291.77 - lr: 0.000142 - momentum: 0.000000 2023-10-06 13:11:04,548 epoch 2 - iter 90/152 - loss 1.10009267 - time (sec): 62.13 - samples/sec: 292.05 - lr: 0.000140 - momentum: 0.000000 2023-10-06 13:11:14,863 epoch 2 - iter 105/152 - loss 1.03295088 - time (sec): 72.45 - samples/sec: 292.01 - lr: 0.000139 - momentum: 0.000000 2023-10-06 13:11:25,393 epoch 2 - iter 120/152 - loss 0.98652304 - time (sec): 82.98 - samples/sec: 291.54 - lr: 0.000137 - momentum: 0.000000 2023-10-06 13:11:36,515 epoch 2 - iter 135/152 - loss 0.93594003 - time (sec): 94.10 - samples/sec: 291.46 - lr: 0.000135 - momentum: 0.000000 2023-10-06 13:11:47,543 epoch 2 - iter 150/152 - loss 0.89467702 - time (sec): 105.13 - samples/sec: 292.02 - lr: 0.000134 - momentum: 0.000000 2023-10-06 13:11:48,621 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:11:48,621 EPOCH 2 done: loss 0.8913 - lr: 0.000134 2023-10-06 13:11:55,927 DEV : loss 0.5647168755531311 - f1-score (micro avg) 0.0 2023-10-06 13:11:55,934 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:12:06,278 epoch 3 - iter 15/152 - loss 0.55208629 - time (sec): 10.34 - samples/sec: 282.72 - lr: 0.000132 - momentum: 0.000000 2023-10-06 13:12:16,529 epoch 3 - iter 30/152 - loss 0.49683238 - time (sec): 20.59 - samples/sec: 280.97 - lr: 0.000130 - momentum: 0.000000 2023-10-06 13:12:27,371 epoch 3 - iter 45/152 - loss 0.46337786 - time (sec): 31.44 - samples/sec: 282.49 - lr: 0.000129 - momentum: 0.000000 2023-10-06 13:12:38,640 epoch 3 - iter 60/152 - loss 0.44837662 - time (sec): 42.70 - samples/sec: 285.22 - lr: 0.000127 - momentum: 0.000000 2023-10-06 13:12:49,081 epoch 3 - iter 75/152 - loss 0.41717800 - time (sec): 53.15 - samples/sec: 283.47 - lr: 0.000125 - momentum: 0.000000 2023-10-06 13:13:00,259 epoch 3 - iter 90/152 - loss 0.41097854 - time (sec): 64.32 - samples/sec: 282.62 - lr: 0.000124 - momentum: 0.000000 2023-10-06 13:13:11,316 epoch 3 - iter 105/152 - loss 0.40805842 - time (sec): 75.38 - samples/sec: 283.03 - lr: 0.000122 - momentum: 0.000000 2023-10-06 13:13:22,175 epoch 3 - iter 120/152 - loss 0.39498496 - time (sec): 86.24 - samples/sec: 282.03 - lr: 0.000120 - momentum: 0.000000 2023-10-06 13:13:32,934 epoch 3 - iter 135/152 - loss 0.38573774 - time (sec): 97.00 - samples/sec: 280.27 - lr: 0.000119 - momentum: 0.000000 2023-10-06 13:13:44,662 epoch 3 - iter 150/152 - loss 0.37384951 - time (sec): 108.73 - samples/sec: 281.55 - lr: 0.000117 - momentum: 0.000000 2023-10-06 13:13:46,031 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:13:46,032 EPOCH 3 done: loss 0.3753 - lr: 0.000117 2023-10-06 13:13:54,094 DEV : loss 0.32833757996559143 - f1-score (micro avg) 0.4878 2023-10-06 13:13:54,102 saving best model 2023-10-06 13:13:54,964 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:14:06,353 epoch 4 - iter 15/152 - loss 0.28006593 - time (sec): 11.39 - samples/sec: 274.59 - lr: 0.000115 - momentum: 0.000000 2023-10-06 13:14:17,205 epoch 4 - iter 30/152 - loss 0.28639218 - time (sec): 22.24 - samples/sec: 274.51 - lr: 0.000114 - momentum: 0.000000 2023-10-06 13:14:28,850 epoch 4 - iter 45/152 - loss 0.27015973 - time (sec): 33.88 - samples/sec: 278.86 - lr: 0.000112 - momentum: 0.000000 2023-10-06 13:14:40,557 epoch 4 - iter 60/152 - loss 0.26084485 - time (sec): 45.59 - samples/sec: 278.63 - lr: 0.000110 - momentum: 0.000000 2023-10-06 13:14:51,743 epoch 4 - iter 75/152 - loss 0.25334043 - time (sec): 56.78 - samples/sec: 279.25 - lr: 0.000109 - momentum: 0.000000 2023-10-06 13:15:03,188 epoch 4 - iter 90/152 - loss 0.24462008 - time (sec): 68.22 - samples/sec: 278.78 - lr: 0.000107 - momentum: 0.000000 2023-10-06 13:15:13,402 epoch 4 - iter 105/152 - loss 0.24376650 - time (sec): 78.44 - samples/sec: 277.79 - lr: 0.000105 - momentum: 0.000000 2023-10-06 13:15:24,629 epoch 4 - iter 120/152 - loss 0.24370889 - time (sec): 89.66 - samples/sec: 276.29 - lr: 0.000104 - momentum: 0.000000 2023-10-06 13:15:35,373 epoch 4 - iter 135/152 - loss 0.23999875 - time (sec): 100.41 - samples/sec: 275.42 - lr: 0.000102 - momentum: 0.000000 2023-10-06 13:15:46,458 epoch 4 - iter 150/152 - loss 0.23025240 - time (sec): 111.49 - samples/sec: 274.23 - lr: 0.000101 - momentum: 0.000000 2023-10-06 13:15:47,939 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:15:47,940 EPOCH 4 done: loss 0.2287 - lr: 0.000101 2023-10-06 13:15:56,008 DEV : loss 0.2277979999780655 - f1-score (micro avg) 0.7074 2023-10-06 13:15:56,016 saving best model 2023-10-06 13:16:00,368 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:16:11,445 epoch 5 - iter 15/152 - loss 0.12246111 - time (sec): 11.08 - samples/sec: 276.56 - lr: 0.000099 - momentum: 0.000000 2023-10-06 13:16:22,448 epoch 5 - iter 30/152 - loss 0.15322928 - time (sec): 22.08 - samples/sec: 273.43 - lr: 0.000097 - momentum: 0.000000 2023-10-06 13:16:34,190 epoch 5 - iter 45/152 - loss 0.15417556 - time (sec): 33.82 - samples/sec: 273.18 - lr: 0.000095 - momentum: 0.000000 2023-10-06 13:16:45,527 epoch 5 - iter 60/152 - loss 0.16403261 - time (sec): 45.16 - samples/sec: 274.29 - lr: 0.000094 - momentum: 0.000000 2023-10-06 13:16:56,764 epoch 5 - iter 75/152 - loss 0.16365678 - time (sec): 56.39 - samples/sec: 274.74 - lr: 0.000092 - momentum: 0.000000 2023-10-06 13:17:07,818 epoch 5 - iter 90/152 - loss 0.16139091 - time (sec): 67.45 - samples/sec: 273.62 - lr: 0.000091 - momentum: 0.000000 2023-10-06 13:17:18,564 epoch 5 - iter 105/152 - loss 0.15382459 - time (sec): 78.19 - samples/sec: 271.81 - lr: 0.000089 - momentum: 0.000000 2023-10-06 13:17:30,211 epoch 5 - iter 120/152 - loss 0.15240417 - time (sec): 89.84 - samples/sec: 273.78 - lr: 0.000087 - momentum: 0.000000 2023-10-06 13:17:41,001 epoch 5 - iter 135/152 - loss 0.15079292 - time (sec): 100.63 - samples/sec: 273.95 - lr: 0.000086 - momentum: 0.000000 2023-10-06 13:17:51,812 epoch 5 - iter 150/152 - loss 0.15298419 - time (sec): 111.44 - samples/sec: 274.10 - lr: 0.000084 - momentum: 0.000000 2023-10-06 13:17:53,395 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:17:53,395 EPOCH 5 done: loss 0.1519 - lr: 0.000084 2023-10-06 13:18:01,435 DEV : loss 0.17695870995521545 - f1-score (micro avg) 0.7258 2023-10-06 13:18:01,444 saving best model 2023-10-06 13:18:05,779 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:18:16,697 epoch 6 - iter 15/152 - loss 0.09270192 - time (sec): 10.92 - samples/sec: 271.43 - lr: 0.000082 - momentum: 0.000000 2023-10-06 13:18:27,716 epoch 6 - iter 30/152 - loss 0.11720667 - time (sec): 21.94 - samples/sec: 267.47 - lr: 0.000080 - momentum: 0.000000 2023-10-06 13:18:39,303 epoch 6 - iter 45/152 - loss 0.11175077 - time (sec): 33.52 - samples/sec: 270.35 - lr: 0.000079 - momentum: 0.000000 2023-10-06 13:18:50,781 epoch 6 - iter 60/152 - loss 0.11436783 - time (sec): 45.00 - samples/sec: 272.19 - lr: 0.000077 - momentum: 0.000000 2023-10-06 13:19:01,772 epoch 6 - iter 75/152 - loss 0.11082759 - time (sec): 55.99 - samples/sec: 271.26 - lr: 0.000076 - momentum: 0.000000 2023-10-06 13:19:13,070 epoch 6 - iter 90/152 - loss 0.10598764 - time (sec): 67.29 - samples/sec: 271.48 - lr: 0.000074 - momentum: 0.000000 2023-10-06 13:19:24,104 epoch 6 - iter 105/152 - loss 0.10720574 - time (sec): 78.32 - samples/sec: 270.72 - lr: 0.000072 - momentum: 0.000000 2023-10-06 13:19:35,155 epoch 6 - iter 120/152 - loss 0.11087016 - time (sec): 89.37 - samples/sec: 272.16 - lr: 0.000071 - momentum: 0.000000 2023-10-06 13:19:46,330 epoch 6 - iter 135/152 - loss 0.10716799 - time (sec): 100.55 - samples/sec: 273.19 - lr: 0.000069 - momentum: 0.000000 2023-10-06 13:19:57,472 epoch 6 - iter 150/152 - loss 0.10886342 - time (sec): 111.69 - samples/sec: 273.88 - lr: 0.000067 - momentum: 0.000000 2023-10-06 13:19:58,850 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:19:58,850 EPOCH 6 done: loss 0.1091 - lr: 0.000067 2023-10-06 13:20:06,860 DEV : loss 0.1589815467596054 - f1-score (micro avg) 0.7943 2023-10-06 13:20:06,868 saving best model 2023-10-06 13:20:11,225 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:20:22,447 epoch 7 - iter 15/152 - loss 0.12776069 - time (sec): 11.22 - samples/sec: 265.52 - lr: 0.000066 - momentum: 0.000000 2023-10-06 13:20:34,020 epoch 7 - iter 30/152 - loss 0.10877098 - time (sec): 22.79 - samples/sec: 273.51 - lr: 0.000064 - momentum: 0.000000 2023-10-06 13:20:44,843 epoch 7 - iter 45/152 - loss 0.09444140 - time (sec): 33.62 - samples/sec: 273.80 - lr: 0.000062 - momentum: 0.000000 2023-10-06 13:20:56,426 epoch 7 - iter 60/152 - loss 0.09188632 - time (sec): 45.20 - samples/sec: 275.76 - lr: 0.000061 - momentum: 0.000000 2023-10-06 13:21:07,589 epoch 7 - iter 75/152 - loss 0.09200131 - time (sec): 56.36 - samples/sec: 275.34 - lr: 0.000059 - momentum: 0.000000 2023-10-06 13:21:18,216 epoch 7 - iter 90/152 - loss 0.08825078 - time (sec): 66.99 - samples/sec: 272.46 - lr: 0.000057 - momentum: 0.000000 2023-10-06 13:21:28,988 epoch 7 - iter 105/152 - loss 0.08636856 - time (sec): 77.76 - samples/sec: 272.23 - lr: 0.000056 - momentum: 0.000000 2023-10-06 13:21:40,417 epoch 7 - iter 120/152 - loss 0.08420198 - time (sec): 89.19 - samples/sec: 273.10 - lr: 0.000054 - momentum: 0.000000 2023-10-06 13:21:51,929 epoch 7 - iter 135/152 - loss 0.08667895 - time (sec): 100.70 - samples/sec: 274.30 - lr: 0.000052 - momentum: 0.000000 2023-10-06 13:22:02,903 epoch 7 - iter 150/152 - loss 0.08327537 - time (sec): 111.68 - samples/sec: 274.30 - lr: 0.000051 - momentum: 0.000000 2023-10-06 13:22:04,222 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:22:04,223 EPOCH 7 done: loss 0.0830 - lr: 0.000051 2023-10-06 13:22:12,223 DEV : loss 0.14461469650268555 - f1-score (micro avg) 0.8189 2023-10-06 13:22:12,231 saving best model 2023-10-06 13:22:16,577 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:22:27,756 epoch 8 - iter 15/152 - loss 0.06828884 - time (sec): 11.18 - samples/sec: 281.09 - lr: 0.000049 - momentum: 0.000000 2023-10-06 13:22:39,406 epoch 8 - iter 30/152 - loss 0.07361348 - time (sec): 22.83 - samples/sec: 283.65 - lr: 0.000047 - momentum: 0.000000 2023-10-06 13:22:51,204 epoch 8 - iter 45/152 - loss 0.08124757 - time (sec): 34.63 - samples/sec: 283.46 - lr: 0.000046 - momentum: 0.000000 2023-10-06 13:23:02,401 epoch 8 - iter 60/152 - loss 0.07871064 - time (sec): 45.82 - samples/sec: 281.54 - lr: 0.000044 - momentum: 0.000000 2023-10-06 13:23:13,854 epoch 8 - iter 75/152 - loss 0.07422431 - time (sec): 57.28 - samples/sec: 280.31 - lr: 0.000042 - momentum: 0.000000 2023-10-06 13:23:24,913 epoch 8 - iter 90/152 - loss 0.07333499 - time (sec): 68.33 - samples/sec: 278.69 - lr: 0.000041 - momentum: 0.000000 2023-10-06 13:23:35,270 epoch 8 - iter 105/152 - loss 0.07143430 - time (sec): 78.69 - samples/sec: 276.28 - lr: 0.000039 - momentum: 0.000000 2023-10-06 13:23:46,331 epoch 8 - iter 120/152 - loss 0.06850039 - time (sec): 89.75 - samples/sec: 276.17 - lr: 0.000037 - momentum: 0.000000 2023-10-06 13:23:57,183 epoch 8 - iter 135/152 - loss 0.06792511 - time (sec): 100.60 - samples/sec: 275.50 - lr: 0.000036 - momentum: 0.000000 2023-10-06 13:24:07,826 epoch 8 - iter 150/152 - loss 0.06593987 - time (sec): 111.25 - samples/sec: 274.54 - lr: 0.000034 - momentum: 0.000000 2023-10-06 13:24:09,273 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:24:09,273 EPOCH 8 done: loss 0.0681 - lr: 0.000034 2023-10-06 13:24:17,123 DEV : loss 0.14291059970855713 - f1-score (micro avg) 0.807 2023-10-06 13:24:17,132 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:24:28,361 epoch 9 - iter 15/152 - loss 0.05127601 - time (sec): 11.23 - samples/sec: 282.41 - lr: 0.000032 - momentum: 0.000000 2023-10-06 13:24:38,888 epoch 9 - iter 30/152 - loss 0.05020165 - time (sec): 21.75 - samples/sec: 274.51 - lr: 0.000031 - momentum: 0.000000 2023-10-06 13:24:49,557 epoch 9 - iter 45/152 - loss 0.04825827 - time (sec): 32.42 - samples/sec: 272.33 - lr: 0.000029 - momentum: 0.000000 2023-10-06 13:25:01,312 epoch 9 - iter 60/152 - loss 0.05048857 - time (sec): 44.18 - samples/sec: 276.71 - lr: 0.000027 - momentum: 0.000000 2023-10-06 13:25:12,244 epoch 9 - iter 75/152 - loss 0.05688874 - time (sec): 55.11 - samples/sec: 275.16 - lr: 0.000026 - momentum: 0.000000 2023-10-06 13:25:23,129 epoch 9 - iter 90/152 - loss 0.05601915 - time (sec): 66.00 - samples/sec: 274.97 - lr: 0.000024 - momentum: 0.000000 2023-10-06 13:25:34,216 epoch 9 - iter 105/152 - loss 0.05737830 - time (sec): 77.08 - samples/sec: 276.08 - lr: 0.000022 - momentum: 0.000000 2023-10-06 13:25:45,044 epoch 9 - iter 120/152 - loss 0.05834607 - time (sec): 87.91 - samples/sec: 276.19 - lr: 0.000021 - momentum: 0.000000 2023-10-06 13:25:56,736 epoch 9 - iter 135/152 - loss 0.05753999 - time (sec): 99.60 - samples/sec: 276.94 - lr: 0.000019 - momentum: 0.000000 2023-10-06 13:26:07,630 epoch 9 - iter 150/152 - loss 0.05804039 - time (sec): 110.50 - samples/sec: 276.72 - lr: 0.000018 - momentum: 0.000000 2023-10-06 13:26:09,098 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:26:09,098 EPOCH 9 done: loss 0.0574 - lr: 0.000018 2023-10-06 13:26:16,886 DEV : loss 0.14050251245498657 - f1-score (micro avg) 0.8132 2023-10-06 13:26:16,894 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:26:28,153 epoch 10 - iter 15/152 - loss 0.04766949 - time (sec): 11.26 - samples/sec: 265.79 - lr: 0.000016 - momentum: 0.000000 2023-10-06 13:26:39,619 epoch 10 - iter 30/152 - loss 0.06175718 - time (sec): 22.72 - samples/sec: 269.68 - lr: 0.000014 - momentum: 0.000000 2023-10-06 13:26:50,158 epoch 10 - iter 45/152 - loss 0.05547825 - time (sec): 33.26 - samples/sec: 268.72 - lr: 0.000012 - momentum: 0.000000 2023-10-06 13:27:01,057 epoch 10 - iter 60/152 - loss 0.05417530 - time (sec): 44.16 - samples/sec: 268.78 - lr: 0.000011 - momentum: 0.000000 2023-10-06 13:27:12,421 epoch 10 - iter 75/152 - loss 0.05152617 - time (sec): 55.53 - samples/sec: 270.74 - lr: 0.000009 - momentum: 0.000000 2023-10-06 13:27:23,494 epoch 10 - iter 90/152 - loss 0.05355677 - time (sec): 66.60 - samples/sec: 272.15 - lr: 0.000008 - momentum: 0.000000 2023-10-06 13:27:34,374 epoch 10 - iter 105/152 - loss 0.05240993 - time (sec): 77.48 - samples/sec: 271.92 - lr: 0.000006 - momentum: 0.000000 2023-10-06 13:27:45,675 epoch 10 - iter 120/152 - loss 0.05321465 - time (sec): 88.78 - samples/sec: 273.51 - lr: 0.000004 - momentum: 0.000000 2023-10-06 13:27:57,530 epoch 10 - iter 135/152 - loss 0.05326204 - time (sec): 100.63 - samples/sec: 274.63 - lr: 0.000003 - momentum: 0.000000 2023-10-06 13:28:08,225 epoch 10 - iter 150/152 - loss 0.05277666 - time (sec): 111.33 - samples/sec: 274.78 - lr: 0.000001 - momentum: 0.000000 2023-10-06 13:28:09,594 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:28:09,594 EPOCH 10 done: loss 0.0527 - lr: 0.000001 2023-10-06 13:28:17,372 DEV : loss 0.13832153379917145 - f1-score (micro avg) 0.8217 2023-10-06 13:28:17,379 saving best model 2023-10-06 13:28:22,740 ---------------------------------------------------------------------------------------------------- 2023-10-06 13:28:22,742 Loading model from best epoch ... 2023-10-06 13:28:25,333 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-06 13:28:32,461 Results: - F-score (micro) 0.7891 - F-score (macro) 0.4828 - Accuracy 0.6561 By class: precision recall f1-score support scope 0.7205 0.7682 0.7436 151 work 0.7155 0.8737 0.7867 95 pers 0.8273 0.9479 0.8835 96 loc 0.0000 0.0000 0.0000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7494 0.8333 0.7891 348 macro avg 0.4527 0.5180 0.4828 348 weighted avg 0.7362 0.8333 0.7811 348 2023-10-06 13:28:32,462 ----------------------------------------------------------------------------------------------------