2023-10-08 23:34:23,888 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,890 Model: "SequenceTagger( (embeddings): ByT5Embeddings( (model): T5EncoderModel( (shared): Embedding(384, 1472) (encoder): T5Stack( (embed_tokens): Embedding(384, 1472) (block): ModuleList( (0): T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) (relative_attention_bias): Embedding(32, 6) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (1-11): 11 x T5Block( (layer): ModuleList( (0): T5LayerSelfAttention( (SelfAttention): T5Attention( (q): Linear(in_features=1472, out_features=384, bias=False) (k): Linear(in_features=1472, out_features=384, bias=False) (v): Linear(in_features=1472, out_features=384, bias=False) (o): Linear(in_features=384, out_features=1472, bias=False) ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) (1): T5LayerFF( (DenseReluDense): T5DenseGatedActDense( (wi_0): Linear(in_features=1472, out_features=3584, bias=False) (wi_1): Linear(in_features=1472, out_features=3584, bias=False) (wo): Linear(in_features=3584, out_features=1472, bias=False) (dropout): Dropout(p=0.1, inplace=False) (act): NewGELUActivation() ) (layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (final_layer_norm): T5LayerNorm() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1472, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-08 23:34:23,890 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,890 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-08 23:34:23,890 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,890 Train: 966 sentences 2023-10-08 23:34:23,890 (train_with_dev=False, train_with_test=False) 2023-10-08 23:34:23,890 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,890 Training Params: 2023-10-08 23:34:23,890 - learning_rate: "0.00015" 2023-10-08 23:34:23,890 - mini_batch_size: "8" 2023-10-08 23:34:23,890 - max_epochs: "10" 2023-10-08 23:34:23,890 - shuffle: "True" 2023-10-08 23:34:23,890 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,890 Plugins: 2023-10-08 23:34:23,891 - TensorboardLogger 2023-10-08 23:34:23,891 - LinearScheduler | warmup_fraction: '0.1' 2023-10-08 23:34:23,891 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,891 Final evaluation on model from best epoch (best-model.pt) 2023-10-08 23:34:23,891 - metric: "('micro avg', 'f1-score')" 2023-10-08 23:34:23,891 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,891 Computation: 2023-10-08 23:34:23,891 - compute on device: cuda:0 2023-10-08 23:34:23,891 - embedding storage: none 2023-10-08 23:34:23,891 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,891 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-5" 2023-10-08 23:34:23,891 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,891 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:34:23,891 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-08 23:34:32,778 epoch 1 - iter 12/121 - loss 3.24017889 - time (sec): 8.89 - samples/sec: 258.16 - lr: 0.000014 - momentum: 0.000000 2023-10-08 23:34:42,632 epoch 1 - iter 24/121 - loss 3.23366481 - time (sec): 18.74 - samples/sec: 269.53 - lr: 0.000029 - momentum: 0.000000 2023-10-08 23:34:52,336 epoch 1 - iter 36/121 - loss 3.22367203 - time (sec): 28.44 - samples/sec: 271.42 - lr: 0.000043 - momentum: 0.000000 2023-10-08 23:35:01,746 epoch 1 - iter 48/121 - loss 3.20730354 - time (sec): 37.85 - samples/sec: 266.79 - lr: 0.000058 - momentum: 0.000000 2023-10-08 23:35:11,211 epoch 1 - iter 60/121 - loss 3.17435023 - time (sec): 47.32 - samples/sec: 268.14 - lr: 0.000073 - momentum: 0.000000 2023-10-08 23:35:20,140 epoch 1 - iter 72/121 - loss 3.12585276 - time (sec): 56.25 - samples/sec: 266.78 - lr: 0.000088 - momentum: 0.000000 2023-10-08 23:35:29,472 epoch 1 - iter 84/121 - loss 3.05911175 - time (sec): 65.58 - samples/sec: 268.07 - lr: 0.000103 - momentum: 0.000000 2023-10-08 23:35:38,423 epoch 1 - iter 96/121 - loss 2.98937366 - time (sec): 74.53 - samples/sec: 267.14 - lr: 0.000118 - momentum: 0.000000 2023-10-08 23:35:47,528 epoch 1 - iter 108/121 - loss 2.91543096 - time (sec): 83.64 - samples/sec: 265.16 - lr: 0.000133 - momentum: 0.000000 2023-10-08 23:35:56,942 epoch 1 - iter 120/121 - loss 2.83441714 - time (sec): 93.05 - samples/sec: 264.28 - lr: 0.000148 - momentum: 0.000000 2023-10-08 23:35:57,594 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:35:57,594 EPOCH 1 done: loss 2.8287 - lr: 0.000148 2023-10-08 23:36:03,894 DEV : loss 1.865909457206726 - f1-score (micro avg) 0.0 2023-10-08 23:36:03,899 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:36:13,707 epoch 2 - iter 12/121 - loss 1.82988856 - time (sec): 9.81 - samples/sec: 252.40 - lr: 0.000148 - momentum: 0.000000 2023-10-08 23:36:23,287 epoch 2 - iter 24/121 - loss 1.70061483 - time (sec): 19.39 - samples/sec: 260.65 - lr: 0.000147 - momentum: 0.000000 2023-10-08 23:36:32,140 epoch 2 - iter 36/121 - loss 1.62437798 - time (sec): 28.24 - samples/sec: 257.05 - lr: 0.000145 - momentum: 0.000000 2023-10-08 23:36:41,051 epoch 2 - iter 48/121 - loss 1.52599380 - time (sec): 37.15 - samples/sec: 257.31 - lr: 0.000144 - momentum: 0.000000 2023-10-08 23:36:50,363 epoch 2 - iter 60/121 - loss 1.43660909 - time (sec): 46.46 - samples/sec: 255.80 - lr: 0.000142 - momentum: 0.000000 2023-10-08 23:36:59,592 epoch 2 - iter 72/121 - loss 1.37023102 - time (sec): 55.69 - samples/sec: 255.46 - lr: 0.000140 - momentum: 0.000000 2023-10-08 23:37:08,394 epoch 2 - iter 84/121 - loss 1.30900136 - time (sec): 64.49 - samples/sec: 255.16 - lr: 0.000139 - momentum: 0.000000 2023-10-08 23:37:17,797 epoch 2 - iter 96/121 - loss 1.23447583 - time (sec): 73.90 - samples/sec: 256.89 - lr: 0.000137 - momentum: 0.000000 2023-10-08 23:37:27,844 epoch 2 - iter 108/121 - loss 1.16765337 - time (sec): 83.94 - samples/sec: 258.99 - lr: 0.000135 - momentum: 0.000000 2023-10-08 23:37:37,761 epoch 2 - iter 120/121 - loss 1.11224056 - time (sec): 93.86 - samples/sec: 262.29 - lr: 0.000134 - momentum: 0.000000 2023-10-08 23:37:38,273 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:37:38,273 EPOCH 2 done: loss 1.1093 - lr: 0.000134 2023-10-08 23:37:44,716 DEV : loss 0.6599521636962891 - f1-score (micro avg) 0.0 2023-10-08 23:37:44,723 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:37:53,805 epoch 3 - iter 12/121 - loss 0.66795242 - time (sec): 9.08 - samples/sec: 258.00 - lr: 0.000132 - momentum: 0.000000 2023-10-08 23:38:03,763 epoch 3 - iter 24/121 - loss 0.57765494 - time (sec): 19.04 - samples/sec: 267.13 - lr: 0.000130 - momentum: 0.000000 2023-10-08 23:38:13,177 epoch 3 - iter 36/121 - loss 0.58939285 - time (sec): 28.45 - samples/sec: 265.13 - lr: 0.000129 - momentum: 0.000000 2023-10-08 23:38:22,673 epoch 3 - iter 48/121 - loss 0.59008148 - time (sec): 37.95 - samples/sec: 266.23 - lr: 0.000127 - momentum: 0.000000 2023-10-08 23:38:32,149 epoch 3 - iter 60/121 - loss 0.59041910 - time (sec): 47.43 - samples/sec: 265.96 - lr: 0.000125 - momentum: 0.000000 2023-10-08 23:38:41,186 epoch 3 - iter 72/121 - loss 0.59060120 - time (sec): 56.46 - samples/sec: 263.95 - lr: 0.000124 - momentum: 0.000000 2023-10-08 23:38:50,402 epoch 3 - iter 84/121 - loss 0.56699050 - time (sec): 65.68 - samples/sec: 262.80 - lr: 0.000122 - momentum: 0.000000 2023-10-08 23:39:00,139 epoch 3 - iter 96/121 - loss 0.54734552 - time (sec): 75.42 - samples/sec: 263.58 - lr: 0.000120 - momentum: 0.000000 2023-10-08 23:39:08,891 epoch 3 - iter 108/121 - loss 0.53838536 - time (sec): 84.17 - samples/sec: 262.33 - lr: 0.000119 - momentum: 0.000000 2023-10-08 23:39:18,361 epoch 3 - iter 120/121 - loss 0.52998391 - time (sec): 93.64 - samples/sec: 263.15 - lr: 0.000117 - momentum: 0.000000 2023-10-08 23:39:18,899 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:39:18,899 EPOCH 3 done: loss 0.5291 - lr: 0.000117 2023-10-08 23:39:25,550 DEV : loss 0.3969781696796417 - f1-score (micro avg) 0.0 2023-10-08 23:39:25,557 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:39:35,472 epoch 4 - iter 12/121 - loss 0.38272330 - time (sec): 9.91 - samples/sec: 274.49 - lr: 0.000115 - momentum: 0.000000 2023-10-08 23:39:45,398 epoch 4 - iter 24/121 - loss 0.39630310 - time (sec): 19.84 - samples/sec: 273.45 - lr: 0.000114 - momentum: 0.000000 2023-10-08 23:39:54,512 epoch 4 - iter 36/121 - loss 0.36093487 - time (sec): 28.95 - samples/sec: 268.09 - lr: 0.000112 - momentum: 0.000000 2023-10-08 23:40:04,530 epoch 4 - iter 48/121 - loss 0.35342633 - time (sec): 38.97 - samples/sec: 266.02 - lr: 0.000110 - momentum: 0.000000 2023-10-08 23:40:14,229 epoch 4 - iter 60/121 - loss 0.34983137 - time (sec): 48.67 - samples/sec: 265.22 - lr: 0.000109 - momentum: 0.000000 2023-10-08 23:40:23,213 epoch 4 - iter 72/121 - loss 0.35500506 - time (sec): 57.65 - samples/sec: 263.88 - lr: 0.000107 - momentum: 0.000000 2023-10-08 23:40:31,803 epoch 4 - iter 84/121 - loss 0.34761176 - time (sec): 66.24 - samples/sec: 262.39 - lr: 0.000105 - momentum: 0.000000 2023-10-08 23:40:41,187 epoch 4 - iter 96/121 - loss 0.34047462 - time (sec): 75.63 - samples/sec: 262.83 - lr: 0.000104 - momentum: 0.000000 2023-10-08 23:40:50,246 epoch 4 - iter 108/121 - loss 0.33251375 - time (sec): 84.69 - samples/sec: 262.11 - lr: 0.000102 - momentum: 0.000000 2023-10-08 23:40:59,213 epoch 4 - iter 120/121 - loss 0.32254317 - time (sec): 93.65 - samples/sec: 262.27 - lr: 0.000101 - momentum: 0.000000 2023-10-08 23:40:59,856 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:40:59,857 EPOCH 4 done: loss 0.3228 - lr: 0.000101 2023-10-08 23:41:06,377 DEV : loss 0.2733049690723419 - f1-score (micro avg) 0.4662 2023-10-08 23:41:06,385 saving best model 2023-10-08 23:41:07,265 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:41:16,917 epoch 5 - iter 12/121 - loss 0.28853168 - time (sec): 9.65 - samples/sec: 263.94 - lr: 0.000099 - momentum: 0.000000 2023-10-08 23:41:25,961 epoch 5 - iter 24/121 - loss 0.23595621 - time (sec): 18.69 - samples/sec: 260.84 - lr: 0.000097 - momentum: 0.000000 2023-10-08 23:41:35,553 epoch 5 - iter 36/121 - loss 0.23653503 - time (sec): 28.29 - samples/sec: 258.97 - lr: 0.000095 - momentum: 0.000000 2023-10-08 23:41:44,553 epoch 5 - iter 48/121 - loss 0.22403941 - time (sec): 37.29 - samples/sec: 259.64 - lr: 0.000094 - momentum: 0.000000 2023-10-08 23:41:53,779 epoch 5 - iter 60/121 - loss 0.22717697 - time (sec): 46.51 - samples/sec: 260.49 - lr: 0.000092 - momentum: 0.000000 2023-10-08 23:42:02,805 epoch 5 - iter 72/121 - loss 0.23024853 - time (sec): 55.54 - samples/sec: 259.80 - lr: 0.000091 - momentum: 0.000000 2023-10-08 23:42:12,086 epoch 5 - iter 84/121 - loss 0.23713059 - time (sec): 64.82 - samples/sec: 259.99 - lr: 0.000089 - momentum: 0.000000 2023-10-08 23:42:21,895 epoch 5 - iter 96/121 - loss 0.23487341 - time (sec): 74.63 - samples/sec: 261.15 - lr: 0.000087 - momentum: 0.000000 2023-10-08 23:42:31,641 epoch 5 - iter 108/121 - loss 0.23401748 - time (sec): 84.37 - samples/sec: 261.88 - lr: 0.000086 - momentum: 0.000000 2023-10-08 23:42:40,945 epoch 5 - iter 120/121 - loss 0.22746030 - time (sec): 93.68 - samples/sec: 261.81 - lr: 0.000084 - momentum: 0.000000 2023-10-08 23:42:41,671 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:42:41,671 EPOCH 5 done: loss 0.2270 - lr: 0.000084 2023-10-08 23:42:48,196 DEV : loss 0.21299438178539276 - f1-score (micro avg) 0.6084 2023-10-08 23:42:48,202 saving best model 2023-10-08 23:42:52,580 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:43:01,768 epoch 6 - iter 12/121 - loss 0.20543145 - time (sec): 9.19 - samples/sec: 266.58 - lr: 0.000082 - momentum: 0.000000 2023-10-08 23:43:11,706 epoch 6 - iter 24/121 - loss 0.19076741 - time (sec): 19.12 - samples/sec: 271.32 - lr: 0.000081 - momentum: 0.000000 2023-10-08 23:43:20,980 epoch 6 - iter 36/121 - loss 0.19650215 - time (sec): 28.40 - samples/sec: 269.73 - lr: 0.000079 - momentum: 0.000000 2023-10-08 23:43:30,235 epoch 6 - iter 48/121 - loss 0.18904959 - time (sec): 37.65 - samples/sec: 267.23 - lr: 0.000077 - momentum: 0.000000 2023-10-08 23:43:40,393 epoch 6 - iter 60/121 - loss 0.18069900 - time (sec): 47.81 - samples/sec: 264.75 - lr: 0.000076 - momentum: 0.000000 2023-10-08 23:43:49,509 epoch 6 - iter 72/121 - loss 0.17923457 - time (sec): 56.93 - samples/sec: 267.01 - lr: 0.000074 - momentum: 0.000000 2023-10-08 23:43:59,118 epoch 6 - iter 84/121 - loss 0.17895962 - time (sec): 66.54 - samples/sec: 266.92 - lr: 0.000072 - momentum: 0.000000 2023-10-08 23:44:08,097 epoch 6 - iter 96/121 - loss 0.17558585 - time (sec): 75.52 - samples/sec: 266.34 - lr: 0.000071 - momentum: 0.000000 2023-10-08 23:44:16,813 epoch 6 - iter 108/121 - loss 0.17561514 - time (sec): 84.23 - samples/sec: 264.48 - lr: 0.000069 - momentum: 0.000000 2023-10-08 23:44:26,016 epoch 6 - iter 120/121 - loss 0.17486140 - time (sec): 93.43 - samples/sec: 263.55 - lr: 0.000067 - momentum: 0.000000 2023-10-08 23:44:26,515 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:44:26,516 EPOCH 6 done: loss 0.1750 - lr: 0.000067 2023-10-08 23:44:33,002 DEV : loss 0.1774875372648239 - f1-score (micro avg) 0.7266 2023-10-08 23:44:33,008 saving best model 2023-10-08 23:44:37,887 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:44:46,853 epoch 7 - iter 12/121 - loss 0.15820431 - time (sec): 8.97 - samples/sec: 262.34 - lr: 0.000066 - momentum: 0.000000 2023-10-08 23:44:55,909 epoch 7 - iter 24/121 - loss 0.16596847 - time (sec): 18.02 - samples/sec: 259.25 - lr: 0.000064 - momentum: 0.000000 2023-10-08 23:45:04,694 epoch 7 - iter 36/121 - loss 0.15533406 - time (sec): 26.81 - samples/sec: 258.67 - lr: 0.000062 - momentum: 0.000000 2023-10-08 23:45:14,414 epoch 7 - iter 48/121 - loss 0.14849612 - time (sec): 36.53 - samples/sec: 262.25 - lr: 0.000061 - momentum: 0.000000 2023-10-08 23:45:23,620 epoch 7 - iter 60/121 - loss 0.14917696 - time (sec): 45.73 - samples/sec: 264.06 - lr: 0.000059 - momentum: 0.000000 2023-10-08 23:45:33,084 epoch 7 - iter 72/121 - loss 0.14147734 - time (sec): 55.20 - samples/sec: 263.79 - lr: 0.000057 - momentum: 0.000000 2023-10-08 23:45:42,561 epoch 7 - iter 84/121 - loss 0.13644284 - time (sec): 64.67 - samples/sec: 262.09 - lr: 0.000056 - momentum: 0.000000 2023-10-08 23:45:52,135 epoch 7 - iter 96/121 - loss 0.13506282 - time (sec): 74.25 - samples/sec: 263.18 - lr: 0.000054 - momentum: 0.000000 2023-10-08 23:46:02,002 epoch 7 - iter 108/121 - loss 0.13473976 - time (sec): 84.11 - samples/sec: 262.94 - lr: 0.000052 - momentum: 0.000000 2023-10-08 23:46:11,175 epoch 7 - iter 120/121 - loss 0.13772286 - time (sec): 93.29 - samples/sec: 263.34 - lr: 0.000051 - momentum: 0.000000 2023-10-08 23:46:11,785 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:46:11,786 EPOCH 7 done: loss 0.1373 - lr: 0.000051 2023-10-08 23:46:18,296 DEV : loss 0.15338008105754852 - f1-score (micro avg) 0.842 2023-10-08 23:46:18,302 saving best model 2023-10-08 23:46:22,671 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:46:32,037 epoch 8 - iter 12/121 - loss 0.11148768 - time (sec): 9.36 - samples/sec: 262.38 - lr: 0.000049 - momentum: 0.000000 2023-10-08 23:46:41,192 epoch 8 - iter 24/121 - loss 0.11910156 - time (sec): 18.52 - samples/sec: 262.97 - lr: 0.000047 - momentum: 0.000000 2023-10-08 23:46:50,425 epoch 8 - iter 36/121 - loss 0.12923502 - time (sec): 27.75 - samples/sec: 261.49 - lr: 0.000046 - momentum: 0.000000 2023-10-08 23:46:59,623 epoch 8 - iter 48/121 - loss 0.12796695 - time (sec): 36.95 - samples/sec: 262.16 - lr: 0.000044 - momentum: 0.000000 2023-10-08 23:47:09,242 epoch 8 - iter 60/121 - loss 0.12207759 - time (sec): 46.57 - samples/sec: 263.01 - lr: 0.000042 - momentum: 0.000000 2023-10-08 23:47:18,465 epoch 8 - iter 72/121 - loss 0.11942277 - time (sec): 55.79 - samples/sec: 263.50 - lr: 0.000041 - momentum: 0.000000 2023-10-08 23:47:28,252 epoch 8 - iter 84/121 - loss 0.11435017 - time (sec): 65.58 - samples/sec: 263.90 - lr: 0.000039 - momentum: 0.000000 2023-10-08 23:47:38,022 epoch 8 - iter 96/121 - loss 0.11382818 - time (sec): 75.35 - samples/sec: 264.10 - lr: 0.000038 - momentum: 0.000000 2023-10-08 23:47:47,763 epoch 8 - iter 108/121 - loss 0.11198754 - time (sec): 85.09 - samples/sec: 263.20 - lr: 0.000036 - momentum: 0.000000 2023-10-08 23:47:56,645 epoch 8 - iter 120/121 - loss 0.11242524 - time (sec): 93.97 - samples/sec: 261.61 - lr: 0.000034 - momentum: 0.000000 2023-10-08 23:47:57,247 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:47:57,248 EPOCH 8 done: loss 0.1128 - lr: 0.000034 2023-10-08 23:48:03,852 DEV : loss 0.14283686876296997 - f1-score (micro avg) 0.8227 2023-10-08 23:48:03,858 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:48:13,211 epoch 9 - iter 12/121 - loss 0.10396974 - time (sec): 9.35 - samples/sec: 244.25 - lr: 0.000032 - momentum: 0.000000 2023-10-08 23:48:23,568 epoch 9 - iter 24/121 - loss 0.09710433 - time (sec): 19.71 - samples/sec: 261.11 - lr: 0.000031 - momentum: 0.000000 2023-10-08 23:48:32,954 epoch 9 - iter 36/121 - loss 0.09513445 - time (sec): 29.09 - samples/sec: 262.70 - lr: 0.000029 - momentum: 0.000000 2023-10-08 23:48:42,580 epoch 9 - iter 48/121 - loss 0.09058182 - time (sec): 38.72 - samples/sec: 261.16 - lr: 0.000028 - momentum: 0.000000 2023-10-08 23:48:52,022 epoch 9 - iter 60/121 - loss 0.09108372 - time (sec): 48.16 - samples/sec: 260.37 - lr: 0.000026 - momentum: 0.000000 2023-10-08 23:49:01,360 epoch 9 - iter 72/121 - loss 0.09190391 - time (sec): 57.50 - samples/sec: 261.56 - lr: 0.000024 - momentum: 0.000000 2023-10-08 23:49:10,170 epoch 9 - iter 84/121 - loss 0.09588537 - time (sec): 66.31 - samples/sec: 260.53 - lr: 0.000023 - momentum: 0.000000 2023-10-08 23:49:19,362 epoch 9 - iter 96/121 - loss 0.09700172 - time (sec): 75.50 - samples/sec: 259.33 - lr: 0.000021 - momentum: 0.000000 2023-10-08 23:49:28,627 epoch 9 - iter 108/121 - loss 0.09918562 - time (sec): 84.77 - samples/sec: 259.28 - lr: 0.000019 - momentum: 0.000000 2023-10-08 23:49:38,372 epoch 9 - iter 120/121 - loss 0.10076070 - time (sec): 94.51 - samples/sec: 260.41 - lr: 0.000018 - momentum: 0.000000 2023-10-08 23:49:38,893 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:49:38,893 EPOCH 9 done: loss 0.1003 - lr: 0.000018 2023-10-08 23:49:45,492 DEV : loss 0.1380492001771927 - f1-score (micro avg) 0.8354 2023-10-08 23:49:45,498 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:49:54,373 epoch 10 - iter 12/121 - loss 0.10280380 - time (sec): 8.87 - samples/sec: 260.00 - lr: 0.000016 - momentum: 0.000000 2023-10-08 23:50:03,805 epoch 10 - iter 24/121 - loss 0.09500688 - time (sec): 18.30 - samples/sec: 262.66 - lr: 0.000014 - momentum: 0.000000 2023-10-08 23:50:13,553 epoch 10 - iter 36/121 - loss 0.09699027 - time (sec): 28.05 - samples/sec: 263.07 - lr: 0.000013 - momentum: 0.000000 2023-10-08 23:50:23,171 epoch 10 - iter 48/121 - loss 0.09566332 - time (sec): 37.67 - samples/sec: 261.36 - lr: 0.000011 - momentum: 0.000000 2023-10-08 23:50:32,652 epoch 10 - iter 60/121 - loss 0.09384200 - time (sec): 47.15 - samples/sec: 262.26 - lr: 0.000009 - momentum: 0.000000 2023-10-08 23:50:42,319 epoch 10 - iter 72/121 - loss 0.09581268 - time (sec): 56.82 - samples/sec: 263.40 - lr: 0.000008 - momentum: 0.000000 2023-10-08 23:50:51,898 epoch 10 - iter 84/121 - loss 0.09664381 - time (sec): 66.40 - samples/sec: 263.82 - lr: 0.000006 - momentum: 0.000000 2023-10-08 23:51:00,278 epoch 10 - iter 96/121 - loss 0.09499539 - time (sec): 74.78 - samples/sec: 261.96 - lr: 0.000004 - momentum: 0.000000 2023-10-08 23:51:09,871 epoch 10 - iter 108/121 - loss 0.09536038 - time (sec): 84.37 - samples/sec: 260.50 - lr: 0.000003 - momentum: 0.000000 2023-10-08 23:51:19,558 epoch 10 - iter 120/121 - loss 0.09302861 - time (sec): 94.06 - samples/sec: 261.15 - lr: 0.000001 - momentum: 0.000000 2023-10-08 23:51:20,206 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:51:20,207 EPOCH 10 done: loss 0.0931 - lr: 0.000001 2023-10-08 23:51:26,721 DEV : loss 0.1366628259420395 - f1-score (micro avg) 0.8307 2023-10-08 23:51:27,596 ---------------------------------------------------------------------------------------------------- 2023-10-08 23:51:27,597 Loading model from best epoch ... 2023-10-08 23:51:30,275 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-08 23:51:36,836 Results: - F-score (micro) 0.7862 - F-score (macro) 0.4704 - Accuracy 0.682 By class: precision recall f1-score support pers 0.8069 0.8417 0.8239 139 scope 0.7881 0.9225 0.8500 129 work 0.6186 0.7500 0.6780 80 loc 0.0000 0.0000 0.0000 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7532 0.8222 0.7862 360 macro avg 0.4427 0.5028 0.4704 360 weighted avg 0.7314 0.8222 0.7734 360 2023-10-08 23:51:36,836 ----------------------------------------------------------------------------------------------------