stefan-it's picture
Upload folder using huggingface_hub
0f37486
2023-10-08 20:56:01,348 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,349 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-08 20:56:01,349 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,349 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-08 20:56:01,349 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,349 Train: 966 sentences
2023-10-08 20:56:01,349 (train_with_dev=False, train_with_test=False)
2023-10-08 20:56:01,349 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,349 Training Params:
2023-10-08 20:56:01,350 - learning_rate: "0.00016"
2023-10-08 20:56:01,350 - mini_batch_size: "4"
2023-10-08 20:56:01,350 - max_epochs: "10"
2023-10-08 20:56:01,350 - shuffle: "True"
2023-10-08 20:56:01,350 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,350 Plugins:
2023-10-08 20:56:01,350 - TensorboardLogger
2023-10-08 20:56:01,350 - LinearScheduler | warmup_fraction: '0.1'
2023-10-08 20:56:01,350 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,350 Final evaluation on model from best epoch (best-model.pt)
2023-10-08 20:56:01,350 - metric: "('micro avg', 'f1-score')"
2023-10-08 20:56:01,350 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,350 Computation:
2023-10-08 20:56:01,350 - compute on device: cuda:0
2023-10-08 20:56:01,350 - embedding storage: none
2023-10-08 20:56:01,350 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,350 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3"
2023-10-08 20:56:01,350 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,350 ----------------------------------------------------------------------------------------------------
2023-10-08 20:56:01,351 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-08 20:56:10,066 epoch 1 - iter 24/242 - loss 3.21248576 - time (sec): 8.71 - samples/sec: 254.62 - lr: 0.000015 - momentum: 0.000000
2023-10-08 20:56:19,536 epoch 1 - iter 48/242 - loss 3.20217130 - time (sec): 18.18 - samples/sec: 261.87 - lr: 0.000031 - momentum: 0.000000
2023-10-08 20:56:29,197 epoch 1 - iter 72/242 - loss 3.18153929 - time (sec): 27.85 - samples/sec: 263.53 - lr: 0.000047 - momentum: 0.000000
2023-10-08 20:56:38,099 epoch 1 - iter 96/242 - loss 3.14553573 - time (sec): 36.75 - samples/sec: 261.00 - lr: 0.000063 - momentum: 0.000000
2023-10-08 20:56:47,311 epoch 1 - iter 120/242 - loss 3.06559143 - time (sec): 45.96 - samples/sec: 262.84 - lr: 0.000079 - momentum: 0.000000
2023-10-08 20:56:57,061 epoch 1 - iter 144/242 - loss 2.95086512 - time (sec): 55.71 - samples/sec: 265.22 - lr: 0.000095 - momentum: 0.000000
2023-10-08 20:57:06,365 epoch 1 - iter 168/242 - loss 2.84762397 - time (sec): 65.01 - samples/sec: 264.25 - lr: 0.000110 - momentum: 0.000000
2023-10-08 20:57:15,837 epoch 1 - iter 192/242 - loss 2.73008905 - time (sec): 74.49 - samples/sec: 264.02 - lr: 0.000126 - momentum: 0.000000
2023-10-08 20:57:25,343 epoch 1 - iter 216/242 - loss 2.60051482 - time (sec): 83.99 - samples/sec: 264.38 - lr: 0.000142 - momentum: 0.000000
2023-10-08 20:57:34,612 epoch 1 - iter 240/242 - loss 2.47662353 - time (sec): 93.26 - samples/sec: 262.92 - lr: 0.000158 - momentum: 0.000000
2023-10-08 20:57:35,455 ----------------------------------------------------------------------------------------------------
2023-10-08 20:57:35,455 EPOCH 1 done: loss 2.4643 - lr: 0.000158
2023-10-08 20:57:41,303 DEV : loss 1.0756269693374634 - f1-score (micro avg) 0.0
2023-10-08 20:57:41,309 ----------------------------------------------------------------------------------------------------
2023-10-08 20:57:50,782 epoch 2 - iter 24/242 - loss 1.04440907 - time (sec): 9.47 - samples/sec: 273.34 - lr: 0.000158 - momentum: 0.000000
2023-10-08 20:58:00,603 epoch 2 - iter 48/242 - loss 0.91809550 - time (sec): 19.29 - samples/sec: 274.51 - lr: 0.000157 - momentum: 0.000000
2023-10-08 20:58:09,572 epoch 2 - iter 72/242 - loss 0.83319993 - time (sec): 28.26 - samples/sec: 269.73 - lr: 0.000155 - momentum: 0.000000
2023-10-08 20:58:18,913 epoch 2 - iter 96/242 - loss 0.77856262 - time (sec): 37.60 - samples/sec: 265.65 - lr: 0.000153 - momentum: 0.000000
2023-10-08 20:58:28,300 epoch 2 - iter 120/242 - loss 0.73849171 - time (sec): 46.99 - samples/sec: 264.14 - lr: 0.000151 - momentum: 0.000000
2023-10-08 20:58:38,063 epoch 2 - iter 144/242 - loss 0.71039644 - time (sec): 56.75 - samples/sec: 262.01 - lr: 0.000150 - momentum: 0.000000
2023-10-08 20:58:47,722 epoch 2 - iter 168/242 - loss 0.68492635 - time (sec): 66.41 - samples/sec: 262.46 - lr: 0.000148 - momentum: 0.000000
2023-10-08 20:58:56,669 epoch 2 - iter 192/242 - loss 0.66008899 - time (sec): 75.36 - samples/sec: 261.42 - lr: 0.000146 - momentum: 0.000000
2023-10-08 20:59:05,847 epoch 2 - iter 216/242 - loss 0.62381105 - time (sec): 84.54 - samples/sec: 260.81 - lr: 0.000144 - momentum: 0.000000
2023-10-08 20:59:15,150 epoch 2 - iter 240/242 - loss 0.59453970 - time (sec): 93.84 - samples/sec: 261.37 - lr: 0.000142 - momentum: 0.000000
2023-10-08 20:59:15,852 ----------------------------------------------------------------------------------------------------
2023-10-08 20:59:15,853 EPOCH 2 done: loss 0.5916 - lr: 0.000142
2023-10-08 20:59:21,652 DEV : loss 0.3565257787704468 - f1-score (micro avg) 0.397
2023-10-08 20:59:21,658 saving best model
2023-10-08 20:59:22,541 ----------------------------------------------------------------------------------------------------
2023-10-08 20:59:31,690 epoch 3 - iter 24/242 - loss 0.31870845 - time (sec): 9.15 - samples/sec: 249.14 - lr: 0.000141 - momentum: 0.000000
2023-10-08 20:59:41,292 epoch 3 - iter 48/242 - loss 0.31431169 - time (sec): 18.75 - samples/sec: 260.05 - lr: 0.000139 - momentum: 0.000000
2023-10-08 20:59:50,694 epoch 3 - iter 72/242 - loss 0.30615624 - time (sec): 28.15 - samples/sec: 259.76 - lr: 0.000137 - momentum: 0.000000
2023-10-08 21:00:00,074 epoch 3 - iter 96/242 - loss 0.30552584 - time (sec): 37.53 - samples/sec: 260.76 - lr: 0.000135 - momentum: 0.000000
2023-10-08 21:00:09,563 epoch 3 - iter 120/242 - loss 0.30424885 - time (sec): 47.02 - samples/sec: 262.82 - lr: 0.000134 - momentum: 0.000000
2023-10-08 21:00:18,480 epoch 3 - iter 144/242 - loss 0.29174531 - time (sec): 55.94 - samples/sec: 261.20 - lr: 0.000132 - momentum: 0.000000
2023-10-08 21:00:27,777 epoch 3 - iter 168/242 - loss 0.28538621 - time (sec): 65.24 - samples/sec: 261.76 - lr: 0.000130 - momentum: 0.000000
2023-10-08 21:00:37,324 epoch 3 - iter 192/242 - loss 0.27737288 - time (sec): 74.78 - samples/sec: 261.32 - lr: 0.000128 - momentum: 0.000000
2023-10-08 21:00:47,019 epoch 3 - iter 216/242 - loss 0.26684156 - time (sec): 84.48 - samples/sec: 263.34 - lr: 0.000126 - momentum: 0.000000
2023-10-08 21:00:56,227 epoch 3 - iter 240/242 - loss 0.25808320 - time (sec): 93.69 - samples/sec: 262.18 - lr: 0.000125 - momentum: 0.000000
2023-10-08 21:00:56,860 ----------------------------------------------------------------------------------------------------
2023-10-08 21:00:56,861 EPOCH 3 done: loss 0.2588 - lr: 0.000125
2023-10-08 21:01:02,683 DEV : loss 0.21175621449947357 - f1-score (micro avg) 0.659
2023-10-08 21:01:02,689 saving best model
2023-10-08 21:01:07,049 ----------------------------------------------------------------------------------------------------
2023-10-08 21:01:16,136 epoch 4 - iter 24/242 - loss 0.15157997 - time (sec): 9.09 - samples/sec: 247.11 - lr: 0.000123 - momentum: 0.000000
2023-10-08 21:01:25,532 epoch 4 - iter 48/242 - loss 0.15954180 - time (sec): 18.48 - samples/sec: 255.23 - lr: 0.000121 - momentum: 0.000000
2023-10-08 21:01:34,168 epoch 4 - iter 72/242 - loss 0.16996318 - time (sec): 27.12 - samples/sec: 252.05 - lr: 0.000119 - momentum: 0.000000
2023-10-08 21:01:43,284 epoch 4 - iter 96/242 - loss 0.17211199 - time (sec): 36.23 - samples/sec: 254.90 - lr: 0.000118 - momentum: 0.000000
2023-10-08 21:01:53,138 epoch 4 - iter 120/242 - loss 0.16036017 - time (sec): 46.09 - samples/sec: 258.47 - lr: 0.000116 - momentum: 0.000000
2023-10-08 21:02:03,097 epoch 4 - iter 144/242 - loss 0.16267112 - time (sec): 56.05 - samples/sec: 259.57 - lr: 0.000114 - momentum: 0.000000
2023-10-08 21:02:12,258 epoch 4 - iter 168/242 - loss 0.16306635 - time (sec): 65.21 - samples/sec: 259.39 - lr: 0.000112 - momentum: 0.000000
2023-10-08 21:02:22,379 epoch 4 - iter 192/242 - loss 0.15970730 - time (sec): 75.33 - samples/sec: 261.39 - lr: 0.000110 - momentum: 0.000000
2023-10-08 21:02:31,674 epoch 4 - iter 216/242 - loss 0.15643909 - time (sec): 84.62 - samples/sec: 261.29 - lr: 0.000109 - momentum: 0.000000
2023-10-08 21:02:41,366 epoch 4 - iter 240/242 - loss 0.15293647 - time (sec): 94.32 - samples/sec: 260.77 - lr: 0.000107 - momentum: 0.000000
2023-10-08 21:02:41,976 ----------------------------------------------------------------------------------------------------
2023-10-08 21:02:41,976 EPOCH 4 done: loss 0.1530 - lr: 0.000107
2023-10-08 21:02:47,953 DEV : loss 0.14897313714027405 - f1-score (micro avg) 0.831
2023-10-08 21:02:47,959 saving best model
2023-10-08 21:02:52,374 ----------------------------------------------------------------------------------------------------
2023-10-08 21:03:01,612 epoch 5 - iter 24/242 - loss 0.12429477 - time (sec): 9.24 - samples/sec: 246.94 - lr: 0.000105 - momentum: 0.000000
2023-10-08 21:03:11,750 epoch 5 - iter 48/242 - loss 0.11383623 - time (sec): 19.37 - samples/sec: 249.71 - lr: 0.000103 - momentum: 0.000000
2023-10-08 21:03:21,688 epoch 5 - iter 72/242 - loss 0.11256443 - time (sec): 29.31 - samples/sec: 256.71 - lr: 0.000102 - momentum: 0.000000
2023-10-08 21:03:31,142 epoch 5 - iter 96/242 - loss 0.11357129 - time (sec): 38.77 - samples/sec: 255.81 - lr: 0.000100 - momentum: 0.000000
2023-10-08 21:03:41,343 epoch 5 - iter 120/242 - loss 0.11427482 - time (sec): 48.97 - samples/sec: 254.54 - lr: 0.000098 - momentum: 0.000000
2023-10-08 21:03:51,426 epoch 5 - iter 144/242 - loss 0.10792581 - time (sec): 59.05 - samples/sec: 253.49 - lr: 0.000096 - momentum: 0.000000
2023-10-08 21:04:01,108 epoch 5 - iter 168/242 - loss 0.10715487 - time (sec): 68.73 - samples/sec: 251.83 - lr: 0.000094 - momentum: 0.000000
2023-10-08 21:04:11,449 epoch 5 - iter 192/242 - loss 0.10331860 - time (sec): 79.07 - samples/sec: 253.65 - lr: 0.000093 - momentum: 0.000000
2023-10-08 21:04:20,874 epoch 5 - iter 216/242 - loss 0.10229074 - time (sec): 88.50 - samples/sec: 251.80 - lr: 0.000091 - momentum: 0.000000
2023-10-08 21:04:30,342 epoch 5 - iter 240/242 - loss 0.09987971 - time (sec): 97.97 - samples/sec: 250.66 - lr: 0.000089 - momentum: 0.000000
2023-10-08 21:04:31,029 ----------------------------------------------------------------------------------------------------
2023-10-08 21:04:31,029 EPOCH 5 done: loss 0.1010 - lr: 0.000089
2023-10-08 21:04:37,493 DEV : loss 0.13600760698318481 - f1-score (micro avg) 0.8286
2023-10-08 21:04:37,499 ----------------------------------------------------------------------------------------------------
2023-10-08 21:04:47,893 epoch 6 - iter 24/242 - loss 0.08121927 - time (sec): 10.39 - samples/sec: 253.73 - lr: 0.000087 - momentum: 0.000000
2023-10-08 21:04:58,067 epoch 6 - iter 48/242 - loss 0.07627296 - time (sec): 20.57 - samples/sec: 248.90 - lr: 0.000086 - momentum: 0.000000
2023-10-08 21:05:07,748 epoch 6 - iter 72/242 - loss 0.07814236 - time (sec): 30.25 - samples/sec: 246.00 - lr: 0.000084 - momentum: 0.000000
2023-10-08 21:05:17,421 epoch 6 - iter 96/242 - loss 0.07937287 - time (sec): 39.92 - samples/sec: 240.23 - lr: 0.000082 - momentum: 0.000000
2023-10-08 21:05:27,198 epoch 6 - iter 120/242 - loss 0.07633702 - time (sec): 49.70 - samples/sec: 238.86 - lr: 0.000080 - momentum: 0.000000
2023-10-08 21:05:37,318 epoch 6 - iter 144/242 - loss 0.07488017 - time (sec): 59.82 - samples/sec: 239.66 - lr: 0.000078 - momentum: 0.000000
2023-10-08 21:05:47,520 epoch 6 - iter 168/242 - loss 0.07461197 - time (sec): 70.02 - samples/sec: 241.35 - lr: 0.000077 - momentum: 0.000000
2023-10-08 21:05:58,199 epoch 6 - iter 192/242 - loss 0.07164879 - time (sec): 80.70 - samples/sec: 242.47 - lr: 0.000075 - momentum: 0.000000
2023-10-08 21:06:08,645 epoch 6 - iter 216/242 - loss 0.07129952 - time (sec): 91.15 - samples/sec: 242.13 - lr: 0.000073 - momentum: 0.000000
2023-10-08 21:06:19,125 epoch 6 - iter 240/242 - loss 0.07177037 - time (sec): 101.62 - samples/sec: 241.81 - lr: 0.000071 - momentum: 0.000000
2023-10-08 21:06:19,867 ----------------------------------------------------------------------------------------------------
2023-10-08 21:06:19,867 EPOCH 6 done: loss 0.0715 - lr: 0.000071
2023-10-08 21:06:26,467 DEV : loss 0.13159669935703278 - f1-score (micro avg) 0.8323
2023-10-08 21:06:26,473 saving best model
2023-10-08 21:06:30,860 ----------------------------------------------------------------------------------------------------
2023-10-08 21:06:40,816 epoch 7 - iter 24/242 - loss 0.04880875 - time (sec): 9.95 - samples/sec: 233.77 - lr: 0.000070 - momentum: 0.000000
2023-10-08 21:06:50,799 epoch 7 - iter 48/242 - loss 0.06119329 - time (sec): 19.94 - samples/sec: 233.03 - lr: 0.000068 - momentum: 0.000000
2023-10-08 21:07:01,128 epoch 7 - iter 72/242 - loss 0.05398906 - time (sec): 30.27 - samples/sec: 238.28 - lr: 0.000066 - momentum: 0.000000
2023-10-08 21:07:11,020 epoch 7 - iter 96/242 - loss 0.05055426 - time (sec): 40.16 - samples/sec: 238.63 - lr: 0.000064 - momentum: 0.000000
2023-10-08 21:07:21,005 epoch 7 - iter 120/242 - loss 0.05528542 - time (sec): 50.14 - samples/sec: 238.78 - lr: 0.000062 - momentum: 0.000000
2023-10-08 21:07:31,375 epoch 7 - iter 144/242 - loss 0.05610316 - time (sec): 60.51 - samples/sec: 240.69 - lr: 0.000061 - momentum: 0.000000
2023-10-08 21:07:42,084 epoch 7 - iter 168/242 - loss 0.05305806 - time (sec): 71.22 - samples/sec: 240.78 - lr: 0.000059 - momentum: 0.000000
2023-10-08 21:07:52,693 epoch 7 - iter 192/242 - loss 0.05445471 - time (sec): 81.83 - samples/sec: 241.64 - lr: 0.000057 - momentum: 0.000000
2023-10-08 21:08:02,913 epoch 7 - iter 216/242 - loss 0.05621495 - time (sec): 92.05 - samples/sec: 242.21 - lr: 0.000055 - momentum: 0.000000
2023-10-08 21:08:12,736 epoch 7 - iter 240/242 - loss 0.05582534 - time (sec): 101.87 - samples/sec: 241.18 - lr: 0.000054 - momentum: 0.000000
2023-10-08 21:08:13,425 ----------------------------------------------------------------------------------------------------
2023-10-08 21:08:13,425 EPOCH 7 done: loss 0.0559 - lr: 0.000054
2023-10-08 21:08:20,015 DEV : loss 0.14115837216377258 - f1-score (micro avg) 0.8218
2023-10-08 21:08:20,021 ----------------------------------------------------------------------------------------------------
2023-10-08 21:08:29,855 epoch 8 - iter 24/242 - loss 0.03435208 - time (sec): 9.83 - samples/sec: 237.19 - lr: 0.000052 - momentum: 0.000000
2023-10-08 21:08:39,915 epoch 8 - iter 48/242 - loss 0.03890898 - time (sec): 19.89 - samples/sec: 245.23 - lr: 0.000050 - momentum: 0.000000
2023-10-08 21:08:50,130 epoch 8 - iter 72/242 - loss 0.03289446 - time (sec): 30.11 - samples/sec: 244.76 - lr: 0.000048 - momentum: 0.000000
2023-10-08 21:09:00,441 epoch 8 - iter 96/242 - loss 0.03429030 - time (sec): 40.42 - samples/sec: 245.78 - lr: 0.000046 - momentum: 0.000000
2023-10-08 21:09:11,335 epoch 8 - iter 120/242 - loss 0.03396519 - time (sec): 51.31 - samples/sec: 245.58 - lr: 0.000045 - momentum: 0.000000
2023-10-08 21:09:21,613 epoch 8 - iter 144/242 - loss 0.03748909 - time (sec): 61.59 - samples/sec: 243.87 - lr: 0.000043 - momentum: 0.000000
2023-10-08 21:09:31,490 epoch 8 - iter 168/242 - loss 0.03726164 - time (sec): 71.47 - samples/sec: 241.71 - lr: 0.000041 - momentum: 0.000000
2023-10-08 21:09:41,675 epoch 8 - iter 192/242 - loss 0.04265657 - time (sec): 81.65 - samples/sec: 241.05 - lr: 0.000039 - momentum: 0.000000
2023-10-08 21:09:52,187 epoch 8 - iter 216/242 - loss 0.04150746 - time (sec): 92.16 - samples/sec: 242.33 - lr: 0.000038 - momentum: 0.000000
2023-10-08 21:10:02,056 epoch 8 - iter 240/242 - loss 0.04187872 - time (sec): 102.03 - samples/sec: 241.46 - lr: 0.000036 - momentum: 0.000000
2023-10-08 21:10:02,609 ----------------------------------------------------------------------------------------------------
2023-10-08 21:10:02,610 EPOCH 8 done: loss 0.0417 - lr: 0.000036
2023-10-08 21:10:09,227 DEV : loss 0.14380428194999695 - f1-score (micro avg) 0.8414
2023-10-08 21:10:09,233 saving best model
2023-10-08 21:10:14,054 ----------------------------------------------------------------------------------------------------
2023-10-08 21:10:24,583 epoch 9 - iter 24/242 - loss 0.02692928 - time (sec): 10.53 - samples/sec: 229.77 - lr: 0.000034 - momentum: 0.000000
2023-10-08 21:10:34,714 epoch 9 - iter 48/242 - loss 0.03328689 - time (sec): 20.66 - samples/sec: 236.80 - lr: 0.000032 - momentum: 0.000000
2023-10-08 21:10:44,821 epoch 9 - iter 72/242 - loss 0.03273916 - time (sec): 30.77 - samples/sec: 237.57 - lr: 0.000030 - momentum: 0.000000
2023-10-08 21:10:55,410 epoch 9 - iter 96/242 - loss 0.03231675 - time (sec): 41.35 - samples/sec: 241.93 - lr: 0.000029 - momentum: 0.000000
2023-10-08 21:11:05,223 epoch 9 - iter 120/242 - loss 0.03154966 - time (sec): 51.17 - samples/sec: 241.29 - lr: 0.000027 - momentum: 0.000000
2023-10-08 21:11:15,601 epoch 9 - iter 144/242 - loss 0.03518646 - time (sec): 61.55 - samples/sec: 241.96 - lr: 0.000025 - momentum: 0.000000
2023-10-08 21:11:25,838 epoch 9 - iter 168/242 - loss 0.03564032 - time (sec): 71.78 - samples/sec: 243.06 - lr: 0.000023 - momentum: 0.000000
2023-10-08 21:11:35,558 epoch 9 - iter 192/242 - loss 0.03533551 - time (sec): 81.50 - samples/sec: 241.50 - lr: 0.000022 - momentum: 0.000000
2023-10-08 21:11:45,728 epoch 9 - iter 216/242 - loss 0.03445013 - time (sec): 91.67 - samples/sec: 241.24 - lr: 0.000020 - momentum: 0.000000
2023-10-08 21:11:55,922 epoch 9 - iter 240/242 - loss 0.03733405 - time (sec): 101.87 - samples/sec: 241.75 - lr: 0.000018 - momentum: 0.000000
2023-10-08 21:11:56,487 ----------------------------------------------------------------------------------------------------
2023-10-08 21:11:56,487 EPOCH 9 done: loss 0.0372 - lr: 0.000018
2023-10-08 21:12:03,065 DEV : loss 0.15464326739311218 - f1-score (micro avg) 0.8159
2023-10-08 21:12:03,071 ----------------------------------------------------------------------------------------------------
2023-10-08 21:12:14,366 epoch 10 - iter 24/242 - loss 0.04280404 - time (sec): 11.29 - samples/sec: 252.09 - lr: 0.000016 - momentum: 0.000000
2023-10-08 21:12:24,462 epoch 10 - iter 48/242 - loss 0.03845794 - time (sec): 21.39 - samples/sec: 248.82 - lr: 0.000014 - momentum: 0.000000
2023-10-08 21:12:34,552 epoch 10 - iter 72/242 - loss 0.03631523 - time (sec): 31.48 - samples/sec: 248.19 - lr: 0.000013 - momentum: 0.000000
2023-10-08 21:12:44,058 epoch 10 - iter 96/242 - loss 0.03423702 - time (sec): 40.99 - samples/sec: 244.21 - lr: 0.000011 - momentum: 0.000000
2023-10-08 21:12:54,777 epoch 10 - iter 120/242 - loss 0.03302895 - time (sec): 51.70 - samples/sec: 244.26 - lr: 0.000009 - momentum: 0.000000
2023-10-08 21:13:05,431 epoch 10 - iter 144/242 - loss 0.03606201 - time (sec): 62.36 - samples/sec: 244.15 - lr: 0.000007 - momentum: 0.000000
2023-10-08 21:13:14,961 epoch 10 - iter 168/242 - loss 0.03411320 - time (sec): 71.89 - samples/sec: 242.50 - lr: 0.000006 - momentum: 0.000000
2023-10-08 21:13:24,916 epoch 10 - iter 192/242 - loss 0.03522787 - time (sec): 81.84 - samples/sec: 243.10 - lr: 0.000004 - momentum: 0.000000
2023-10-08 21:13:35,201 epoch 10 - iter 216/242 - loss 0.03484203 - time (sec): 92.13 - samples/sec: 242.98 - lr: 0.000002 - momentum: 0.000000
2023-10-08 21:13:44,464 epoch 10 - iter 240/242 - loss 0.03343421 - time (sec): 101.39 - samples/sec: 241.60 - lr: 0.000000 - momentum: 0.000000
2023-10-08 21:13:45,303 ----------------------------------------------------------------------------------------------------
2023-10-08 21:13:45,303 EPOCH 10 done: loss 0.0331 - lr: 0.000000
2023-10-08 21:13:51,911 DEV : loss 0.15395969152450562 - f1-score (micro avg) 0.8323
2023-10-08 21:13:52,775 ----------------------------------------------------------------------------------------------------
2023-10-08 21:13:52,777 Loading model from best epoch ...
2023-10-08 21:13:55,558 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-08 21:14:02,010
Results:
- F-score (micro) 0.8065
- F-score (macro) 0.4043
- Accuracy 0.7075
By class:
precision recall f1-score support
pers 0.8194 0.8489 0.8339 139
scope 0.8489 0.9147 0.8806 129
work 0.6400 0.8000 0.7111 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
object 0.0000 0.0000 0.0000 0
micro avg 0.7812 0.8333 0.8065 360
macro avg 0.3847 0.4273 0.4043 360
weighted avg 0.7628 0.8333 0.7956 360
2023-10-08 21:14:02,010 ----------------------------------------------------------------------------------------------------