stefan-it's picture
Upload folder using huggingface_hub
88a6874
2023-10-08 23:16:47,704 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,705 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-08 23:16:47,705 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,706 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-08 23:16:47,706 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,706 Train: 966 sentences
2023-10-08 23:16:47,706 (train_with_dev=False, train_with_test=False)
2023-10-08 23:16:47,706 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,706 Training Params:
2023-10-08 23:16:47,706 - learning_rate: "0.00016"
2023-10-08 23:16:47,706 - mini_batch_size: "4"
2023-10-08 23:16:47,706 - max_epochs: "10"
2023-10-08 23:16:47,706 - shuffle: "True"
2023-10-08 23:16:47,706 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,706 Plugins:
2023-10-08 23:16:47,706 - TensorboardLogger
2023-10-08 23:16:47,706 - LinearScheduler | warmup_fraction: '0.1'
2023-10-08 23:16:47,706 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,706 Final evaluation on model from best epoch (best-model.pt)
2023-10-08 23:16:47,706 - metric: "('micro avg', 'f1-score')"
2023-10-08 23:16:47,707 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,707 Computation:
2023-10-08 23:16:47,707 - compute on device: cuda:0
2023-10-08 23:16:47,707 - embedding storage: none
2023-10-08 23:16:47,707 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,707 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5"
2023-10-08 23:16:47,707 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,707 ----------------------------------------------------------------------------------------------------
2023-10-08 23:16:47,707 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-08 23:16:56,686 epoch 1 - iter 24/242 - loss 3.23915124 - time (sec): 8.98 - samples/sec: 255.50 - lr: 0.000015 - momentum: 0.000000
2023-10-08 23:17:06,554 epoch 1 - iter 48/242 - loss 3.22546569 - time (sec): 18.85 - samples/sec: 268.01 - lr: 0.000031 - momentum: 0.000000
2023-10-08 23:17:16,352 epoch 1 - iter 72/242 - loss 3.20262177 - time (sec): 28.64 - samples/sec: 269.51 - lr: 0.000047 - momentum: 0.000000
2023-10-08 23:17:25,768 epoch 1 - iter 96/242 - loss 3.15888174 - time (sec): 38.06 - samples/sec: 265.34 - lr: 0.000063 - momentum: 0.000000
2023-10-08 23:17:35,279 epoch 1 - iter 120/242 - loss 3.07509975 - time (sec): 47.57 - samples/sec: 266.72 - lr: 0.000079 - momentum: 0.000000
2023-10-08 23:17:44,356 epoch 1 - iter 144/242 - loss 2.97812763 - time (sec): 56.65 - samples/sec: 264.90 - lr: 0.000095 - momentum: 0.000000
2023-10-08 23:17:53,600 epoch 1 - iter 168/242 - loss 2.86385060 - time (sec): 65.89 - samples/sec: 266.80 - lr: 0.000110 - momentum: 0.000000
2023-10-08 23:18:02,645 epoch 1 - iter 192/242 - loss 2.75171464 - time (sec): 74.94 - samples/sec: 265.69 - lr: 0.000126 - momentum: 0.000000
2023-10-08 23:18:11,837 epoch 1 - iter 216/242 - loss 2.63706419 - time (sec): 84.13 - samples/sec: 263.61 - lr: 0.000142 - momentum: 0.000000
2023-10-08 23:18:21,299 epoch 1 - iter 240/242 - loss 2.51038471 - time (sec): 93.59 - samples/sec: 262.75 - lr: 0.000158 - momentum: 0.000000
2023-10-08 23:18:21,957 ----------------------------------------------------------------------------------------------------
2023-10-08 23:18:21,957 EPOCH 1 done: loss 2.5016 - lr: 0.000158
2023-10-08 23:18:27,818 DEV : loss 1.079971194267273 - f1-score (micro avg) 0.0
2023-10-08 23:18:27,824 ----------------------------------------------------------------------------------------------------
2023-10-08 23:18:37,431 epoch 2 - iter 24/242 - loss 1.01756734 - time (sec): 9.61 - samples/sec: 257.67 - lr: 0.000158 - momentum: 0.000000
2023-10-08 23:18:47,040 epoch 2 - iter 48/242 - loss 0.84586341 - time (sec): 19.21 - samples/sec: 262.99 - lr: 0.000157 - momentum: 0.000000
2023-10-08 23:18:55,942 epoch 2 - iter 72/242 - loss 0.80030243 - time (sec): 28.12 - samples/sec: 258.18 - lr: 0.000155 - momentum: 0.000000
2023-10-08 23:19:04,952 epoch 2 - iter 96/242 - loss 0.73412891 - time (sec): 37.13 - samples/sec: 257.47 - lr: 0.000153 - momentum: 0.000000
2023-10-08 23:19:14,107 epoch 2 - iter 120/242 - loss 0.70354425 - time (sec): 46.28 - samples/sec: 256.80 - lr: 0.000151 - momentum: 0.000000
2023-10-08 23:19:23,347 epoch 2 - iter 144/242 - loss 0.69760704 - time (sec): 55.52 - samples/sec: 256.25 - lr: 0.000150 - momentum: 0.000000
2023-10-08 23:19:32,249 epoch 2 - iter 168/242 - loss 0.68240983 - time (sec): 64.42 - samples/sec: 255.44 - lr: 0.000148 - momentum: 0.000000
2023-10-08 23:19:41,636 epoch 2 - iter 192/242 - loss 0.65096524 - time (sec): 73.81 - samples/sec: 257.19 - lr: 0.000146 - momentum: 0.000000
2023-10-08 23:19:51,662 epoch 2 - iter 216/242 - loss 0.62153433 - time (sec): 83.84 - samples/sec: 259.32 - lr: 0.000144 - momentum: 0.000000
2023-10-08 23:20:01,577 epoch 2 - iter 240/242 - loss 0.59708481 - time (sec): 93.75 - samples/sec: 262.60 - lr: 0.000142 - momentum: 0.000000
2023-10-08 23:20:02,119 ----------------------------------------------------------------------------------------------------
2023-10-08 23:20:02,119 EPOCH 2 done: loss 0.5954 - lr: 0.000142
2023-10-08 23:20:07,885 DEV : loss 0.36621710658073425 - f1-score (micro avg) 0.185
2023-10-08 23:20:07,891 saving best model
2023-10-08 23:20:08,756 ----------------------------------------------------------------------------------------------------
2023-10-08 23:20:18,050 epoch 3 - iter 24/242 - loss 0.35950188 - time (sec): 9.29 - samples/sec: 252.16 - lr: 0.000141 - momentum: 0.000000
2023-10-08 23:20:27,966 epoch 3 - iter 48/242 - loss 0.29823828 - time (sec): 19.21 - samples/sec: 264.79 - lr: 0.000139 - momentum: 0.000000
2023-10-08 23:20:37,393 epoch 3 - iter 72/242 - loss 0.29752598 - time (sec): 28.63 - samples/sec: 263.46 - lr: 0.000137 - momentum: 0.000000
2023-10-08 23:20:46,914 epoch 3 - iter 96/242 - loss 0.29599846 - time (sec): 38.16 - samples/sec: 264.78 - lr: 0.000135 - momentum: 0.000000
2023-10-08 23:20:56,382 epoch 3 - iter 120/242 - loss 0.29052825 - time (sec): 47.62 - samples/sec: 264.84 - lr: 0.000134 - momentum: 0.000000
2023-10-08 23:21:05,437 epoch 3 - iter 144/242 - loss 0.29051562 - time (sec): 56.68 - samples/sec: 262.94 - lr: 0.000132 - momentum: 0.000000
2023-10-08 23:21:14,707 epoch 3 - iter 168/242 - loss 0.28108395 - time (sec): 65.95 - samples/sec: 261.72 - lr: 0.000130 - momentum: 0.000000
2023-10-08 23:21:24,422 epoch 3 - iter 192/242 - loss 0.27070285 - time (sec): 75.66 - samples/sec: 262.71 - lr: 0.000128 - momentum: 0.000000
2023-10-08 23:21:33,244 epoch 3 - iter 216/242 - loss 0.26836798 - time (sec): 84.49 - samples/sec: 261.35 - lr: 0.000126 - momentum: 0.000000
2023-10-08 23:21:42,731 epoch 3 - iter 240/242 - loss 0.26309788 - time (sec): 93.97 - samples/sec: 262.21 - lr: 0.000125 - momentum: 0.000000
2023-10-08 23:21:43,283 ----------------------------------------------------------------------------------------------------
2023-10-08 23:21:43,283 EPOCH 3 done: loss 0.2626 - lr: 0.000125
2023-10-08 23:21:49,085 DEV : loss 0.20921547710895538 - f1-score (micro avg) 0.6216
2023-10-08 23:21:49,091 saving best model
2023-10-08 23:21:49,986 ----------------------------------------------------------------------------------------------------
2023-10-08 23:21:59,848 epoch 4 - iter 24/242 - loss 0.19937278 - time (sec): 9.86 - samples/sec: 275.96 - lr: 0.000123 - momentum: 0.000000
2023-10-08 23:22:09,820 epoch 4 - iter 48/242 - loss 0.20959392 - time (sec): 19.83 - samples/sec: 273.55 - lr: 0.000121 - momentum: 0.000000
2023-10-08 23:22:19,075 epoch 4 - iter 72/242 - loss 0.18266325 - time (sec): 29.09 - samples/sec: 266.85 - lr: 0.000119 - momentum: 0.000000
2023-10-08 23:22:29,185 epoch 4 - iter 96/242 - loss 0.17559324 - time (sec): 39.20 - samples/sec: 264.48 - lr: 0.000118 - momentum: 0.000000
2023-10-08 23:22:38,943 epoch 4 - iter 120/242 - loss 0.17056546 - time (sec): 48.96 - samples/sec: 263.67 - lr: 0.000116 - momentum: 0.000000
2023-10-08 23:22:48,102 epoch 4 - iter 144/242 - loss 0.17146465 - time (sec): 58.11 - samples/sec: 261.79 - lr: 0.000114 - momentum: 0.000000
2023-10-08 23:22:56,854 epoch 4 - iter 168/242 - loss 0.16628588 - time (sec): 66.87 - samples/sec: 259.95 - lr: 0.000112 - momentum: 0.000000
2023-10-08 23:23:06,396 epoch 4 - iter 192/242 - loss 0.16339488 - time (sec): 76.41 - samples/sec: 260.14 - lr: 0.000110 - momentum: 0.000000
2023-10-08 23:23:15,662 epoch 4 - iter 216/242 - loss 0.15709292 - time (sec): 85.67 - samples/sec: 259.09 - lr: 0.000109 - momentum: 0.000000
2023-10-08 23:23:24,937 epoch 4 - iter 240/242 - loss 0.15254959 - time (sec): 94.95 - samples/sec: 258.70 - lr: 0.000107 - momentum: 0.000000
2023-10-08 23:23:25,605 ----------------------------------------------------------------------------------------------------
2023-10-08 23:23:25,605 EPOCH 4 done: loss 0.1526 - lr: 0.000107
2023-10-08 23:23:31,733 DEV : loss 0.1503736525774002 - f1-score (micro avg) 0.8296
2023-10-08 23:23:31,738 saving best model
2023-10-08 23:23:32,825 ----------------------------------------------------------------------------------------------------
2023-10-08 23:23:42,613 epoch 5 - iter 24/242 - loss 0.13336685 - time (sec): 9.79 - samples/sec: 260.25 - lr: 0.000105 - momentum: 0.000000
2023-10-08 23:23:52,065 epoch 5 - iter 48/242 - loss 0.10805870 - time (sec): 19.24 - samples/sec: 253.46 - lr: 0.000103 - momentum: 0.000000
2023-10-08 23:24:01,993 epoch 5 - iter 72/242 - loss 0.10522684 - time (sec): 29.17 - samples/sec: 251.14 - lr: 0.000102 - momentum: 0.000000
2023-10-08 23:24:11,456 epoch 5 - iter 96/242 - loss 0.09751095 - time (sec): 38.63 - samples/sec: 250.61 - lr: 0.000100 - momentum: 0.000000
2023-10-08 23:24:21,194 epoch 5 - iter 120/242 - loss 0.10180240 - time (sec): 48.37 - samples/sec: 250.50 - lr: 0.000098 - momentum: 0.000000
2023-10-08 23:24:30,785 epoch 5 - iter 144/242 - loss 0.10278967 - time (sec): 57.96 - samples/sec: 248.95 - lr: 0.000096 - momentum: 0.000000
2023-10-08 23:24:40,680 epoch 5 - iter 168/242 - loss 0.10716189 - time (sec): 67.85 - samples/sec: 248.36 - lr: 0.000094 - momentum: 0.000000
2023-10-08 23:24:51,090 epoch 5 - iter 192/242 - loss 0.10576312 - time (sec): 78.26 - samples/sec: 249.02 - lr: 0.000093 - momentum: 0.000000
2023-10-08 23:25:01,502 epoch 5 - iter 216/242 - loss 0.10513146 - time (sec): 88.68 - samples/sec: 249.18 - lr: 0.000091 - momentum: 0.000000
2023-10-08 23:25:11,442 epoch 5 - iter 240/242 - loss 0.10084590 - time (sec): 98.62 - samples/sec: 248.70 - lr: 0.000089 - momentum: 0.000000
2023-10-08 23:25:12,224 ----------------------------------------------------------------------------------------------------
2023-10-08 23:25:12,225 EPOCH 5 done: loss 0.1003 - lr: 0.000089
2023-10-08 23:25:18,599 DEV : loss 0.13417156040668488 - f1-score (micro avg) 0.8175
2023-10-08 23:25:18,604 ----------------------------------------------------------------------------------------------------
2023-10-08 23:25:28,360 epoch 6 - iter 24/242 - loss 0.08491531 - time (sec): 9.75 - samples/sec: 251.08 - lr: 0.000087 - momentum: 0.000000
2023-10-08 23:25:38,991 epoch 6 - iter 48/242 - loss 0.07418088 - time (sec): 20.39 - samples/sec: 254.54 - lr: 0.000086 - momentum: 0.000000
2023-10-08 23:25:49,035 epoch 6 - iter 72/242 - loss 0.07920203 - time (sec): 30.43 - samples/sec: 251.73 - lr: 0.000084 - momentum: 0.000000
2023-10-08 23:25:59,078 epoch 6 - iter 96/242 - loss 0.07319054 - time (sec): 40.47 - samples/sec: 248.62 - lr: 0.000082 - momentum: 0.000000
2023-10-08 23:26:09,795 epoch 6 - iter 120/242 - loss 0.06921767 - time (sec): 51.19 - samples/sec: 247.28 - lr: 0.000080 - momentum: 0.000000
2023-10-08 23:26:19,241 epoch 6 - iter 144/242 - loss 0.06870799 - time (sec): 60.64 - samples/sec: 250.68 - lr: 0.000078 - momentum: 0.000000
2023-10-08 23:26:29,070 epoch 6 - iter 168/242 - loss 0.06861629 - time (sec): 70.46 - samples/sec: 252.04 - lr: 0.000077 - momentum: 0.000000
2023-10-08 23:26:38,252 epoch 6 - iter 192/242 - loss 0.06870233 - time (sec): 79.65 - samples/sec: 252.53 - lr: 0.000075 - momentum: 0.000000
2023-10-08 23:26:47,202 epoch 6 - iter 216/242 - loss 0.06991614 - time (sec): 88.60 - samples/sec: 251.46 - lr: 0.000073 - momentum: 0.000000
2023-10-08 23:26:56,516 epoch 6 - iter 240/242 - loss 0.06934364 - time (sec): 97.91 - samples/sec: 251.51 - lr: 0.000071 - momentum: 0.000000
2023-10-08 23:26:57,042 ----------------------------------------------------------------------------------------------------
2023-10-08 23:26:57,043 EPOCH 6 done: loss 0.0693 - lr: 0.000071
2023-10-08 23:27:02,928 DEV : loss 0.132174551486969 - f1-score (micro avg) 0.8375
2023-10-08 23:27:02,934 saving best model
2023-10-08 23:27:03,847 ----------------------------------------------------------------------------------------------------
2023-10-08 23:27:13,106 epoch 7 - iter 24/242 - loss 0.07042745 - time (sec): 9.26 - samples/sec: 254.06 - lr: 0.000070 - momentum: 0.000000
2023-10-08 23:27:22,314 epoch 7 - iter 48/242 - loss 0.07049752 - time (sec): 18.47 - samples/sec: 253.00 - lr: 0.000068 - momentum: 0.000000
2023-10-08 23:27:31,268 epoch 7 - iter 72/242 - loss 0.06167805 - time (sec): 27.42 - samples/sec: 252.88 - lr: 0.000066 - momentum: 0.000000
2023-10-08 23:27:41,041 epoch 7 - iter 96/242 - loss 0.05608674 - time (sec): 37.19 - samples/sec: 257.55 - lr: 0.000064 - momentum: 0.000000
2023-10-08 23:27:50,332 epoch 7 - iter 120/242 - loss 0.05525729 - time (sec): 46.48 - samples/sec: 259.79 - lr: 0.000062 - momentum: 0.000000
2023-10-08 23:27:59,952 epoch 7 - iter 144/242 - loss 0.04968835 - time (sec): 56.10 - samples/sec: 259.52 - lr: 0.000061 - momentum: 0.000000
2023-10-08 23:28:09,362 epoch 7 - iter 168/242 - loss 0.04954527 - time (sec): 65.51 - samples/sec: 258.72 - lr: 0.000059 - momentum: 0.000000
2023-10-08 23:28:18,978 epoch 7 - iter 192/242 - loss 0.05136589 - time (sec): 75.13 - samples/sec: 260.08 - lr: 0.000057 - momentum: 0.000000
2023-10-08 23:28:28,800 epoch 7 - iter 216/242 - loss 0.05068956 - time (sec): 84.95 - samples/sec: 260.35 - lr: 0.000055 - momentum: 0.000000
2023-10-08 23:28:38,041 epoch 7 - iter 240/242 - loss 0.05158986 - time (sec): 94.19 - samples/sec: 260.80 - lr: 0.000054 - momentum: 0.000000
2023-10-08 23:28:38,662 ----------------------------------------------------------------------------------------------------
2023-10-08 23:28:38,663 EPOCH 7 done: loss 0.0513 - lr: 0.000054
2023-10-08 23:28:44,423 DEV : loss 0.13093389570713043 - f1-score (micro avg) 0.8201
2023-10-08 23:28:44,429 ----------------------------------------------------------------------------------------------------
2023-10-08 23:28:53,805 epoch 8 - iter 24/242 - loss 0.03824659 - time (sec): 9.37 - samples/sec: 262.09 - lr: 0.000052 - momentum: 0.000000
2023-10-08 23:29:03,033 epoch 8 - iter 48/242 - loss 0.04184396 - time (sec): 18.60 - samples/sec: 261.79 - lr: 0.000050 - momentum: 0.000000
2023-10-08 23:29:12,274 epoch 8 - iter 72/242 - loss 0.05415154 - time (sec): 27.84 - samples/sec: 260.63 - lr: 0.000048 - momentum: 0.000000
2023-10-08 23:29:21,478 epoch 8 - iter 96/242 - loss 0.04767935 - time (sec): 37.05 - samples/sec: 261.47 - lr: 0.000046 - momentum: 0.000000
2023-10-08 23:29:31,081 epoch 8 - iter 120/242 - loss 0.04619751 - time (sec): 46.65 - samples/sec: 262.54 - lr: 0.000045 - momentum: 0.000000
2023-10-08 23:29:40,300 epoch 8 - iter 144/242 - loss 0.04396737 - time (sec): 55.87 - samples/sec: 263.13 - lr: 0.000043 - momentum: 0.000000
2023-10-08 23:29:50,002 epoch 8 - iter 168/242 - loss 0.04174366 - time (sec): 65.57 - samples/sec: 263.92 - lr: 0.000041 - momentum: 0.000000
2023-10-08 23:29:59,702 epoch 8 - iter 192/242 - loss 0.04237328 - time (sec): 75.27 - samples/sec: 264.37 - lr: 0.000039 - momentum: 0.000000
2023-10-08 23:30:09,215 epoch 8 - iter 216/242 - loss 0.03993499 - time (sec): 84.79 - samples/sec: 264.15 - lr: 0.000038 - momentum: 0.000000
2023-10-08 23:30:18,111 epoch 8 - iter 240/242 - loss 0.04045820 - time (sec): 93.68 - samples/sec: 262.42 - lr: 0.000036 - momentum: 0.000000
2023-10-08 23:30:18,731 ----------------------------------------------------------------------------------------------------
2023-10-08 23:30:18,731 EPOCH 8 done: loss 0.0405 - lr: 0.000036
2023-10-08 23:30:24,546 DEV : loss 0.14306265115737915 - f1-score (micro avg) 0.8296
2023-10-08 23:30:24,552 ----------------------------------------------------------------------------------------------------
2023-10-08 23:30:34,013 epoch 9 - iter 24/242 - loss 0.03298364 - time (sec): 9.46 - samples/sec: 241.43 - lr: 0.000034 - momentum: 0.000000
2023-10-08 23:30:44,240 epoch 9 - iter 48/242 - loss 0.03143181 - time (sec): 19.69 - samples/sec: 261.40 - lr: 0.000032 - momentum: 0.000000
2023-10-08 23:30:53,583 epoch 9 - iter 72/242 - loss 0.03109793 - time (sec): 29.03 - samples/sec: 263.28 - lr: 0.000030 - momentum: 0.000000
2023-10-08 23:31:03,165 epoch 9 - iter 96/242 - loss 0.02776311 - time (sec): 38.61 - samples/sec: 261.89 - lr: 0.000029 - momentum: 0.000000
2023-10-08 23:31:12,620 epoch 9 - iter 120/242 - loss 0.02893952 - time (sec): 48.07 - samples/sec: 260.89 - lr: 0.000027 - momentum: 0.000000
2023-10-08 23:31:22,035 epoch 9 - iter 144/242 - loss 0.02808601 - time (sec): 57.48 - samples/sec: 261.65 - lr: 0.000025 - momentum: 0.000000
2023-10-08 23:31:31,002 epoch 9 - iter 168/242 - loss 0.03101637 - time (sec): 66.45 - samples/sec: 259.99 - lr: 0.000023 - momentum: 0.000000
2023-10-08 23:31:40,261 epoch 9 - iter 192/242 - loss 0.03087273 - time (sec): 75.71 - samples/sec: 258.62 - lr: 0.000022 - momentum: 0.000000
2023-10-08 23:31:49,537 epoch 9 - iter 216/242 - loss 0.03304254 - time (sec): 84.98 - samples/sec: 258.63 - lr: 0.000020 - momentum: 0.000000
2023-10-08 23:31:59,418 epoch 9 - iter 240/242 - loss 0.03419530 - time (sec): 94.87 - samples/sec: 259.44 - lr: 0.000018 - momentum: 0.000000
2023-10-08 23:31:59,968 ----------------------------------------------------------------------------------------------------
2023-10-08 23:31:59,969 EPOCH 9 done: loss 0.0340 - lr: 0.000018
2023-10-08 23:32:05,952 DEV : loss 0.15046241879463196 - f1-score (micro avg) 0.812
2023-10-08 23:32:05,958 ----------------------------------------------------------------------------------------------------
2023-10-08 23:32:15,038 epoch 10 - iter 24/242 - loss 0.03676048 - time (sec): 9.08 - samples/sec: 254.11 - lr: 0.000016 - momentum: 0.000000
2023-10-08 23:32:24,674 epoch 10 - iter 48/242 - loss 0.02715581 - time (sec): 18.71 - samples/sec: 256.92 - lr: 0.000014 - momentum: 0.000000
2023-10-08 23:32:34,672 epoch 10 - iter 72/242 - loss 0.03106681 - time (sec): 28.71 - samples/sec: 257.03 - lr: 0.000013 - momentum: 0.000000
2023-10-08 23:32:44,579 epoch 10 - iter 96/242 - loss 0.03478316 - time (sec): 38.62 - samples/sec: 254.95 - lr: 0.000011 - momentum: 0.000000
2023-10-08 23:32:54,405 epoch 10 - iter 120/242 - loss 0.03385505 - time (sec): 48.44 - samples/sec: 255.26 - lr: 0.000009 - momentum: 0.000000
2023-10-08 23:33:04,455 epoch 10 - iter 144/242 - loss 0.03265163 - time (sec): 58.50 - samples/sec: 255.85 - lr: 0.000007 - momentum: 0.000000
2023-10-08 23:33:14,449 epoch 10 - iter 168/242 - loss 0.03299652 - time (sec): 68.49 - samples/sec: 255.76 - lr: 0.000006 - momentum: 0.000000
2023-10-08 23:33:23,340 epoch 10 - iter 192/242 - loss 0.03125589 - time (sec): 77.38 - samples/sec: 253.15 - lr: 0.000004 - momentum: 0.000000
2023-10-08 23:33:33,278 epoch 10 - iter 216/242 - loss 0.03156172 - time (sec): 87.32 - samples/sec: 251.71 - lr: 0.000002 - momentum: 0.000000
2023-10-08 23:33:43,528 epoch 10 - iter 240/242 - loss 0.03115999 - time (sec): 97.57 - samples/sec: 251.75 - lr: 0.000000 - momentum: 0.000000
2023-10-08 23:33:44,227 ----------------------------------------------------------------------------------------------------
2023-10-08 23:33:44,228 EPOCH 10 done: loss 0.0312 - lr: 0.000000
2023-10-08 23:33:50,694 DEV : loss 0.15445925295352936 - f1-score (micro avg) 0.8175
2023-10-08 23:33:51,726 ----------------------------------------------------------------------------------------------------
2023-10-08 23:33:51,727 Loading model from best epoch ...
2023-10-08 23:33:54,397 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-08 23:34:00,723
Results:
- F-score (micro) 0.806
- F-score (macro) 0.4034
- Accuracy 0.6956
By class:
precision recall f1-score support
pers 0.8369 0.8489 0.8429 139
scope 0.8417 0.9070 0.8731 129
work 0.6458 0.7750 0.7045 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
object 0.0000 0.0000 0.0000 0
micro avg 0.7878 0.8250 0.8060 360
macro avg 0.3874 0.4218 0.4034 360
weighted avg 0.7683 0.8250 0.7949 360
2023-10-08 23:34:00,723 ----------------------------------------------------------------------------------------------------