|
2023-10-08 23:16:47,704 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,705 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): T5LayerNorm() |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-08 23:16:47,705 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,706 MultiCorpus: 966 train + 219 dev + 204 test sentences |
|
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator |
|
2023-10-08 23:16:47,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,706 Train: 966 sentences |
|
2023-10-08 23:16:47,706 (train_with_dev=False, train_with_test=False) |
|
2023-10-08 23:16:47,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,706 Training Params: |
|
2023-10-08 23:16:47,706 - learning_rate: "0.00016" |
|
2023-10-08 23:16:47,706 - mini_batch_size: "4" |
|
2023-10-08 23:16:47,706 - max_epochs: "10" |
|
2023-10-08 23:16:47,706 - shuffle: "True" |
|
2023-10-08 23:16:47,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,706 Plugins: |
|
2023-10-08 23:16:47,706 - TensorboardLogger |
|
2023-10-08 23:16:47,706 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-08 23:16:47,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,706 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-08 23:16:47,706 - metric: "('micro avg', 'f1-score')" |
|
2023-10-08 23:16:47,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,707 Computation: |
|
2023-10-08 23:16:47,707 - compute on device: cuda:0 |
|
2023-10-08 23:16:47,707 - embedding storage: none |
|
2023-10-08 23:16:47,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,707 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-08 23:16:47,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:16:47,707 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-08 23:16:56,686 epoch 1 - iter 24/242 - loss 3.23915124 - time (sec): 8.98 - samples/sec: 255.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-08 23:17:06,554 epoch 1 - iter 48/242 - loss 3.22546569 - time (sec): 18.85 - samples/sec: 268.01 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-08 23:17:16,352 epoch 1 - iter 72/242 - loss 3.20262177 - time (sec): 28.64 - samples/sec: 269.51 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-08 23:17:25,768 epoch 1 - iter 96/242 - loss 3.15888174 - time (sec): 38.06 - samples/sec: 265.34 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-08 23:17:35,279 epoch 1 - iter 120/242 - loss 3.07509975 - time (sec): 47.57 - samples/sec: 266.72 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-08 23:17:44,356 epoch 1 - iter 144/242 - loss 2.97812763 - time (sec): 56.65 - samples/sec: 264.90 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-08 23:17:53,600 epoch 1 - iter 168/242 - loss 2.86385060 - time (sec): 65.89 - samples/sec: 266.80 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-08 23:18:02,645 epoch 1 - iter 192/242 - loss 2.75171464 - time (sec): 74.94 - samples/sec: 265.69 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-08 23:18:11,837 epoch 1 - iter 216/242 - loss 2.63706419 - time (sec): 84.13 - samples/sec: 263.61 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-08 23:18:21,299 epoch 1 - iter 240/242 - loss 2.51038471 - time (sec): 93.59 - samples/sec: 262.75 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-08 23:18:21,957 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:18:21,957 EPOCH 1 done: loss 2.5016 - lr: 0.000158 |
|
2023-10-08 23:18:27,818 DEV : loss 1.079971194267273 - f1-score (micro avg) 0.0 |
|
2023-10-08 23:18:27,824 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:18:37,431 epoch 2 - iter 24/242 - loss 1.01756734 - time (sec): 9.61 - samples/sec: 257.67 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-08 23:18:47,040 epoch 2 - iter 48/242 - loss 0.84586341 - time (sec): 19.21 - samples/sec: 262.99 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-08 23:18:55,942 epoch 2 - iter 72/242 - loss 0.80030243 - time (sec): 28.12 - samples/sec: 258.18 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-08 23:19:04,952 epoch 2 - iter 96/242 - loss 0.73412891 - time (sec): 37.13 - samples/sec: 257.47 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-08 23:19:14,107 epoch 2 - iter 120/242 - loss 0.70354425 - time (sec): 46.28 - samples/sec: 256.80 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-08 23:19:23,347 epoch 2 - iter 144/242 - loss 0.69760704 - time (sec): 55.52 - samples/sec: 256.25 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-08 23:19:32,249 epoch 2 - iter 168/242 - loss 0.68240983 - time (sec): 64.42 - samples/sec: 255.44 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-08 23:19:41,636 epoch 2 - iter 192/242 - loss 0.65096524 - time (sec): 73.81 - samples/sec: 257.19 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-08 23:19:51,662 epoch 2 - iter 216/242 - loss 0.62153433 - time (sec): 83.84 - samples/sec: 259.32 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-08 23:20:01,577 epoch 2 - iter 240/242 - loss 0.59708481 - time (sec): 93.75 - samples/sec: 262.60 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-08 23:20:02,119 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:20:02,119 EPOCH 2 done: loss 0.5954 - lr: 0.000142 |
|
2023-10-08 23:20:07,885 DEV : loss 0.36621710658073425 - f1-score (micro avg) 0.185 |
|
2023-10-08 23:20:07,891 saving best model |
|
2023-10-08 23:20:08,756 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:20:18,050 epoch 3 - iter 24/242 - loss 0.35950188 - time (sec): 9.29 - samples/sec: 252.16 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-08 23:20:27,966 epoch 3 - iter 48/242 - loss 0.29823828 - time (sec): 19.21 - samples/sec: 264.79 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-08 23:20:37,393 epoch 3 - iter 72/242 - loss 0.29752598 - time (sec): 28.63 - samples/sec: 263.46 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-08 23:20:46,914 epoch 3 - iter 96/242 - loss 0.29599846 - time (sec): 38.16 - samples/sec: 264.78 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-08 23:20:56,382 epoch 3 - iter 120/242 - loss 0.29052825 - time (sec): 47.62 - samples/sec: 264.84 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-08 23:21:05,437 epoch 3 - iter 144/242 - loss 0.29051562 - time (sec): 56.68 - samples/sec: 262.94 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-08 23:21:14,707 epoch 3 - iter 168/242 - loss 0.28108395 - time (sec): 65.95 - samples/sec: 261.72 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-08 23:21:24,422 epoch 3 - iter 192/242 - loss 0.27070285 - time (sec): 75.66 - samples/sec: 262.71 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-08 23:21:33,244 epoch 3 - iter 216/242 - loss 0.26836798 - time (sec): 84.49 - samples/sec: 261.35 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-08 23:21:42,731 epoch 3 - iter 240/242 - loss 0.26309788 - time (sec): 93.97 - samples/sec: 262.21 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-08 23:21:43,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:21:43,283 EPOCH 3 done: loss 0.2626 - lr: 0.000125 |
|
2023-10-08 23:21:49,085 DEV : loss 0.20921547710895538 - f1-score (micro avg) 0.6216 |
|
2023-10-08 23:21:49,091 saving best model |
|
2023-10-08 23:21:49,986 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:21:59,848 epoch 4 - iter 24/242 - loss 0.19937278 - time (sec): 9.86 - samples/sec: 275.96 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-08 23:22:09,820 epoch 4 - iter 48/242 - loss 0.20959392 - time (sec): 19.83 - samples/sec: 273.55 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-08 23:22:19,075 epoch 4 - iter 72/242 - loss 0.18266325 - time (sec): 29.09 - samples/sec: 266.85 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-08 23:22:29,185 epoch 4 - iter 96/242 - loss 0.17559324 - time (sec): 39.20 - samples/sec: 264.48 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-08 23:22:38,943 epoch 4 - iter 120/242 - loss 0.17056546 - time (sec): 48.96 - samples/sec: 263.67 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-08 23:22:48,102 epoch 4 - iter 144/242 - loss 0.17146465 - time (sec): 58.11 - samples/sec: 261.79 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-08 23:22:56,854 epoch 4 - iter 168/242 - loss 0.16628588 - time (sec): 66.87 - samples/sec: 259.95 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-08 23:23:06,396 epoch 4 - iter 192/242 - loss 0.16339488 - time (sec): 76.41 - samples/sec: 260.14 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-08 23:23:15,662 epoch 4 - iter 216/242 - loss 0.15709292 - time (sec): 85.67 - samples/sec: 259.09 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-08 23:23:24,937 epoch 4 - iter 240/242 - loss 0.15254959 - time (sec): 94.95 - samples/sec: 258.70 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-08 23:23:25,605 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:23:25,605 EPOCH 4 done: loss 0.1526 - lr: 0.000107 |
|
2023-10-08 23:23:31,733 DEV : loss 0.1503736525774002 - f1-score (micro avg) 0.8296 |
|
2023-10-08 23:23:31,738 saving best model |
|
2023-10-08 23:23:32,825 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:23:42,613 epoch 5 - iter 24/242 - loss 0.13336685 - time (sec): 9.79 - samples/sec: 260.25 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-08 23:23:52,065 epoch 5 - iter 48/242 - loss 0.10805870 - time (sec): 19.24 - samples/sec: 253.46 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-08 23:24:01,993 epoch 5 - iter 72/242 - loss 0.10522684 - time (sec): 29.17 - samples/sec: 251.14 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-08 23:24:11,456 epoch 5 - iter 96/242 - loss 0.09751095 - time (sec): 38.63 - samples/sec: 250.61 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-08 23:24:21,194 epoch 5 - iter 120/242 - loss 0.10180240 - time (sec): 48.37 - samples/sec: 250.50 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-08 23:24:30,785 epoch 5 - iter 144/242 - loss 0.10278967 - time (sec): 57.96 - samples/sec: 248.95 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-08 23:24:40,680 epoch 5 - iter 168/242 - loss 0.10716189 - time (sec): 67.85 - samples/sec: 248.36 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-08 23:24:51,090 epoch 5 - iter 192/242 - loss 0.10576312 - time (sec): 78.26 - samples/sec: 249.02 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-08 23:25:01,502 epoch 5 - iter 216/242 - loss 0.10513146 - time (sec): 88.68 - samples/sec: 249.18 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-08 23:25:11,442 epoch 5 - iter 240/242 - loss 0.10084590 - time (sec): 98.62 - samples/sec: 248.70 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-08 23:25:12,224 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:25:12,225 EPOCH 5 done: loss 0.1003 - lr: 0.000089 |
|
2023-10-08 23:25:18,599 DEV : loss 0.13417156040668488 - f1-score (micro avg) 0.8175 |
|
2023-10-08 23:25:18,604 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:25:28,360 epoch 6 - iter 24/242 - loss 0.08491531 - time (sec): 9.75 - samples/sec: 251.08 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-08 23:25:38,991 epoch 6 - iter 48/242 - loss 0.07418088 - time (sec): 20.39 - samples/sec: 254.54 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-08 23:25:49,035 epoch 6 - iter 72/242 - loss 0.07920203 - time (sec): 30.43 - samples/sec: 251.73 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-08 23:25:59,078 epoch 6 - iter 96/242 - loss 0.07319054 - time (sec): 40.47 - samples/sec: 248.62 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-08 23:26:09,795 epoch 6 - iter 120/242 - loss 0.06921767 - time (sec): 51.19 - samples/sec: 247.28 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-08 23:26:19,241 epoch 6 - iter 144/242 - loss 0.06870799 - time (sec): 60.64 - samples/sec: 250.68 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-08 23:26:29,070 epoch 6 - iter 168/242 - loss 0.06861629 - time (sec): 70.46 - samples/sec: 252.04 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-08 23:26:38,252 epoch 6 - iter 192/242 - loss 0.06870233 - time (sec): 79.65 - samples/sec: 252.53 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-08 23:26:47,202 epoch 6 - iter 216/242 - loss 0.06991614 - time (sec): 88.60 - samples/sec: 251.46 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-08 23:26:56,516 epoch 6 - iter 240/242 - loss 0.06934364 - time (sec): 97.91 - samples/sec: 251.51 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-08 23:26:57,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:26:57,043 EPOCH 6 done: loss 0.0693 - lr: 0.000071 |
|
2023-10-08 23:27:02,928 DEV : loss 0.132174551486969 - f1-score (micro avg) 0.8375 |
|
2023-10-08 23:27:02,934 saving best model |
|
2023-10-08 23:27:03,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:27:13,106 epoch 7 - iter 24/242 - loss 0.07042745 - time (sec): 9.26 - samples/sec: 254.06 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-08 23:27:22,314 epoch 7 - iter 48/242 - loss 0.07049752 - time (sec): 18.47 - samples/sec: 253.00 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-08 23:27:31,268 epoch 7 - iter 72/242 - loss 0.06167805 - time (sec): 27.42 - samples/sec: 252.88 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-08 23:27:41,041 epoch 7 - iter 96/242 - loss 0.05608674 - time (sec): 37.19 - samples/sec: 257.55 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-08 23:27:50,332 epoch 7 - iter 120/242 - loss 0.05525729 - time (sec): 46.48 - samples/sec: 259.79 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-08 23:27:59,952 epoch 7 - iter 144/242 - loss 0.04968835 - time (sec): 56.10 - samples/sec: 259.52 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-08 23:28:09,362 epoch 7 - iter 168/242 - loss 0.04954527 - time (sec): 65.51 - samples/sec: 258.72 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-08 23:28:18,978 epoch 7 - iter 192/242 - loss 0.05136589 - time (sec): 75.13 - samples/sec: 260.08 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-08 23:28:28,800 epoch 7 - iter 216/242 - loss 0.05068956 - time (sec): 84.95 - samples/sec: 260.35 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-08 23:28:38,041 epoch 7 - iter 240/242 - loss 0.05158986 - time (sec): 94.19 - samples/sec: 260.80 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-08 23:28:38,662 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:28:38,663 EPOCH 7 done: loss 0.0513 - lr: 0.000054 |
|
2023-10-08 23:28:44,423 DEV : loss 0.13093389570713043 - f1-score (micro avg) 0.8201 |
|
2023-10-08 23:28:44,429 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:28:53,805 epoch 8 - iter 24/242 - loss 0.03824659 - time (sec): 9.37 - samples/sec: 262.09 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-08 23:29:03,033 epoch 8 - iter 48/242 - loss 0.04184396 - time (sec): 18.60 - samples/sec: 261.79 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-08 23:29:12,274 epoch 8 - iter 72/242 - loss 0.05415154 - time (sec): 27.84 - samples/sec: 260.63 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-08 23:29:21,478 epoch 8 - iter 96/242 - loss 0.04767935 - time (sec): 37.05 - samples/sec: 261.47 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-08 23:29:31,081 epoch 8 - iter 120/242 - loss 0.04619751 - time (sec): 46.65 - samples/sec: 262.54 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-08 23:29:40,300 epoch 8 - iter 144/242 - loss 0.04396737 - time (sec): 55.87 - samples/sec: 263.13 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-08 23:29:50,002 epoch 8 - iter 168/242 - loss 0.04174366 - time (sec): 65.57 - samples/sec: 263.92 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-08 23:29:59,702 epoch 8 - iter 192/242 - loss 0.04237328 - time (sec): 75.27 - samples/sec: 264.37 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-08 23:30:09,215 epoch 8 - iter 216/242 - loss 0.03993499 - time (sec): 84.79 - samples/sec: 264.15 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-08 23:30:18,111 epoch 8 - iter 240/242 - loss 0.04045820 - time (sec): 93.68 - samples/sec: 262.42 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-08 23:30:18,731 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:30:18,731 EPOCH 8 done: loss 0.0405 - lr: 0.000036 |
|
2023-10-08 23:30:24,546 DEV : loss 0.14306265115737915 - f1-score (micro avg) 0.8296 |
|
2023-10-08 23:30:24,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:30:34,013 epoch 9 - iter 24/242 - loss 0.03298364 - time (sec): 9.46 - samples/sec: 241.43 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-08 23:30:44,240 epoch 9 - iter 48/242 - loss 0.03143181 - time (sec): 19.69 - samples/sec: 261.40 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-08 23:30:53,583 epoch 9 - iter 72/242 - loss 0.03109793 - time (sec): 29.03 - samples/sec: 263.28 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-08 23:31:03,165 epoch 9 - iter 96/242 - loss 0.02776311 - time (sec): 38.61 - samples/sec: 261.89 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-08 23:31:12,620 epoch 9 - iter 120/242 - loss 0.02893952 - time (sec): 48.07 - samples/sec: 260.89 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-08 23:31:22,035 epoch 9 - iter 144/242 - loss 0.02808601 - time (sec): 57.48 - samples/sec: 261.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-08 23:31:31,002 epoch 9 - iter 168/242 - loss 0.03101637 - time (sec): 66.45 - samples/sec: 259.99 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-08 23:31:40,261 epoch 9 - iter 192/242 - loss 0.03087273 - time (sec): 75.71 - samples/sec: 258.62 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-08 23:31:49,537 epoch 9 - iter 216/242 - loss 0.03304254 - time (sec): 84.98 - samples/sec: 258.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-08 23:31:59,418 epoch 9 - iter 240/242 - loss 0.03419530 - time (sec): 94.87 - samples/sec: 259.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-08 23:31:59,968 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:31:59,969 EPOCH 9 done: loss 0.0340 - lr: 0.000018 |
|
2023-10-08 23:32:05,952 DEV : loss 0.15046241879463196 - f1-score (micro avg) 0.812 |
|
2023-10-08 23:32:05,958 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:32:15,038 epoch 10 - iter 24/242 - loss 0.03676048 - time (sec): 9.08 - samples/sec: 254.11 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-08 23:32:24,674 epoch 10 - iter 48/242 - loss 0.02715581 - time (sec): 18.71 - samples/sec: 256.92 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-08 23:32:34,672 epoch 10 - iter 72/242 - loss 0.03106681 - time (sec): 28.71 - samples/sec: 257.03 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-08 23:32:44,579 epoch 10 - iter 96/242 - loss 0.03478316 - time (sec): 38.62 - samples/sec: 254.95 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-08 23:32:54,405 epoch 10 - iter 120/242 - loss 0.03385505 - time (sec): 48.44 - samples/sec: 255.26 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-08 23:33:04,455 epoch 10 - iter 144/242 - loss 0.03265163 - time (sec): 58.50 - samples/sec: 255.85 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-08 23:33:14,449 epoch 10 - iter 168/242 - loss 0.03299652 - time (sec): 68.49 - samples/sec: 255.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-08 23:33:23,340 epoch 10 - iter 192/242 - loss 0.03125589 - time (sec): 77.38 - samples/sec: 253.15 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-08 23:33:33,278 epoch 10 - iter 216/242 - loss 0.03156172 - time (sec): 87.32 - samples/sec: 251.71 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-08 23:33:43,528 epoch 10 - iter 240/242 - loss 0.03115999 - time (sec): 97.57 - samples/sec: 251.75 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-08 23:33:44,227 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:33:44,228 EPOCH 10 done: loss 0.0312 - lr: 0.000000 |
|
2023-10-08 23:33:50,694 DEV : loss 0.15445925295352936 - f1-score (micro avg) 0.8175 |
|
2023-10-08 23:33:51,726 ---------------------------------------------------------------------------------------------------- |
|
2023-10-08 23:33:51,727 Loading model from best epoch ... |
|
2023-10-08 23:33:54,397 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date |
|
2023-10-08 23:34:00,723 |
|
Results: |
|
- F-score (micro) 0.806 |
|
- F-score (macro) 0.4034 |
|
- Accuracy 0.6956 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
pers 0.8369 0.8489 0.8429 139 |
|
scope 0.8417 0.9070 0.8731 129 |
|
work 0.6458 0.7750 0.7045 80 |
|
loc 0.0000 0.0000 0.0000 9 |
|
date 0.0000 0.0000 0.0000 3 |
|
object 0.0000 0.0000 0.0000 0 |
|
|
|
micro avg 0.7878 0.8250 0.8060 360 |
|
macro avg 0.3874 0.4218 0.4034 360 |
|
weighted avg 0.7683 0.8250 0.7949 360 |
|
|
|
2023-10-08 23:34:00,723 ---------------------------------------------------------------------------------------------------- |
|
|