File size: 24,059 Bytes
befa204 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 |
2023-10-13 17:47:04,163 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Train: 5901 sentences
2023-10-13 17:47:04,164 (train_with_dev=False, train_with_test=False)
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Training Params:
2023-10-13 17:47:04,164 - learning_rate: "5e-05"
2023-10-13 17:47:04,164 - mini_batch_size: "8"
2023-10-13 17:47:04,164 - max_epochs: "10"
2023-10-13 17:47:04,164 - shuffle: "True"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Plugins:
2023-10-13 17:47:04,164 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 17:47:04,164 - metric: "('micro avg', 'f1-score')"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Computation:
2023-10-13 17:47:04,164 - compute on device: cuda:0
2023-10-13 17:47:04,164 - embedding storage: none
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,164 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 17:47:04,164 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:04,165 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:09,303 epoch 1 - iter 73/738 - loss 2.61704957 - time (sec): 5.14 - samples/sec: 3329.07 - lr: 0.000005 - momentum: 0.000000
2023-10-13 17:47:14,890 epoch 1 - iter 146/738 - loss 1.63900339 - time (sec): 10.72 - samples/sec: 3355.14 - lr: 0.000010 - momentum: 0.000000
2023-10-13 17:47:19,561 epoch 1 - iter 219/738 - loss 1.25224248 - time (sec): 15.40 - samples/sec: 3383.08 - lr: 0.000015 - momentum: 0.000000
2023-10-13 17:47:24,159 epoch 1 - iter 292/738 - loss 1.03925094 - time (sec): 19.99 - samples/sec: 3392.19 - lr: 0.000020 - momentum: 0.000000
2023-10-13 17:47:28,728 epoch 1 - iter 365/738 - loss 0.89541308 - time (sec): 24.56 - samples/sec: 3399.05 - lr: 0.000025 - momentum: 0.000000
2023-10-13 17:47:33,630 epoch 1 - iter 438/738 - loss 0.79056877 - time (sec): 29.46 - samples/sec: 3393.34 - lr: 0.000030 - momentum: 0.000000
2023-10-13 17:47:37,911 epoch 1 - iter 511/738 - loss 0.71979193 - time (sec): 33.75 - samples/sec: 3392.11 - lr: 0.000035 - momentum: 0.000000
2023-10-13 17:47:42,768 epoch 1 - iter 584/738 - loss 0.65785367 - time (sec): 38.60 - samples/sec: 3381.31 - lr: 0.000039 - momentum: 0.000000
2023-10-13 17:47:47,675 epoch 1 - iter 657/738 - loss 0.60430136 - time (sec): 43.51 - samples/sec: 3373.75 - lr: 0.000044 - momentum: 0.000000
2023-10-13 17:47:53,061 epoch 1 - iter 730/738 - loss 0.55420001 - time (sec): 48.90 - samples/sec: 3371.78 - lr: 0.000049 - momentum: 0.000000
2023-10-13 17:47:53,539 ----------------------------------------------------------------------------------------------------
2023-10-13 17:47:53,540 EPOCH 1 done: loss 0.5507 - lr: 0.000049
2023-10-13 17:47:59,706 DEV : loss 0.12785974144935608 - f1-score (micro avg) 0.7131
2023-10-13 17:47:59,734 saving best model
2023-10-13 17:48:00,205 ----------------------------------------------------------------------------------------------------
2023-10-13 17:48:04,704 epoch 2 - iter 73/738 - loss 0.14031504 - time (sec): 4.50 - samples/sec: 3262.14 - lr: 0.000049 - momentum: 0.000000
2023-10-13 17:48:09,192 epoch 2 - iter 146/738 - loss 0.13796547 - time (sec): 8.99 - samples/sec: 3313.17 - lr: 0.000049 - momentum: 0.000000
2023-10-13 17:48:14,464 epoch 2 - iter 219/738 - loss 0.13466397 - time (sec): 14.26 - samples/sec: 3360.71 - lr: 0.000048 - momentum: 0.000000
2023-10-13 17:48:19,296 epoch 2 - iter 292/738 - loss 0.13225824 - time (sec): 19.09 - samples/sec: 3355.88 - lr: 0.000048 - momentum: 0.000000
2023-10-13 17:48:24,137 epoch 2 - iter 365/738 - loss 0.13022111 - time (sec): 23.93 - samples/sec: 3350.86 - lr: 0.000047 - momentum: 0.000000
2023-10-13 17:48:29,160 epoch 2 - iter 438/738 - loss 0.12720644 - time (sec): 28.95 - samples/sec: 3364.48 - lr: 0.000047 - momentum: 0.000000
2023-10-13 17:48:34,186 epoch 2 - iter 511/738 - loss 0.12457501 - time (sec): 33.98 - samples/sec: 3340.81 - lr: 0.000046 - momentum: 0.000000
2023-10-13 17:48:38,951 epoch 2 - iter 584/738 - loss 0.12421710 - time (sec): 38.74 - samples/sec: 3349.70 - lr: 0.000046 - momentum: 0.000000
2023-10-13 17:48:44,401 epoch 2 - iter 657/738 - loss 0.12162126 - time (sec): 44.19 - samples/sec: 3350.95 - lr: 0.000045 - momentum: 0.000000
2023-10-13 17:48:49,387 epoch 2 - iter 730/738 - loss 0.11931305 - time (sec): 49.18 - samples/sec: 3349.23 - lr: 0.000045 - momentum: 0.000000
2023-10-13 17:48:49,874 ----------------------------------------------------------------------------------------------------
2023-10-13 17:48:49,874 EPOCH 2 done: loss 0.1189 - lr: 0.000045
2023-10-13 17:49:01,090 DEV : loss 0.13197503983974457 - f1-score (micro avg) 0.7308
2023-10-13 17:49:01,119 saving best model
2023-10-13 17:49:01,598 ----------------------------------------------------------------------------------------------------
2023-10-13 17:49:06,572 epoch 3 - iter 73/738 - loss 0.07614814 - time (sec): 4.97 - samples/sec: 3274.66 - lr: 0.000044 - momentum: 0.000000
2023-10-13 17:49:11,815 epoch 3 - iter 146/738 - loss 0.07670962 - time (sec): 10.21 - samples/sec: 3311.41 - lr: 0.000043 - momentum: 0.000000
2023-10-13 17:49:16,339 epoch 3 - iter 219/738 - loss 0.07615619 - time (sec): 14.74 - samples/sec: 3339.95 - lr: 0.000043 - momentum: 0.000000
2023-10-13 17:49:21,598 epoch 3 - iter 292/738 - loss 0.08533494 - time (sec): 20.00 - samples/sec: 3355.66 - lr: 0.000042 - momentum: 0.000000
2023-10-13 17:49:26,329 epoch 3 - iter 365/738 - loss 0.08052962 - time (sec): 24.73 - samples/sec: 3350.28 - lr: 0.000042 - momentum: 0.000000
2023-10-13 17:49:31,257 epoch 3 - iter 438/738 - loss 0.07707434 - time (sec): 29.65 - samples/sec: 3330.19 - lr: 0.000041 - momentum: 0.000000
2023-10-13 17:49:36,060 epoch 3 - iter 511/738 - loss 0.07639684 - time (sec): 34.46 - samples/sec: 3344.43 - lr: 0.000041 - momentum: 0.000000
2023-10-13 17:49:41,408 epoch 3 - iter 584/738 - loss 0.07393004 - time (sec): 39.80 - samples/sec: 3333.01 - lr: 0.000040 - momentum: 0.000000
2023-10-13 17:49:46,311 epoch 3 - iter 657/738 - loss 0.07260392 - time (sec): 44.71 - samples/sec: 3315.95 - lr: 0.000040 - momentum: 0.000000
2023-10-13 17:49:51,524 epoch 3 - iter 730/738 - loss 0.07247522 - time (sec): 49.92 - samples/sec: 3305.85 - lr: 0.000039 - momentum: 0.000000
2023-10-13 17:49:51,967 ----------------------------------------------------------------------------------------------------
2023-10-13 17:49:51,967 EPOCH 3 done: loss 0.0725 - lr: 0.000039
2023-10-13 17:50:03,343 DEV : loss 0.1486140638589859 - f1-score (micro avg) 0.7833
2023-10-13 17:50:03,372 saving best model
2023-10-13 17:50:03,852 ----------------------------------------------------------------------------------------------------
2023-10-13 17:50:09,135 epoch 4 - iter 73/738 - loss 0.05135216 - time (sec): 5.28 - samples/sec: 3383.66 - lr: 0.000038 - momentum: 0.000000
2023-10-13 17:50:13,787 epoch 4 - iter 146/738 - loss 0.05140785 - time (sec): 9.93 - samples/sec: 3335.68 - lr: 0.000038 - momentum: 0.000000
2023-10-13 17:50:19,538 epoch 4 - iter 219/738 - loss 0.04837867 - time (sec): 15.68 - samples/sec: 3374.63 - lr: 0.000037 - momentum: 0.000000
2023-10-13 17:50:24,656 epoch 4 - iter 292/738 - loss 0.05298887 - time (sec): 20.80 - samples/sec: 3349.51 - lr: 0.000037 - momentum: 0.000000
2023-10-13 17:50:29,288 epoch 4 - iter 365/738 - loss 0.05192526 - time (sec): 25.43 - samples/sec: 3358.42 - lr: 0.000036 - momentum: 0.000000
2023-10-13 17:50:34,582 epoch 4 - iter 438/738 - loss 0.05297484 - time (sec): 30.72 - samples/sec: 3368.28 - lr: 0.000036 - momentum: 0.000000
2023-10-13 17:50:39,204 epoch 4 - iter 511/738 - loss 0.05272183 - time (sec): 35.34 - samples/sec: 3368.87 - lr: 0.000035 - momentum: 0.000000
2023-10-13 17:50:43,882 epoch 4 - iter 584/738 - loss 0.05406746 - time (sec): 40.02 - samples/sec: 3349.33 - lr: 0.000035 - momentum: 0.000000
2023-10-13 17:50:48,214 epoch 4 - iter 657/738 - loss 0.05441271 - time (sec): 44.35 - samples/sec: 3352.26 - lr: 0.000034 - momentum: 0.000000
2023-10-13 17:50:52,924 epoch 4 - iter 730/738 - loss 0.05358667 - time (sec): 49.06 - samples/sec: 3359.88 - lr: 0.000033 - momentum: 0.000000
2023-10-13 17:50:53,379 ----------------------------------------------------------------------------------------------------
2023-10-13 17:50:53,379 EPOCH 4 done: loss 0.0533 - lr: 0.000033
2023-10-13 17:51:04,589 DEV : loss 0.1737738847732544 - f1-score (micro avg) 0.8049
2023-10-13 17:51:04,619 saving best model
2023-10-13 17:51:05,140 ----------------------------------------------------------------------------------------------------
2023-10-13 17:51:10,235 epoch 5 - iter 73/738 - loss 0.03032376 - time (sec): 5.09 - samples/sec: 3312.13 - lr: 0.000033 - momentum: 0.000000
2023-10-13 17:51:14,799 epoch 5 - iter 146/738 - loss 0.03581647 - time (sec): 9.65 - samples/sec: 3328.98 - lr: 0.000032 - momentum: 0.000000
2023-10-13 17:51:19,339 epoch 5 - iter 219/738 - loss 0.03520240 - time (sec): 14.19 - samples/sec: 3390.73 - lr: 0.000032 - momentum: 0.000000
2023-10-13 17:51:24,352 epoch 5 - iter 292/738 - loss 0.03677044 - time (sec): 19.21 - samples/sec: 3407.81 - lr: 0.000031 - momentum: 0.000000
2023-10-13 17:51:29,378 epoch 5 - iter 365/738 - loss 0.03507287 - time (sec): 24.23 - samples/sec: 3366.45 - lr: 0.000031 - momentum: 0.000000
2023-10-13 17:51:34,241 epoch 5 - iter 438/738 - loss 0.03455833 - time (sec): 29.10 - samples/sec: 3355.44 - lr: 0.000030 - momentum: 0.000000
2023-10-13 17:51:39,901 epoch 5 - iter 511/738 - loss 0.03549297 - time (sec): 34.76 - samples/sec: 3357.51 - lr: 0.000030 - momentum: 0.000000
2023-10-13 17:51:44,117 epoch 5 - iter 584/738 - loss 0.03649335 - time (sec): 38.97 - samples/sec: 3377.19 - lr: 0.000029 - momentum: 0.000000
2023-10-13 17:51:49,172 epoch 5 - iter 657/738 - loss 0.03572356 - time (sec): 44.03 - samples/sec: 3376.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 17:51:53,851 epoch 5 - iter 730/738 - loss 0.03593262 - time (sec): 48.71 - samples/sec: 3383.72 - lr: 0.000028 - momentum: 0.000000
2023-10-13 17:51:54,287 ----------------------------------------------------------------------------------------------------
2023-10-13 17:51:54,287 EPOCH 5 done: loss 0.0360 - lr: 0.000028
2023-10-13 17:52:05,499 DEV : loss 0.1812753677368164 - f1-score (micro avg) 0.8177
2023-10-13 17:52:05,531 saving best model
2023-10-13 17:52:06,113 ----------------------------------------------------------------------------------------------------
2023-10-13 17:52:11,732 epoch 6 - iter 73/738 - loss 0.01621660 - time (sec): 5.61 - samples/sec: 3000.04 - lr: 0.000027 - momentum: 0.000000
2023-10-13 17:52:16,775 epoch 6 - iter 146/738 - loss 0.02080977 - time (sec): 10.66 - samples/sec: 3100.09 - lr: 0.000027 - momentum: 0.000000
2023-10-13 17:52:21,263 epoch 6 - iter 219/738 - loss 0.01876135 - time (sec): 15.14 - samples/sec: 3144.78 - lr: 0.000026 - momentum: 0.000000
2023-10-13 17:52:25,848 epoch 6 - iter 292/738 - loss 0.02138464 - time (sec): 19.73 - samples/sec: 3183.99 - lr: 0.000026 - momentum: 0.000000
2023-10-13 17:52:31,015 epoch 6 - iter 365/738 - loss 0.01978816 - time (sec): 24.90 - samples/sec: 3209.75 - lr: 0.000025 - momentum: 0.000000
2023-10-13 17:52:35,258 epoch 6 - iter 438/738 - loss 0.01938038 - time (sec): 29.14 - samples/sec: 3225.51 - lr: 0.000025 - momentum: 0.000000
2023-10-13 17:52:40,274 epoch 6 - iter 511/738 - loss 0.01928215 - time (sec): 34.16 - samples/sec: 3254.34 - lr: 0.000024 - momentum: 0.000000
2023-10-13 17:52:45,579 epoch 6 - iter 584/738 - loss 0.01987479 - time (sec): 39.46 - samples/sec: 3280.26 - lr: 0.000023 - momentum: 0.000000
2023-10-13 17:52:51,330 epoch 6 - iter 657/738 - loss 0.02178292 - time (sec): 45.21 - samples/sec: 3297.22 - lr: 0.000023 - momentum: 0.000000
2023-10-13 17:52:56,145 epoch 6 - iter 730/738 - loss 0.02282192 - time (sec): 50.03 - samples/sec: 3300.32 - lr: 0.000022 - momentum: 0.000000
2023-10-13 17:52:56,552 ----------------------------------------------------------------------------------------------------
2023-10-13 17:52:56,552 EPOCH 6 done: loss 0.0228 - lr: 0.000022
2023-10-13 17:53:07,779 DEV : loss 0.21827659010887146 - f1-score (micro avg) 0.7988
2023-10-13 17:53:07,809 ----------------------------------------------------------------------------------------------------
2023-10-13 17:53:12,402 epoch 7 - iter 73/738 - loss 0.01465345 - time (sec): 4.59 - samples/sec: 3353.90 - lr: 0.000022 - momentum: 0.000000
2023-10-13 17:53:16,861 epoch 7 - iter 146/738 - loss 0.01573536 - time (sec): 9.05 - samples/sec: 3298.73 - lr: 0.000021 - momentum: 0.000000
2023-10-13 17:53:21,855 epoch 7 - iter 219/738 - loss 0.01844721 - time (sec): 14.04 - samples/sec: 3360.00 - lr: 0.000021 - momentum: 0.000000
2023-10-13 17:53:26,583 epoch 7 - iter 292/738 - loss 0.01755483 - time (sec): 18.77 - samples/sec: 3348.38 - lr: 0.000020 - momentum: 0.000000
2023-10-13 17:53:31,556 epoch 7 - iter 365/738 - loss 0.01732233 - time (sec): 23.75 - samples/sec: 3348.92 - lr: 0.000020 - momentum: 0.000000
2023-10-13 17:53:36,455 epoch 7 - iter 438/738 - loss 0.01872450 - time (sec): 28.64 - samples/sec: 3348.83 - lr: 0.000019 - momentum: 0.000000
2023-10-13 17:53:41,246 epoch 7 - iter 511/738 - loss 0.01801696 - time (sec): 33.44 - samples/sec: 3353.96 - lr: 0.000018 - momentum: 0.000000
2023-10-13 17:53:46,182 epoch 7 - iter 584/738 - loss 0.01924045 - time (sec): 38.37 - samples/sec: 3350.46 - lr: 0.000018 - momentum: 0.000000
2023-10-13 17:53:51,818 epoch 7 - iter 657/738 - loss 0.01897246 - time (sec): 44.01 - samples/sec: 3363.15 - lr: 0.000017 - momentum: 0.000000
2023-10-13 17:53:56,775 epoch 7 - iter 730/738 - loss 0.01878918 - time (sec): 48.96 - samples/sec: 3357.79 - lr: 0.000017 - momentum: 0.000000
2023-10-13 17:53:57,373 ----------------------------------------------------------------------------------------------------
2023-10-13 17:53:57,373 EPOCH 7 done: loss 0.0186 - lr: 0.000017
2023-10-13 17:54:08,578 DEV : loss 0.20159663259983063 - f1-score (micro avg) 0.8255
2023-10-13 17:54:08,607 saving best model
2023-10-13 17:54:09,182 ----------------------------------------------------------------------------------------------------
2023-10-13 17:54:14,384 epoch 8 - iter 73/738 - loss 0.00867283 - time (sec): 5.20 - samples/sec: 3376.39 - lr: 0.000016 - momentum: 0.000000
2023-10-13 17:54:18,924 epoch 8 - iter 146/738 - loss 0.00929264 - time (sec): 9.74 - samples/sec: 3338.85 - lr: 0.000016 - momentum: 0.000000
2023-10-13 17:54:23,900 epoch 8 - iter 219/738 - loss 0.00988928 - time (sec): 14.71 - samples/sec: 3360.24 - lr: 0.000015 - momentum: 0.000000
2023-10-13 17:54:28,467 epoch 8 - iter 292/738 - loss 0.01078611 - time (sec): 19.28 - samples/sec: 3357.48 - lr: 0.000015 - momentum: 0.000000
2023-10-13 17:54:33,531 epoch 8 - iter 365/738 - loss 0.01188262 - time (sec): 24.34 - samples/sec: 3332.34 - lr: 0.000014 - momentum: 0.000000
2023-10-13 17:54:39,068 epoch 8 - iter 438/738 - loss 0.01168210 - time (sec): 29.88 - samples/sec: 3313.89 - lr: 0.000013 - momentum: 0.000000
2023-10-13 17:54:43,328 epoch 8 - iter 511/738 - loss 0.01108505 - time (sec): 34.14 - samples/sec: 3334.40 - lr: 0.000013 - momentum: 0.000000
2023-10-13 17:54:48,539 epoch 8 - iter 584/738 - loss 0.01139473 - time (sec): 39.35 - samples/sec: 3327.81 - lr: 0.000012 - momentum: 0.000000
2023-10-13 17:54:53,175 epoch 8 - iter 657/738 - loss 0.01059925 - time (sec): 43.99 - samples/sec: 3333.14 - lr: 0.000012 - momentum: 0.000000
2023-10-13 17:54:58,385 epoch 8 - iter 730/738 - loss 0.01213062 - time (sec): 49.20 - samples/sec: 3351.79 - lr: 0.000011 - momentum: 0.000000
2023-10-13 17:54:58,847 ----------------------------------------------------------------------------------------------------
2023-10-13 17:54:58,847 EPOCH 8 done: loss 0.0120 - lr: 0.000011
2023-10-13 17:55:10,116 DEV : loss 0.2121274471282959 - f1-score (micro avg) 0.8167
2023-10-13 17:55:10,146 ----------------------------------------------------------------------------------------------------
2023-10-13 17:55:15,100 epoch 9 - iter 73/738 - loss 0.00785780 - time (sec): 4.95 - samples/sec: 3384.77 - lr: 0.000011 - momentum: 0.000000
2023-10-13 17:55:20,217 epoch 9 - iter 146/738 - loss 0.00968859 - time (sec): 10.07 - samples/sec: 3323.66 - lr: 0.000010 - momentum: 0.000000
2023-10-13 17:55:24,538 epoch 9 - iter 219/738 - loss 0.00766936 - time (sec): 14.39 - samples/sec: 3355.87 - lr: 0.000010 - momentum: 0.000000
2023-10-13 17:55:29,150 epoch 9 - iter 292/738 - loss 0.00776847 - time (sec): 19.00 - samples/sec: 3343.63 - lr: 0.000009 - momentum: 0.000000
2023-10-13 17:55:34,169 epoch 9 - iter 365/738 - loss 0.00786568 - time (sec): 24.02 - samples/sec: 3303.61 - lr: 0.000008 - momentum: 0.000000
2023-10-13 17:55:39,513 epoch 9 - iter 438/738 - loss 0.00805319 - time (sec): 29.37 - samples/sec: 3304.31 - lr: 0.000008 - momentum: 0.000000
2023-10-13 17:55:44,830 epoch 9 - iter 511/738 - loss 0.00739363 - time (sec): 34.68 - samples/sec: 3308.13 - lr: 0.000007 - momentum: 0.000000
2023-10-13 17:55:49,338 epoch 9 - iter 584/738 - loss 0.00731685 - time (sec): 39.19 - samples/sec: 3324.15 - lr: 0.000007 - momentum: 0.000000
2023-10-13 17:55:54,064 epoch 9 - iter 657/738 - loss 0.00765869 - time (sec): 43.92 - samples/sec: 3324.65 - lr: 0.000006 - momentum: 0.000000
2023-10-13 17:55:59,126 epoch 9 - iter 730/738 - loss 0.00759718 - time (sec): 48.98 - samples/sec: 3359.39 - lr: 0.000006 - momentum: 0.000000
2023-10-13 17:55:59,614 ----------------------------------------------------------------------------------------------------
2023-10-13 17:55:59,614 EPOCH 9 done: loss 0.0075 - lr: 0.000006
2023-10-13 17:56:10,875 DEV : loss 0.22374621033668518 - f1-score (micro avg) 0.8242
2023-10-13 17:56:10,904 ----------------------------------------------------------------------------------------------------
2023-10-13 17:56:16,195 epoch 10 - iter 73/738 - loss 0.00432633 - time (sec): 5.29 - samples/sec: 3017.62 - lr: 0.000005 - momentum: 0.000000
2023-10-13 17:56:21,075 epoch 10 - iter 146/738 - loss 0.00341698 - time (sec): 10.17 - samples/sec: 3203.85 - lr: 0.000004 - momentum: 0.000000
2023-10-13 17:56:25,457 epoch 10 - iter 219/738 - loss 0.00467938 - time (sec): 14.55 - samples/sec: 3261.52 - lr: 0.000004 - momentum: 0.000000
2023-10-13 17:56:30,710 epoch 10 - iter 292/738 - loss 0.00480875 - time (sec): 19.81 - samples/sec: 3313.14 - lr: 0.000003 - momentum: 0.000000
2023-10-13 17:56:36,255 epoch 10 - iter 365/738 - loss 0.00575497 - time (sec): 25.35 - samples/sec: 3311.60 - lr: 0.000003 - momentum: 0.000000
2023-10-13 17:56:40,976 epoch 10 - iter 438/738 - loss 0.00574872 - time (sec): 30.07 - samples/sec: 3312.48 - lr: 0.000002 - momentum: 0.000000
2023-10-13 17:56:45,946 epoch 10 - iter 511/738 - loss 0.00539595 - time (sec): 35.04 - samples/sec: 3329.58 - lr: 0.000002 - momentum: 0.000000
2023-10-13 17:56:51,370 epoch 10 - iter 584/738 - loss 0.00518260 - time (sec): 40.47 - samples/sec: 3321.26 - lr: 0.000001 - momentum: 0.000000
2023-10-13 17:56:56,106 epoch 10 - iter 657/738 - loss 0.00512442 - time (sec): 45.20 - samples/sec: 3321.97 - lr: 0.000001 - momentum: 0.000000
2023-10-13 17:57:00,525 epoch 10 - iter 730/738 - loss 0.00499521 - time (sec): 49.62 - samples/sec: 3319.82 - lr: 0.000000 - momentum: 0.000000
2023-10-13 17:57:00,998 ----------------------------------------------------------------------------------------------------
2023-10-13 17:57:00,999 EPOCH 10 done: loss 0.0049 - lr: 0.000000
2023-10-13 17:57:12,280 DEV : loss 0.22519326210021973 - f1-score (micro avg) 0.8266
2023-10-13 17:57:12,310 saving best model
2023-10-13 17:57:13,140 ----------------------------------------------------------------------------------------------------
2023-10-13 17:57:13,141 Loading model from best epoch ...
2023-10-13 17:57:14,542 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 17:57:20,591
Results:
- F-score (micro) 0.8013
- F-score (macro) 0.7071
- Accuracy 0.6949
By class:
precision recall f1-score support
loc 0.8622 0.8823 0.8721 858
pers 0.7549 0.7970 0.7754 537
org 0.5652 0.5909 0.5778 132
time 0.5484 0.6296 0.5862 54
prod 0.7636 0.6885 0.7241 61
micro avg 0.7876 0.8155 0.8013 1642
macro avg 0.6989 0.7177 0.7071 1642
weighted avg 0.7892 0.8155 0.8019 1642
2023-10-13 17:57:20,591 ----------------------------------------------------------------------------------------------------
|