2023-10-13 17:47:04,163 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 Train: 5901 sentences 2023-10-13 17:47:04,164 (train_with_dev=False, train_with_test=False) 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 Training Params: 2023-10-13 17:47:04,164 - learning_rate: "5e-05" 2023-10-13 17:47:04,164 - mini_batch_size: "8" 2023-10-13 17:47:04,164 - max_epochs: "10" 2023-10-13 17:47:04,164 - shuffle: "True" 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 Plugins: 2023-10-13 17:47:04,164 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 17:47:04,164 - metric: "('micro avg', 'f1-score')" 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 Computation: 2023-10-13 17:47:04,164 - compute on device: cuda:0 2023-10-13 17:47:04,164 - embedding storage: none 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,164 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 17:47:04,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:04,165 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:09,303 epoch 1 - iter 73/738 - loss 2.61704957 - time (sec): 5.14 - samples/sec: 3329.07 - lr: 0.000005 - momentum: 0.000000 2023-10-13 17:47:14,890 epoch 1 - iter 146/738 - loss 1.63900339 - time (sec): 10.72 - samples/sec: 3355.14 - lr: 0.000010 - momentum: 0.000000 2023-10-13 17:47:19,561 epoch 1 - iter 219/738 - loss 1.25224248 - time (sec): 15.40 - samples/sec: 3383.08 - lr: 0.000015 - momentum: 0.000000 2023-10-13 17:47:24,159 epoch 1 - iter 292/738 - loss 1.03925094 - time (sec): 19.99 - samples/sec: 3392.19 - lr: 0.000020 - momentum: 0.000000 2023-10-13 17:47:28,728 epoch 1 - iter 365/738 - loss 0.89541308 - time (sec): 24.56 - samples/sec: 3399.05 - lr: 0.000025 - momentum: 0.000000 2023-10-13 17:47:33,630 epoch 1 - iter 438/738 - loss 0.79056877 - time (sec): 29.46 - samples/sec: 3393.34 - lr: 0.000030 - momentum: 0.000000 2023-10-13 17:47:37,911 epoch 1 - iter 511/738 - loss 0.71979193 - time (sec): 33.75 - samples/sec: 3392.11 - lr: 0.000035 - momentum: 0.000000 2023-10-13 17:47:42,768 epoch 1 - iter 584/738 - loss 0.65785367 - time (sec): 38.60 - samples/sec: 3381.31 - lr: 0.000039 - momentum: 0.000000 2023-10-13 17:47:47,675 epoch 1 - iter 657/738 - loss 0.60430136 - time (sec): 43.51 - samples/sec: 3373.75 - lr: 0.000044 - momentum: 0.000000 2023-10-13 17:47:53,061 epoch 1 - iter 730/738 - loss 0.55420001 - time (sec): 48.90 - samples/sec: 3371.78 - lr: 0.000049 - momentum: 0.000000 2023-10-13 17:47:53,539 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:47:53,540 EPOCH 1 done: loss 0.5507 - lr: 0.000049 2023-10-13 17:47:59,706 DEV : loss 0.12785974144935608 - f1-score (micro avg) 0.7131 2023-10-13 17:47:59,734 saving best model 2023-10-13 17:48:00,205 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:48:04,704 epoch 2 - iter 73/738 - loss 0.14031504 - time (sec): 4.50 - samples/sec: 3262.14 - lr: 0.000049 - momentum: 0.000000 2023-10-13 17:48:09,192 epoch 2 - iter 146/738 - loss 0.13796547 - time (sec): 8.99 - samples/sec: 3313.17 - lr: 0.000049 - momentum: 0.000000 2023-10-13 17:48:14,464 epoch 2 - iter 219/738 - loss 0.13466397 - time (sec): 14.26 - samples/sec: 3360.71 - lr: 0.000048 - momentum: 0.000000 2023-10-13 17:48:19,296 epoch 2 - iter 292/738 - loss 0.13225824 - time (sec): 19.09 - samples/sec: 3355.88 - lr: 0.000048 - momentum: 0.000000 2023-10-13 17:48:24,137 epoch 2 - iter 365/738 - loss 0.13022111 - time (sec): 23.93 - samples/sec: 3350.86 - lr: 0.000047 - momentum: 0.000000 2023-10-13 17:48:29,160 epoch 2 - iter 438/738 - loss 0.12720644 - time (sec): 28.95 - samples/sec: 3364.48 - lr: 0.000047 - momentum: 0.000000 2023-10-13 17:48:34,186 epoch 2 - iter 511/738 - loss 0.12457501 - time (sec): 33.98 - samples/sec: 3340.81 - lr: 0.000046 - momentum: 0.000000 2023-10-13 17:48:38,951 epoch 2 - iter 584/738 - loss 0.12421710 - time (sec): 38.74 - samples/sec: 3349.70 - lr: 0.000046 - momentum: 0.000000 2023-10-13 17:48:44,401 epoch 2 - iter 657/738 - loss 0.12162126 - time (sec): 44.19 - samples/sec: 3350.95 - lr: 0.000045 - momentum: 0.000000 2023-10-13 17:48:49,387 epoch 2 - iter 730/738 - loss 0.11931305 - time (sec): 49.18 - samples/sec: 3349.23 - lr: 0.000045 - momentum: 0.000000 2023-10-13 17:48:49,874 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:48:49,874 EPOCH 2 done: loss 0.1189 - lr: 0.000045 2023-10-13 17:49:01,090 DEV : loss 0.13197503983974457 - f1-score (micro avg) 0.7308 2023-10-13 17:49:01,119 saving best model 2023-10-13 17:49:01,598 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:49:06,572 epoch 3 - iter 73/738 - loss 0.07614814 - time (sec): 4.97 - samples/sec: 3274.66 - lr: 0.000044 - momentum: 0.000000 2023-10-13 17:49:11,815 epoch 3 - iter 146/738 - loss 0.07670962 - time (sec): 10.21 - samples/sec: 3311.41 - lr: 0.000043 - momentum: 0.000000 2023-10-13 17:49:16,339 epoch 3 - iter 219/738 - loss 0.07615619 - time (sec): 14.74 - samples/sec: 3339.95 - lr: 0.000043 - momentum: 0.000000 2023-10-13 17:49:21,598 epoch 3 - iter 292/738 - loss 0.08533494 - time (sec): 20.00 - samples/sec: 3355.66 - lr: 0.000042 - momentum: 0.000000 2023-10-13 17:49:26,329 epoch 3 - iter 365/738 - loss 0.08052962 - time (sec): 24.73 - samples/sec: 3350.28 - lr: 0.000042 - momentum: 0.000000 2023-10-13 17:49:31,257 epoch 3 - iter 438/738 - loss 0.07707434 - time (sec): 29.65 - samples/sec: 3330.19 - lr: 0.000041 - momentum: 0.000000 2023-10-13 17:49:36,060 epoch 3 - iter 511/738 - loss 0.07639684 - time (sec): 34.46 - samples/sec: 3344.43 - lr: 0.000041 - momentum: 0.000000 2023-10-13 17:49:41,408 epoch 3 - iter 584/738 - loss 0.07393004 - time (sec): 39.80 - samples/sec: 3333.01 - lr: 0.000040 - momentum: 0.000000 2023-10-13 17:49:46,311 epoch 3 - iter 657/738 - loss 0.07260392 - time (sec): 44.71 - samples/sec: 3315.95 - lr: 0.000040 - momentum: 0.000000 2023-10-13 17:49:51,524 epoch 3 - iter 730/738 - loss 0.07247522 - time (sec): 49.92 - samples/sec: 3305.85 - lr: 0.000039 - momentum: 0.000000 2023-10-13 17:49:51,967 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:49:51,967 EPOCH 3 done: loss 0.0725 - lr: 0.000039 2023-10-13 17:50:03,343 DEV : loss 0.1486140638589859 - f1-score (micro avg) 0.7833 2023-10-13 17:50:03,372 saving best model 2023-10-13 17:50:03,852 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:50:09,135 epoch 4 - iter 73/738 - loss 0.05135216 - time (sec): 5.28 - samples/sec: 3383.66 - lr: 0.000038 - momentum: 0.000000 2023-10-13 17:50:13,787 epoch 4 - iter 146/738 - loss 0.05140785 - time (sec): 9.93 - samples/sec: 3335.68 - lr: 0.000038 - momentum: 0.000000 2023-10-13 17:50:19,538 epoch 4 - iter 219/738 - loss 0.04837867 - time (sec): 15.68 - samples/sec: 3374.63 - lr: 0.000037 - momentum: 0.000000 2023-10-13 17:50:24,656 epoch 4 - iter 292/738 - loss 0.05298887 - time (sec): 20.80 - samples/sec: 3349.51 - lr: 0.000037 - momentum: 0.000000 2023-10-13 17:50:29,288 epoch 4 - iter 365/738 - loss 0.05192526 - time (sec): 25.43 - samples/sec: 3358.42 - lr: 0.000036 - momentum: 0.000000 2023-10-13 17:50:34,582 epoch 4 - iter 438/738 - loss 0.05297484 - time (sec): 30.72 - samples/sec: 3368.28 - lr: 0.000036 - momentum: 0.000000 2023-10-13 17:50:39,204 epoch 4 - iter 511/738 - loss 0.05272183 - time (sec): 35.34 - samples/sec: 3368.87 - lr: 0.000035 - momentum: 0.000000 2023-10-13 17:50:43,882 epoch 4 - iter 584/738 - loss 0.05406746 - time (sec): 40.02 - samples/sec: 3349.33 - lr: 0.000035 - momentum: 0.000000 2023-10-13 17:50:48,214 epoch 4 - iter 657/738 - loss 0.05441271 - time (sec): 44.35 - samples/sec: 3352.26 - lr: 0.000034 - momentum: 0.000000 2023-10-13 17:50:52,924 epoch 4 - iter 730/738 - loss 0.05358667 - time (sec): 49.06 - samples/sec: 3359.88 - lr: 0.000033 - momentum: 0.000000 2023-10-13 17:50:53,379 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:50:53,379 EPOCH 4 done: loss 0.0533 - lr: 0.000033 2023-10-13 17:51:04,589 DEV : loss 0.1737738847732544 - f1-score (micro avg) 0.8049 2023-10-13 17:51:04,619 saving best model 2023-10-13 17:51:05,140 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:51:10,235 epoch 5 - iter 73/738 - loss 0.03032376 - time (sec): 5.09 - samples/sec: 3312.13 - lr: 0.000033 - momentum: 0.000000 2023-10-13 17:51:14,799 epoch 5 - iter 146/738 - loss 0.03581647 - time (sec): 9.65 - samples/sec: 3328.98 - lr: 0.000032 - momentum: 0.000000 2023-10-13 17:51:19,339 epoch 5 - iter 219/738 - loss 0.03520240 - time (sec): 14.19 - samples/sec: 3390.73 - lr: 0.000032 - momentum: 0.000000 2023-10-13 17:51:24,352 epoch 5 - iter 292/738 - loss 0.03677044 - time (sec): 19.21 - samples/sec: 3407.81 - lr: 0.000031 - momentum: 0.000000 2023-10-13 17:51:29,378 epoch 5 - iter 365/738 - loss 0.03507287 - time (sec): 24.23 - samples/sec: 3366.45 - lr: 0.000031 - momentum: 0.000000 2023-10-13 17:51:34,241 epoch 5 - iter 438/738 - loss 0.03455833 - time (sec): 29.10 - samples/sec: 3355.44 - lr: 0.000030 - momentum: 0.000000 2023-10-13 17:51:39,901 epoch 5 - iter 511/738 - loss 0.03549297 - time (sec): 34.76 - samples/sec: 3357.51 - lr: 0.000030 - momentum: 0.000000 2023-10-13 17:51:44,117 epoch 5 - iter 584/738 - loss 0.03649335 - time (sec): 38.97 - samples/sec: 3377.19 - lr: 0.000029 - momentum: 0.000000 2023-10-13 17:51:49,172 epoch 5 - iter 657/738 - loss 0.03572356 - time (sec): 44.03 - samples/sec: 3376.14 - lr: 0.000028 - momentum: 0.000000 2023-10-13 17:51:53,851 epoch 5 - iter 730/738 - loss 0.03593262 - time (sec): 48.71 - samples/sec: 3383.72 - lr: 0.000028 - momentum: 0.000000 2023-10-13 17:51:54,287 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:51:54,287 EPOCH 5 done: loss 0.0360 - lr: 0.000028 2023-10-13 17:52:05,499 DEV : loss 0.1812753677368164 - f1-score (micro avg) 0.8177 2023-10-13 17:52:05,531 saving best model 2023-10-13 17:52:06,113 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:52:11,732 epoch 6 - iter 73/738 - loss 0.01621660 - time (sec): 5.61 - samples/sec: 3000.04 - lr: 0.000027 - momentum: 0.000000 2023-10-13 17:52:16,775 epoch 6 - iter 146/738 - loss 0.02080977 - time (sec): 10.66 - samples/sec: 3100.09 - lr: 0.000027 - momentum: 0.000000 2023-10-13 17:52:21,263 epoch 6 - iter 219/738 - loss 0.01876135 - time (sec): 15.14 - samples/sec: 3144.78 - lr: 0.000026 - momentum: 0.000000 2023-10-13 17:52:25,848 epoch 6 - iter 292/738 - loss 0.02138464 - time (sec): 19.73 - samples/sec: 3183.99 - lr: 0.000026 - momentum: 0.000000 2023-10-13 17:52:31,015 epoch 6 - iter 365/738 - loss 0.01978816 - time (sec): 24.90 - samples/sec: 3209.75 - lr: 0.000025 - momentum: 0.000000 2023-10-13 17:52:35,258 epoch 6 - iter 438/738 - loss 0.01938038 - time (sec): 29.14 - samples/sec: 3225.51 - lr: 0.000025 - momentum: 0.000000 2023-10-13 17:52:40,274 epoch 6 - iter 511/738 - loss 0.01928215 - time (sec): 34.16 - samples/sec: 3254.34 - lr: 0.000024 - momentum: 0.000000 2023-10-13 17:52:45,579 epoch 6 - iter 584/738 - loss 0.01987479 - time (sec): 39.46 - samples/sec: 3280.26 - lr: 0.000023 - momentum: 0.000000 2023-10-13 17:52:51,330 epoch 6 - iter 657/738 - loss 0.02178292 - time (sec): 45.21 - samples/sec: 3297.22 - lr: 0.000023 - momentum: 0.000000 2023-10-13 17:52:56,145 epoch 6 - iter 730/738 - loss 0.02282192 - time (sec): 50.03 - samples/sec: 3300.32 - lr: 0.000022 - momentum: 0.000000 2023-10-13 17:52:56,552 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:52:56,552 EPOCH 6 done: loss 0.0228 - lr: 0.000022 2023-10-13 17:53:07,779 DEV : loss 0.21827659010887146 - f1-score (micro avg) 0.7988 2023-10-13 17:53:07,809 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:53:12,402 epoch 7 - iter 73/738 - loss 0.01465345 - time (sec): 4.59 - samples/sec: 3353.90 - lr: 0.000022 - momentum: 0.000000 2023-10-13 17:53:16,861 epoch 7 - iter 146/738 - loss 0.01573536 - time (sec): 9.05 - samples/sec: 3298.73 - lr: 0.000021 - momentum: 0.000000 2023-10-13 17:53:21,855 epoch 7 - iter 219/738 - loss 0.01844721 - time (sec): 14.04 - samples/sec: 3360.00 - lr: 0.000021 - momentum: 0.000000 2023-10-13 17:53:26,583 epoch 7 - iter 292/738 - loss 0.01755483 - time (sec): 18.77 - samples/sec: 3348.38 - lr: 0.000020 - momentum: 0.000000 2023-10-13 17:53:31,556 epoch 7 - iter 365/738 - loss 0.01732233 - time (sec): 23.75 - samples/sec: 3348.92 - lr: 0.000020 - momentum: 0.000000 2023-10-13 17:53:36,455 epoch 7 - iter 438/738 - loss 0.01872450 - time (sec): 28.64 - samples/sec: 3348.83 - lr: 0.000019 - momentum: 0.000000 2023-10-13 17:53:41,246 epoch 7 - iter 511/738 - loss 0.01801696 - time (sec): 33.44 - samples/sec: 3353.96 - lr: 0.000018 - momentum: 0.000000 2023-10-13 17:53:46,182 epoch 7 - iter 584/738 - loss 0.01924045 - time (sec): 38.37 - samples/sec: 3350.46 - lr: 0.000018 - momentum: 0.000000 2023-10-13 17:53:51,818 epoch 7 - iter 657/738 - loss 0.01897246 - time (sec): 44.01 - samples/sec: 3363.15 - lr: 0.000017 - momentum: 0.000000 2023-10-13 17:53:56,775 epoch 7 - iter 730/738 - loss 0.01878918 - time (sec): 48.96 - samples/sec: 3357.79 - lr: 0.000017 - momentum: 0.000000 2023-10-13 17:53:57,373 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:53:57,373 EPOCH 7 done: loss 0.0186 - lr: 0.000017 2023-10-13 17:54:08,578 DEV : loss 0.20159663259983063 - f1-score (micro avg) 0.8255 2023-10-13 17:54:08,607 saving best model 2023-10-13 17:54:09,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:54:14,384 epoch 8 - iter 73/738 - loss 0.00867283 - time (sec): 5.20 - samples/sec: 3376.39 - lr: 0.000016 - momentum: 0.000000 2023-10-13 17:54:18,924 epoch 8 - iter 146/738 - loss 0.00929264 - time (sec): 9.74 - samples/sec: 3338.85 - lr: 0.000016 - momentum: 0.000000 2023-10-13 17:54:23,900 epoch 8 - iter 219/738 - loss 0.00988928 - time (sec): 14.71 - samples/sec: 3360.24 - lr: 0.000015 - momentum: 0.000000 2023-10-13 17:54:28,467 epoch 8 - iter 292/738 - loss 0.01078611 - time (sec): 19.28 - samples/sec: 3357.48 - lr: 0.000015 - momentum: 0.000000 2023-10-13 17:54:33,531 epoch 8 - iter 365/738 - loss 0.01188262 - time (sec): 24.34 - samples/sec: 3332.34 - lr: 0.000014 - momentum: 0.000000 2023-10-13 17:54:39,068 epoch 8 - iter 438/738 - loss 0.01168210 - time (sec): 29.88 - samples/sec: 3313.89 - lr: 0.000013 - momentum: 0.000000 2023-10-13 17:54:43,328 epoch 8 - iter 511/738 - loss 0.01108505 - time (sec): 34.14 - samples/sec: 3334.40 - lr: 0.000013 - momentum: 0.000000 2023-10-13 17:54:48,539 epoch 8 - iter 584/738 - loss 0.01139473 - time (sec): 39.35 - samples/sec: 3327.81 - lr: 0.000012 - momentum: 0.000000 2023-10-13 17:54:53,175 epoch 8 - iter 657/738 - loss 0.01059925 - time (sec): 43.99 - samples/sec: 3333.14 - lr: 0.000012 - momentum: 0.000000 2023-10-13 17:54:58,385 epoch 8 - iter 730/738 - loss 0.01213062 - time (sec): 49.20 - samples/sec: 3351.79 - lr: 0.000011 - momentum: 0.000000 2023-10-13 17:54:58,847 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:54:58,847 EPOCH 8 done: loss 0.0120 - lr: 0.000011 2023-10-13 17:55:10,116 DEV : loss 0.2121274471282959 - f1-score (micro avg) 0.8167 2023-10-13 17:55:10,146 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:55:15,100 epoch 9 - iter 73/738 - loss 0.00785780 - time (sec): 4.95 - samples/sec: 3384.77 - lr: 0.000011 - momentum: 0.000000 2023-10-13 17:55:20,217 epoch 9 - iter 146/738 - loss 0.00968859 - time (sec): 10.07 - samples/sec: 3323.66 - lr: 0.000010 - momentum: 0.000000 2023-10-13 17:55:24,538 epoch 9 - iter 219/738 - loss 0.00766936 - time (sec): 14.39 - samples/sec: 3355.87 - lr: 0.000010 - momentum: 0.000000 2023-10-13 17:55:29,150 epoch 9 - iter 292/738 - loss 0.00776847 - time (sec): 19.00 - samples/sec: 3343.63 - lr: 0.000009 - momentum: 0.000000 2023-10-13 17:55:34,169 epoch 9 - iter 365/738 - loss 0.00786568 - time (sec): 24.02 - samples/sec: 3303.61 - lr: 0.000008 - momentum: 0.000000 2023-10-13 17:55:39,513 epoch 9 - iter 438/738 - loss 0.00805319 - time (sec): 29.37 - samples/sec: 3304.31 - lr: 0.000008 - momentum: 0.000000 2023-10-13 17:55:44,830 epoch 9 - iter 511/738 - loss 0.00739363 - time (sec): 34.68 - samples/sec: 3308.13 - lr: 0.000007 - momentum: 0.000000 2023-10-13 17:55:49,338 epoch 9 - iter 584/738 - loss 0.00731685 - time (sec): 39.19 - samples/sec: 3324.15 - lr: 0.000007 - momentum: 0.000000 2023-10-13 17:55:54,064 epoch 9 - iter 657/738 - loss 0.00765869 - time (sec): 43.92 - samples/sec: 3324.65 - lr: 0.000006 - momentum: 0.000000 2023-10-13 17:55:59,126 epoch 9 - iter 730/738 - loss 0.00759718 - time (sec): 48.98 - samples/sec: 3359.39 - lr: 0.000006 - momentum: 0.000000 2023-10-13 17:55:59,614 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:55:59,614 EPOCH 9 done: loss 0.0075 - lr: 0.000006 2023-10-13 17:56:10,875 DEV : loss 0.22374621033668518 - f1-score (micro avg) 0.8242 2023-10-13 17:56:10,904 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:56:16,195 epoch 10 - iter 73/738 - loss 0.00432633 - time (sec): 5.29 - samples/sec: 3017.62 - lr: 0.000005 - momentum: 0.000000 2023-10-13 17:56:21,075 epoch 10 - iter 146/738 - loss 0.00341698 - time (sec): 10.17 - samples/sec: 3203.85 - lr: 0.000004 - momentum: 0.000000 2023-10-13 17:56:25,457 epoch 10 - iter 219/738 - loss 0.00467938 - time (sec): 14.55 - samples/sec: 3261.52 - lr: 0.000004 - momentum: 0.000000 2023-10-13 17:56:30,710 epoch 10 - iter 292/738 - loss 0.00480875 - time (sec): 19.81 - samples/sec: 3313.14 - lr: 0.000003 - momentum: 0.000000 2023-10-13 17:56:36,255 epoch 10 - iter 365/738 - loss 0.00575497 - time (sec): 25.35 - samples/sec: 3311.60 - lr: 0.000003 - momentum: 0.000000 2023-10-13 17:56:40,976 epoch 10 - iter 438/738 - loss 0.00574872 - time (sec): 30.07 - samples/sec: 3312.48 - lr: 0.000002 - momentum: 0.000000 2023-10-13 17:56:45,946 epoch 10 - iter 511/738 - loss 0.00539595 - time (sec): 35.04 - samples/sec: 3329.58 - lr: 0.000002 - momentum: 0.000000 2023-10-13 17:56:51,370 epoch 10 - iter 584/738 - loss 0.00518260 - time (sec): 40.47 - samples/sec: 3321.26 - lr: 0.000001 - momentum: 0.000000 2023-10-13 17:56:56,106 epoch 10 - iter 657/738 - loss 0.00512442 - time (sec): 45.20 - samples/sec: 3321.97 - lr: 0.000001 - momentum: 0.000000 2023-10-13 17:57:00,525 epoch 10 - iter 730/738 - loss 0.00499521 - time (sec): 49.62 - samples/sec: 3319.82 - lr: 0.000000 - momentum: 0.000000 2023-10-13 17:57:00,998 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:57:00,999 EPOCH 10 done: loss 0.0049 - lr: 0.000000 2023-10-13 17:57:12,280 DEV : loss 0.22519326210021973 - f1-score (micro avg) 0.8266 2023-10-13 17:57:12,310 saving best model 2023-10-13 17:57:13,140 ---------------------------------------------------------------------------------------------------- 2023-10-13 17:57:13,141 Loading model from best epoch ... 2023-10-13 17:57:14,542 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-13 17:57:20,591 Results: - F-score (micro) 0.8013 - F-score (macro) 0.7071 - Accuracy 0.6949 By class: precision recall f1-score support loc 0.8622 0.8823 0.8721 858 pers 0.7549 0.7970 0.7754 537 org 0.5652 0.5909 0.5778 132 time 0.5484 0.6296 0.5862 54 prod 0.7636 0.6885 0.7241 61 micro avg 0.7876 0.8155 0.8013 1642 macro avg 0.6989 0.7177 0.7071 1642 weighted avg 0.7892 0.8155 0.8019 1642 2023-10-13 17:57:20,591 ----------------------------------------------------------------------------------------------------