2023-10-14 21:14:10,263 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 Train: 14465 sentences 2023-10-14 21:14:10,264 (train_with_dev=False, train_with_test=False) 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 Training Params: 2023-10-14 21:14:10,264 - learning_rate: "3e-05" 2023-10-14 21:14:10,264 - mini_batch_size: "8" 2023-10-14 21:14:10,264 - max_epochs: "10" 2023-10-14 21:14:10,264 - shuffle: "True" 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 Plugins: 2023-10-14 21:14:10,264 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 21:14:10,264 - metric: "('micro avg', 'f1-score')" 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 Computation: 2023-10-14 21:14:10,264 - compute on device: cuda:0 2023-10-14 21:14:10,264 - embedding storage: none 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:10,264 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:14:21,346 epoch 1 - iter 180/1809 - loss 1.90381785 - time (sec): 11.08 - samples/sec: 3376.03 - lr: 0.000003 - momentum: 0.000000 2023-10-14 21:14:32,403 epoch 1 - iter 360/1809 - loss 1.06803482 - time (sec): 22.14 - samples/sec: 3397.48 - lr: 0.000006 - momentum: 0.000000 2023-10-14 21:14:43,052 epoch 1 - iter 540/1809 - loss 0.77636418 - time (sec): 32.79 - samples/sec: 3390.47 - lr: 0.000009 - momentum: 0.000000 2023-10-14 21:14:53,957 epoch 1 - iter 720/1809 - loss 0.61914340 - time (sec): 43.69 - samples/sec: 3387.88 - lr: 0.000012 - momentum: 0.000000 2023-10-14 21:15:05,506 epoch 1 - iter 900/1809 - loss 0.51652229 - time (sec): 55.24 - samples/sec: 3397.47 - lr: 0.000015 - momentum: 0.000000 2023-10-14 21:15:16,373 epoch 1 - iter 1080/1809 - loss 0.45106548 - time (sec): 66.11 - samples/sec: 3405.24 - lr: 0.000018 - momentum: 0.000000 2023-10-14 21:15:27,887 epoch 1 - iter 1260/1809 - loss 0.40095933 - time (sec): 77.62 - samples/sec: 3414.66 - lr: 0.000021 - momentum: 0.000000 2023-10-14 21:15:38,882 epoch 1 - iter 1440/1809 - loss 0.36564921 - time (sec): 88.62 - samples/sec: 3418.04 - lr: 0.000024 - momentum: 0.000000 2023-10-14 21:15:49,939 epoch 1 - iter 1620/1809 - loss 0.33661742 - time (sec): 99.67 - samples/sec: 3414.98 - lr: 0.000027 - momentum: 0.000000 2023-10-14 21:16:00,909 epoch 1 - iter 1800/1809 - loss 0.31361967 - time (sec): 110.64 - samples/sec: 3414.89 - lr: 0.000030 - momentum: 0.000000 2023-10-14 21:16:01,443 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:16:01,443 EPOCH 1 done: loss 0.3126 - lr: 0.000030 2023-10-14 21:16:07,875 DEV : loss 0.11117860674858093 - f1-score (micro avg) 0.6269 2023-10-14 21:16:07,921 saving best model 2023-10-14 21:16:08,358 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:16:19,590 epoch 2 - iter 180/1809 - loss 0.08921046 - time (sec): 11.23 - samples/sec: 3404.09 - lr: 0.000030 - momentum: 0.000000 2023-10-14 21:16:30,539 epoch 2 - iter 360/1809 - loss 0.08748107 - time (sec): 22.18 - samples/sec: 3399.90 - lr: 0.000029 - momentum: 0.000000 2023-10-14 21:16:41,559 epoch 2 - iter 540/1809 - loss 0.08786843 - time (sec): 33.20 - samples/sec: 3384.05 - lr: 0.000029 - momentum: 0.000000 2023-10-14 21:16:52,961 epoch 2 - iter 720/1809 - loss 0.08653904 - time (sec): 44.60 - samples/sec: 3409.28 - lr: 0.000029 - momentum: 0.000000 2023-10-14 21:17:03,984 epoch 2 - iter 900/1809 - loss 0.08652461 - time (sec): 55.62 - samples/sec: 3406.12 - lr: 0.000028 - momentum: 0.000000 2023-10-14 21:17:15,172 epoch 2 - iter 1080/1809 - loss 0.08558330 - time (sec): 66.81 - samples/sec: 3414.01 - lr: 0.000028 - momentum: 0.000000 2023-10-14 21:17:26,791 epoch 2 - iter 1260/1809 - loss 0.08469632 - time (sec): 78.43 - samples/sec: 3409.34 - lr: 0.000028 - momentum: 0.000000 2023-10-14 21:17:37,861 epoch 2 - iter 1440/1809 - loss 0.08498290 - time (sec): 89.50 - samples/sec: 3405.49 - lr: 0.000027 - momentum: 0.000000 2023-10-14 21:17:48,517 epoch 2 - iter 1620/1809 - loss 0.08441384 - time (sec): 100.16 - samples/sec: 3398.36 - lr: 0.000027 - momentum: 0.000000 2023-10-14 21:17:59,613 epoch 2 - iter 1800/1809 - loss 0.08398362 - time (sec): 111.25 - samples/sec: 3399.35 - lr: 0.000027 - momentum: 0.000000 2023-10-14 21:18:00,196 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:18:00,196 EPOCH 2 done: loss 0.0841 - lr: 0.000027 2023-10-14 21:18:05,895 DEV : loss 0.1181371659040451 - f1-score (micro avg) 0.6241 2023-10-14 21:18:05,930 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:18:17,254 epoch 3 - iter 180/1809 - loss 0.04597861 - time (sec): 11.32 - samples/sec: 3256.04 - lr: 0.000026 - momentum: 0.000000 2023-10-14 21:18:29,594 epoch 3 - iter 360/1809 - loss 0.05194200 - time (sec): 23.66 - samples/sec: 3153.44 - lr: 0.000026 - momentum: 0.000000 2023-10-14 21:18:41,113 epoch 3 - iter 540/1809 - loss 0.05725339 - time (sec): 35.18 - samples/sec: 3194.67 - lr: 0.000026 - momentum: 0.000000 2023-10-14 21:18:52,263 epoch 3 - iter 720/1809 - loss 0.05751126 - time (sec): 46.33 - samples/sec: 3234.26 - lr: 0.000025 - momentum: 0.000000 2023-10-14 21:19:03,375 epoch 3 - iter 900/1809 - loss 0.05698117 - time (sec): 57.44 - samples/sec: 3266.53 - lr: 0.000025 - momentum: 0.000000 2023-10-14 21:19:14,364 epoch 3 - iter 1080/1809 - loss 0.05635563 - time (sec): 68.43 - samples/sec: 3296.08 - lr: 0.000025 - momentum: 0.000000 2023-10-14 21:19:26,004 epoch 3 - iter 1260/1809 - loss 0.05644181 - time (sec): 80.07 - samples/sec: 3306.44 - lr: 0.000024 - momentum: 0.000000 2023-10-14 21:19:37,443 epoch 3 - iter 1440/1809 - loss 0.05590284 - time (sec): 91.51 - samples/sec: 3306.55 - lr: 0.000024 - momentum: 0.000000 2023-10-14 21:19:48,688 epoch 3 - iter 1620/1809 - loss 0.05687337 - time (sec): 102.76 - samples/sec: 3313.21 - lr: 0.000024 - momentum: 0.000000 2023-10-14 21:19:59,697 epoch 3 - iter 1800/1809 - loss 0.05715098 - time (sec): 113.76 - samples/sec: 3325.13 - lr: 0.000023 - momentum: 0.000000 2023-10-14 21:20:00,218 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:20:00,218 EPOCH 3 done: loss 0.0570 - lr: 0.000023 2023-10-14 21:20:05,948 DEV : loss 0.15852448344230652 - f1-score (micro avg) 0.6331 2023-10-14 21:20:05,993 saving best model 2023-10-14 21:20:06,557 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:20:17,420 epoch 4 - iter 180/1809 - loss 0.03774158 - time (sec): 10.86 - samples/sec: 3375.73 - lr: 0.000023 - momentum: 0.000000 2023-10-14 21:20:28,751 epoch 4 - iter 360/1809 - loss 0.04107676 - time (sec): 22.19 - samples/sec: 3376.57 - lr: 0.000023 - momentum: 0.000000 2023-10-14 21:20:39,811 epoch 4 - iter 540/1809 - loss 0.04036546 - time (sec): 33.25 - samples/sec: 3377.69 - lr: 0.000022 - momentum: 0.000000 2023-10-14 21:20:50,772 epoch 4 - iter 720/1809 - loss 0.04111588 - time (sec): 44.21 - samples/sec: 3394.90 - lr: 0.000022 - momentum: 0.000000 2023-10-14 21:21:02,835 epoch 4 - iter 900/1809 - loss 0.04004334 - time (sec): 56.28 - samples/sec: 3360.21 - lr: 0.000022 - momentum: 0.000000 2023-10-14 21:21:13,868 epoch 4 - iter 1080/1809 - loss 0.04015642 - time (sec): 67.31 - samples/sec: 3384.56 - lr: 0.000021 - momentum: 0.000000 2023-10-14 21:21:24,788 epoch 4 - iter 1260/1809 - loss 0.04109986 - time (sec): 78.23 - samples/sec: 3395.23 - lr: 0.000021 - momentum: 0.000000 2023-10-14 21:21:35,715 epoch 4 - iter 1440/1809 - loss 0.04150162 - time (sec): 89.16 - samples/sec: 3408.60 - lr: 0.000021 - momentum: 0.000000 2023-10-14 21:21:47,110 epoch 4 - iter 1620/1809 - loss 0.04171498 - time (sec): 100.55 - samples/sec: 3396.77 - lr: 0.000020 - momentum: 0.000000 2023-10-14 21:21:58,023 epoch 4 - iter 1800/1809 - loss 0.04159954 - time (sec): 111.46 - samples/sec: 3392.39 - lr: 0.000020 - momentum: 0.000000 2023-10-14 21:21:58,529 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:21:58,529 EPOCH 4 done: loss 0.0415 - lr: 0.000020 2023-10-14 21:22:04,200 DEV : loss 0.23644335567951202 - f1-score (micro avg) 0.6361 2023-10-14 21:22:04,231 saving best model 2023-10-14 21:22:04,809 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:22:15,861 epoch 5 - iter 180/1809 - loss 0.03030318 - time (sec): 11.05 - samples/sec: 3333.61 - lr: 0.000020 - momentum: 0.000000 2023-10-14 21:22:26,995 epoch 5 - iter 360/1809 - loss 0.03079855 - time (sec): 22.18 - samples/sec: 3422.91 - lr: 0.000019 - momentum: 0.000000 2023-10-14 21:22:38,145 epoch 5 - iter 540/1809 - loss 0.03069607 - time (sec): 33.33 - samples/sec: 3419.90 - lr: 0.000019 - momentum: 0.000000 2023-10-14 21:22:49,063 epoch 5 - iter 720/1809 - loss 0.03008395 - time (sec): 44.25 - samples/sec: 3414.06 - lr: 0.000019 - momentum: 0.000000 2023-10-14 21:23:00,143 epoch 5 - iter 900/1809 - loss 0.03015912 - time (sec): 55.33 - samples/sec: 3399.24 - lr: 0.000018 - momentum: 0.000000 2023-10-14 21:23:11,578 epoch 5 - iter 1080/1809 - loss 0.03105206 - time (sec): 66.77 - samples/sec: 3409.94 - lr: 0.000018 - momentum: 0.000000 2023-10-14 21:23:22,723 epoch 5 - iter 1260/1809 - loss 0.03115092 - time (sec): 77.91 - samples/sec: 3411.10 - lr: 0.000018 - momentum: 0.000000 2023-10-14 21:23:33,719 epoch 5 - iter 1440/1809 - loss 0.03091193 - time (sec): 88.91 - samples/sec: 3415.79 - lr: 0.000017 - momentum: 0.000000 2023-10-14 21:23:44,567 epoch 5 - iter 1620/1809 - loss 0.03146358 - time (sec): 99.75 - samples/sec: 3418.58 - lr: 0.000017 - momentum: 0.000000 2023-10-14 21:23:55,443 epoch 5 - iter 1800/1809 - loss 0.03089906 - time (sec): 110.63 - samples/sec: 3417.50 - lr: 0.000017 - momentum: 0.000000 2023-10-14 21:23:55,964 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:23:55,964 EPOCH 5 done: loss 0.0308 - lr: 0.000017 2023-10-14 21:24:02,349 DEV : loss 0.29626017808914185 - f1-score (micro avg) 0.6348 2023-10-14 21:24:02,382 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:24:13,473 epoch 6 - iter 180/1809 - loss 0.02177454 - time (sec): 11.09 - samples/sec: 3480.30 - lr: 0.000016 - momentum: 0.000000 2023-10-14 21:24:24,551 epoch 6 - iter 360/1809 - loss 0.01815594 - time (sec): 22.17 - samples/sec: 3419.15 - lr: 0.000016 - momentum: 0.000000 2023-10-14 21:24:35,749 epoch 6 - iter 540/1809 - loss 0.01951666 - time (sec): 33.37 - samples/sec: 3413.28 - lr: 0.000016 - momentum: 0.000000 2023-10-14 21:24:46,567 epoch 6 - iter 720/1809 - loss 0.01954928 - time (sec): 44.18 - samples/sec: 3399.04 - lr: 0.000015 - momentum: 0.000000 2023-10-14 21:24:57,651 epoch 6 - iter 900/1809 - loss 0.02065711 - time (sec): 55.27 - samples/sec: 3391.16 - lr: 0.000015 - momentum: 0.000000 2023-10-14 21:25:08,694 epoch 6 - iter 1080/1809 - loss 0.02124727 - time (sec): 66.31 - samples/sec: 3387.12 - lr: 0.000015 - momentum: 0.000000 2023-10-14 21:25:19,606 epoch 6 - iter 1260/1809 - loss 0.02069374 - time (sec): 77.22 - samples/sec: 3394.31 - lr: 0.000014 - momentum: 0.000000 2023-10-14 21:25:30,722 epoch 6 - iter 1440/1809 - loss 0.02117259 - time (sec): 88.34 - samples/sec: 3402.25 - lr: 0.000014 - momentum: 0.000000 2023-10-14 21:25:41,996 epoch 6 - iter 1620/1809 - loss 0.02125642 - time (sec): 99.61 - samples/sec: 3403.99 - lr: 0.000014 - momentum: 0.000000 2023-10-14 21:25:53,227 epoch 6 - iter 1800/1809 - loss 0.02103519 - time (sec): 110.84 - samples/sec: 3409.62 - lr: 0.000013 - momentum: 0.000000 2023-10-14 21:25:53,744 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:25:53,744 EPOCH 6 done: loss 0.0209 - lr: 0.000013 2023-10-14 21:26:01,401 DEV : loss 0.35041970014572144 - f1-score (micro avg) 0.6271 2023-10-14 21:26:01,436 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:26:12,224 epoch 7 - iter 180/1809 - loss 0.01038516 - time (sec): 10.79 - samples/sec: 3579.72 - lr: 0.000013 - momentum: 0.000000 2023-10-14 21:26:23,538 epoch 7 - iter 360/1809 - loss 0.01077422 - time (sec): 22.10 - samples/sec: 3448.80 - lr: 0.000013 - momentum: 0.000000 2023-10-14 21:26:34,673 epoch 7 - iter 540/1809 - loss 0.01201565 - time (sec): 33.24 - samples/sec: 3417.81 - lr: 0.000012 - momentum: 0.000000 2023-10-14 21:26:45,591 epoch 7 - iter 720/1809 - loss 0.01213645 - time (sec): 44.15 - samples/sec: 3443.25 - lr: 0.000012 - momentum: 0.000000 2023-10-14 21:26:56,659 epoch 7 - iter 900/1809 - loss 0.01412555 - time (sec): 55.22 - samples/sec: 3438.25 - lr: 0.000012 - momentum: 0.000000 2023-10-14 21:27:07,537 epoch 7 - iter 1080/1809 - loss 0.01408455 - time (sec): 66.10 - samples/sec: 3435.92 - lr: 0.000011 - momentum: 0.000000 2023-10-14 21:27:18,660 epoch 7 - iter 1260/1809 - loss 0.01332679 - time (sec): 77.22 - samples/sec: 3422.80 - lr: 0.000011 - momentum: 0.000000 2023-10-14 21:27:30,101 epoch 7 - iter 1440/1809 - loss 0.01394835 - time (sec): 88.66 - samples/sec: 3428.58 - lr: 0.000011 - momentum: 0.000000 2023-10-14 21:27:41,000 epoch 7 - iter 1620/1809 - loss 0.01439966 - time (sec): 99.56 - samples/sec: 3424.09 - lr: 0.000010 - momentum: 0.000000 2023-10-14 21:27:51,982 epoch 7 - iter 1800/1809 - loss 0.01437537 - time (sec): 110.54 - samples/sec: 3422.81 - lr: 0.000010 - momentum: 0.000000 2023-10-14 21:27:52,474 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:27:52,474 EPOCH 7 done: loss 0.0144 - lr: 0.000010 2023-10-14 21:27:58,730 DEV : loss 0.3561807870864868 - f1-score (micro avg) 0.6432 2023-10-14 21:27:58,762 saving best model 2023-10-14 21:27:59,385 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:28:10,412 epoch 8 - iter 180/1809 - loss 0.00796504 - time (sec): 11.03 - samples/sec: 3440.41 - lr: 0.000010 - momentum: 0.000000 2023-10-14 21:28:21,842 epoch 8 - iter 360/1809 - loss 0.00775403 - time (sec): 22.46 - samples/sec: 3402.77 - lr: 0.000009 - momentum: 0.000000 2023-10-14 21:28:32,860 epoch 8 - iter 540/1809 - loss 0.00765121 - time (sec): 33.47 - samples/sec: 3380.13 - lr: 0.000009 - momentum: 0.000000 2023-10-14 21:28:43,863 epoch 8 - iter 720/1809 - loss 0.00901281 - time (sec): 44.48 - samples/sec: 3403.24 - lr: 0.000009 - momentum: 0.000000 2023-10-14 21:28:54,924 epoch 8 - iter 900/1809 - loss 0.00858084 - time (sec): 55.54 - samples/sec: 3420.72 - lr: 0.000008 - momentum: 0.000000 2023-10-14 21:29:06,065 epoch 8 - iter 1080/1809 - loss 0.00865634 - time (sec): 66.68 - samples/sec: 3409.39 - lr: 0.000008 - momentum: 0.000000 2023-10-14 21:29:16,924 epoch 8 - iter 1260/1809 - loss 0.00989641 - time (sec): 77.54 - samples/sec: 3417.94 - lr: 0.000008 - momentum: 0.000000 2023-10-14 21:29:27,925 epoch 8 - iter 1440/1809 - loss 0.01016223 - time (sec): 88.54 - samples/sec: 3422.99 - lr: 0.000007 - momentum: 0.000000 2023-10-14 21:29:38,969 epoch 8 - iter 1620/1809 - loss 0.01064998 - time (sec): 99.58 - samples/sec: 3422.54 - lr: 0.000007 - momentum: 0.000000 2023-10-14 21:29:50,129 epoch 8 - iter 1800/1809 - loss 0.01033000 - time (sec): 110.74 - samples/sec: 3417.04 - lr: 0.000007 - momentum: 0.000000 2023-10-14 21:29:50,633 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:29:50,633 EPOCH 8 done: loss 0.0103 - lr: 0.000007 2023-10-14 21:29:56,979 DEV : loss 0.38320016860961914 - f1-score (micro avg) 0.6464 2023-10-14 21:29:57,010 saving best model 2023-10-14 21:29:57,602 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:30:08,728 epoch 9 - iter 180/1809 - loss 0.00972321 - time (sec): 11.12 - samples/sec: 3412.05 - lr: 0.000006 - momentum: 0.000000 2023-10-14 21:30:19,700 epoch 9 - iter 360/1809 - loss 0.00656536 - time (sec): 22.10 - samples/sec: 3439.93 - lr: 0.000006 - momentum: 0.000000 2023-10-14 21:30:31,056 epoch 9 - iter 540/1809 - loss 0.00671814 - time (sec): 33.45 - samples/sec: 3452.06 - lr: 0.000006 - momentum: 0.000000 2023-10-14 21:30:42,103 epoch 9 - iter 720/1809 - loss 0.00664463 - time (sec): 44.50 - samples/sec: 3426.80 - lr: 0.000005 - momentum: 0.000000 2023-10-14 21:30:52,969 epoch 9 - iter 900/1809 - loss 0.00691173 - time (sec): 55.37 - samples/sec: 3419.30 - lr: 0.000005 - momentum: 0.000000 2023-10-14 21:31:03,885 epoch 9 - iter 1080/1809 - loss 0.00690807 - time (sec): 66.28 - samples/sec: 3415.64 - lr: 0.000005 - momentum: 0.000000 2023-10-14 21:31:14,931 epoch 9 - iter 1260/1809 - loss 0.00726721 - time (sec): 77.33 - samples/sec: 3419.26 - lr: 0.000004 - momentum: 0.000000 2023-10-14 21:31:25,828 epoch 9 - iter 1440/1809 - loss 0.00726244 - time (sec): 88.22 - samples/sec: 3421.98 - lr: 0.000004 - momentum: 0.000000 2023-10-14 21:31:36,948 epoch 9 - iter 1620/1809 - loss 0.00705247 - time (sec): 99.34 - samples/sec: 3423.83 - lr: 0.000004 - momentum: 0.000000 2023-10-14 21:31:48,785 epoch 9 - iter 1800/1809 - loss 0.00696297 - time (sec): 111.18 - samples/sec: 3402.85 - lr: 0.000003 - momentum: 0.000000 2023-10-14 21:31:49,342 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:31:49,342 EPOCH 9 done: loss 0.0070 - lr: 0.000003 2023-10-14 21:31:54,979 DEV : loss 0.38886281847953796 - f1-score (micro avg) 0.6445 2023-10-14 21:31:55,023 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:32:08,159 epoch 10 - iter 180/1809 - loss 0.00786543 - time (sec): 13.14 - samples/sec: 2875.89 - lr: 0.000003 - momentum: 0.000000 2023-10-14 21:32:19,712 epoch 10 - iter 360/1809 - loss 0.00565639 - time (sec): 24.69 - samples/sec: 3046.75 - lr: 0.000003 - momentum: 0.000000 2023-10-14 21:32:31,193 epoch 10 - iter 540/1809 - loss 0.00526643 - time (sec): 36.17 - samples/sec: 3136.28 - lr: 0.000002 - momentum: 0.000000 2023-10-14 21:32:42,536 epoch 10 - iter 720/1809 - loss 0.00464233 - time (sec): 47.51 - samples/sec: 3170.36 - lr: 0.000002 - momentum: 0.000000 2023-10-14 21:32:54,160 epoch 10 - iter 900/1809 - loss 0.00541164 - time (sec): 59.14 - samples/sec: 3196.61 - lr: 0.000002 - momentum: 0.000000 2023-10-14 21:33:04,934 epoch 10 - iter 1080/1809 - loss 0.00573972 - time (sec): 69.91 - samples/sec: 3225.60 - lr: 0.000001 - momentum: 0.000000 2023-10-14 21:33:15,716 epoch 10 - iter 1260/1809 - loss 0.00553462 - time (sec): 80.69 - samples/sec: 3260.00 - lr: 0.000001 - momentum: 0.000000 2023-10-14 21:33:27,092 epoch 10 - iter 1440/1809 - loss 0.00551375 - time (sec): 92.07 - samples/sec: 3286.62 - lr: 0.000001 - momentum: 0.000000 2023-10-14 21:33:37,913 epoch 10 - iter 1620/1809 - loss 0.00525782 - time (sec): 102.89 - samples/sec: 3308.15 - lr: 0.000000 - momentum: 0.000000 2023-10-14 21:33:49,470 epoch 10 - iter 1800/1809 - loss 0.00511494 - time (sec): 114.45 - samples/sec: 3304.67 - lr: 0.000000 - momentum: 0.000000 2023-10-14 21:33:49,989 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:33:49,990 EPOCH 10 done: loss 0.0051 - lr: 0.000000 2023-10-14 21:33:55,618 DEV : loss 0.4081748127937317 - f1-score (micro avg) 0.6428 2023-10-14 21:33:56,145 ---------------------------------------------------------------------------------------------------- 2023-10-14 21:33:56,146 Loading model from best epoch ... 2023-10-14 21:33:57,658 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-14 21:34:05,332 Results: - F-score (micro) 0.6527 - F-score (macro) 0.5176 - Accuracy 0.4984 By class: precision recall f1-score support loc 0.6280 0.7970 0.7025 591 pers 0.5703 0.7843 0.6604 357 org 0.1899 0.1899 0.1899 79 micro avg 0.5803 0.7459 0.6527 1027 macro avg 0.4627 0.5904 0.5176 1027 weighted avg 0.5742 0.7459 0.6484 1027 2023-10-14 21:34:05,332 ----------------------------------------------------------------------------------------------------