2023-10-25 15:07:16,252 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,253 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 15:07:16,253 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,254 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,254 Train: 7142 sentences 2023-10-25 15:07:16,254 (train_with_dev=False, train_with_test=False) 2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,254 Training Params: 2023-10-25 15:07:16,254 - learning_rate: "3e-05" 2023-10-25 15:07:16,254 - mini_batch_size: "8" 2023-10-25 15:07:16,254 - max_epochs: "10" 2023-10-25 15:07:16,254 - shuffle: "True" 2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,254 Plugins: 2023-10-25 15:07:16,254 - TensorboardLogger 2023-10-25 15:07:16,254 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,254 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 15:07:16,254 - metric: "('micro avg', 'f1-score')" 2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,254 Computation: 2023-10-25 15:07:16,254 - compute on device: cuda:0 2023-10-25 15:07:16,254 - embedding storage: none 2023-10-25 15:07:16,254 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,254 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-25 15:07:16,255 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,255 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:07:16,255 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 15:07:22,287 epoch 1 - iter 89/893 - loss 2.32679756 - time (sec): 6.03 - samples/sec: 4228.16 - lr: 0.000003 - momentum: 0.000000 2023-10-25 15:07:27,984 epoch 1 - iter 178/893 - loss 1.51396462 - time (sec): 11.73 - samples/sec: 4163.39 - lr: 0.000006 - momentum: 0.000000 2023-10-25 15:07:33,714 epoch 1 - iter 267/893 - loss 1.14478279 - time (sec): 17.46 - samples/sec: 4138.90 - lr: 0.000009 - momentum: 0.000000 2023-10-25 15:07:39,758 epoch 1 - iter 356/893 - loss 0.92624548 - time (sec): 23.50 - samples/sec: 4118.92 - lr: 0.000012 - momentum: 0.000000 2023-10-25 15:07:45,647 epoch 1 - iter 445/893 - loss 0.77920214 - time (sec): 29.39 - samples/sec: 4149.48 - lr: 0.000015 - momentum: 0.000000 2023-10-25 15:07:51,499 epoch 1 - iter 534/893 - loss 0.67179663 - time (sec): 35.24 - samples/sec: 4208.38 - lr: 0.000018 - momentum: 0.000000 2023-10-25 15:07:57,019 epoch 1 - iter 623/893 - loss 0.60130729 - time (sec): 40.76 - samples/sec: 4247.79 - lr: 0.000021 - momentum: 0.000000 2023-10-25 15:08:02,490 epoch 1 - iter 712/893 - loss 0.54657076 - time (sec): 46.23 - samples/sec: 4282.03 - lr: 0.000024 - momentum: 0.000000 2023-10-25 15:08:07,955 epoch 1 - iter 801/893 - loss 0.50230078 - time (sec): 51.70 - samples/sec: 4306.93 - lr: 0.000027 - momentum: 0.000000 2023-10-25 15:08:13,549 epoch 1 - iter 890/893 - loss 0.46653814 - time (sec): 57.29 - samples/sec: 4330.30 - lr: 0.000030 - momentum: 0.000000 2023-10-25 15:08:13,712 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:08:13,712 EPOCH 1 done: loss 0.4656 - lr: 0.000030 2023-10-25 15:08:17,345 DEV : loss 0.1060444563627243 - f1-score (micro avg) 0.7387 2023-10-25 15:08:17,369 saving best model 2023-10-25 15:08:17,905 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:08:23,847 epoch 2 - iter 89/893 - loss 0.11786204 - time (sec): 5.94 - samples/sec: 4321.04 - lr: 0.000030 - momentum: 0.000000 2023-10-25 15:08:29,333 epoch 2 - iter 178/893 - loss 0.11830957 - time (sec): 11.43 - samples/sec: 4126.75 - lr: 0.000029 - momentum: 0.000000 2023-10-25 15:08:35,534 epoch 2 - iter 267/893 - loss 0.11117703 - time (sec): 17.63 - samples/sec: 4187.28 - lr: 0.000029 - momentum: 0.000000 2023-10-25 15:08:41,384 epoch 2 - iter 356/893 - loss 0.11088492 - time (sec): 23.48 - samples/sec: 4194.66 - lr: 0.000029 - momentum: 0.000000 2023-10-25 15:08:47,307 epoch 2 - iter 445/893 - loss 0.10753992 - time (sec): 29.40 - samples/sec: 4218.91 - lr: 0.000028 - momentum: 0.000000 2023-10-25 15:08:53,058 epoch 2 - iter 534/893 - loss 0.10789428 - time (sec): 35.15 - samples/sec: 4232.20 - lr: 0.000028 - momentum: 0.000000 2023-10-25 15:08:58,673 epoch 2 - iter 623/893 - loss 0.10558977 - time (sec): 40.77 - samples/sec: 4289.97 - lr: 0.000028 - momentum: 0.000000 2023-10-25 15:09:04,088 epoch 2 - iter 712/893 - loss 0.10375529 - time (sec): 46.18 - samples/sec: 4271.55 - lr: 0.000027 - momentum: 0.000000 2023-10-25 15:09:09,613 epoch 2 - iter 801/893 - loss 0.10372405 - time (sec): 51.71 - samples/sec: 4305.77 - lr: 0.000027 - momentum: 0.000000 2023-10-25 15:09:15,240 epoch 2 - iter 890/893 - loss 0.10324515 - time (sec): 57.33 - samples/sec: 4321.41 - lr: 0.000027 - momentum: 0.000000 2023-10-25 15:09:15,429 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:09:15,430 EPOCH 2 done: loss 0.1031 - lr: 0.000027 2023-10-25 15:09:20,245 DEV : loss 0.09593858569860458 - f1-score (micro avg) 0.777 2023-10-25 15:09:20,268 saving best model 2023-10-25 15:09:20,917 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:09:26,476 epoch 3 - iter 89/893 - loss 0.06107853 - time (sec): 5.56 - samples/sec: 4578.32 - lr: 0.000026 - momentum: 0.000000 2023-10-25 15:09:31,970 epoch 3 - iter 178/893 - loss 0.05655927 - time (sec): 11.05 - samples/sec: 4428.83 - lr: 0.000026 - momentum: 0.000000 2023-10-25 15:09:37,621 epoch 3 - iter 267/893 - loss 0.05765556 - time (sec): 16.70 - samples/sec: 4489.95 - lr: 0.000026 - momentum: 0.000000 2023-10-25 15:09:43,134 epoch 3 - iter 356/893 - loss 0.05899019 - time (sec): 22.22 - samples/sec: 4487.90 - lr: 0.000025 - momentum: 0.000000 2023-10-25 15:09:48,684 epoch 3 - iter 445/893 - loss 0.06059285 - time (sec): 27.77 - samples/sec: 4430.84 - lr: 0.000025 - momentum: 0.000000 2023-10-25 15:09:54,437 epoch 3 - iter 534/893 - loss 0.06203883 - time (sec): 33.52 - samples/sec: 4406.62 - lr: 0.000025 - momentum: 0.000000 2023-10-25 15:10:00,394 epoch 3 - iter 623/893 - loss 0.06228959 - time (sec): 39.48 - samples/sec: 4403.41 - lr: 0.000024 - momentum: 0.000000 2023-10-25 15:10:06,271 epoch 3 - iter 712/893 - loss 0.06223592 - time (sec): 45.35 - samples/sec: 4403.81 - lr: 0.000024 - momentum: 0.000000 2023-10-25 15:10:12,116 epoch 3 - iter 801/893 - loss 0.06176464 - time (sec): 51.20 - samples/sec: 4397.44 - lr: 0.000024 - momentum: 0.000000 2023-10-25 15:10:17,781 epoch 3 - iter 890/893 - loss 0.06140542 - time (sec): 56.86 - samples/sec: 4364.69 - lr: 0.000023 - momentum: 0.000000 2023-10-25 15:10:17,965 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:10:17,965 EPOCH 3 done: loss 0.0613 - lr: 0.000023 2023-10-25 15:10:22,849 DEV : loss 0.10392870754003525 - f1-score (micro avg) 0.7824 2023-10-25 15:10:22,870 saving best model 2023-10-25 15:10:23,572 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:10:29,407 epoch 4 - iter 89/893 - loss 0.04410301 - time (sec): 5.83 - samples/sec: 4278.04 - lr: 0.000023 - momentum: 0.000000 2023-10-25 15:10:35,183 epoch 4 - iter 178/893 - loss 0.04544728 - time (sec): 11.61 - samples/sec: 4301.00 - lr: 0.000023 - momentum: 0.000000 2023-10-25 15:10:40,777 epoch 4 - iter 267/893 - loss 0.04597693 - time (sec): 17.20 - samples/sec: 4268.80 - lr: 0.000022 - momentum: 0.000000 2023-10-25 15:10:46,358 epoch 4 - iter 356/893 - loss 0.04537082 - time (sec): 22.78 - samples/sec: 4353.99 - lr: 0.000022 - momentum: 0.000000 2023-10-25 15:10:52,237 epoch 4 - iter 445/893 - loss 0.04624815 - time (sec): 28.66 - samples/sec: 4335.26 - lr: 0.000022 - momentum: 0.000000 2023-10-25 15:10:58,294 epoch 4 - iter 534/893 - loss 0.04475990 - time (sec): 34.72 - samples/sec: 4348.02 - lr: 0.000021 - momentum: 0.000000 2023-10-25 15:11:04,125 epoch 4 - iter 623/893 - loss 0.04559760 - time (sec): 40.55 - samples/sec: 4316.02 - lr: 0.000021 - momentum: 0.000000 2023-10-25 15:11:09,950 epoch 4 - iter 712/893 - loss 0.04461195 - time (sec): 46.38 - samples/sec: 4271.43 - lr: 0.000021 - momentum: 0.000000 2023-10-25 15:11:15,961 epoch 4 - iter 801/893 - loss 0.04371497 - time (sec): 52.39 - samples/sec: 4285.47 - lr: 0.000020 - momentum: 0.000000 2023-10-25 15:11:21,628 epoch 4 - iter 890/893 - loss 0.04355757 - time (sec): 58.05 - samples/sec: 4275.23 - lr: 0.000020 - momentum: 0.000000 2023-10-25 15:11:21,806 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:11:21,807 EPOCH 4 done: loss 0.0437 - lr: 0.000020 2023-10-25 15:11:25,871 DEV : loss 0.1405394971370697 - f1-score (micro avg) 0.7739 2023-10-25 15:11:25,895 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:11:31,806 epoch 5 - iter 89/893 - loss 0.03086483 - time (sec): 5.91 - samples/sec: 4018.56 - lr: 0.000020 - momentum: 0.000000 2023-10-25 15:11:37,763 epoch 5 - iter 178/893 - loss 0.03408901 - time (sec): 11.87 - samples/sec: 4169.34 - lr: 0.000019 - momentum: 0.000000 2023-10-25 15:11:43,934 epoch 5 - iter 267/893 - loss 0.03368610 - time (sec): 18.04 - samples/sec: 4168.56 - lr: 0.000019 - momentum: 0.000000 2023-10-25 15:11:49,742 epoch 5 - iter 356/893 - loss 0.03342150 - time (sec): 23.84 - samples/sec: 4171.56 - lr: 0.000019 - momentum: 0.000000 2023-10-25 15:11:55,551 epoch 5 - iter 445/893 - loss 0.03366136 - time (sec): 29.65 - samples/sec: 4169.40 - lr: 0.000018 - momentum: 0.000000 2023-10-25 15:12:01,573 epoch 5 - iter 534/893 - loss 0.03390282 - time (sec): 35.68 - samples/sec: 4202.78 - lr: 0.000018 - momentum: 0.000000 2023-10-25 15:12:07,368 epoch 5 - iter 623/893 - loss 0.03357706 - time (sec): 41.47 - samples/sec: 4197.80 - lr: 0.000018 - momentum: 0.000000 2023-10-25 15:12:13,096 epoch 5 - iter 712/893 - loss 0.03266215 - time (sec): 47.20 - samples/sec: 4199.37 - lr: 0.000017 - momentum: 0.000000 2023-10-25 15:12:18,874 epoch 5 - iter 801/893 - loss 0.03391376 - time (sec): 52.98 - samples/sec: 4207.00 - lr: 0.000017 - momentum: 0.000000 2023-10-25 15:12:24,683 epoch 5 - iter 890/893 - loss 0.03419362 - time (sec): 58.79 - samples/sec: 4219.53 - lr: 0.000017 - momentum: 0.000000 2023-10-25 15:12:24,885 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:12:24,885 EPOCH 5 done: loss 0.0341 - lr: 0.000017 2023-10-25 15:12:29,919 DEV : loss 0.16618064045906067 - f1-score (micro avg) 0.8051 2023-10-25 15:12:29,940 saving best model 2023-10-25 15:12:30,620 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:12:36,386 epoch 6 - iter 89/893 - loss 0.01934906 - time (sec): 5.76 - samples/sec: 4367.06 - lr: 0.000016 - momentum: 0.000000 2023-10-25 15:12:42,034 epoch 6 - iter 178/893 - loss 0.01971571 - time (sec): 11.41 - samples/sec: 4277.35 - lr: 0.000016 - momentum: 0.000000 2023-10-25 15:12:48,099 epoch 6 - iter 267/893 - loss 0.01891099 - time (sec): 17.48 - samples/sec: 4240.69 - lr: 0.000016 - momentum: 0.000000 2023-10-25 15:12:54,052 epoch 6 - iter 356/893 - loss 0.02463843 - time (sec): 23.43 - samples/sec: 4232.23 - lr: 0.000015 - momentum: 0.000000 2023-10-25 15:12:59,977 epoch 6 - iter 445/893 - loss 0.02485032 - time (sec): 29.35 - samples/sec: 4266.10 - lr: 0.000015 - momentum: 0.000000 2023-10-25 15:13:05,810 epoch 6 - iter 534/893 - loss 0.02602922 - time (sec): 35.19 - samples/sec: 4217.86 - lr: 0.000015 - momentum: 0.000000 2023-10-25 15:13:11,965 epoch 6 - iter 623/893 - loss 0.02550398 - time (sec): 41.34 - samples/sec: 4206.73 - lr: 0.000014 - momentum: 0.000000 2023-10-25 15:13:17,850 epoch 6 - iter 712/893 - loss 0.02490406 - time (sec): 47.23 - samples/sec: 4219.72 - lr: 0.000014 - momentum: 0.000000 2023-10-25 15:13:23,687 epoch 6 - iter 801/893 - loss 0.02516364 - time (sec): 53.06 - samples/sec: 4220.25 - lr: 0.000014 - momentum: 0.000000 2023-10-25 15:13:29,433 epoch 6 - iter 890/893 - loss 0.02571510 - time (sec): 58.81 - samples/sec: 4220.50 - lr: 0.000013 - momentum: 0.000000 2023-10-25 15:13:29,621 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:13:29,622 EPOCH 6 done: loss 0.0258 - lr: 0.000013 2023-10-25 15:13:34,396 DEV : loss 0.17371046543121338 - f1-score (micro avg) 0.8112 2023-10-25 15:13:34,417 saving best model 2023-10-25 15:13:35,072 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:13:41,193 epoch 7 - iter 89/893 - loss 0.02011470 - time (sec): 6.12 - samples/sec: 4355.79 - lr: 0.000013 - momentum: 0.000000 2023-10-25 15:13:47,009 epoch 7 - iter 178/893 - loss 0.02325248 - time (sec): 11.94 - samples/sec: 4245.96 - lr: 0.000013 - momentum: 0.000000 2023-10-25 15:13:52,906 epoch 7 - iter 267/893 - loss 0.02032808 - time (sec): 17.83 - samples/sec: 4206.61 - lr: 0.000012 - momentum: 0.000000 2023-10-25 15:13:58,744 epoch 7 - iter 356/893 - loss 0.02038788 - time (sec): 23.67 - samples/sec: 4235.40 - lr: 0.000012 - momentum: 0.000000 2023-10-25 15:14:04,629 epoch 7 - iter 445/893 - loss 0.01997941 - time (sec): 29.56 - samples/sec: 4217.35 - lr: 0.000012 - momentum: 0.000000 2023-10-25 15:14:10,590 epoch 7 - iter 534/893 - loss 0.01925592 - time (sec): 35.52 - samples/sec: 4185.73 - lr: 0.000011 - momentum: 0.000000 2023-10-25 15:14:16,316 epoch 7 - iter 623/893 - loss 0.01978486 - time (sec): 41.24 - samples/sec: 4162.69 - lr: 0.000011 - momentum: 0.000000 2023-10-25 15:14:22,572 epoch 7 - iter 712/893 - loss 0.01991371 - time (sec): 47.50 - samples/sec: 4168.20 - lr: 0.000011 - momentum: 0.000000 2023-10-25 15:14:28,265 epoch 7 - iter 801/893 - loss 0.01994353 - time (sec): 53.19 - samples/sec: 4181.97 - lr: 0.000010 - momentum: 0.000000 2023-10-25 15:14:34,108 epoch 7 - iter 890/893 - loss 0.02008156 - time (sec): 59.03 - samples/sec: 4202.26 - lr: 0.000010 - momentum: 0.000000 2023-10-25 15:14:34,306 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:14:34,306 EPOCH 7 done: loss 0.0200 - lr: 0.000010 2023-10-25 15:14:38,320 DEV : loss 0.17937816679477692 - f1-score (micro avg) 0.8123 2023-10-25 15:14:38,342 saving best model 2023-10-25 15:14:39,015 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:14:44,770 epoch 8 - iter 89/893 - loss 0.01881002 - time (sec): 5.75 - samples/sec: 4163.23 - lr: 0.000010 - momentum: 0.000000 2023-10-25 15:14:50,519 epoch 8 - iter 178/893 - loss 0.01785540 - time (sec): 11.50 - samples/sec: 4223.67 - lr: 0.000009 - momentum: 0.000000 2023-10-25 15:14:56,466 epoch 8 - iter 267/893 - loss 0.01663373 - time (sec): 17.45 - samples/sec: 4239.61 - lr: 0.000009 - momentum: 0.000000 2023-10-25 15:15:02,200 epoch 8 - iter 356/893 - loss 0.01641502 - time (sec): 23.18 - samples/sec: 4212.11 - lr: 0.000009 - momentum: 0.000000 2023-10-25 15:15:08,033 epoch 8 - iter 445/893 - loss 0.01638939 - time (sec): 29.02 - samples/sec: 4217.12 - lr: 0.000008 - momentum: 0.000000 2023-10-25 15:15:14,046 epoch 8 - iter 534/893 - loss 0.01532943 - time (sec): 35.03 - samples/sec: 4218.47 - lr: 0.000008 - momentum: 0.000000 2023-10-25 15:15:20,415 epoch 8 - iter 623/893 - loss 0.01502360 - time (sec): 41.40 - samples/sec: 4207.79 - lr: 0.000008 - momentum: 0.000000 2023-10-25 15:15:26,296 epoch 8 - iter 712/893 - loss 0.01549364 - time (sec): 47.28 - samples/sec: 4180.53 - lr: 0.000007 - momentum: 0.000000 2023-10-25 15:15:32,060 epoch 8 - iter 801/893 - loss 0.01600174 - time (sec): 53.04 - samples/sec: 4193.62 - lr: 0.000007 - momentum: 0.000000 2023-10-25 15:15:38,072 epoch 8 - iter 890/893 - loss 0.01593321 - time (sec): 59.06 - samples/sec: 4200.02 - lr: 0.000007 - momentum: 0.000000 2023-10-25 15:15:38,267 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:15:38,267 EPOCH 8 done: loss 0.0159 - lr: 0.000007 2023-10-25 15:15:43,263 DEV : loss 0.21227356791496277 - f1-score (micro avg) 0.7971 2023-10-25 15:15:43,284 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:15:49,118 epoch 9 - iter 89/893 - loss 0.00920994 - time (sec): 5.83 - samples/sec: 4231.87 - lr: 0.000006 - momentum: 0.000000 2023-10-25 15:15:54,872 epoch 9 - iter 178/893 - loss 0.00891932 - time (sec): 11.59 - samples/sec: 4226.85 - lr: 0.000006 - momentum: 0.000000 2023-10-25 15:16:00,508 epoch 9 - iter 267/893 - loss 0.01039436 - time (sec): 17.22 - samples/sec: 4278.15 - lr: 0.000006 - momentum: 0.000000 2023-10-25 15:16:06,782 epoch 9 - iter 356/893 - loss 0.01056567 - time (sec): 23.50 - samples/sec: 4283.75 - lr: 0.000005 - momentum: 0.000000 2023-10-25 15:16:12,664 epoch 9 - iter 445/893 - loss 0.01205154 - time (sec): 29.38 - samples/sec: 4283.65 - lr: 0.000005 - momentum: 0.000000 2023-10-25 15:16:18,609 epoch 9 - iter 534/893 - loss 0.01179973 - time (sec): 35.32 - samples/sec: 4285.62 - lr: 0.000005 - momentum: 0.000000 2023-10-25 15:16:24,456 epoch 9 - iter 623/893 - loss 0.01148151 - time (sec): 41.17 - samples/sec: 4253.07 - lr: 0.000004 - momentum: 0.000000 2023-10-25 15:16:30,157 epoch 9 - iter 712/893 - loss 0.01108116 - time (sec): 46.87 - samples/sec: 4279.34 - lr: 0.000004 - momentum: 0.000000 2023-10-25 15:16:35,747 epoch 9 - iter 801/893 - loss 0.01083303 - time (sec): 52.46 - samples/sec: 4280.96 - lr: 0.000004 - momentum: 0.000000 2023-10-25 15:16:41,190 epoch 9 - iter 890/893 - loss 0.01082633 - time (sec): 57.90 - samples/sec: 4284.21 - lr: 0.000003 - momentum: 0.000000 2023-10-25 15:16:41,365 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:16:41,366 EPOCH 9 done: loss 0.0108 - lr: 0.000003 2023-10-25 15:16:46,208 DEV : loss 0.21176157891750336 - f1-score (micro avg) 0.8104 2023-10-25 15:16:46,230 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:16:51,777 epoch 10 - iter 89/893 - loss 0.01317723 - time (sec): 5.55 - samples/sec: 4308.65 - lr: 0.000003 - momentum: 0.000000 2023-10-25 15:16:57,614 epoch 10 - iter 178/893 - loss 0.00887501 - time (sec): 11.38 - samples/sec: 4360.90 - lr: 0.000003 - momentum: 0.000000 2023-10-25 15:17:03,568 epoch 10 - iter 267/893 - loss 0.00690912 - time (sec): 17.34 - samples/sec: 4311.16 - lr: 0.000002 - momentum: 0.000000 2023-10-25 15:17:09,586 epoch 10 - iter 356/893 - loss 0.00707851 - time (sec): 23.35 - samples/sec: 4278.17 - lr: 0.000002 - momentum: 0.000000 2023-10-25 15:17:15,125 epoch 10 - iter 445/893 - loss 0.00657208 - time (sec): 28.89 - samples/sec: 4225.74 - lr: 0.000002 - momentum: 0.000000 2023-10-25 15:17:20,950 epoch 10 - iter 534/893 - loss 0.00656213 - time (sec): 34.72 - samples/sec: 4263.47 - lr: 0.000001 - momentum: 0.000000 2023-10-25 15:17:26,715 epoch 10 - iter 623/893 - loss 0.00748677 - time (sec): 40.48 - samples/sec: 4268.60 - lr: 0.000001 - momentum: 0.000000 2023-10-25 15:17:32,254 epoch 10 - iter 712/893 - loss 0.00756648 - time (sec): 46.02 - samples/sec: 4296.21 - lr: 0.000001 - momentum: 0.000000 2023-10-25 15:17:37,995 epoch 10 - iter 801/893 - loss 0.00771944 - time (sec): 51.76 - samples/sec: 4279.89 - lr: 0.000000 - momentum: 0.000000 2023-10-25 15:17:43,827 epoch 10 - iter 890/893 - loss 0.00812425 - time (sec): 57.60 - samples/sec: 4307.35 - lr: 0.000000 - momentum: 0.000000 2023-10-25 15:17:43,999 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:17:43,999 EPOCH 10 done: loss 0.0081 - lr: 0.000000 2023-10-25 15:17:47,978 DEV : loss 0.21720275282859802 - f1-score (micro avg) 0.8147 2023-10-25 15:17:47,999 saving best model 2023-10-25 15:17:49,119 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:17:49,120 Loading model from best epoch ... 2023-10-25 15:17:51,035 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 15:18:03,575 Results: - F-score (micro) 0.6992 - F-score (macro) 0.6245 - Accuracy 0.5561 By class: precision recall f1-score support LOC 0.7038 0.6986 0.7012 1095 PER 0.7808 0.7816 0.7812 1012 ORG 0.4549 0.5798 0.5099 357 HumanProd 0.4074 0.6667 0.5057 33 micro avg 0.6842 0.7149 0.6992 2497 macro avg 0.5867 0.6817 0.6245 2497 weighted avg 0.6955 0.7149 0.7037 2497 2023-10-25 15:18:03,576 ----------------------------------------------------------------------------------------------------