2023-10-13 15:56:41,973 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,974 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 Train: 5901 sentences 2023-10-13 15:56:41,975 (train_with_dev=False, train_with_test=False) 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 Training Params: 2023-10-13 15:56:41,975 - learning_rate: "3e-05" 2023-10-13 15:56:41,975 - mini_batch_size: "8" 2023-10-13 15:56:41,975 - max_epochs: "10" 2023-10-13 15:56:41,975 - shuffle: "True" 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 Plugins: 2023-10-13 15:56:41,975 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 15:56:41,975 - metric: "('micro avg', 'f1-score')" 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 Computation: 2023-10-13 15:56:41,975 - compute on device: cuda:0 2023-10-13 15:56:41,975 - embedding storage: none 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:41,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:47,173 epoch 1 - iter 73/738 - loss 3.05853360 - time (sec): 5.20 - samples/sec: 3385.89 - lr: 0.000003 - momentum: 0.000000 2023-10-13 15:56:52,161 epoch 1 - iter 146/738 - loss 2.01396427 - time (sec): 10.18 - samples/sec: 3514.39 - lr: 0.000006 - momentum: 0.000000 2023-10-13 15:56:56,988 epoch 1 - iter 219/738 - loss 1.53824384 - time (sec): 15.01 - samples/sec: 3444.77 - lr: 0.000009 - momentum: 0.000000 2023-10-13 15:57:01,882 epoch 1 - iter 292/738 - loss 1.24891774 - time (sec): 19.91 - samples/sec: 3432.48 - lr: 0.000012 - momentum: 0.000000 2023-10-13 15:57:06,840 epoch 1 - iter 365/738 - loss 1.07452913 - time (sec): 24.86 - samples/sec: 3418.88 - lr: 0.000015 - momentum: 0.000000 2023-10-13 15:57:11,658 epoch 1 - iter 438/738 - loss 0.94521822 - time (sec): 29.68 - samples/sec: 3426.34 - lr: 0.000018 - momentum: 0.000000 2023-10-13 15:57:16,509 epoch 1 - iter 511/738 - loss 0.85158135 - time (sec): 34.53 - samples/sec: 3412.22 - lr: 0.000021 - momentum: 0.000000 2023-10-13 15:57:20,999 epoch 1 - iter 584/738 - loss 0.78332215 - time (sec): 39.02 - samples/sec: 3392.42 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:57:25,830 epoch 1 - iter 657/738 - loss 0.72187463 - time (sec): 43.85 - samples/sec: 3388.05 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:57:30,631 epoch 1 - iter 730/738 - loss 0.66940959 - time (sec): 48.65 - samples/sec: 3389.93 - lr: 0.000030 - momentum: 0.000000 2023-10-13 15:57:31,100 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:57:31,100 EPOCH 1 done: loss 0.6654 - lr: 0.000030 2023-10-13 15:57:37,172 DEV : loss 0.14468906819820404 - f1-score (micro avg) 0.7202 2023-10-13 15:57:37,201 saving best model 2023-10-13 15:57:37,657 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:57:41,990 epoch 2 - iter 73/738 - loss 0.14759708 - time (sec): 4.33 - samples/sec: 3490.78 - lr: 0.000030 - momentum: 0.000000 2023-10-13 15:57:46,742 epoch 2 - iter 146/738 - loss 0.14672610 - time (sec): 9.08 - samples/sec: 3447.45 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:57:51,446 epoch 2 - iter 219/738 - loss 0.14523852 - time (sec): 13.79 - samples/sec: 3442.51 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:57:56,392 epoch 2 - iter 292/738 - loss 0.14107499 - time (sec): 18.73 - samples/sec: 3369.54 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:58:01,243 epoch 2 - iter 365/738 - loss 0.14216356 - time (sec): 23.58 - samples/sec: 3333.33 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:58:06,232 epoch 2 - iter 438/738 - loss 0.13781464 - time (sec): 28.57 - samples/sec: 3340.70 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:58:11,636 epoch 2 - iter 511/738 - loss 0.13577481 - time (sec): 33.98 - samples/sec: 3350.22 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:58:16,471 epoch 2 - iter 584/738 - loss 0.13005589 - time (sec): 38.81 - samples/sec: 3352.85 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:58:21,461 epoch 2 - iter 657/738 - loss 0.12956762 - time (sec): 43.80 - samples/sec: 3362.65 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:58:26,771 epoch 2 - iter 730/738 - loss 0.12951791 - time (sec): 49.11 - samples/sec: 3353.97 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:58:27,274 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:58:27,274 EPOCH 2 done: loss 0.1294 - lr: 0.000027 2023-10-13 15:58:38,435 DEV : loss 0.11062650382518768 - f1-score (micro avg) 0.7675 2023-10-13 15:58:38,464 saving best model 2023-10-13 15:58:39,090 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:58:43,863 epoch 3 - iter 73/738 - loss 0.06065277 - time (sec): 4.77 - samples/sec: 3236.91 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:58:48,700 epoch 3 - iter 146/738 - loss 0.07316894 - time (sec): 9.61 - samples/sec: 3338.82 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:58:54,048 epoch 3 - iter 219/738 - loss 0.07863303 - time (sec): 14.95 - samples/sec: 3233.00 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:58:58,383 epoch 3 - iter 292/738 - loss 0.07743735 - time (sec): 19.29 - samples/sec: 3281.31 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:59:03,909 epoch 3 - iter 365/738 - loss 0.07606154 - time (sec): 24.81 - samples/sec: 3262.96 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:59:09,129 epoch 3 - iter 438/738 - loss 0.07463572 - time (sec): 30.03 - samples/sec: 3309.18 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:59:13,987 epoch 3 - iter 511/738 - loss 0.07177934 - time (sec): 34.89 - samples/sec: 3306.95 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:59:18,942 epoch 3 - iter 584/738 - loss 0.07324880 - time (sec): 39.85 - samples/sec: 3320.00 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:59:24,146 epoch 3 - iter 657/738 - loss 0.07181062 - time (sec): 45.05 - samples/sec: 3315.65 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:59:28,957 epoch 3 - iter 730/738 - loss 0.07338875 - time (sec): 49.86 - samples/sec: 3305.07 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:59:29,439 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:59:29,439 EPOCH 3 done: loss 0.0738 - lr: 0.000023 2023-10-13 15:59:40,672 DEV : loss 0.10864270478487015 - f1-score (micro avg) 0.8175 2023-10-13 15:59:40,703 saving best model 2023-10-13 15:59:41,239 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:59:46,028 epoch 4 - iter 73/738 - loss 0.04491633 - time (sec): 4.79 - samples/sec: 3162.59 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:59:50,685 epoch 4 - iter 146/738 - loss 0.04351311 - time (sec): 9.44 - samples/sec: 3264.79 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:59:55,499 epoch 4 - iter 219/738 - loss 0.04436716 - time (sec): 14.26 - samples/sec: 3321.07 - lr: 0.000022 - momentum: 0.000000 2023-10-13 16:00:00,085 epoch 4 - iter 292/738 - loss 0.04559344 - time (sec): 18.84 - samples/sec: 3331.80 - lr: 0.000022 - momentum: 0.000000 2023-10-13 16:00:05,124 epoch 4 - iter 365/738 - loss 0.04601732 - time (sec): 23.88 - samples/sec: 3329.15 - lr: 0.000022 - momentum: 0.000000 2023-10-13 16:00:10,439 epoch 4 - iter 438/738 - loss 0.04450045 - time (sec): 29.20 - samples/sec: 3316.64 - lr: 0.000021 - momentum: 0.000000 2023-10-13 16:00:16,073 epoch 4 - iter 511/738 - loss 0.04465368 - time (sec): 34.83 - samples/sec: 3316.58 - lr: 0.000021 - momentum: 0.000000 2023-10-13 16:00:20,833 epoch 4 - iter 584/738 - loss 0.04598766 - time (sec): 39.59 - samples/sec: 3332.91 - lr: 0.000021 - momentum: 0.000000 2023-10-13 16:00:25,931 epoch 4 - iter 657/738 - loss 0.05017371 - time (sec): 44.69 - samples/sec: 3328.90 - lr: 0.000020 - momentum: 0.000000 2023-10-13 16:00:30,598 epoch 4 - iter 730/738 - loss 0.04880929 - time (sec): 49.36 - samples/sec: 3338.99 - lr: 0.000020 - momentum: 0.000000 2023-10-13 16:00:31,086 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:00:31,086 EPOCH 4 done: loss 0.0490 - lr: 0.000020 2023-10-13 16:00:42,240 DEV : loss 0.1474699079990387 - f1-score (micro avg) 0.7874 2023-10-13 16:00:42,273 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:00:47,244 epoch 5 - iter 73/738 - loss 0.04483958 - time (sec): 4.97 - samples/sec: 3309.95 - lr: 0.000020 - momentum: 0.000000 2023-10-13 16:00:52,261 epoch 5 - iter 146/738 - loss 0.03940692 - time (sec): 9.99 - samples/sec: 3306.44 - lr: 0.000019 - momentum: 0.000000 2023-10-13 16:00:57,081 epoch 5 - iter 219/738 - loss 0.03823729 - time (sec): 14.81 - samples/sec: 3372.73 - lr: 0.000019 - momentum: 0.000000 2023-10-13 16:01:01,852 epoch 5 - iter 292/738 - loss 0.03570809 - time (sec): 19.58 - samples/sec: 3372.43 - lr: 0.000019 - momentum: 0.000000 2023-10-13 16:01:06,660 epoch 5 - iter 365/738 - loss 0.03611074 - time (sec): 24.39 - samples/sec: 3366.09 - lr: 0.000018 - momentum: 0.000000 2023-10-13 16:01:11,422 epoch 5 - iter 438/738 - loss 0.03549566 - time (sec): 29.15 - samples/sec: 3345.91 - lr: 0.000018 - momentum: 0.000000 2023-10-13 16:01:16,434 epoch 5 - iter 511/738 - loss 0.03481242 - time (sec): 34.16 - samples/sec: 3329.04 - lr: 0.000018 - momentum: 0.000000 2023-10-13 16:01:21,988 epoch 5 - iter 584/738 - loss 0.03612795 - time (sec): 39.71 - samples/sec: 3312.86 - lr: 0.000017 - momentum: 0.000000 2023-10-13 16:01:27,611 epoch 5 - iter 657/738 - loss 0.03541289 - time (sec): 45.34 - samples/sec: 3304.32 - lr: 0.000017 - momentum: 0.000000 2023-10-13 16:01:32,079 epoch 5 - iter 730/738 - loss 0.03603975 - time (sec): 49.80 - samples/sec: 3305.92 - lr: 0.000017 - momentum: 0.000000 2023-10-13 16:01:32,629 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:01:32,630 EPOCH 5 done: loss 0.0359 - lr: 0.000017 2023-10-13 16:01:43,734 DEV : loss 0.15243035554885864 - f1-score (micro avg) 0.8137 2023-10-13 16:01:43,764 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:01:48,291 epoch 6 - iter 73/738 - loss 0.02753601 - time (sec): 4.53 - samples/sec: 3278.39 - lr: 0.000016 - momentum: 0.000000 2023-10-13 16:01:53,834 epoch 6 - iter 146/738 - loss 0.02412278 - time (sec): 10.07 - samples/sec: 3370.68 - lr: 0.000016 - momentum: 0.000000 2023-10-13 16:01:58,832 epoch 6 - iter 219/738 - loss 0.02730955 - time (sec): 15.07 - samples/sec: 3372.32 - lr: 0.000016 - momentum: 0.000000 2023-10-13 16:02:03,512 epoch 6 - iter 292/738 - loss 0.02914052 - time (sec): 19.75 - samples/sec: 3355.51 - lr: 0.000015 - momentum: 0.000000 2023-10-13 16:02:08,994 epoch 6 - iter 365/738 - loss 0.02963576 - time (sec): 25.23 - samples/sec: 3347.08 - lr: 0.000015 - momentum: 0.000000 2023-10-13 16:02:13,940 epoch 6 - iter 438/738 - loss 0.03069200 - time (sec): 30.18 - samples/sec: 3357.87 - lr: 0.000015 - momentum: 0.000000 2023-10-13 16:02:18,409 epoch 6 - iter 511/738 - loss 0.02902534 - time (sec): 34.64 - samples/sec: 3367.48 - lr: 0.000014 - momentum: 0.000000 2023-10-13 16:02:23,343 epoch 6 - iter 584/738 - loss 0.02744013 - time (sec): 39.58 - samples/sec: 3365.62 - lr: 0.000014 - momentum: 0.000000 2023-10-13 16:02:28,153 epoch 6 - iter 657/738 - loss 0.02726539 - time (sec): 44.39 - samples/sec: 3353.62 - lr: 0.000014 - momentum: 0.000000 2023-10-13 16:02:33,016 epoch 6 - iter 730/738 - loss 0.02719915 - time (sec): 49.25 - samples/sec: 3346.24 - lr: 0.000013 - momentum: 0.000000 2023-10-13 16:02:33,486 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:02:33,487 EPOCH 6 done: loss 0.0272 - lr: 0.000013 2023-10-13 16:02:44,645 DEV : loss 0.17250441014766693 - f1-score (micro avg) 0.8204 2023-10-13 16:02:44,676 saving best model 2023-10-13 16:02:45,316 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:02:50,958 epoch 7 - iter 73/738 - loss 0.01634149 - time (sec): 5.64 - samples/sec: 3018.98 - lr: 0.000013 - momentum: 0.000000 2023-10-13 16:02:55,302 epoch 7 - iter 146/738 - loss 0.01514888 - time (sec): 9.98 - samples/sec: 3189.06 - lr: 0.000013 - momentum: 0.000000 2023-10-13 16:03:00,715 epoch 7 - iter 219/738 - loss 0.01838780 - time (sec): 15.40 - samples/sec: 3270.41 - lr: 0.000012 - momentum: 0.000000 2023-10-13 16:03:06,168 epoch 7 - iter 292/738 - loss 0.01756859 - time (sec): 20.85 - samples/sec: 3303.19 - lr: 0.000012 - momentum: 0.000000 2023-10-13 16:03:10,622 epoch 7 - iter 365/738 - loss 0.01866420 - time (sec): 25.30 - samples/sec: 3307.34 - lr: 0.000012 - momentum: 0.000000 2023-10-13 16:03:15,201 epoch 7 - iter 438/738 - loss 0.01851098 - time (sec): 29.88 - samples/sec: 3315.07 - lr: 0.000011 - momentum: 0.000000 2023-10-13 16:03:19,804 epoch 7 - iter 511/738 - loss 0.01926756 - time (sec): 34.48 - samples/sec: 3336.22 - lr: 0.000011 - momentum: 0.000000 2023-10-13 16:03:24,445 epoch 7 - iter 584/738 - loss 0.01977881 - time (sec): 39.13 - samples/sec: 3335.09 - lr: 0.000011 - momentum: 0.000000 2023-10-13 16:03:29,619 epoch 7 - iter 657/738 - loss 0.01907489 - time (sec): 44.30 - samples/sec: 3305.95 - lr: 0.000010 - momentum: 0.000000 2023-10-13 16:03:35,450 epoch 7 - iter 730/738 - loss 0.01894560 - time (sec): 50.13 - samples/sec: 3288.28 - lr: 0.000010 - momentum: 0.000000 2023-10-13 16:03:35,947 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:03:35,948 EPOCH 7 done: loss 0.0190 - lr: 0.000010 2023-10-13 16:03:47,105 DEV : loss 0.20412878692150116 - f1-score (micro avg) 0.8154 2023-10-13 16:03:47,134 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:03:52,161 epoch 8 - iter 73/738 - loss 0.01344325 - time (sec): 5.03 - samples/sec: 3320.25 - lr: 0.000010 - momentum: 0.000000 2023-10-13 16:03:57,698 epoch 8 - iter 146/738 - loss 0.01748030 - time (sec): 10.56 - samples/sec: 3185.06 - lr: 0.000009 - momentum: 0.000000 2023-10-13 16:04:04,280 epoch 8 - iter 219/738 - loss 0.01866343 - time (sec): 17.14 - samples/sec: 3081.13 - lr: 0.000009 - momentum: 0.000000 2023-10-13 16:04:09,445 epoch 8 - iter 292/738 - loss 0.01990917 - time (sec): 22.31 - samples/sec: 3033.63 - lr: 0.000009 - momentum: 0.000000 2023-10-13 16:04:14,015 epoch 8 - iter 365/738 - loss 0.01831245 - time (sec): 26.88 - samples/sec: 3065.14 - lr: 0.000008 - momentum: 0.000000 2023-10-13 16:04:18,968 epoch 8 - iter 438/738 - loss 0.01923268 - time (sec): 31.83 - samples/sec: 3095.99 - lr: 0.000008 - momentum: 0.000000 2023-10-13 16:04:23,888 epoch 8 - iter 511/738 - loss 0.01795623 - time (sec): 36.75 - samples/sec: 3121.73 - lr: 0.000008 - momentum: 0.000000 2023-10-13 16:04:28,303 epoch 8 - iter 584/738 - loss 0.01764510 - time (sec): 41.17 - samples/sec: 3142.12 - lr: 0.000007 - momentum: 0.000000 2023-10-13 16:04:33,109 epoch 8 - iter 657/738 - loss 0.01668878 - time (sec): 45.97 - samples/sec: 3159.38 - lr: 0.000007 - momentum: 0.000000 2023-10-13 16:04:38,749 epoch 8 - iter 730/738 - loss 0.01600711 - time (sec): 51.61 - samples/sec: 3191.33 - lr: 0.000007 - momentum: 0.000000 2023-10-13 16:04:39,252 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:04:39,252 EPOCH 8 done: loss 0.0158 - lr: 0.000007 2023-10-13 16:04:50,416 DEV : loss 0.19149629771709442 - f1-score (micro avg) 0.826 2023-10-13 16:04:50,445 saving best model 2023-10-13 16:04:50,981 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:04:55,805 epoch 9 - iter 73/738 - loss 0.00698935 - time (sec): 4.82 - samples/sec: 3214.71 - lr: 0.000006 - momentum: 0.000000 2023-10-13 16:05:00,887 epoch 9 - iter 146/738 - loss 0.00952456 - time (sec): 9.91 - samples/sec: 3262.85 - lr: 0.000006 - momentum: 0.000000 2023-10-13 16:05:05,508 epoch 9 - iter 219/738 - loss 0.00949156 - time (sec): 14.53 - samples/sec: 3325.48 - lr: 0.000006 - momentum: 0.000000 2023-10-13 16:05:10,381 epoch 9 - iter 292/738 - loss 0.01022930 - time (sec): 19.40 - samples/sec: 3341.20 - lr: 0.000005 - momentum: 0.000000 2023-10-13 16:05:15,595 epoch 9 - iter 365/738 - loss 0.01120093 - time (sec): 24.61 - samples/sec: 3363.39 - lr: 0.000005 - momentum: 0.000000 2023-10-13 16:05:20,184 epoch 9 - iter 438/738 - loss 0.01043318 - time (sec): 29.20 - samples/sec: 3365.10 - lr: 0.000005 - momentum: 0.000000 2023-10-13 16:05:25,342 epoch 9 - iter 511/738 - loss 0.01017895 - time (sec): 34.36 - samples/sec: 3352.34 - lr: 0.000004 - momentum: 0.000000 2023-10-13 16:05:29,969 epoch 9 - iter 584/738 - loss 0.01017742 - time (sec): 38.99 - samples/sec: 3339.96 - lr: 0.000004 - momentum: 0.000000 2023-10-13 16:05:34,734 epoch 9 - iter 657/738 - loss 0.00985239 - time (sec): 43.75 - samples/sec: 3352.03 - lr: 0.000004 - momentum: 0.000000 2023-10-13 16:05:40,086 epoch 9 - iter 730/738 - loss 0.01123444 - time (sec): 49.10 - samples/sec: 3352.47 - lr: 0.000003 - momentum: 0.000000 2023-10-13 16:05:40,697 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:05:40,698 EPOCH 9 done: loss 0.0117 - lr: 0.000003 2023-10-13 16:05:51,786 DEV : loss 0.19480924308300018 - f1-score (micro avg) 0.8267 2023-10-13 16:05:51,816 saving best model 2023-10-13 16:05:52,417 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:05:57,660 epoch 10 - iter 73/738 - loss 0.01186703 - time (sec): 5.24 - samples/sec: 3361.08 - lr: 0.000003 - momentum: 0.000000 2023-10-13 16:06:02,305 epoch 10 - iter 146/738 - loss 0.00836274 - time (sec): 9.89 - samples/sec: 3390.77 - lr: 0.000003 - momentum: 0.000000 2023-10-13 16:06:06,585 epoch 10 - iter 219/738 - loss 0.01000260 - time (sec): 14.17 - samples/sec: 3456.57 - lr: 0.000002 - momentum: 0.000000 2023-10-13 16:06:11,468 epoch 10 - iter 292/738 - loss 0.00903905 - time (sec): 19.05 - samples/sec: 3427.99 - lr: 0.000002 - momentum: 0.000000 2023-10-13 16:06:16,389 epoch 10 - iter 365/738 - loss 0.00871698 - time (sec): 23.97 - samples/sec: 3393.43 - lr: 0.000002 - momentum: 0.000000 2023-10-13 16:06:21,868 epoch 10 - iter 438/738 - loss 0.00896638 - time (sec): 29.45 - samples/sec: 3398.02 - lr: 0.000001 - momentum: 0.000000 2023-10-13 16:06:26,485 epoch 10 - iter 511/738 - loss 0.00840529 - time (sec): 34.07 - samples/sec: 3368.60 - lr: 0.000001 - momentum: 0.000000 2023-10-13 16:06:32,017 epoch 10 - iter 584/738 - loss 0.00852755 - time (sec): 39.60 - samples/sec: 3327.66 - lr: 0.000001 - momentum: 0.000000 2023-10-13 16:06:37,311 epoch 10 - iter 657/738 - loss 0.00884580 - time (sec): 44.89 - samples/sec: 3302.90 - lr: 0.000000 - momentum: 0.000000 2023-10-13 16:06:42,635 epoch 10 - iter 730/738 - loss 0.00832851 - time (sec): 50.22 - samples/sec: 3284.25 - lr: 0.000000 - momentum: 0.000000 2023-10-13 16:06:43,054 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:06:43,054 EPOCH 10 done: loss 0.0083 - lr: 0.000000 2023-10-13 16:06:54,526 DEV : loss 0.19598710536956787 - f1-score (micro avg) 0.8274 2023-10-13 16:06:54,557 saving best model 2023-10-13 16:06:55,518 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:06:55,520 Loading model from best epoch ... 2023-10-13 16:06:57,087 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-13 16:07:03,859 Results: - F-score (micro) 0.7929 - F-score (macro) 0.6926 - Accuracy 0.6785 By class: precision recall f1-score support loc 0.8673 0.8683 0.8678 858 pers 0.7402 0.8119 0.7744 537 org 0.5067 0.5758 0.5390 132 prod 0.6885 0.6885 0.6885 61 time 0.5469 0.6481 0.5932 54 micro avg 0.7742 0.8124 0.7929 1642 macro avg 0.6699 0.7185 0.6926 1642 weighted avg 0.7796 0.8124 0.7951 1642 2023-10-13 16:07:03,859 ----------------------------------------------------------------------------------------------------