2023-10-17 10:57:18,983 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,984 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 10:57:18,984 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,984 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 Train: 966 sentences 2023-10-17 10:57:18,985 (train_with_dev=False, train_with_test=False) 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 Training Params: 2023-10-17 10:57:18,985 - learning_rate: "3e-05" 2023-10-17 10:57:18,985 - mini_batch_size: "8" 2023-10-17 10:57:18,985 - max_epochs: "10" 2023-10-17 10:57:18,985 - shuffle: "True" 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 Plugins: 2023-10-17 10:57:18,985 - TensorboardLogger 2023-10-17 10:57:18,985 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 10:57:18,985 - metric: "('micro avg', 'f1-score')" 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 Computation: 2023-10-17 10:57:18,985 - compute on device: cuda:0 2023-10-17 10:57:18,985 - embedding storage: none 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 Model training base path: "hmbench-ajmc/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:18,985 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 10:57:19,714 epoch 1 - iter 12/121 - loss 4.11176953 - time (sec): 0.73 - samples/sec: 3310.46 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:57:20,438 epoch 1 - iter 24/121 - loss 3.76005368 - time (sec): 1.45 - samples/sec: 3272.87 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:57:21,175 epoch 1 - iter 36/121 - loss 3.10341988 - time (sec): 2.19 - samples/sec: 3232.18 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:57:21,939 epoch 1 - iter 48/121 - loss 2.54022751 - time (sec): 2.95 - samples/sec: 3306.53 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:57:22,623 epoch 1 - iter 60/121 - loss 2.20458174 - time (sec): 3.64 - samples/sec: 3312.27 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:57:23,405 epoch 1 - iter 72/121 - loss 1.91912829 - time (sec): 4.42 - samples/sec: 3309.16 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:57:24,148 epoch 1 - iter 84/121 - loss 1.68059317 - time (sec): 5.16 - samples/sec: 3353.48 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:57:24,937 epoch 1 - iter 96/121 - loss 1.49481711 - time (sec): 5.95 - samples/sec: 3380.76 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:57:25,639 epoch 1 - iter 108/121 - loss 1.38776546 - time (sec): 6.65 - samples/sec: 3359.76 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:57:26,354 epoch 1 - iter 120/121 - loss 1.28622241 - time (sec): 7.37 - samples/sec: 3340.58 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:57:26,401 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:26,401 EPOCH 1 done: loss 1.2827 - lr: 0.000030 2023-10-17 10:57:27,280 DEV : loss 0.2441699206829071 - f1-score (micro avg) 0.5721 2023-10-17 10:57:27,288 saving best model 2023-10-17 10:57:27,722 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:28,445 epoch 2 - iter 12/121 - loss 0.20649691 - time (sec): 0.72 - samples/sec: 3182.29 - lr: 0.000030 - momentum: 0.000000 2023-10-17 10:57:29,175 epoch 2 - iter 24/121 - loss 0.22995980 - time (sec): 1.45 - samples/sec: 3218.68 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:57:29,928 epoch 2 - iter 36/121 - loss 0.23059976 - time (sec): 2.20 - samples/sec: 3223.00 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:57:30,663 epoch 2 - iter 48/121 - loss 0.22267191 - time (sec): 2.94 - samples/sec: 3238.25 - lr: 0.000029 - momentum: 0.000000 2023-10-17 10:57:31,424 epoch 2 - iter 60/121 - loss 0.21515131 - time (sec): 3.70 - samples/sec: 3306.79 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:57:32,133 epoch 2 - iter 72/121 - loss 0.21250351 - time (sec): 4.41 - samples/sec: 3298.44 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:57:32,892 epoch 2 - iter 84/121 - loss 0.20931862 - time (sec): 5.17 - samples/sec: 3316.16 - lr: 0.000028 - momentum: 0.000000 2023-10-17 10:57:33,676 epoch 2 - iter 96/121 - loss 0.20235956 - time (sec): 5.95 - samples/sec: 3306.08 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:57:34,437 epoch 2 - iter 108/121 - loss 0.20444909 - time (sec): 6.71 - samples/sec: 3296.50 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:57:35,196 epoch 2 - iter 120/121 - loss 0.19934455 - time (sec): 7.47 - samples/sec: 3287.52 - lr: 0.000027 - momentum: 0.000000 2023-10-17 10:57:35,266 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:35,266 EPOCH 2 done: loss 0.1981 - lr: 0.000027 2023-10-17 10:57:36,024 DEV : loss 0.13820908963680267 - f1-score (micro avg) 0.8161 2023-10-17 10:57:36,030 saving best model 2023-10-17 10:57:36,558 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:37,315 epoch 3 - iter 12/121 - loss 0.10671627 - time (sec): 0.75 - samples/sec: 3199.00 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:57:38,087 epoch 3 - iter 24/121 - loss 0.10895843 - time (sec): 1.53 - samples/sec: 3252.71 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:57:38,888 epoch 3 - iter 36/121 - loss 0.09820473 - time (sec): 2.33 - samples/sec: 3307.37 - lr: 0.000026 - momentum: 0.000000 2023-10-17 10:57:39,619 epoch 3 - iter 48/121 - loss 0.09907103 - time (sec): 3.06 - samples/sec: 3313.97 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:57:40,388 epoch 3 - iter 60/121 - loss 0.10159016 - time (sec): 3.83 - samples/sec: 3279.99 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:57:41,125 epoch 3 - iter 72/121 - loss 0.10049162 - time (sec): 4.56 - samples/sec: 3290.90 - lr: 0.000025 - momentum: 0.000000 2023-10-17 10:57:41,905 epoch 3 - iter 84/121 - loss 0.10314644 - time (sec): 5.34 - samples/sec: 3293.60 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:57:42,586 epoch 3 - iter 96/121 - loss 0.10316924 - time (sec): 6.03 - samples/sec: 3247.80 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:57:43,425 epoch 3 - iter 108/121 - loss 0.10749021 - time (sec): 6.86 - samples/sec: 3250.57 - lr: 0.000024 - momentum: 0.000000 2023-10-17 10:57:44,135 epoch 3 - iter 120/121 - loss 0.10975500 - time (sec): 7.57 - samples/sec: 3241.35 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:57:44,190 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:44,190 EPOCH 3 done: loss 0.1105 - lr: 0.000023 2023-10-17 10:57:44,981 DEV : loss 0.1287086457014084 - f1-score (micro avg) 0.7931 2023-10-17 10:57:44,987 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:45,701 epoch 4 - iter 12/121 - loss 0.09747709 - time (sec): 0.71 - samples/sec: 2907.36 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:57:46,425 epoch 4 - iter 24/121 - loss 0.09056427 - time (sec): 1.44 - samples/sec: 3172.85 - lr: 0.000023 - momentum: 0.000000 2023-10-17 10:57:47,181 epoch 4 - iter 36/121 - loss 0.07817966 - time (sec): 2.19 - samples/sec: 3174.55 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:57:47,980 epoch 4 - iter 48/121 - loss 0.07983687 - time (sec): 2.99 - samples/sec: 3194.07 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:57:48,746 epoch 4 - iter 60/121 - loss 0.07961880 - time (sec): 3.76 - samples/sec: 3252.18 - lr: 0.000022 - momentum: 0.000000 2023-10-17 10:57:49,457 epoch 4 - iter 72/121 - loss 0.08005860 - time (sec): 4.47 - samples/sec: 3224.86 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:57:50,244 epoch 4 - iter 84/121 - loss 0.08381911 - time (sec): 5.26 - samples/sec: 3199.52 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:57:51,052 epoch 4 - iter 96/121 - loss 0.07951076 - time (sec): 6.06 - samples/sec: 3191.68 - lr: 0.000021 - momentum: 0.000000 2023-10-17 10:57:51,880 epoch 4 - iter 108/121 - loss 0.08050723 - time (sec): 6.89 - samples/sec: 3178.37 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:57:52,607 epoch 4 - iter 120/121 - loss 0.07858401 - time (sec): 7.62 - samples/sec: 3232.92 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:57:52,656 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:52,656 EPOCH 4 done: loss 0.0783 - lr: 0.000020 2023-10-17 10:57:53,416 DEV : loss 0.1368216872215271 - f1-score (micro avg) 0.8428 2023-10-17 10:57:53,421 saving best model 2023-10-17 10:57:53,955 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:57:54,692 epoch 5 - iter 12/121 - loss 0.05030366 - time (sec): 0.73 - samples/sec: 2978.83 - lr: 0.000020 - momentum: 0.000000 2023-10-17 10:57:55,441 epoch 5 - iter 24/121 - loss 0.05842618 - time (sec): 1.48 - samples/sec: 3004.98 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:57:56,202 epoch 5 - iter 36/121 - loss 0.05402638 - time (sec): 2.24 - samples/sec: 3143.73 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:57:56,977 epoch 5 - iter 48/121 - loss 0.05579888 - time (sec): 3.01 - samples/sec: 3107.83 - lr: 0.000019 - momentum: 0.000000 2023-10-17 10:57:57,688 epoch 5 - iter 60/121 - loss 0.05433023 - time (sec): 3.73 - samples/sec: 3159.77 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:57:58,487 epoch 5 - iter 72/121 - loss 0.05588435 - time (sec): 4.52 - samples/sec: 3222.11 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:57:59,183 epoch 5 - iter 84/121 - loss 0.06059224 - time (sec): 5.22 - samples/sec: 3228.21 - lr: 0.000018 - momentum: 0.000000 2023-10-17 10:57:59,937 epoch 5 - iter 96/121 - loss 0.05994432 - time (sec): 5.98 - samples/sec: 3237.32 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:58:00,703 epoch 5 - iter 108/121 - loss 0.05885631 - time (sec): 6.74 - samples/sec: 3244.65 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:58:01,507 epoch 5 - iter 120/121 - loss 0.05832928 - time (sec): 7.54 - samples/sec: 3267.73 - lr: 0.000017 - momentum: 0.000000 2023-10-17 10:58:01,553 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:01,553 EPOCH 5 done: loss 0.0585 - lr: 0.000017 2023-10-17 10:58:02,302 DEV : loss 0.16064168512821198 - f1-score (micro avg) 0.8162 2023-10-17 10:58:02,307 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:03,052 epoch 6 - iter 12/121 - loss 0.04426736 - time (sec): 0.74 - samples/sec: 3294.18 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:58:03,802 epoch 6 - iter 24/121 - loss 0.05380387 - time (sec): 1.49 - samples/sec: 3310.51 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:58:04,550 epoch 6 - iter 36/121 - loss 0.05088120 - time (sec): 2.24 - samples/sec: 3350.87 - lr: 0.000016 - momentum: 0.000000 2023-10-17 10:58:05,236 epoch 6 - iter 48/121 - loss 0.04892439 - time (sec): 2.93 - samples/sec: 3321.57 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:58:06,003 epoch 6 - iter 60/121 - loss 0.04484164 - time (sec): 3.70 - samples/sec: 3304.78 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:58:06,692 epoch 6 - iter 72/121 - loss 0.04516285 - time (sec): 4.38 - samples/sec: 3254.88 - lr: 0.000015 - momentum: 0.000000 2023-10-17 10:58:07,506 epoch 6 - iter 84/121 - loss 0.04649830 - time (sec): 5.20 - samples/sec: 3282.01 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:58:08,260 epoch 6 - iter 96/121 - loss 0.04408763 - time (sec): 5.95 - samples/sec: 3282.93 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:58:09,055 epoch 6 - iter 108/121 - loss 0.04327841 - time (sec): 6.75 - samples/sec: 3250.65 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:58:09,841 epoch 6 - iter 120/121 - loss 0.04469431 - time (sec): 7.53 - samples/sec: 3261.23 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:58:09,899 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:09,899 EPOCH 6 done: loss 0.0444 - lr: 0.000013 2023-10-17 10:58:10,661 DEV : loss 0.16509920358657837 - f1-score (micro avg) 0.8256 2023-10-17 10:58:10,666 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:11,437 epoch 7 - iter 12/121 - loss 0.01541501 - time (sec): 0.77 - samples/sec: 3108.65 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:58:12,150 epoch 7 - iter 24/121 - loss 0.02410835 - time (sec): 1.48 - samples/sec: 3142.06 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:58:12,860 epoch 7 - iter 36/121 - loss 0.02628934 - time (sec): 2.19 - samples/sec: 3297.76 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:58:13,611 epoch 7 - iter 48/121 - loss 0.02740439 - time (sec): 2.94 - samples/sec: 3283.05 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:58:14,402 epoch 7 - iter 60/121 - loss 0.03048279 - time (sec): 3.73 - samples/sec: 3282.57 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:58:15,133 epoch 7 - iter 72/121 - loss 0.03201688 - time (sec): 4.47 - samples/sec: 3263.94 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:58:15,878 epoch 7 - iter 84/121 - loss 0.03254549 - time (sec): 5.21 - samples/sec: 3247.21 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:58:16,610 epoch 7 - iter 96/121 - loss 0.03124162 - time (sec): 5.94 - samples/sec: 3271.43 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:58:17,415 epoch 7 - iter 108/121 - loss 0.03286000 - time (sec): 6.75 - samples/sec: 3266.85 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:58:18,164 epoch 7 - iter 120/121 - loss 0.03208405 - time (sec): 7.50 - samples/sec: 3286.59 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:58:18,219 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:18,219 EPOCH 7 done: loss 0.0320 - lr: 0.000010 2023-10-17 10:58:18,971 DEV : loss 0.18511991202831268 - f1-score (micro avg) 0.8542 2023-10-17 10:58:18,976 saving best model 2023-10-17 10:58:19,628 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:20,370 epoch 8 - iter 12/121 - loss 0.03326804 - time (sec): 0.74 - samples/sec: 2896.45 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:58:21,124 epoch 8 - iter 24/121 - loss 0.02660413 - time (sec): 1.49 - samples/sec: 3211.26 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:58:21,883 epoch 8 - iter 36/121 - loss 0.02728278 - time (sec): 2.25 - samples/sec: 3177.93 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:58:22,704 epoch 8 - iter 48/121 - loss 0.02521362 - time (sec): 3.07 - samples/sec: 3259.47 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:58:23,482 epoch 8 - iter 60/121 - loss 0.02360167 - time (sec): 3.85 - samples/sec: 3252.99 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:58:24,230 epoch 8 - iter 72/121 - loss 0.02285615 - time (sec): 4.60 - samples/sec: 3257.76 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:58:24,950 epoch 8 - iter 84/121 - loss 0.02475812 - time (sec): 5.32 - samples/sec: 3227.04 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:58:25,699 epoch 8 - iter 96/121 - loss 0.02481499 - time (sec): 6.07 - samples/sec: 3234.77 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:58:26,456 epoch 8 - iter 108/121 - loss 0.02381609 - time (sec): 6.83 - samples/sec: 3260.39 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:58:27,226 epoch 8 - iter 120/121 - loss 0.02539417 - time (sec): 7.60 - samples/sec: 3239.88 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:58:27,278 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:27,278 EPOCH 8 done: loss 0.0255 - lr: 0.000007 2023-10-17 10:58:28,030 DEV : loss 0.1753813475370407 - f1-score (micro avg) 0.8515 2023-10-17 10:58:28,035 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:28,780 epoch 9 - iter 12/121 - loss 0.02319572 - time (sec): 0.74 - samples/sec: 3358.61 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:58:29,544 epoch 9 - iter 24/121 - loss 0.02035831 - time (sec): 1.51 - samples/sec: 3182.04 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:58:30,299 epoch 9 - iter 36/121 - loss 0.01757965 - time (sec): 2.26 - samples/sec: 3098.64 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:58:31,049 epoch 9 - iter 48/121 - loss 0.01978687 - time (sec): 3.01 - samples/sec: 3092.72 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:58:31,770 epoch 9 - iter 60/121 - loss 0.01965078 - time (sec): 3.73 - samples/sec: 3098.63 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:58:32,538 epoch 9 - iter 72/121 - loss 0.01917681 - time (sec): 4.50 - samples/sec: 3162.47 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:58:33,313 epoch 9 - iter 84/121 - loss 0.02104908 - time (sec): 5.28 - samples/sec: 3188.02 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:58:34,078 epoch 9 - iter 96/121 - loss 0.01936059 - time (sec): 6.04 - samples/sec: 3228.86 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:58:34,903 epoch 9 - iter 108/121 - loss 0.01966485 - time (sec): 6.87 - samples/sec: 3233.29 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:58:35,661 epoch 9 - iter 120/121 - loss 0.01996074 - time (sec): 7.62 - samples/sec: 3226.11 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:58:35,713 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:35,713 EPOCH 9 done: loss 0.0202 - lr: 0.000004 2023-10-17 10:58:36,476 DEV : loss 0.18414919078350067 - f1-score (micro avg) 0.8446 2023-10-17 10:58:36,481 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:37,192 epoch 10 - iter 12/121 - loss 0.01422985 - time (sec): 0.71 - samples/sec: 3331.69 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:58:37,934 epoch 10 - iter 24/121 - loss 0.02050312 - time (sec): 1.45 - samples/sec: 3144.41 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:58:38,683 epoch 10 - iter 36/121 - loss 0.01665361 - time (sec): 2.20 - samples/sec: 3252.90 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:58:39,417 epoch 10 - iter 48/121 - loss 0.01607792 - time (sec): 2.93 - samples/sec: 3302.06 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:58:40,247 epoch 10 - iter 60/121 - loss 0.01888085 - time (sec): 3.76 - samples/sec: 3376.90 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:58:40,984 epoch 10 - iter 72/121 - loss 0.01759198 - time (sec): 4.50 - samples/sec: 3302.85 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:58:41,737 epoch 10 - iter 84/121 - loss 0.01703542 - time (sec): 5.26 - samples/sec: 3248.04 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:58:42,542 epoch 10 - iter 96/121 - loss 0.01766957 - time (sec): 6.06 - samples/sec: 3251.25 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:58:43,353 epoch 10 - iter 108/121 - loss 0.01802283 - time (sec): 6.87 - samples/sec: 3264.13 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:58:44,058 epoch 10 - iter 120/121 - loss 0.01776160 - time (sec): 7.58 - samples/sec: 3243.89 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:58:44,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:44,116 EPOCH 10 done: loss 0.0176 - lr: 0.000000 2023-10-17 10:58:44,870 DEV : loss 0.19578655064105988 - f1-score (micro avg) 0.8375 2023-10-17 10:58:45,260 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:58:45,261 Loading model from best epoch ... 2023-10-17 10:58:46,626 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 10:58:47,299 Results: - F-score (micro) 0.8071 - F-score (macro) 0.5559 - Accuracy 0.6974 By class: precision recall f1-score support pers 0.8194 0.8489 0.8339 139 scope 0.8626 0.8760 0.8692 129 work 0.6596 0.7750 0.7126 80 loc 1.0000 0.2222 0.3636 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7951 0.8194 0.8071 360 macro avg 0.6683 0.5444 0.5559 360 weighted avg 0.7971 0.8194 0.8009 360 2023-10-17 10:58:47,299 ----------------------------------------------------------------------------------------------------