2023-10-17 08:44:19,028 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,028 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 Train: 1100 sentences 2023-10-17 08:44:19,029 (train_with_dev=False, train_with_test=False) 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 Training Params: 2023-10-17 08:44:19,029 - learning_rate: "5e-05" 2023-10-17 08:44:19,029 - mini_batch_size: "8" 2023-10-17 08:44:19,029 - max_epochs: "10" 2023-10-17 08:44:19,029 - shuffle: "True" 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 Plugins: 2023-10-17 08:44:19,029 - TensorboardLogger 2023-10-17 08:44:19,029 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:44:19,029 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 Computation: 2023-10-17 08:44:19,029 - compute on device: cuda:0 2023-10-17 08:44:19,029 - embedding storage: none 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,029 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:19,030 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:44:19,768 epoch 1 - iter 13/138 - loss 4.23689924 - time (sec): 0.74 - samples/sec: 2774.07 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:44:20,495 epoch 1 - iter 26/138 - loss 3.72475472 - time (sec): 1.46 - samples/sec: 2913.13 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:44:21,251 epoch 1 - iter 39/138 - loss 2.98979238 - time (sec): 2.22 - samples/sec: 2921.26 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:44:21,935 epoch 1 - iter 52/138 - loss 2.54874337 - time (sec): 2.90 - samples/sec: 2897.74 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:44:22,680 epoch 1 - iter 65/138 - loss 2.15097506 - time (sec): 3.65 - samples/sec: 2930.58 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:44:23,413 epoch 1 - iter 78/138 - loss 1.87984736 - time (sec): 4.38 - samples/sec: 2955.37 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:44:24,111 epoch 1 - iter 91/138 - loss 1.68323414 - time (sec): 5.08 - samples/sec: 2946.32 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:44:24,831 epoch 1 - iter 104/138 - loss 1.52893736 - time (sec): 5.80 - samples/sec: 2944.10 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:44:25,580 epoch 1 - iter 117/138 - loss 1.38071873 - time (sec): 6.55 - samples/sec: 2965.22 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:44:26,331 epoch 1 - iter 130/138 - loss 1.27710721 - time (sec): 7.30 - samples/sec: 2942.46 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:44:26,778 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:26,779 EPOCH 1 done: loss 1.2205 - lr: 0.000047 2023-10-17 08:44:27,298 DEV : loss 0.2180010974407196 - f1-score (micro avg) 0.6165 2023-10-17 08:44:27,302 saving best model 2023-10-17 08:44:27,635 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:28,352 epoch 2 - iter 13/138 - loss 0.32735430 - time (sec): 0.72 - samples/sec: 2955.79 - lr: 0.000050 - momentum: 0.000000 2023-10-17 08:44:29,091 epoch 2 - iter 26/138 - loss 0.25572315 - time (sec): 1.45 - samples/sec: 2994.77 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:44:29,834 epoch 2 - iter 39/138 - loss 0.22680767 - time (sec): 2.20 - samples/sec: 3035.05 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:44:30,583 epoch 2 - iter 52/138 - loss 0.21737319 - time (sec): 2.95 - samples/sec: 2983.02 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:44:31,379 epoch 2 - iter 65/138 - loss 0.22002365 - time (sec): 3.74 - samples/sec: 2953.68 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:44:32,138 epoch 2 - iter 78/138 - loss 0.21253262 - time (sec): 4.50 - samples/sec: 2911.39 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:44:32,878 epoch 2 - iter 91/138 - loss 0.20313431 - time (sec): 5.24 - samples/sec: 2935.34 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:44:33,599 epoch 2 - iter 104/138 - loss 0.19280458 - time (sec): 5.96 - samples/sec: 2929.87 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:44:34,321 epoch 2 - iter 117/138 - loss 0.18675924 - time (sec): 6.69 - samples/sec: 2907.25 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:44:35,032 epoch 2 - iter 130/138 - loss 0.18564163 - time (sec): 7.40 - samples/sec: 2925.93 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:44:35,445 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:35,445 EPOCH 2 done: loss 0.1811 - lr: 0.000045 2023-10-17 08:44:36,075 DEV : loss 0.1245567575097084 - f1-score (micro avg) 0.8305 2023-10-17 08:44:36,081 saving best model 2023-10-17 08:44:36,514 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:37,267 epoch 3 - iter 13/138 - loss 0.09322958 - time (sec): 0.75 - samples/sec: 3137.87 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:44:38,045 epoch 3 - iter 26/138 - loss 0.09112179 - time (sec): 1.53 - samples/sec: 3017.03 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:44:38,773 epoch 3 - iter 39/138 - loss 0.08652228 - time (sec): 2.26 - samples/sec: 2969.66 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:44:39,577 epoch 3 - iter 52/138 - loss 0.08449509 - time (sec): 3.06 - samples/sec: 2976.32 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:44:40,288 epoch 3 - iter 65/138 - loss 0.08587226 - time (sec): 3.77 - samples/sec: 2933.58 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:44:40,994 epoch 3 - iter 78/138 - loss 0.08699599 - time (sec): 4.48 - samples/sec: 2945.29 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:44:41,741 epoch 3 - iter 91/138 - loss 0.08855977 - time (sec): 5.22 - samples/sec: 2925.78 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:44:42,515 epoch 3 - iter 104/138 - loss 0.09546851 - time (sec): 6.00 - samples/sec: 2930.81 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:44:43,229 epoch 3 - iter 117/138 - loss 0.09882774 - time (sec): 6.71 - samples/sec: 2931.36 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:44:43,923 epoch 3 - iter 130/138 - loss 0.10085115 - time (sec): 7.41 - samples/sec: 2918.81 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:44:44,378 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:44,378 EPOCH 3 done: loss 0.1007 - lr: 0.000039 2023-10-17 08:44:45,057 DEV : loss 0.12144241482019424 - f1-score (micro avg) 0.8712 2023-10-17 08:44:45,062 saving best model 2023-10-17 08:44:45,506 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:46,236 epoch 4 - iter 13/138 - loss 0.04907389 - time (sec): 0.73 - samples/sec: 2851.54 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:44:46,974 epoch 4 - iter 26/138 - loss 0.05122455 - time (sec): 1.46 - samples/sec: 2921.88 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:44:47,691 epoch 4 - iter 39/138 - loss 0.04522759 - time (sec): 2.18 - samples/sec: 2934.94 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:44:48,379 epoch 4 - iter 52/138 - loss 0.05104925 - time (sec): 2.87 - samples/sec: 2931.29 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:44:49,118 epoch 4 - iter 65/138 - loss 0.05515405 - time (sec): 3.61 - samples/sec: 2912.28 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:44:49,867 epoch 4 - iter 78/138 - loss 0.05772425 - time (sec): 4.36 - samples/sec: 2916.87 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:44:50,808 epoch 4 - iter 91/138 - loss 0.06243765 - time (sec): 5.30 - samples/sec: 2776.13 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:44:51,582 epoch 4 - iter 104/138 - loss 0.06726186 - time (sec): 6.07 - samples/sec: 2789.80 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:44:52,345 epoch 4 - iter 117/138 - loss 0.07264164 - time (sec): 6.83 - samples/sec: 2814.01 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:44:53,114 epoch 4 - iter 130/138 - loss 0.07138648 - time (sec): 7.60 - samples/sec: 2816.62 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:44:53,545 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:53,546 EPOCH 4 done: loss 0.0710 - lr: 0.000034 2023-10-17 08:44:54,237 DEV : loss 0.14281411468982697 - f1-score (micro avg) 0.862 2023-10-17 08:44:54,241 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:44:54,974 epoch 5 - iter 13/138 - loss 0.08915125 - time (sec): 0.73 - samples/sec: 3058.74 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:44:55,741 epoch 5 - iter 26/138 - loss 0.08538151 - time (sec): 1.50 - samples/sec: 2962.06 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:44:56,460 epoch 5 - iter 39/138 - loss 0.07451282 - time (sec): 2.22 - samples/sec: 2981.90 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:44:57,151 epoch 5 - iter 52/138 - loss 0.07480996 - time (sec): 2.91 - samples/sec: 2949.67 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:44:57,881 epoch 5 - iter 65/138 - loss 0.07246926 - time (sec): 3.64 - samples/sec: 2991.74 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:44:58,615 epoch 5 - iter 78/138 - loss 0.07599919 - time (sec): 4.37 - samples/sec: 2985.62 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:44:59,411 epoch 5 - iter 91/138 - loss 0.06939059 - time (sec): 5.17 - samples/sec: 2936.72 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:45:00,192 epoch 5 - iter 104/138 - loss 0.06497114 - time (sec): 5.95 - samples/sec: 2929.39 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:45:00,968 epoch 5 - iter 117/138 - loss 0.06053147 - time (sec): 6.73 - samples/sec: 2901.84 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:45:01,711 epoch 5 - iter 130/138 - loss 0.05852661 - time (sec): 7.47 - samples/sec: 2896.28 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:45:02,146 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:02,147 EPOCH 5 done: loss 0.0586 - lr: 0.000028 2023-10-17 08:45:02,908 DEV : loss 0.1629737764596939 - f1-score (micro avg) 0.8708 2023-10-17 08:45:02,913 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:03,685 epoch 6 - iter 13/138 - loss 0.04102128 - time (sec): 0.77 - samples/sec: 2812.89 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:45:04,460 epoch 6 - iter 26/138 - loss 0.04110212 - time (sec): 1.55 - samples/sec: 2806.20 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:45:05,203 epoch 6 - iter 39/138 - loss 0.05651555 - time (sec): 2.29 - samples/sec: 2772.29 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:45:05,948 epoch 6 - iter 52/138 - loss 0.06576880 - time (sec): 3.03 - samples/sec: 2757.34 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:45:06,784 epoch 6 - iter 65/138 - loss 0.06388934 - time (sec): 3.87 - samples/sec: 2730.08 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:45:07,534 epoch 6 - iter 78/138 - loss 0.06510905 - time (sec): 4.62 - samples/sec: 2741.94 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:45:08,290 epoch 6 - iter 91/138 - loss 0.06051985 - time (sec): 5.38 - samples/sec: 2776.01 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:45:09,031 epoch 6 - iter 104/138 - loss 0.05874561 - time (sec): 6.12 - samples/sec: 2790.54 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:45:09,785 epoch 6 - iter 117/138 - loss 0.05433949 - time (sec): 6.87 - samples/sec: 2798.80 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:45:10,511 epoch 6 - iter 130/138 - loss 0.05085432 - time (sec): 7.60 - samples/sec: 2812.65 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:45:10,968 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:10,968 EPOCH 6 done: loss 0.0483 - lr: 0.000023 2023-10-17 08:45:11,693 DEV : loss 0.170999675989151 - f1-score (micro avg) 0.8633 2023-10-17 08:45:11,698 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:12,423 epoch 7 - iter 13/138 - loss 0.03378651 - time (sec): 0.72 - samples/sec: 3015.17 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:45:13,108 epoch 7 - iter 26/138 - loss 0.02358591 - time (sec): 1.41 - samples/sec: 3105.48 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:45:13,838 epoch 7 - iter 39/138 - loss 0.02146202 - time (sec): 2.14 - samples/sec: 2952.00 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:45:14,582 epoch 7 - iter 52/138 - loss 0.02877528 - time (sec): 2.88 - samples/sec: 2903.41 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:45:15,433 epoch 7 - iter 65/138 - loss 0.03053761 - time (sec): 3.73 - samples/sec: 2887.15 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:45:16,172 epoch 7 - iter 78/138 - loss 0.03095385 - time (sec): 4.47 - samples/sec: 2900.40 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:45:16,946 epoch 7 - iter 91/138 - loss 0.03425468 - time (sec): 5.25 - samples/sec: 2880.13 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:45:17,724 epoch 7 - iter 104/138 - loss 0.03297833 - time (sec): 6.02 - samples/sec: 2879.17 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:45:18,528 epoch 7 - iter 117/138 - loss 0.03274128 - time (sec): 6.83 - samples/sec: 2851.13 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:45:19,270 epoch 7 - iter 130/138 - loss 0.02995608 - time (sec): 7.57 - samples/sec: 2854.86 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:45:19,722 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:19,722 EPOCH 7 done: loss 0.0314 - lr: 0.000017 2023-10-17 08:45:20,357 DEV : loss 0.17921938002109528 - f1-score (micro avg) 0.87 2023-10-17 08:45:20,361 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:21,112 epoch 8 - iter 13/138 - loss 0.02216531 - time (sec): 0.75 - samples/sec: 2828.84 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:45:21,865 epoch 8 - iter 26/138 - loss 0.02828258 - time (sec): 1.50 - samples/sec: 2755.00 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:45:22,673 epoch 8 - iter 39/138 - loss 0.02250268 - time (sec): 2.31 - samples/sec: 2815.84 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:45:23,452 epoch 8 - iter 52/138 - loss 0.02828756 - time (sec): 3.09 - samples/sec: 2851.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:45:24,194 epoch 8 - iter 65/138 - loss 0.02593428 - time (sec): 3.83 - samples/sec: 2859.14 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:45:24,899 epoch 8 - iter 78/138 - loss 0.02247652 - time (sec): 4.54 - samples/sec: 2816.39 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:45:25,713 epoch 8 - iter 91/138 - loss 0.02261621 - time (sec): 5.35 - samples/sec: 2810.12 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:45:26,434 epoch 8 - iter 104/138 - loss 0.02328122 - time (sec): 6.07 - samples/sec: 2832.31 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:45:27,199 epoch 8 - iter 117/138 - loss 0.02375440 - time (sec): 6.84 - samples/sec: 2830.24 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:45:27,931 epoch 8 - iter 130/138 - loss 0.02207057 - time (sec): 7.57 - samples/sec: 2852.00 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:45:28,389 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:28,389 EPOCH 8 done: loss 0.0274 - lr: 0.000012 2023-10-17 08:45:29,033 DEV : loss 0.17290453612804413 - f1-score (micro avg) 0.8766 2023-10-17 08:45:29,038 saving best model 2023-10-17 08:45:29,506 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:30,198 epoch 9 - iter 13/138 - loss 0.01479278 - time (sec): 0.69 - samples/sec: 3106.28 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:45:30,964 epoch 9 - iter 26/138 - loss 0.02037863 - time (sec): 1.46 - samples/sec: 3100.06 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:45:31,702 epoch 9 - iter 39/138 - loss 0.01773260 - time (sec): 2.19 - samples/sec: 3000.70 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:45:32,385 epoch 9 - iter 52/138 - loss 0.01953734 - time (sec): 2.88 - samples/sec: 2969.58 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:45:33,085 epoch 9 - iter 65/138 - loss 0.01974078 - time (sec): 3.58 - samples/sec: 3023.44 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:45:33,826 epoch 9 - iter 78/138 - loss 0.01741408 - time (sec): 4.32 - samples/sec: 3019.24 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:45:34,528 epoch 9 - iter 91/138 - loss 0.02365520 - time (sec): 5.02 - samples/sec: 2999.66 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:45:35,271 epoch 9 - iter 104/138 - loss 0.02358021 - time (sec): 5.76 - samples/sec: 2981.94 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:45:35,978 epoch 9 - iter 117/138 - loss 0.02217472 - time (sec): 6.47 - samples/sec: 2980.06 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:45:36,677 epoch 9 - iter 130/138 - loss 0.02094984 - time (sec): 7.17 - samples/sec: 2990.54 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:45:37,134 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:37,134 EPOCH 9 done: loss 0.0202 - lr: 0.000006 2023-10-17 08:45:37,790 DEV : loss 0.1823827624320984 - f1-score (micro avg) 0.872 2023-10-17 08:45:37,795 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:38,585 epoch 10 - iter 13/138 - loss 0.08784586 - time (sec): 0.79 - samples/sec: 3159.87 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:45:39,345 epoch 10 - iter 26/138 - loss 0.04966499 - time (sec): 1.55 - samples/sec: 3082.74 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:45:40,068 epoch 10 - iter 39/138 - loss 0.03598407 - time (sec): 2.27 - samples/sec: 3108.91 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:45:40,765 epoch 10 - iter 52/138 - loss 0.02807500 - time (sec): 2.97 - samples/sec: 3070.41 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:45:41,488 epoch 10 - iter 65/138 - loss 0.02399162 - time (sec): 3.69 - samples/sec: 3046.98 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:45:42,166 epoch 10 - iter 78/138 - loss 0.02165534 - time (sec): 4.37 - samples/sec: 3027.74 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:45:42,927 epoch 10 - iter 91/138 - loss 0.01912312 - time (sec): 5.13 - samples/sec: 2985.88 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:45:43,591 epoch 10 - iter 104/138 - loss 0.01886851 - time (sec): 5.79 - samples/sec: 2971.80 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:45:44,320 epoch 10 - iter 117/138 - loss 0.01778748 - time (sec): 6.52 - samples/sec: 2975.48 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:45:45,028 epoch 10 - iter 130/138 - loss 0.01786949 - time (sec): 7.23 - samples/sec: 2985.97 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:45:45,509 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:45,510 EPOCH 10 done: loss 0.0176 - lr: 0.000000 2023-10-17 08:45:46,146 DEV : loss 0.18640285730361938 - f1-score (micro avg) 0.872 2023-10-17 08:45:46,499 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:45:46,500 Loading model from best epoch ... 2023-10-17 08:45:47,856 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:45:48,650 Results: - F-score (micro) 0.9067 - F-score (macro) 0.9372 - Accuracy 0.8413 By class: precision recall f1-score support scope 0.8895 0.9148 0.9020 176 pers 0.9683 0.9531 0.9606 128 work 0.7975 0.8514 0.8235 74 object 1.0000 1.0000 1.0000 2 loc 1.0000 1.0000 1.0000 2 micro avg 0.8974 0.9162 0.9067 382 macro avg 0.9310 0.9438 0.9372 382 weighted avg 0.8992 0.9162 0.9075 382 2023-10-17 08:45:48,650 ----------------------------------------------------------------------------------------------------