2023-10-16 18:03:29,369 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,369 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,370 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,370 Train: 1166 sentences 2023-10-16 18:03:29,370 (train_with_dev=False, train_with_test=False) 2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,370 Training Params: 2023-10-16 18:03:29,370 - learning_rate: "5e-05" 2023-10-16 18:03:29,370 - mini_batch_size: "8" 2023-10-16 18:03:29,370 - max_epochs: "10" 2023-10-16 18:03:29,370 - shuffle: "True" 2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,370 Plugins: 2023-10-16 18:03:29,370 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,370 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 18:03:29,371 - metric: "('micro avg', 'f1-score')" 2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,371 Computation: 2023-10-16 18:03:29,371 - compute on device: cuda:0 2023-10-16 18:03:29,371 - embedding storage: none 2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,371 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:30,832 epoch 1 - iter 14/146 - loss 3.01497338 - time (sec): 1.46 - samples/sec: 2840.69 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:03:32,586 epoch 1 - iter 28/146 - loss 2.66432612 - time (sec): 3.21 - samples/sec: 2976.51 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:03:33,881 epoch 1 - iter 42/146 - loss 2.22017613 - time (sec): 4.51 - samples/sec: 2961.61 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:03:35,066 epoch 1 - iter 56/146 - loss 1.88824002 - time (sec): 5.69 - samples/sec: 3029.64 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:03:36,628 epoch 1 - iter 70/146 - loss 1.59845017 - time (sec): 7.26 - samples/sec: 3000.46 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:03:38,129 epoch 1 - iter 84/146 - loss 1.41561660 - time (sec): 8.76 - samples/sec: 2977.65 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:03:39,479 epoch 1 - iter 98/146 - loss 1.31091327 - time (sec): 10.11 - samples/sec: 3005.20 - lr: 0.000033 - momentum: 0.000000 2023-10-16 18:03:40,770 epoch 1 - iter 112/146 - loss 1.19566679 - time (sec): 11.40 - samples/sec: 3019.72 - lr: 0.000038 - momentum: 0.000000 2023-10-16 18:03:42,096 epoch 1 - iter 126/146 - loss 1.10677521 - time (sec): 12.72 - samples/sec: 2998.40 - lr: 0.000043 - momentum: 0.000000 2023-10-16 18:03:43,586 epoch 1 - iter 140/146 - loss 1.01852525 - time (sec): 14.21 - samples/sec: 3008.21 - lr: 0.000048 - momentum: 0.000000 2023-10-16 18:03:44,182 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:44,183 EPOCH 1 done: loss 0.9945 - lr: 0.000048 2023-10-16 18:03:45,011 DEV : loss 0.21267659962177277 - f1-score (micro avg) 0.4215 2023-10-16 18:03:45,017 saving best model 2023-10-16 18:03:45,421 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:03:47,141 epoch 2 - iter 14/146 - loss 0.19846158 - time (sec): 1.72 - samples/sec: 2526.85 - lr: 0.000050 - momentum: 0.000000 2023-10-16 18:03:48,426 epoch 2 - iter 28/146 - loss 0.25337461 - time (sec): 3.00 - samples/sec: 2817.46 - lr: 0.000049 - momentum: 0.000000 2023-10-16 18:03:49,670 epoch 2 - iter 42/146 - loss 0.24819521 - time (sec): 4.25 - samples/sec: 3003.01 - lr: 0.000048 - momentum: 0.000000 2023-10-16 18:03:51,044 epoch 2 - iter 56/146 - loss 0.22898805 - time (sec): 5.62 - samples/sec: 2998.40 - lr: 0.000048 - momentum: 0.000000 2023-10-16 18:03:52,231 epoch 2 - iter 70/146 - loss 0.21737529 - time (sec): 6.81 - samples/sec: 2987.47 - lr: 0.000047 - momentum: 0.000000 2023-10-16 18:03:53,641 epoch 2 - iter 84/146 - loss 0.21818368 - time (sec): 8.22 - samples/sec: 2953.87 - lr: 0.000047 - momentum: 0.000000 2023-10-16 18:03:55,531 epoch 2 - iter 98/146 - loss 0.21871720 - time (sec): 10.11 - samples/sec: 2914.81 - lr: 0.000046 - momentum: 0.000000 2023-10-16 18:03:57,179 epoch 2 - iter 112/146 - loss 0.21468634 - time (sec): 11.76 - samples/sec: 2886.73 - lr: 0.000046 - momentum: 0.000000 2023-10-16 18:03:58,761 epoch 2 - iter 126/146 - loss 0.21766037 - time (sec): 13.34 - samples/sec: 2867.10 - lr: 0.000045 - momentum: 0.000000 2023-10-16 18:04:00,342 epoch 2 - iter 140/146 - loss 0.20901073 - time (sec): 14.92 - samples/sec: 2891.15 - lr: 0.000045 - momentum: 0.000000 2023-10-16 18:04:00,765 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:00,765 EPOCH 2 done: loss 0.2086 - lr: 0.000045 2023-10-16 18:04:02,072 DEV : loss 0.1228543296456337 - f1-score (micro avg) 0.6842 2023-10-16 18:04:02,078 saving best model 2023-10-16 18:04:02,611 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:04,294 epoch 3 - iter 14/146 - loss 0.14117504 - time (sec): 1.68 - samples/sec: 3205.04 - lr: 0.000044 - momentum: 0.000000 2023-10-16 18:04:05,541 epoch 3 - iter 28/146 - loss 0.12690368 - time (sec): 2.93 - samples/sec: 2907.39 - lr: 0.000043 - momentum: 0.000000 2023-10-16 18:04:06,769 epoch 3 - iter 42/146 - loss 0.13116279 - time (sec): 4.16 - samples/sec: 3099.71 - lr: 0.000043 - momentum: 0.000000 2023-10-16 18:04:08,164 epoch 3 - iter 56/146 - loss 0.12578848 - time (sec): 5.55 - samples/sec: 3162.13 - lr: 0.000042 - momentum: 0.000000 2023-10-16 18:04:09,644 epoch 3 - iter 70/146 - loss 0.11850282 - time (sec): 7.03 - samples/sec: 3196.95 - lr: 0.000042 - momentum: 0.000000 2023-10-16 18:04:11,025 epoch 3 - iter 84/146 - loss 0.11666530 - time (sec): 8.41 - samples/sec: 3173.50 - lr: 0.000041 - momentum: 0.000000 2023-10-16 18:04:12,066 epoch 3 - iter 98/146 - loss 0.11535129 - time (sec): 9.45 - samples/sec: 3129.27 - lr: 0.000041 - momentum: 0.000000 2023-10-16 18:04:13,501 epoch 3 - iter 112/146 - loss 0.11610264 - time (sec): 10.89 - samples/sec: 3096.61 - lr: 0.000040 - momentum: 0.000000 2023-10-16 18:04:14,912 epoch 3 - iter 126/146 - loss 0.11279962 - time (sec): 12.30 - samples/sec: 3085.89 - lr: 0.000040 - momentum: 0.000000 2023-10-16 18:04:16,435 epoch 3 - iter 140/146 - loss 0.11327376 - time (sec): 13.82 - samples/sec: 3090.83 - lr: 0.000039 - momentum: 0.000000 2023-10-16 18:04:17,059 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:17,059 EPOCH 3 done: loss 0.1124 - lr: 0.000039 2023-10-16 18:04:18,296 DEV : loss 0.11312269419431686 - f1-score (micro avg) 0.7093 2023-10-16 18:04:18,300 saving best model 2023-10-16 18:04:18,736 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:20,303 epoch 4 - iter 14/146 - loss 0.07271600 - time (sec): 1.56 - samples/sec: 2859.63 - lr: 0.000038 - momentum: 0.000000 2023-10-16 18:04:21,947 epoch 4 - iter 28/146 - loss 0.08092845 - time (sec): 3.21 - samples/sec: 2813.11 - lr: 0.000038 - momentum: 0.000000 2023-10-16 18:04:23,283 epoch 4 - iter 42/146 - loss 0.07155158 - time (sec): 4.54 - samples/sec: 2848.06 - lr: 0.000037 - momentum: 0.000000 2023-10-16 18:04:24,679 epoch 4 - iter 56/146 - loss 0.06922215 - time (sec): 5.94 - samples/sec: 2916.61 - lr: 0.000037 - momentum: 0.000000 2023-10-16 18:04:26,059 epoch 4 - iter 70/146 - loss 0.07262218 - time (sec): 7.32 - samples/sec: 2935.48 - lr: 0.000036 - momentum: 0.000000 2023-10-16 18:04:27,517 epoch 4 - iter 84/146 - loss 0.07370568 - time (sec): 8.78 - samples/sec: 2884.56 - lr: 0.000036 - momentum: 0.000000 2023-10-16 18:04:28,799 epoch 4 - iter 98/146 - loss 0.07320739 - time (sec): 10.06 - samples/sec: 2889.11 - lr: 0.000035 - momentum: 0.000000 2023-10-16 18:04:30,342 epoch 4 - iter 112/146 - loss 0.07873008 - time (sec): 11.60 - samples/sec: 2889.88 - lr: 0.000035 - momentum: 0.000000 2023-10-16 18:04:31,904 epoch 4 - iter 126/146 - loss 0.07512300 - time (sec): 13.16 - samples/sec: 2927.05 - lr: 0.000034 - momentum: 0.000000 2023-10-16 18:04:33,248 epoch 4 - iter 140/146 - loss 0.07444114 - time (sec): 14.51 - samples/sec: 2929.89 - lr: 0.000034 - momentum: 0.000000 2023-10-16 18:04:33,902 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:33,902 EPOCH 4 done: loss 0.0751 - lr: 0.000034 2023-10-16 18:04:35,153 DEV : loss 0.10812485218048096 - f1-score (micro avg) 0.7446 2023-10-16 18:04:35,158 saving best model 2023-10-16 18:04:35,666 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:37,282 epoch 5 - iter 14/146 - loss 0.05012667 - time (sec): 1.61 - samples/sec: 2978.46 - lr: 0.000033 - momentum: 0.000000 2023-10-16 18:04:38,724 epoch 5 - iter 28/146 - loss 0.05362344 - time (sec): 3.05 - samples/sec: 2896.15 - lr: 0.000032 - momentum: 0.000000 2023-10-16 18:04:39,994 epoch 5 - iter 42/146 - loss 0.05602592 - time (sec): 4.32 - samples/sec: 3049.46 - lr: 0.000032 - momentum: 0.000000 2023-10-16 18:04:41,436 epoch 5 - iter 56/146 - loss 0.05228982 - time (sec): 5.76 - samples/sec: 3067.34 - lr: 0.000031 - momentum: 0.000000 2023-10-16 18:04:43,165 epoch 5 - iter 70/146 - loss 0.05023073 - time (sec): 7.49 - samples/sec: 2978.39 - lr: 0.000031 - momentum: 0.000000 2023-10-16 18:04:44,560 epoch 5 - iter 84/146 - loss 0.05370307 - time (sec): 8.89 - samples/sec: 2991.28 - lr: 0.000030 - momentum: 0.000000 2023-10-16 18:04:45,946 epoch 5 - iter 98/146 - loss 0.05217695 - time (sec): 10.28 - samples/sec: 3019.20 - lr: 0.000030 - momentum: 0.000000 2023-10-16 18:04:47,101 epoch 5 - iter 112/146 - loss 0.05378885 - time (sec): 11.43 - samples/sec: 3022.07 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:04:48,507 epoch 5 - iter 126/146 - loss 0.05094919 - time (sec): 12.84 - samples/sec: 3017.31 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:04:49,841 epoch 5 - iter 140/146 - loss 0.05093383 - time (sec): 14.17 - samples/sec: 3019.63 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:04:50,475 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:50,475 EPOCH 5 done: loss 0.0505 - lr: 0.000028 2023-10-16 18:04:51,717 DEV : loss 0.12893062829971313 - f1-score (micro avg) 0.7451 2023-10-16 18:04:51,722 saving best model 2023-10-16 18:04:52,219 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:04:53,686 epoch 6 - iter 14/146 - loss 0.05350656 - time (sec): 1.46 - samples/sec: 2785.60 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:04:55,172 epoch 6 - iter 28/146 - loss 0.04160296 - time (sec): 2.95 - samples/sec: 2864.73 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:04:56,281 epoch 6 - iter 42/146 - loss 0.04397600 - time (sec): 4.06 - samples/sec: 2902.44 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:04:57,844 epoch 6 - iter 56/146 - loss 0.04078383 - time (sec): 5.62 - samples/sec: 2960.99 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:04:58,952 epoch 6 - iter 70/146 - loss 0.03860104 - time (sec): 6.73 - samples/sec: 2941.14 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:05:00,612 epoch 6 - iter 84/146 - loss 0.03725238 - time (sec): 8.39 - samples/sec: 2888.25 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:05:02,296 epoch 6 - iter 98/146 - loss 0.03930290 - time (sec): 10.07 - samples/sec: 2919.23 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:05:03,668 epoch 6 - iter 112/146 - loss 0.03821762 - time (sec): 11.44 - samples/sec: 2953.46 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:05:05,152 epoch 6 - iter 126/146 - loss 0.03812065 - time (sec): 12.93 - samples/sec: 2955.70 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:05:06,486 epoch 6 - iter 140/146 - loss 0.03699667 - time (sec): 14.26 - samples/sec: 2988.15 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:05:07,036 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:07,036 EPOCH 6 done: loss 0.0362 - lr: 0.000023 2023-10-16 18:05:08,266 DEV : loss 0.12390300631523132 - f1-score (micro avg) 0.7414 2023-10-16 18:05:08,271 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:09,553 epoch 7 - iter 14/146 - loss 0.01743334 - time (sec): 1.28 - samples/sec: 3239.48 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:05:10,816 epoch 7 - iter 28/146 - loss 0.01770293 - time (sec): 2.54 - samples/sec: 3207.44 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:05:12,064 epoch 7 - iter 42/146 - loss 0.03132041 - time (sec): 3.79 - samples/sec: 3161.58 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:05:13,842 epoch 7 - iter 56/146 - loss 0.03031072 - time (sec): 5.57 - samples/sec: 3063.29 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:05:15,512 epoch 7 - iter 70/146 - loss 0.03222962 - time (sec): 7.24 - samples/sec: 2966.48 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:05:17,073 epoch 7 - iter 84/146 - loss 0.03034968 - time (sec): 8.80 - samples/sec: 2895.56 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:05:18,505 epoch 7 - iter 98/146 - loss 0.02880115 - time (sec): 10.23 - samples/sec: 2895.60 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:05:19,805 epoch 7 - iter 112/146 - loss 0.02780285 - time (sec): 11.53 - samples/sec: 2939.47 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:05:21,415 epoch 7 - iter 126/146 - loss 0.02885371 - time (sec): 13.14 - samples/sec: 2937.77 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:05:22,691 epoch 7 - iter 140/146 - loss 0.02990561 - time (sec): 14.42 - samples/sec: 2952.74 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:05:23,307 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:23,307 EPOCH 7 done: loss 0.0292 - lr: 0.000017 2023-10-16 18:05:24,837 DEV : loss 0.13431765139102936 - f1-score (micro avg) 0.7837 2023-10-16 18:05:24,843 saving best model 2023-10-16 18:05:25,379 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:26,644 epoch 8 - iter 14/146 - loss 0.01998228 - time (sec): 1.26 - samples/sec: 2827.62 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:05:28,217 epoch 8 - iter 28/146 - loss 0.01806968 - time (sec): 2.83 - samples/sec: 2953.41 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:05:29,581 epoch 8 - iter 42/146 - loss 0.01593014 - time (sec): 4.20 - samples/sec: 2977.35 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:05:31,066 epoch 8 - iter 56/146 - loss 0.01649775 - time (sec): 5.68 - samples/sec: 2983.75 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:05:32,577 epoch 8 - iter 70/146 - loss 0.01690638 - time (sec): 7.19 - samples/sec: 2953.54 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:05:34,016 epoch 8 - iter 84/146 - loss 0.01728427 - time (sec): 8.63 - samples/sec: 2931.99 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:05:35,351 epoch 8 - iter 98/146 - loss 0.01793460 - time (sec): 9.97 - samples/sec: 2912.50 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:05:36,764 epoch 8 - iter 112/146 - loss 0.02025946 - time (sec): 11.38 - samples/sec: 2919.80 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:05:37,985 epoch 8 - iter 126/146 - loss 0.02049963 - time (sec): 12.60 - samples/sec: 2942.14 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:05:39,695 epoch 8 - iter 140/146 - loss 0.02027892 - time (sec): 14.31 - samples/sec: 2967.94 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:05:40,439 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:40,439 EPOCH 8 done: loss 0.0198 - lr: 0.000012 2023-10-16 18:05:41,765 DEV : loss 0.15437102317810059 - f1-score (micro avg) 0.7479 2023-10-16 18:05:41,774 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:43,312 epoch 9 - iter 14/146 - loss 0.01195428 - time (sec): 1.54 - samples/sec: 2860.41 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:05:44,705 epoch 9 - iter 28/146 - loss 0.00962441 - time (sec): 2.93 - samples/sec: 2836.03 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:05:46,047 epoch 9 - iter 42/146 - loss 0.01059304 - time (sec): 4.27 - samples/sec: 2842.07 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:05:47,676 epoch 9 - iter 56/146 - loss 0.01847025 - time (sec): 5.90 - samples/sec: 2896.60 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:05:49,125 epoch 9 - iter 70/146 - loss 0.01586688 - time (sec): 7.35 - samples/sec: 2895.07 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:05:50,344 epoch 9 - iter 84/146 - loss 0.01537808 - time (sec): 8.57 - samples/sec: 2959.60 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:05:51,778 epoch 9 - iter 98/146 - loss 0.01688221 - time (sec): 10.00 - samples/sec: 2940.78 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:05:53,187 epoch 9 - iter 112/146 - loss 0.01695533 - time (sec): 11.41 - samples/sec: 2952.98 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:05:54,787 epoch 9 - iter 126/146 - loss 0.01675493 - time (sec): 13.01 - samples/sec: 2933.72 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:05:56,392 epoch 9 - iter 140/146 - loss 0.01614276 - time (sec): 14.62 - samples/sec: 2926.73 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:05:56,937 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:56,938 EPOCH 9 done: loss 0.0156 - lr: 0.000006 2023-10-16 18:05:58,185 DEV : loss 0.15635186433792114 - f1-score (micro avg) 0.7368 2023-10-16 18:05:58,190 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:05:59,477 epoch 10 - iter 14/146 - loss 0.00619091 - time (sec): 1.29 - samples/sec: 2928.89 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:06:01,000 epoch 10 - iter 28/146 - loss 0.00503078 - time (sec): 2.81 - samples/sec: 3186.76 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:06:02,933 epoch 10 - iter 42/146 - loss 0.01384968 - time (sec): 4.74 - samples/sec: 3059.21 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:06:04,330 epoch 10 - iter 56/146 - loss 0.01118529 - time (sec): 6.14 - samples/sec: 3127.52 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:06:05,732 epoch 10 - iter 70/146 - loss 0.01020773 - time (sec): 7.54 - samples/sec: 3083.99 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:06:07,104 epoch 10 - iter 84/146 - loss 0.01045028 - time (sec): 8.91 - samples/sec: 3031.46 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:06:08,416 epoch 10 - iter 98/146 - loss 0.00963056 - time (sec): 10.22 - samples/sec: 3020.03 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:06:09,777 epoch 10 - iter 112/146 - loss 0.01023756 - time (sec): 11.59 - samples/sec: 2991.65 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:06:11,036 epoch 10 - iter 126/146 - loss 0.01175292 - time (sec): 12.84 - samples/sec: 2989.54 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:06:12,247 epoch 10 - iter 140/146 - loss 0.01174241 - time (sec): 14.06 - samples/sec: 3014.32 - lr: 0.000000 - momentum: 0.000000 2023-10-16 18:06:12,946 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:06:12,946 EPOCH 10 done: loss 0.0119 - lr: 0.000000 2023-10-16 18:06:14,485 DEV : loss 0.16552899777889252 - f1-score (micro avg) 0.7468 2023-10-16 18:06:14,965 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:06:14,967 Loading model from best epoch ... 2023-10-16 18:06:16,437 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-16 18:06:18,838 Results: - F-score (micro) 0.7565 - F-score (macro) 0.6684 - Accuracy 0.6286 By class: precision recall f1-score support PER 0.7958 0.8621 0.8276 348 LOC 0.6524 0.8199 0.7267 261 ORG 0.4773 0.4038 0.4375 52 HumanProd 0.6818 0.6818 0.6818 22 micro avg 0.7134 0.8053 0.7565 683 macro avg 0.6518 0.6919 0.6684 683 weighted avg 0.7131 0.8053 0.7546 683 2023-10-16 18:06:18,838 ----------------------------------------------------------------------------------------------------