stefan-it's picture
Upload folder using huggingface_hub
34e2f85
2023-10-14 11:58:32,284 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,285 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 11:58:32,285 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 Train: 5777 sentences
2023-10-14 11:58:32,286 (train_with_dev=False, train_with_test=False)
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 Training Params:
2023-10-14 11:58:32,286 - learning_rate: "5e-05"
2023-10-14 11:58:32,286 - mini_batch_size: "8"
2023-10-14 11:58:32,286 - max_epochs: "10"
2023-10-14 11:58:32,286 - shuffle: "True"
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 Plugins:
2023-10-14 11:58:32,286 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 11:58:32,286 - metric: "('micro avg', 'f1-score')"
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 Computation:
2023-10-14 11:58:32,286 - compute on device: cuda:0
2023-10-14 11:58:32,286 - embedding storage: none
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:32,286 ----------------------------------------------------------------------------------------------------
2023-10-14 11:58:38,229 epoch 1 - iter 72/723 - loss 1.83466572 - time (sec): 5.94 - samples/sec: 3121.04 - lr: 0.000005 - momentum: 0.000000
2023-10-14 11:58:44,358 epoch 1 - iter 144/723 - loss 1.07312952 - time (sec): 12.07 - samples/sec: 2975.76 - lr: 0.000010 - momentum: 0.000000
2023-10-14 11:58:50,195 epoch 1 - iter 216/723 - loss 0.79762119 - time (sec): 17.91 - samples/sec: 2978.27 - lr: 0.000015 - momentum: 0.000000
2023-10-14 11:58:56,375 epoch 1 - iter 288/723 - loss 0.64875789 - time (sec): 24.09 - samples/sec: 2962.96 - lr: 0.000020 - momentum: 0.000000
2023-10-14 11:59:02,475 epoch 1 - iter 360/723 - loss 0.55533977 - time (sec): 30.19 - samples/sec: 2937.76 - lr: 0.000025 - momentum: 0.000000
2023-10-14 11:59:08,437 epoch 1 - iter 432/723 - loss 0.48997863 - time (sec): 36.15 - samples/sec: 2939.05 - lr: 0.000030 - momentum: 0.000000
2023-10-14 11:59:14,448 epoch 1 - iter 504/723 - loss 0.44372217 - time (sec): 42.16 - samples/sec: 2940.61 - lr: 0.000035 - momentum: 0.000000
2023-10-14 11:59:20,095 epoch 1 - iter 576/723 - loss 0.40786476 - time (sec): 47.81 - samples/sec: 2933.54 - lr: 0.000040 - momentum: 0.000000
2023-10-14 11:59:25,851 epoch 1 - iter 648/723 - loss 0.37553601 - time (sec): 53.56 - samples/sec: 2947.47 - lr: 0.000045 - momentum: 0.000000
2023-10-14 11:59:31,836 epoch 1 - iter 720/723 - loss 0.35055959 - time (sec): 59.55 - samples/sec: 2949.53 - lr: 0.000050 - momentum: 0.000000
2023-10-14 11:59:32,042 ----------------------------------------------------------------------------------------------------
2023-10-14 11:59:32,042 EPOCH 1 done: loss 0.3500 - lr: 0.000050
2023-10-14 11:59:35,204 DEV : loss 0.13189315795898438 - f1-score (micro avg) 0.6496
2023-10-14 11:59:35,220 saving best model
2023-10-14 11:59:35,629 ----------------------------------------------------------------------------------------------------
2023-10-14 11:59:41,987 epoch 2 - iter 72/723 - loss 0.11751025 - time (sec): 6.36 - samples/sec: 2691.71 - lr: 0.000049 - momentum: 0.000000
2023-10-14 11:59:47,893 epoch 2 - iter 144/723 - loss 0.11290402 - time (sec): 12.26 - samples/sec: 2784.35 - lr: 0.000049 - momentum: 0.000000
2023-10-14 11:59:54,056 epoch 2 - iter 216/723 - loss 0.10761347 - time (sec): 18.43 - samples/sec: 2844.82 - lr: 0.000048 - momentum: 0.000000
2023-10-14 12:00:00,524 epoch 2 - iter 288/723 - loss 0.10620566 - time (sec): 24.89 - samples/sec: 2848.53 - lr: 0.000048 - momentum: 0.000000
2023-10-14 12:00:06,097 epoch 2 - iter 360/723 - loss 0.10503397 - time (sec): 30.47 - samples/sec: 2887.49 - lr: 0.000047 - momentum: 0.000000
2023-10-14 12:00:11,541 epoch 2 - iter 432/723 - loss 0.10197538 - time (sec): 35.91 - samples/sec: 2916.16 - lr: 0.000047 - momentum: 0.000000
2023-10-14 12:00:17,722 epoch 2 - iter 504/723 - loss 0.10172999 - time (sec): 42.09 - samples/sec: 2906.40 - lr: 0.000046 - momentum: 0.000000
2023-10-14 12:00:24,040 epoch 2 - iter 576/723 - loss 0.09966793 - time (sec): 48.41 - samples/sec: 2896.00 - lr: 0.000046 - momentum: 0.000000
2023-10-14 12:00:30,082 epoch 2 - iter 648/723 - loss 0.09976307 - time (sec): 54.45 - samples/sec: 2905.29 - lr: 0.000045 - momentum: 0.000000
2023-10-14 12:00:35,771 epoch 2 - iter 720/723 - loss 0.09860537 - time (sec): 60.14 - samples/sec: 2922.83 - lr: 0.000044 - momentum: 0.000000
2023-10-14 12:00:35,927 ----------------------------------------------------------------------------------------------------
2023-10-14 12:00:35,928 EPOCH 2 done: loss 0.0986 - lr: 0.000044
2023-10-14 12:00:40,278 DEV : loss 0.08656182885169983 - f1-score (micro avg) 0.7867
2023-10-14 12:00:40,298 saving best model
2023-10-14 12:00:40,971 ----------------------------------------------------------------------------------------------------
2023-10-14 12:00:46,886 epoch 3 - iter 72/723 - loss 0.06819931 - time (sec): 5.91 - samples/sec: 2870.31 - lr: 0.000044 - momentum: 0.000000
2023-10-14 12:00:52,807 epoch 3 - iter 144/723 - loss 0.06728335 - time (sec): 11.83 - samples/sec: 2880.12 - lr: 0.000043 - momentum: 0.000000
2023-10-14 12:00:58,921 epoch 3 - iter 216/723 - loss 0.06427625 - time (sec): 17.95 - samples/sec: 2851.72 - lr: 0.000043 - momentum: 0.000000
2023-10-14 12:01:04,551 epoch 3 - iter 288/723 - loss 0.06626178 - time (sec): 23.58 - samples/sec: 2881.51 - lr: 0.000042 - momentum: 0.000000
2023-10-14 12:01:10,181 epoch 3 - iter 360/723 - loss 0.06476197 - time (sec): 29.21 - samples/sec: 2886.60 - lr: 0.000042 - momentum: 0.000000
2023-10-14 12:01:16,383 epoch 3 - iter 432/723 - loss 0.06317834 - time (sec): 35.41 - samples/sec: 2915.82 - lr: 0.000041 - momentum: 0.000000
2023-10-14 12:01:22,271 epoch 3 - iter 504/723 - loss 0.06316978 - time (sec): 41.30 - samples/sec: 2915.94 - lr: 0.000041 - momentum: 0.000000
2023-10-14 12:01:28,315 epoch 3 - iter 576/723 - loss 0.06518816 - time (sec): 47.34 - samples/sec: 2938.91 - lr: 0.000040 - momentum: 0.000000
2023-10-14 12:01:34,586 epoch 3 - iter 648/723 - loss 0.06373533 - time (sec): 53.61 - samples/sec: 2926.35 - lr: 0.000039 - momentum: 0.000000
2023-10-14 12:01:40,894 epoch 3 - iter 720/723 - loss 0.06257771 - time (sec): 59.92 - samples/sec: 2933.69 - lr: 0.000039 - momentum: 0.000000
2023-10-14 12:01:41,060 ----------------------------------------------------------------------------------------------------
2023-10-14 12:01:41,061 EPOCH 3 done: loss 0.0627 - lr: 0.000039
2023-10-14 12:01:44,703 DEV : loss 0.08839963376522064 - f1-score (micro avg) 0.8068
2023-10-14 12:01:44,730 saving best model
2023-10-14 12:01:45,274 ----------------------------------------------------------------------------------------------------
2023-10-14 12:01:51,166 epoch 4 - iter 72/723 - loss 0.03595879 - time (sec): 5.89 - samples/sec: 2878.82 - lr: 0.000038 - momentum: 0.000000
2023-10-14 12:01:57,619 epoch 4 - iter 144/723 - loss 0.03883693 - time (sec): 12.34 - samples/sec: 2904.68 - lr: 0.000038 - momentum: 0.000000
2023-10-14 12:02:03,480 epoch 4 - iter 216/723 - loss 0.03762509 - time (sec): 18.20 - samples/sec: 2912.16 - lr: 0.000037 - momentum: 0.000000
2023-10-14 12:02:09,721 epoch 4 - iter 288/723 - loss 0.04213370 - time (sec): 24.44 - samples/sec: 2891.27 - lr: 0.000037 - momentum: 0.000000
2023-10-14 12:02:15,341 epoch 4 - iter 360/723 - loss 0.04229835 - time (sec): 30.06 - samples/sec: 2907.80 - lr: 0.000036 - momentum: 0.000000
2023-10-14 12:02:21,491 epoch 4 - iter 432/723 - loss 0.04293415 - time (sec): 36.21 - samples/sec: 2900.78 - lr: 0.000036 - momentum: 0.000000
2023-10-14 12:02:27,520 epoch 4 - iter 504/723 - loss 0.04229960 - time (sec): 42.24 - samples/sec: 2913.46 - lr: 0.000035 - momentum: 0.000000
2023-10-14 12:02:33,219 epoch 4 - iter 576/723 - loss 0.04154991 - time (sec): 47.94 - samples/sec: 2914.53 - lr: 0.000034 - momentum: 0.000000
2023-10-14 12:02:39,129 epoch 4 - iter 648/723 - loss 0.04153434 - time (sec): 53.85 - samples/sec: 2930.31 - lr: 0.000034 - momentum: 0.000000
2023-10-14 12:02:45,170 epoch 4 - iter 720/723 - loss 0.04184976 - time (sec): 59.89 - samples/sec: 2935.18 - lr: 0.000033 - momentum: 0.000000
2023-10-14 12:02:45,325 ----------------------------------------------------------------------------------------------------
2023-10-14 12:02:45,325 EPOCH 4 done: loss 0.0418 - lr: 0.000033
2023-10-14 12:02:48,939 DEV : loss 0.09616296738386154 - f1-score (micro avg) 0.7935
2023-10-14 12:02:48,964 ----------------------------------------------------------------------------------------------------
2023-10-14 12:02:54,782 epoch 5 - iter 72/723 - loss 0.02118830 - time (sec): 5.82 - samples/sec: 2858.88 - lr: 0.000033 - momentum: 0.000000
2023-10-14 12:03:00,634 epoch 5 - iter 144/723 - loss 0.02600065 - time (sec): 11.67 - samples/sec: 2863.50 - lr: 0.000032 - momentum: 0.000000
2023-10-14 12:03:06,570 epoch 5 - iter 216/723 - loss 0.02863738 - time (sec): 17.60 - samples/sec: 2872.70 - lr: 0.000032 - momentum: 0.000000
2023-10-14 12:03:12,682 epoch 5 - iter 288/723 - loss 0.02992902 - time (sec): 23.72 - samples/sec: 2916.41 - lr: 0.000031 - momentum: 0.000000
2023-10-14 12:03:18,848 epoch 5 - iter 360/723 - loss 0.03303827 - time (sec): 29.88 - samples/sec: 2917.46 - lr: 0.000031 - momentum: 0.000000
2023-10-14 12:03:24,875 epoch 5 - iter 432/723 - loss 0.03302250 - time (sec): 35.91 - samples/sec: 2933.74 - lr: 0.000030 - momentum: 0.000000
2023-10-14 12:03:31,252 epoch 5 - iter 504/723 - loss 0.03221675 - time (sec): 42.29 - samples/sec: 2935.50 - lr: 0.000029 - momentum: 0.000000
2023-10-14 12:03:36,940 epoch 5 - iter 576/723 - loss 0.03142146 - time (sec): 47.97 - samples/sec: 2936.47 - lr: 0.000029 - momentum: 0.000000
2023-10-14 12:03:42,493 epoch 5 - iter 648/723 - loss 0.03036131 - time (sec): 53.53 - samples/sec: 2952.83 - lr: 0.000028 - momentum: 0.000000
2023-10-14 12:03:48,304 epoch 5 - iter 720/723 - loss 0.03141318 - time (sec): 59.34 - samples/sec: 2955.78 - lr: 0.000028 - momentum: 0.000000
2023-10-14 12:03:48,556 ----------------------------------------------------------------------------------------------------
2023-10-14 12:03:48,556 EPOCH 5 done: loss 0.0315 - lr: 0.000028
2023-10-14 12:03:52,530 DEV : loss 0.11451639235019684 - f1-score (micro avg) 0.8235
2023-10-14 12:03:52,550 saving best model
2023-10-14 12:03:53,116 ----------------------------------------------------------------------------------------------------
2023-10-14 12:03:59,048 epoch 6 - iter 72/723 - loss 0.02304111 - time (sec): 5.93 - samples/sec: 2851.22 - lr: 0.000027 - momentum: 0.000000
2023-10-14 12:04:04,697 epoch 6 - iter 144/723 - loss 0.02294005 - time (sec): 11.58 - samples/sec: 2956.84 - lr: 0.000027 - momentum: 0.000000
2023-10-14 12:04:10,873 epoch 6 - iter 216/723 - loss 0.02326946 - time (sec): 17.75 - samples/sec: 2934.08 - lr: 0.000026 - momentum: 0.000000
2023-10-14 12:04:17,008 epoch 6 - iter 288/723 - loss 0.02667079 - time (sec): 23.89 - samples/sec: 2944.44 - lr: 0.000026 - momentum: 0.000000
2023-10-14 12:04:23,616 epoch 6 - iter 360/723 - loss 0.02748106 - time (sec): 30.50 - samples/sec: 2934.75 - lr: 0.000025 - momentum: 0.000000
2023-10-14 12:04:29,904 epoch 6 - iter 432/723 - loss 0.02624476 - time (sec): 36.79 - samples/sec: 2904.61 - lr: 0.000024 - momentum: 0.000000
2023-10-14 12:04:35,334 epoch 6 - iter 504/723 - loss 0.02505646 - time (sec): 42.22 - samples/sec: 2928.64 - lr: 0.000024 - momentum: 0.000000
2023-10-14 12:04:41,250 epoch 6 - iter 576/723 - loss 0.02403994 - time (sec): 48.13 - samples/sec: 2925.55 - lr: 0.000023 - momentum: 0.000000
2023-10-14 12:04:46,875 epoch 6 - iter 648/723 - loss 0.02367198 - time (sec): 53.76 - samples/sec: 2944.02 - lr: 0.000023 - momentum: 0.000000
2023-10-14 12:04:52,633 epoch 6 - iter 720/723 - loss 0.02343289 - time (sec): 59.51 - samples/sec: 2952.24 - lr: 0.000022 - momentum: 0.000000
2023-10-14 12:04:52,843 ----------------------------------------------------------------------------------------------------
2023-10-14 12:04:52,843 EPOCH 6 done: loss 0.0234 - lr: 0.000022
2023-10-14 12:04:56,349 DEV : loss 0.145940899848938 - f1-score (micro avg) 0.8041
2023-10-14 12:04:56,367 ----------------------------------------------------------------------------------------------------
2023-10-14 12:05:02,368 epoch 7 - iter 72/723 - loss 0.01742657 - time (sec): 6.00 - samples/sec: 2837.14 - lr: 0.000022 - momentum: 0.000000
2023-10-14 12:05:08,372 epoch 7 - iter 144/723 - loss 0.01539851 - time (sec): 12.00 - samples/sec: 2851.72 - lr: 0.000021 - momentum: 0.000000
2023-10-14 12:05:15,025 epoch 7 - iter 216/723 - loss 0.01540275 - time (sec): 18.66 - samples/sec: 2821.55 - lr: 0.000021 - momentum: 0.000000
2023-10-14 12:05:21,000 epoch 7 - iter 288/723 - loss 0.01735549 - time (sec): 24.63 - samples/sec: 2842.93 - lr: 0.000020 - momentum: 0.000000
2023-10-14 12:05:26,955 epoch 7 - iter 360/723 - loss 0.01650779 - time (sec): 30.59 - samples/sec: 2880.15 - lr: 0.000019 - momentum: 0.000000
2023-10-14 12:05:32,689 epoch 7 - iter 432/723 - loss 0.01640627 - time (sec): 36.32 - samples/sec: 2901.85 - lr: 0.000019 - momentum: 0.000000
2023-10-14 12:05:38,703 epoch 7 - iter 504/723 - loss 0.01624031 - time (sec): 42.33 - samples/sec: 2904.94 - lr: 0.000018 - momentum: 0.000000
2023-10-14 12:05:44,741 epoch 7 - iter 576/723 - loss 0.01639039 - time (sec): 48.37 - samples/sec: 2907.00 - lr: 0.000018 - momentum: 0.000000
2023-10-14 12:05:50,379 epoch 7 - iter 648/723 - loss 0.01690292 - time (sec): 54.01 - samples/sec: 2909.39 - lr: 0.000017 - momentum: 0.000000
2023-10-14 12:05:56,792 epoch 7 - iter 720/723 - loss 0.01707398 - time (sec): 60.42 - samples/sec: 2907.57 - lr: 0.000017 - momentum: 0.000000
2023-10-14 12:05:57,013 ----------------------------------------------------------------------------------------------------
2023-10-14 12:05:57,013 EPOCH 7 done: loss 0.0172 - lr: 0.000017
2023-10-14 12:06:00,508 DEV : loss 0.15821607410907745 - f1-score (micro avg) 0.8065
2023-10-14 12:06:00,524 ----------------------------------------------------------------------------------------------------
2023-10-14 12:06:06,459 epoch 8 - iter 72/723 - loss 0.01327058 - time (sec): 5.93 - samples/sec: 3001.68 - lr: 0.000016 - momentum: 0.000000
2023-10-14 12:06:12,306 epoch 8 - iter 144/723 - loss 0.01093102 - time (sec): 11.78 - samples/sec: 2984.11 - lr: 0.000016 - momentum: 0.000000
2023-10-14 12:06:18,692 epoch 8 - iter 216/723 - loss 0.01220562 - time (sec): 18.17 - samples/sec: 2931.79 - lr: 0.000015 - momentum: 0.000000
2023-10-14 12:06:24,532 epoch 8 - iter 288/723 - loss 0.01255941 - time (sec): 24.01 - samples/sec: 2940.87 - lr: 0.000014 - momentum: 0.000000
2023-10-14 12:06:30,631 epoch 8 - iter 360/723 - loss 0.01253527 - time (sec): 30.11 - samples/sec: 2941.73 - lr: 0.000014 - momentum: 0.000000
2023-10-14 12:06:36,280 epoch 8 - iter 432/723 - loss 0.01294269 - time (sec): 35.75 - samples/sec: 2963.97 - lr: 0.000013 - momentum: 0.000000
2023-10-14 12:06:41,907 epoch 8 - iter 504/723 - loss 0.01262547 - time (sec): 41.38 - samples/sec: 2956.91 - lr: 0.000013 - momentum: 0.000000
2023-10-14 12:06:48,186 epoch 8 - iter 576/723 - loss 0.01217781 - time (sec): 47.66 - samples/sec: 2944.97 - lr: 0.000012 - momentum: 0.000000
2023-10-14 12:06:54,429 epoch 8 - iter 648/723 - loss 0.01292360 - time (sec): 53.90 - samples/sec: 2940.90 - lr: 0.000012 - momentum: 0.000000
2023-10-14 12:07:00,216 epoch 8 - iter 720/723 - loss 0.01290911 - time (sec): 59.69 - samples/sec: 2939.40 - lr: 0.000011 - momentum: 0.000000
2023-10-14 12:07:00,519 ----------------------------------------------------------------------------------------------------
2023-10-14 12:07:00,519 EPOCH 8 done: loss 0.0129 - lr: 0.000011
2023-10-14 12:07:05,301 DEV : loss 0.1818445473909378 - f1-score (micro avg) 0.8002
2023-10-14 12:07:05,325 ----------------------------------------------------------------------------------------------------
2023-10-14 12:07:11,461 epoch 9 - iter 72/723 - loss 0.00671491 - time (sec): 6.14 - samples/sec: 2841.22 - lr: 0.000011 - momentum: 0.000000
2023-10-14 12:07:17,079 epoch 9 - iter 144/723 - loss 0.00809482 - time (sec): 11.75 - samples/sec: 2847.47 - lr: 0.000010 - momentum: 0.000000
2023-10-14 12:07:23,944 epoch 9 - iter 216/723 - loss 0.00970598 - time (sec): 18.62 - samples/sec: 2880.97 - lr: 0.000009 - momentum: 0.000000
2023-10-14 12:07:29,480 epoch 9 - iter 288/723 - loss 0.00914238 - time (sec): 24.15 - samples/sec: 2911.70 - lr: 0.000009 - momentum: 0.000000
2023-10-14 12:07:35,420 epoch 9 - iter 360/723 - loss 0.00931955 - time (sec): 30.09 - samples/sec: 2937.31 - lr: 0.000008 - momentum: 0.000000
2023-10-14 12:07:41,293 epoch 9 - iter 432/723 - loss 0.00901514 - time (sec): 35.97 - samples/sec: 2943.28 - lr: 0.000008 - momentum: 0.000000
2023-10-14 12:07:47,142 epoch 9 - iter 504/723 - loss 0.00881443 - time (sec): 41.82 - samples/sec: 2949.50 - lr: 0.000007 - momentum: 0.000000
2023-10-14 12:07:53,123 epoch 9 - iter 576/723 - loss 0.00888603 - time (sec): 47.80 - samples/sec: 2937.46 - lr: 0.000007 - momentum: 0.000000
2023-10-14 12:07:58,907 epoch 9 - iter 648/723 - loss 0.00902752 - time (sec): 53.58 - samples/sec: 2943.85 - lr: 0.000006 - momentum: 0.000000
2023-10-14 12:08:04,899 epoch 9 - iter 720/723 - loss 0.00947007 - time (sec): 59.57 - samples/sec: 2945.66 - lr: 0.000006 - momentum: 0.000000
2023-10-14 12:08:05,164 ----------------------------------------------------------------------------------------------------
2023-10-14 12:08:05,164 EPOCH 9 done: loss 0.0094 - lr: 0.000006
2023-10-14 12:08:08,738 DEV : loss 0.1774718016386032 - f1-score (micro avg) 0.8009
2023-10-14 12:08:08,755 ----------------------------------------------------------------------------------------------------
2023-10-14 12:08:15,069 epoch 10 - iter 72/723 - loss 0.00669000 - time (sec): 6.31 - samples/sec: 2867.51 - lr: 0.000005 - momentum: 0.000000
2023-10-14 12:08:20,844 epoch 10 - iter 144/723 - loss 0.00736923 - time (sec): 12.09 - samples/sec: 2948.87 - lr: 0.000004 - momentum: 0.000000
2023-10-14 12:08:27,527 epoch 10 - iter 216/723 - loss 0.00741936 - time (sec): 18.77 - samples/sec: 2838.69 - lr: 0.000004 - momentum: 0.000000
2023-10-14 12:08:33,953 epoch 10 - iter 288/723 - loss 0.00692285 - time (sec): 25.20 - samples/sec: 2850.72 - lr: 0.000003 - momentum: 0.000000
2023-10-14 12:08:39,756 epoch 10 - iter 360/723 - loss 0.00587923 - time (sec): 31.00 - samples/sec: 2879.20 - lr: 0.000003 - momentum: 0.000000
2023-10-14 12:08:45,462 epoch 10 - iter 432/723 - loss 0.00582156 - time (sec): 36.71 - samples/sec: 2909.42 - lr: 0.000002 - momentum: 0.000000
2023-10-14 12:08:51,557 epoch 10 - iter 504/723 - loss 0.00635550 - time (sec): 42.80 - samples/sec: 2904.28 - lr: 0.000002 - momentum: 0.000000
2023-10-14 12:08:57,278 epoch 10 - iter 576/723 - loss 0.00666718 - time (sec): 48.52 - samples/sec: 2915.73 - lr: 0.000001 - momentum: 0.000000
2023-10-14 12:09:02,977 epoch 10 - iter 648/723 - loss 0.00681012 - time (sec): 54.22 - samples/sec: 2914.70 - lr: 0.000001 - momentum: 0.000000
2023-10-14 12:09:08,726 epoch 10 - iter 720/723 - loss 0.00652961 - time (sec): 59.97 - samples/sec: 2927.48 - lr: 0.000000 - momentum: 0.000000
2023-10-14 12:09:09,024 ----------------------------------------------------------------------------------------------------
2023-10-14 12:09:09,024 EPOCH 10 done: loss 0.0065 - lr: 0.000000
2023-10-14 12:09:12,568 DEV : loss 0.18370041251182556 - f1-score (micro avg) 0.8102
2023-10-14 12:09:12,998 ----------------------------------------------------------------------------------------------------
2023-10-14 12:09:12,999 Loading model from best epoch ...
2023-10-14 12:09:14,567 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 12:09:17,806
Results:
- F-score (micro) 0.7924
- F-score (macro) 0.6936
- Accuracy 0.6775
By class:
precision recall f1-score support
PER 0.7451 0.8610 0.7988 482
LOC 0.8682 0.8057 0.8358 458
ORG 0.4754 0.4203 0.4462 69
micro avg 0.7795 0.8057 0.7924 1009
macro avg 0.6962 0.6957 0.6936 1009
weighted avg 0.7825 0.8057 0.7915 1009
2023-10-14 12:09:17,806 ----------------------------------------------------------------------------------------------------