stefan-it's picture
Upload folder using huggingface_hub
d2ae4d8
2023-10-16 18:57:47,070 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:57:47,071 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:57:47,071 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 Train: 1166 sentences
2023-10-16 18:57:47,071 (train_with_dev=False, train_with_test=False)
2023-10-16 18:57:47,071 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 Training Params:
2023-10-16 18:57:47,071 - learning_rate: "5e-05"
2023-10-16 18:57:47,071 - mini_batch_size: "8"
2023-10-16 18:57:47,071 - max_epochs: "10"
2023-10-16 18:57:47,071 - shuffle: "True"
2023-10-16 18:57:47,071 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 Plugins:
2023-10-16 18:57:47,071 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:57:47,071 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:57:47,071 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:57:47,071 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 Computation:
2023-10-16 18:57:47,071 - compute on device: cuda:0
2023-10-16 18:57:47,071 - embedding storage: none
2023-10-16 18:57:47,071 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,071 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-16 18:57:47,072 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:47,072 ----------------------------------------------------------------------------------------------------
2023-10-16 18:57:48,866 epoch 1 - iter 14/146 - loss 2.85275888 - time (sec): 1.79 - samples/sec: 2393.23 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:57:50,298 epoch 1 - iter 28/146 - loss 2.58461077 - time (sec): 3.23 - samples/sec: 2720.72 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:57:51,822 epoch 1 - iter 42/146 - loss 1.93686695 - time (sec): 4.75 - samples/sec: 2838.89 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:57:53,169 epoch 1 - iter 56/146 - loss 1.60748031 - time (sec): 6.10 - samples/sec: 2902.19 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:57:54,313 epoch 1 - iter 70/146 - loss 1.43716847 - time (sec): 7.24 - samples/sec: 2929.14 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:57:55,684 epoch 1 - iter 84/146 - loss 1.28923078 - time (sec): 8.61 - samples/sec: 2982.48 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:57:57,034 epoch 1 - iter 98/146 - loss 1.16652726 - time (sec): 9.96 - samples/sec: 3025.72 - lr: 0.000033 - momentum: 0.000000
2023-10-16 18:57:58,553 epoch 1 - iter 112/146 - loss 1.06810315 - time (sec): 11.48 - samples/sec: 2995.07 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:57:59,899 epoch 1 - iter 126/146 - loss 0.99228364 - time (sec): 12.83 - samples/sec: 2985.10 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:58:01,226 epoch 1 - iter 140/146 - loss 0.92536534 - time (sec): 14.15 - samples/sec: 2991.18 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:58:01,941 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:01,941 EPOCH 1 done: loss 0.9018 - lr: 0.000048
2023-10-16 18:58:02,793 DEV : loss 0.21601726114749908 - f1-score (micro avg) 0.4723
2023-10-16 18:58:02,797 saving best model
2023-10-16 18:58:03,162 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:04,713 epoch 2 - iter 14/146 - loss 0.25471801 - time (sec): 1.55 - samples/sec: 3359.13 - lr: 0.000050 - momentum: 0.000000
2023-10-16 18:58:06,065 epoch 2 - iter 28/146 - loss 0.22264246 - time (sec): 2.90 - samples/sec: 3275.91 - lr: 0.000049 - momentum: 0.000000
2023-10-16 18:58:07,595 epoch 2 - iter 42/146 - loss 0.22382866 - time (sec): 4.43 - samples/sec: 3045.51 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:58:08,980 epoch 2 - iter 56/146 - loss 0.23555787 - time (sec): 5.82 - samples/sec: 3090.55 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:58:10,657 epoch 2 - iter 70/146 - loss 0.23862866 - time (sec): 7.49 - samples/sec: 3061.24 - lr: 0.000047 - momentum: 0.000000
2023-10-16 18:58:12,056 epoch 2 - iter 84/146 - loss 0.23544150 - time (sec): 8.89 - samples/sec: 3083.02 - lr: 0.000047 - momentum: 0.000000
2023-10-16 18:58:13,142 epoch 2 - iter 98/146 - loss 0.22702864 - time (sec): 9.98 - samples/sec: 3111.58 - lr: 0.000046 - momentum: 0.000000
2023-10-16 18:58:14,334 epoch 2 - iter 112/146 - loss 0.22689614 - time (sec): 11.17 - samples/sec: 3094.05 - lr: 0.000046 - momentum: 0.000000
2023-10-16 18:58:15,689 epoch 2 - iter 126/146 - loss 0.21684551 - time (sec): 12.53 - samples/sec: 3119.01 - lr: 0.000045 - momentum: 0.000000
2023-10-16 18:58:16,942 epoch 2 - iter 140/146 - loss 0.21064804 - time (sec): 13.78 - samples/sec: 3109.01 - lr: 0.000045 - momentum: 0.000000
2023-10-16 18:58:17,445 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:17,446 EPOCH 2 done: loss 0.2084 - lr: 0.000045
2023-10-16 18:58:18,868 DEV : loss 0.12920989096164703 - f1-score (micro avg) 0.6143
2023-10-16 18:58:18,873 saving best model
2023-10-16 18:58:19,350 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:20,707 epoch 3 - iter 14/146 - loss 0.13862225 - time (sec): 1.36 - samples/sec: 2920.89 - lr: 0.000044 - momentum: 0.000000
2023-10-16 18:58:22,006 epoch 3 - iter 28/146 - loss 0.14338984 - time (sec): 2.65 - samples/sec: 3137.06 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:58:23,500 epoch 3 - iter 42/146 - loss 0.12008848 - time (sec): 4.15 - samples/sec: 3074.81 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:58:24,920 epoch 3 - iter 56/146 - loss 0.12313137 - time (sec): 5.57 - samples/sec: 3057.58 - lr: 0.000042 - momentum: 0.000000
2023-10-16 18:58:26,459 epoch 3 - iter 70/146 - loss 0.11907608 - time (sec): 7.11 - samples/sec: 3064.78 - lr: 0.000042 - momentum: 0.000000
2023-10-16 18:58:27,921 epoch 3 - iter 84/146 - loss 0.11172872 - time (sec): 8.57 - samples/sec: 3038.33 - lr: 0.000041 - momentum: 0.000000
2023-10-16 18:58:29,406 epoch 3 - iter 98/146 - loss 0.11976382 - time (sec): 10.05 - samples/sec: 2997.42 - lr: 0.000041 - momentum: 0.000000
2023-10-16 18:58:30,755 epoch 3 - iter 112/146 - loss 0.12122439 - time (sec): 11.40 - samples/sec: 2997.06 - lr: 0.000040 - momentum: 0.000000
2023-10-16 18:58:32,142 epoch 3 - iter 126/146 - loss 0.11649488 - time (sec): 12.79 - samples/sec: 3002.35 - lr: 0.000040 - momentum: 0.000000
2023-10-16 18:58:33,356 epoch 3 - iter 140/146 - loss 0.11320558 - time (sec): 14.00 - samples/sec: 3005.59 - lr: 0.000039 - momentum: 0.000000
2023-10-16 18:58:34,158 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:34,158 EPOCH 3 done: loss 0.1136 - lr: 0.000039
2023-10-16 18:58:35,350 DEV : loss 0.11097574234008789 - f1-score (micro avg) 0.6785
2023-10-16 18:58:35,354 saving best model
2023-10-16 18:58:35,838 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:37,199 epoch 4 - iter 14/146 - loss 0.08881171 - time (sec): 1.36 - samples/sec: 3127.00 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:58:38,665 epoch 4 - iter 28/146 - loss 0.07105259 - time (sec): 2.83 - samples/sec: 2892.67 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:58:40,007 epoch 4 - iter 42/146 - loss 0.07224152 - time (sec): 4.17 - samples/sec: 2928.82 - lr: 0.000037 - momentum: 0.000000
2023-10-16 18:58:41,353 epoch 4 - iter 56/146 - loss 0.06886082 - time (sec): 5.51 - samples/sec: 2884.09 - lr: 0.000037 - momentum: 0.000000
2023-10-16 18:58:43,046 epoch 4 - iter 70/146 - loss 0.06718119 - time (sec): 7.21 - samples/sec: 2985.23 - lr: 0.000036 - momentum: 0.000000
2023-10-16 18:58:44,340 epoch 4 - iter 84/146 - loss 0.06805473 - time (sec): 8.50 - samples/sec: 2977.97 - lr: 0.000036 - momentum: 0.000000
2023-10-16 18:58:45,622 epoch 4 - iter 98/146 - loss 0.06783253 - time (sec): 9.78 - samples/sec: 2998.88 - lr: 0.000035 - momentum: 0.000000
2023-10-16 18:58:47,049 epoch 4 - iter 112/146 - loss 0.07079343 - time (sec): 11.21 - samples/sec: 3008.91 - lr: 0.000035 - momentum: 0.000000
2023-10-16 18:58:48,589 epoch 4 - iter 126/146 - loss 0.07149100 - time (sec): 12.75 - samples/sec: 3001.32 - lr: 0.000034 - momentum: 0.000000
2023-10-16 18:58:50,248 epoch 4 - iter 140/146 - loss 0.07007930 - time (sec): 14.41 - samples/sec: 2977.64 - lr: 0.000034 - momentum: 0.000000
2023-10-16 18:58:50,724 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:50,724 EPOCH 4 done: loss 0.0705 - lr: 0.000034
2023-10-16 18:58:51,927 DEV : loss 0.12358613312244415 - f1-score (micro avg) 0.7131
2023-10-16 18:58:51,931 saving best model
2023-10-16 18:58:52,379 ----------------------------------------------------------------------------------------------------
2023-10-16 18:58:53,904 epoch 5 - iter 14/146 - loss 0.04490097 - time (sec): 1.52 - samples/sec: 2638.04 - lr: 0.000033 - momentum: 0.000000
2023-10-16 18:58:55,255 epoch 5 - iter 28/146 - loss 0.03883960 - time (sec): 2.87 - samples/sec: 2837.28 - lr: 0.000032 - momentum: 0.000000
2023-10-16 18:58:57,007 epoch 5 - iter 42/146 - loss 0.05208017 - time (sec): 4.62 - samples/sec: 2747.83 - lr: 0.000032 - momentum: 0.000000
2023-10-16 18:58:58,377 epoch 5 - iter 56/146 - loss 0.04547796 - time (sec): 5.99 - samples/sec: 2890.39 - lr: 0.000031 - momentum: 0.000000
2023-10-16 18:58:59,661 epoch 5 - iter 70/146 - loss 0.05042304 - time (sec): 7.28 - samples/sec: 2956.50 - lr: 0.000031 - momentum: 0.000000
2023-10-16 18:59:00,782 epoch 5 - iter 84/146 - loss 0.05143826 - time (sec): 8.40 - samples/sec: 3006.47 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:59:02,276 epoch 5 - iter 98/146 - loss 0.05076338 - time (sec): 9.89 - samples/sec: 3023.21 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:59:03,691 epoch 5 - iter 112/146 - loss 0.04889364 - time (sec): 11.31 - samples/sec: 3019.47 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:59:05,074 epoch 5 - iter 126/146 - loss 0.04787092 - time (sec): 12.69 - samples/sec: 3015.71 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:59:06,762 epoch 5 - iter 140/146 - loss 0.04743290 - time (sec): 14.38 - samples/sec: 2984.12 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:59:07,242 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:07,243 EPOCH 5 done: loss 0.0476 - lr: 0.000028
2023-10-16 18:59:08,648 DEV : loss 0.13587994873523712 - f1-score (micro avg) 0.7331
2023-10-16 18:59:08,653 saving best model
2023-10-16 18:59:09,133 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:10,622 epoch 6 - iter 14/146 - loss 0.02716840 - time (sec): 1.49 - samples/sec: 2952.77 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:59:12,060 epoch 6 - iter 28/146 - loss 0.03073166 - time (sec): 2.92 - samples/sec: 2977.07 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:59:13,474 epoch 6 - iter 42/146 - loss 0.02853045 - time (sec): 4.34 - samples/sec: 3024.73 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:59:14,809 epoch 6 - iter 56/146 - loss 0.02737164 - time (sec): 5.67 - samples/sec: 3027.74 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:59:16,593 epoch 6 - iter 70/146 - loss 0.02631676 - time (sec): 7.46 - samples/sec: 2988.27 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:59:17,972 epoch 6 - iter 84/146 - loss 0.02888977 - time (sec): 8.84 - samples/sec: 2996.66 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:59:19,111 epoch 6 - iter 98/146 - loss 0.02768274 - time (sec): 9.98 - samples/sec: 3015.19 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:59:20,669 epoch 6 - iter 112/146 - loss 0.03162461 - time (sec): 11.53 - samples/sec: 3016.17 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:59:21,841 epoch 6 - iter 126/146 - loss 0.03177464 - time (sec): 12.71 - samples/sec: 2998.90 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:59:23,327 epoch 6 - iter 140/146 - loss 0.03314290 - time (sec): 14.19 - samples/sec: 3002.25 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:59:23,949 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:23,949 EPOCH 6 done: loss 0.0327 - lr: 0.000023
2023-10-16 18:59:25,151 DEV : loss 0.15191714465618134 - f1-score (micro avg) 0.7207
2023-10-16 18:59:25,155 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:26,988 epoch 7 - iter 14/146 - loss 0.03275114 - time (sec): 1.83 - samples/sec: 3054.97 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:59:28,329 epoch 7 - iter 28/146 - loss 0.02638257 - time (sec): 3.17 - samples/sec: 3053.94 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:59:29,630 epoch 7 - iter 42/146 - loss 0.02770277 - time (sec): 4.47 - samples/sec: 3068.58 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:59:31,036 epoch 7 - iter 56/146 - loss 0.02917022 - time (sec): 5.88 - samples/sec: 3057.27 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:59:32,414 epoch 7 - iter 70/146 - loss 0.03103427 - time (sec): 7.26 - samples/sec: 2947.06 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:59:33,978 epoch 7 - iter 84/146 - loss 0.02980021 - time (sec): 8.82 - samples/sec: 2954.29 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:59:35,178 epoch 7 - iter 98/146 - loss 0.02806417 - time (sec): 10.02 - samples/sec: 2983.95 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:59:36,697 epoch 7 - iter 112/146 - loss 0.03003321 - time (sec): 11.54 - samples/sec: 2958.78 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:59:37,886 epoch 7 - iter 126/146 - loss 0.02851813 - time (sec): 12.73 - samples/sec: 3015.31 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:59:39,499 epoch 7 - iter 140/146 - loss 0.02688230 - time (sec): 14.34 - samples/sec: 2989.65 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:59:40,113 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:40,113 EPOCH 7 done: loss 0.0266 - lr: 0.000017
2023-10-16 18:59:41,315 DEV : loss 0.15551069378852844 - f1-score (micro avg) 0.7387
2023-10-16 18:59:41,319 saving best model
2023-10-16 18:59:41,779 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:43,142 epoch 8 - iter 14/146 - loss 0.01842562 - time (sec): 1.36 - samples/sec: 3279.69 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:59:44,490 epoch 8 - iter 28/146 - loss 0.01534966 - time (sec): 2.71 - samples/sec: 3154.00 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:59:45,797 epoch 8 - iter 42/146 - loss 0.01695812 - time (sec): 4.01 - samples/sec: 3039.07 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:59:47,255 epoch 8 - iter 56/146 - loss 0.01512349 - time (sec): 5.47 - samples/sec: 3093.51 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:59:48,604 epoch 8 - iter 70/146 - loss 0.01497165 - time (sec): 6.82 - samples/sec: 3057.89 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:59:50,071 epoch 8 - iter 84/146 - loss 0.01522721 - time (sec): 8.29 - samples/sec: 3099.46 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:59:51,646 epoch 8 - iter 98/146 - loss 0.01645300 - time (sec): 9.86 - samples/sec: 3000.10 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:59:52,852 epoch 8 - iter 112/146 - loss 0.01726818 - time (sec): 11.07 - samples/sec: 3021.29 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:59:54,575 epoch 8 - iter 126/146 - loss 0.01719019 - time (sec): 12.79 - samples/sec: 2981.47 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:59:55,955 epoch 8 - iter 140/146 - loss 0.01826140 - time (sec): 14.17 - samples/sec: 3006.41 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:59:56,538 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:56,538 EPOCH 8 done: loss 0.0179 - lr: 0.000012
2023-10-16 18:59:57,914 DEV : loss 0.15476062893867493 - f1-score (micro avg) 0.7441
2023-10-16 18:59:57,919 saving best model
2023-10-16 18:59:58,377 ----------------------------------------------------------------------------------------------------
2023-10-16 18:59:59,944 epoch 9 - iter 14/146 - loss 0.00651781 - time (sec): 1.57 - samples/sec: 3065.95 - lr: 0.000011 - momentum: 0.000000
2023-10-16 19:00:01,273 epoch 9 - iter 28/146 - loss 0.00630319 - time (sec): 2.90 - samples/sec: 3034.00 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:00:02,679 epoch 9 - iter 42/146 - loss 0.01064903 - time (sec): 4.30 - samples/sec: 3030.33 - lr: 0.000010 - momentum: 0.000000
2023-10-16 19:00:04,115 epoch 9 - iter 56/146 - loss 0.01097113 - time (sec): 5.74 - samples/sec: 3054.11 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:00:05,828 epoch 9 - iter 70/146 - loss 0.01132419 - time (sec): 7.45 - samples/sec: 3062.43 - lr: 0.000009 - momentum: 0.000000
2023-10-16 19:00:07,108 epoch 9 - iter 84/146 - loss 0.01113241 - time (sec): 8.73 - samples/sec: 3035.50 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:00:08,428 epoch 9 - iter 98/146 - loss 0.01193416 - time (sec): 10.05 - samples/sec: 2998.95 - lr: 0.000008 - momentum: 0.000000
2023-10-16 19:00:09,859 epoch 9 - iter 112/146 - loss 0.01118978 - time (sec): 11.48 - samples/sec: 3003.88 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:00:11,211 epoch 9 - iter 126/146 - loss 0.01259782 - time (sec): 12.83 - samples/sec: 3007.55 - lr: 0.000007 - momentum: 0.000000
2023-10-16 19:00:12,764 epoch 9 - iter 140/146 - loss 0.01261505 - time (sec): 14.39 - samples/sec: 2984.20 - lr: 0.000006 - momentum: 0.000000
2023-10-16 19:00:13,300 ----------------------------------------------------------------------------------------------------
2023-10-16 19:00:13,300 EPOCH 9 done: loss 0.0135 - lr: 0.000006
2023-10-16 19:00:14,540 DEV : loss 0.16921888291835785 - f1-score (micro avg) 0.7426
2023-10-16 19:00:14,544 ----------------------------------------------------------------------------------------------------
2023-10-16 19:00:15,911 epoch 10 - iter 14/146 - loss 0.00550706 - time (sec): 1.37 - samples/sec: 2894.73 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:00:17,391 epoch 10 - iter 28/146 - loss 0.00941868 - time (sec): 2.85 - samples/sec: 2962.71 - lr: 0.000005 - momentum: 0.000000
2023-10-16 19:00:18,928 epoch 10 - iter 42/146 - loss 0.01303489 - time (sec): 4.38 - samples/sec: 2955.94 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:00:20,430 epoch 10 - iter 56/146 - loss 0.01123757 - time (sec): 5.88 - samples/sec: 3078.55 - lr: 0.000004 - momentum: 0.000000
2023-10-16 19:00:22,022 epoch 10 - iter 70/146 - loss 0.01121363 - time (sec): 7.48 - samples/sec: 3031.28 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:00:23,415 epoch 10 - iter 84/146 - loss 0.01021310 - time (sec): 8.87 - samples/sec: 3034.98 - lr: 0.000003 - momentum: 0.000000
2023-10-16 19:00:24,791 epoch 10 - iter 98/146 - loss 0.01085500 - time (sec): 10.25 - samples/sec: 2977.25 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:00:26,176 epoch 10 - iter 112/146 - loss 0.01012481 - time (sec): 11.63 - samples/sec: 2988.53 - lr: 0.000002 - momentum: 0.000000
2023-10-16 19:00:27,362 epoch 10 - iter 126/146 - loss 0.01085383 - time (sec): 12.82 - samples/sec: 3013.29 - lr: 0.000001 - momentum: 0.000000
2023-10-16 19:00:28,767 epoch 10 - iter 140/146 - loss 0.01079744 - time (sec): 14.22 - samples/sec: 3012.17 - lr: 0.000000 - momentum: 0.000000
2023-10-16 19:00:29,277 ----------------------------------------------------------------------------------------------------
2023-10-16 19:00:29,277 EPOCH 10 done: loss 0.0106 - lr: 0.000000
2023-10-16 19:00:30,484 DEV : loss 0.15949256718158722 - f1-score (micro avg) 0.7666
2023-10-16 19:00:30,488 saving best model
2023-10-16 19:00:31,322 ----------------------------------------------------------------------------------------------------
2023-10-16 19:00:31,323 Loading model from best epoch ...
2023-10-16 19:00:32,795 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 19:00:35,112
Results:
- F-score (micro) 0.745
- F-score (macro) 0.6798
- Accuracy 0.618
By class:
precision recall f1-score support
PER 0.7778 0.8448 0.8099 348
LOC 0.6480 0.7969 0.7148 261
ORG 0.4694 0.4423 0.4554 52
HumanProd 0.7083 0.7727 0.7391 22
micro avg 0.7021 0.7936 0.7450 683
macro avg 0.6509 0.7142 0.6798 683
weighted avg 0.7025 0.7936 0.7443 683
2023-10-16 19:00:35,112 ----------------------------------------------------------------------------------------------------