|
2023-10-16 18:57:47,070 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:57:47,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:57:47,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 Train: 1166 sentences |
|
2023-10-16 18:57:47,071 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:57:47,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 Training Params: |
|
2023-10-16 18:57:47,071 - learning_rate: "5e-05" |
|
2023-10-16 18:57:47,071 - mini_batch_size: "8" |
|
2023-10-16 18:57:47,071 - max_epochs: "10" |
|
2023-10-16 18:57:47,071 - shuffle: "True" |
|
2023-10-16 18:57:47,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 Plugins: |
|
2023-10-16 18:57:47,071 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:57:47,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:57:47,071 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:57:47,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 Computation: |
|
2023-10-16 18:57:47,071 - compute on device: cuda:0 |
|
2023-10-16 18:57:47,071 - embedding storage: none |
|
2023-10-16 18:57:47,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,071 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-16 18:57:47,072 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:47,072 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:57:48,866 epoch 1 - iter 14/146 - loss 2.85275888 - time (sec): 1.79 - samples/sec: 2393.23 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:57:50,298 epoch 1 - iter 28/146 - loss 2.58461077 - time (sec): 3.23 - samples/sec: 2720.72 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:57:51,822 epoch 1 - iter 42/146 - loss 1.93686695 - time (sec): 4.75 - samples/sec: 2838.89 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:57:53,169 epoch 1 - iter 56/146 - loss 1.60748031 - time (sec): 6.10 - samples/sec: 2902.19 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:57:54,313 epoch 1 - iter 70/146 - loss 1.43716847 - time (sec): 7.24 - samples/sec: 2929.14 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:57:55,684 epoch 1 - iter 84/146 - loss 1.28923078 - time (sec): 8.61 - samples/sec: 2982.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:57:57,034 epoch 1 - iter 98/146 - loss 1.16652726 - time (sec): 9.96 - samples/sec: 3025.72 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:57:58,553 epoch 1 - iter 112/146 - loss 1.06810315 - time (sec): 11.48 - samples/sec: 2995.07 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:57:59,899 epoch 1 - iter 126/146 - loss 0.99228364 - time (sec): 12.83 - samples/sec: 2985.10 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:58:01,226 epoch 1 - iter 140/146 - loss 0.92536534 - time (sec): 14.15 - samples/sec: 2991.18 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:58:01,941 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:01,941 EPOCH 1 done: loss 0.9018 - lr: 0.000048 |
|
2023-10-16 18:58:02,793 DEV : loss 0.21601726114749908 - f1-score (micro avg) 0.4723 |
|
2023-10-16 18:58:02,797 saving best model |
|
2023-10-16 18:58:03,162 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:04,713 epoch 2 - iter 14/146 - loss 0.25471801 - time (sec): 1.55 - samples/sec: 3359.13 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-16 18:58:06,065 epoch 2 - iter 28/146 - loss 0.22264246 - time (sec): 2.90 - samples/sec: 3275.91 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:58:07,595 epoch 2 - iter 42/146 - loss 0.22382866 - time (sec): 4.43 - samples/sec: 3045.51 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:58:08,980 epoch 2 - iter 56/146 - loss 0.23555787 - time (sec): 5.82 - samples/sec: 3090.55 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:58:10,657 epoch 2 - iter 70/146 - loss 0.23862866 - time (sec): 7.49 - samples/sec: 3061.24 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:58:12,056 epoch 2 - iter 84/146 - loss 0.23544150 - time (sec): 8.89 - samples/sec: 3083.02 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:58:13,142 epoch 2 - iter 98/146 - loss 0.22702864 - time (sec): 9.98 - samples/sec: 3111.58 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:58:14,334 epoch 2 - iter 112/146 - loss 0.22689614 - time (sec): 11.17 - samples/sec: 3094.05 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:58:15,689 epoch 2 - iter 126/146 - loss 0.21684551 - time (sec): 12.53 - samples/sec: 3119.01 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:58:16,942 epoch 2 - iter 140/146 - loss 0.21064804 - time (sec): 13.78 - samples/sec: 3109.01 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:58:17,445 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:17,446 EPOCH 2 done: loss 0.2084 - lr: 0.000045 |
|
2023-10-16 18:58:18,868 DEV : loss 0.12920989096164703 - f1-score (micro avg) 0.6143 |
|
2023-10-16 18:58:18,873 saving best model |
|
2023-10-16 18:58:19,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:20,707 epoch 3 - iter 14/146 - loss 0.13862225 - time (sec): 1.36 - samples/sec: 2920.89 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 18:58:22,006 epoch 3 - iter 28/146 - loss 0.14338984 - time (sec): 2.65 - samples/sec: 3137.06 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:58:23,500 epoch 3 - iter 42/146 - loss 0.12008848 - time (sec): 4.15 - samples/sec: 3074.81 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:58:24,920 epoch 3 - iter 56/146 - loss 0.12313137 - time (sec): 5.57 - samples/sec: 3057.58 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:58:26,459 epoch 3 - iter 70/146 - loss 0.11907608 - time (sec): 7.11 - samples/sec: 3064.78 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:58:27,921 epoch 3 - iter 84/146 - loss 0.11172872 - time (sec): 8.57 - samples/sec: 3038.33 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:58:29,406 epoch 3 - iter 98/146 - loss 0.11976382 - time (sec): 10.05 - samples/sec: 2997.42 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:58:30,755 epoch 3 - iter 112/146 - loss 0.12122439 - time (sec): 11.40 - samples/sec: 2997.06 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:58:32,142 epoch 3 - iter 126/146 - loss 0.11649488 - time (sec): 12.79 - samples/sec: 3002.35 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:58:33,356 epoch 3 - iter 140/146 - loss 0.11320558 - time (sec): 14.00 - samples/sec: 3005.59 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 18:58:34,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:34,158 EPOCH 3 done: loss 0.1136 - lr: 0.000039 |
|
2023-10-16 18:58:35,350 DEV : loss 0.11097574234008789 - f1-score (micro avg) 0.6785 |
|
2023-10-16 18:58:35,354 saving best model |
|
2023-10-16 18:58:35,838 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:37,199 epoch 4 - iter 14/146 - loss 0.08881171 - time (sec): 1.36 - samples/sec: 3127.00 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:58:38,665 epoch 4 - iter 28/146 - loss 0.07105259 - time (sec): 2.83 - samples/sec: 2892.67 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:58:40,007 epoch 4 - iter 42/146 - loss 0.07224152 - time (sec): 4.17 - samples/sec: 2928.82 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:58:41,353 epoch 4 - iter 56/146 - loss 0.06886082 - time (sec): 5.51 - samples/sec: 2884.09 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:58:43,046 epoch 4 - iter 70/146 - loss 0.06718119 - time (sec): 7.21 - samples/sec: 2985.23 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:58:44,340 epoch 4 - iter 84/146 - loss 0.06805473 - time (sec): 8.50 - samples/sec: 2977.97 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:58:45,622 epoch 4 - iter 98/146 - loss 0.06783253 - time (sec): 9.78 - samples/sec: 2998.88 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:58:47,049 epoch 4 - iter 112/146 - loss 0.07079343 - time (sec): 11.21 - samples/sec: 3008.91 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:58:48,589 epoch 4 - iter 126/146 - loss 0.07149100 - time (sec): 12.75 - samples/sec: 3001.32 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 18:58:50,248 epoch 4 - iter 140/146 - loss 0.07007930 - time (sec): 14.41 - samples/sec: 2977.64 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 18:58:50,724 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:50,724 EPOCH 4 done: loss 0.0705 - lr: 0.000034 |
|
2023-10-16 18:58:51,927 DEV : loss 0.12358613312244415 - f1-score (micro avg) 0.7131 |
|
2023-10-16 18:58:51,931 saving best model |
|
2023-10-16 18:58:52,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:58:53,904 epoch 5 - iter 14/146 - loss 0.04490097 - time (sec): 1.52 - samples/sec: 2638.04 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:58:55,255 epoch 5 - iter 28/146 - loss 0.03883960 - time (sec): 2.87 - samples/sec: 2837.28 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:58:57,007 epoch 5 - iter 42/146 - loss 0.05208017 - time (sec): 4.62 - samples/sec: 2747.83 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:58:58,377 epoch 5 - iter 56/146 - loss 0.04547796 - time (sec): 5.99 - samples/sec: 2890.39 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:58:59,661 epoch 5 - iter 70/146 - loss 0.05042304 - time (sec): 7.28 - samples/sec: 2956.50 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:59:00,782 epoch 5 - iter 84/146 - loss 0.05143826 - time (sec): 8.40 - samples/sec: 3006.47 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:59:02,276 epoch 5 - iter 98/146 - loss 0.05076338 - time (sec): 9.89 - samples/sec: 3023.21 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:59:03,691 epoch 5 - iter 112/146 - loss 0.04889364 - time (sec): 11.31 - samples/sec: 3019.47 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:59:05,074 epoch 5 - iter 126/146 - loss 0.04787092 - time (sec): 12.69 - samples/sec: 3015.71 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:59:06,762 epoch 5 - iter 140/146 - loss 0.04743290 - time (sec): 14.38 - samples/sec: 2984.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:59:07,242 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:07,243 EPOCH 5 done: loss 0.0476 - lr: 0.000028 |
|
2023-10-16 18:59:08,648 DEV : loss 0.13587994873523712 - f1-score (micro avg) 0.7331 |
|
2023-10-16 18:59:08,653 saving best model |
|
2023-10-16 18:59:09,133 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:10,622 epoch 6 - iter 14/146 - loss 0.02716840 - time (sec): 1.49 - samples/sec: 2952.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:59:12,060 epoch 6 - iter 28/146 - loss 0.03073166 - time (sec): 2.92 - samples/sec: 2977.07 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:59:13,474 epoch 6 - iter 42/146 - loss 0.02853045 - time (sec): 4.34 - samples/sec: 3024.73 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:59:14,809 epoch 6 - iter 56/146 - loss 0.02737164 - time (sec): 5.67 - samples/sec: 3027.74 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:59:16,593 epoch 6 - iter 70/146 - loss 0.02631676 - time (sec): 7.46 - samples/sec: 2988.27 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:59:17,972 epoch 6 - iter 84/146 - loss 0.02888977 - time (sec): 8.84 - samples/sec: 2996.66 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:59:19,111 epoch 6 - iter 98/146 - loss 0.02768274 - time (sec): 9.98 - samples/sec: 3015.19 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:59:20,669 epoch 6 - iter 112/146 - loss 0.03162461 - time (sec): 11.53 - samples/sec: 3016.17 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:59:21,841 epoch 6 - iter 126/146 - loss 0.03177464 - time (sec): 12.71 - samples/sec: 2998.90 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:59:23,327 epoch 6 - iter 140/146 - loss 0.03314290 - time (sec): 14.19 - samples/sec: 3002.25 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:59:23,949 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:23,949 EPOCH 6 done: loss 0.0327 - lr: 0.000023 |
|
2023-10-16 18:59:25,151 DEV : loss 0.15191714465618134 - f1-score (micro avg) 0.7207 |
|
2023-10-16 18:59:25,155 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:26,988 epoch 7 - iter 14/146 - loss 0.03275114 - time (sec): 1.83 - samples/sec: 3054.97 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:59:28,329 epoch 7 - iter 28/146 - loss 0.02638257 - time (sec): 3.17 - samples/sec: 3053.94 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:59:29,630 epoch 7 - iter 42/146 - loss 0.02770277 - time (sec): 4.47 - samples/sec: 3068.58 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:59:31,036 epoch 7 - iter 56/146 - loss 0.02917022 - time (sec): 5.88 - samples/sec: 3057.27 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:59:32,414 epoch 7 - iter 70/146 - loss 0.03103427 - time (sec): 7.26 - samples/sec: 2947.06 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:59:33,978 epoch 7 - iter 84/146 - loss 0.02980021 - time (sec): 8.82 - samples/sec: 2954.29 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:59:35,178 epoch 7 - iter 98/146 - loss 0.02806417 - time (sec): 10.02 - samples/sec: 2983.95 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:59:36,697 epoch 7 - iter 112/146 - loss 0.03003321 - time (sec): 11.54 - samples/sec: 2958.78 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:59:37,886 epoch 7 - iter 126/146 - loss 0.02851813 - time (sec): 12.73 - samples/sec: 3015.31 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:59:39,499 epoch 7 - iter 140/146 - loss 0.02688230 - time (sec): 14.34 - samples/sec: 2989.65 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:59:40,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:40,113 EPOCH 7 done: loss 0.0266 - lr: 0.000017 |
|
2023-10-16 18:59:41,315 DEV : loss 0.15551069378852844 - f1-score (micro avg) 0.7387 |
|
2023-10-16 18:59:41,319 saving best model |
|
2023-10-16 18:59:41,779 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:43,142 epoch 8 - iter 14/146 - loss 0.01842562 - time (sec): 1.36 - samples/sec: 3279.69 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:59:44,490 epoch 8 - iter 28/146 - loss 0.01534966 - time (sec): 2.71 - samples/sec: 3154.00 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:59:45,797 epoch 8 - iter 42/146 - loss 0.01695812 - time (sec): 4.01 - samples/sec: 3039.07 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:59:47,255 epoch 8 - iter 56/146 - loss 0.01512349 - time (sec): 5.47 - samples/sec: 3093.51 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:59:48,604 epoch 8 - iter 70/146 - loss 0.01497165 - time (sec): 6.82 - samples/sec: 3057.89 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:59:50,071 epoch 8 - iter 84/146 - loss 0.01522721 - time (sec): 8.29 - samples/sec: 3099.46 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:59:51,646 epoch 8 - iter 98/146 - loss 0.01645300 - time (sec): 9.86 - samples/sec: 3000.10 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:59:52,852 epoch 8 - iter 112/146 - loss 0.01726818 - time (sec): 11.07 - samples/sec: 3021.29 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:59:54,575 epoch 8 - iter 126/146 - loss 0.01719019 - time (sec): 12.79 - samples/sec: 2981.47 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:59:55,955 epoch 8 - iter 140/146 - loss 0.01826140 - time (sec): 14.17 - samples/sec: 3006.41 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:59:56,538 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:56,538 EPOCH 8 done: loss 0.0179 - lr: 0.000012 |
|
2023-10-16 18:59:57,914 DEV : loss 0.15476062893867493 - f1-score (micro avg) 0.7441 |
|
2023-10-16 18:59:57,919 saving best model |
|
2023-10-16 18:59:58,377 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:59:59,944 epoch 9 - iter 14/146 - loss 0.00651781 - time (sec): 1.57 - samples/sec: 3065.95 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 19:00:01,273 epoch 9 - iter 28/146 - loss 0.00630319 - time (sec): 2.90 - samples/sec: 3034.00 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:00:02,679 epoch 9 - iter 42/146 - loss 0.01064903 - time (sec): 4.30 - samples/sec: 3030.33 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 19:00:04,115 epoch 9 - iter 56/146 - loss 0.01097113 - time (sec): 5.74 - samples/sec: 3054.11 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:00:05,828 epoch 9 - iter 70/146 - loss 0.01132419 - time (sec): 7.45 - samples/sec: 3062.43 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 19:00:07,108 epoch 9 - iter 84/146 - loss 0.01113241 - time (sec): 8.73 - samples/sec: 3035.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:00:08,428 epoch 9 - iter 98/146 - loss 0.01193416 - time (sec): 10.05 - samples/sec: 2998.95 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 19:00:09,859 epoch 9 - iter 112/146 - loss 0.01118978 - time (sec): 11.48 - samples/sec: 3003.88 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:00:11,211 epoch 9 - iter 126/146 - loss 0.01259782 - time (sec): 12.83 - samples/sec: 3007.55 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 19:00:12,764 epoch 9 - iter 140/146 - loss 0.01261505 - time (sec): 14.39 - samples/sec: 2984.20 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 19:00:13,300 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:00:13,300 EPOCH 9 done: loss 0.0135 - lr: 0.000006 |
|
2023-10-16 19:00:14,540 DEV : loss 0.16921888291835785 - f1-score (micro avg) 0.7426 |
|
2023-10-16 19:00:14,544 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:00:15,911 epoch 10 - iter 14/146 - loss 0.00550706 - time (sec): 1.37 - samples/sec: 2894.73 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:00:17,391 epoch 10 - iter 28/146 - loss 0.00941868 - time (sec): 2.85 - samples/sec: 2962.71 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 19:00:18,928 epoch 10 - iter 42/146 - loss 0.01303489 - time (sec): 4.38 - samples/sec: 2955.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:00:20,430 epoch 10 - iter 56/146 - loss 0.01123757 - time (sec): 5.88 - samples/sec: 3078.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 19:00:22,022 epoch 10 - iter 70/146 - loss 0.01121363 - time (sec): 7.48 - samples/sec: 3031.28 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:00:23,415 epoch 10 - iter 84/146 - loss 0.01021310 - time (sec): 8.87 - samples/sec: 3034.98 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 19:00:24,791 epoch 10 - iter 98/146 - loss 0.01085500 - time (sec): 10.25 - samples/sec: 2977.25 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:00:26,176 epoch 10 - iter 112/146 - loss 0.01012481 - time (sec): 11.63 - samples/sec: 2988.53 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 19:00:27,362 epoch 10 - iter 126/146 - loss 0.01085383 - time (sec): 12.82 - samples/sec: 3013.29 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 19:00:28,767 epoch 10 - iter 140/146 - loss 0.01079744 - time (sec): 14.22 - samples/sec: 3012.17 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 19:00:29,277 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:00:29,277 EPOCH 10 done: loss 0.0106 - lr: 0.000000 |
|
2023-10-16 19:00:30,484 DEV : loss 0.15949256718158722 - f1-score (micro avg) 0.7666 |
|
2023-10-16 19:00:30,488 saving best model |
|
2023-10-16 19:00:31,322 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 19:00:31,323 Loading model from best epoch ... |
|
2023-10-16 19:00:32,795 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 19:00:35,112 |
|
Results: |
|
- F-score (micro) 0.745 |
|
- F-score (macro) 0.6798 |
|
- Accuracy 0.618 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7778 0.8448 0.8099 348 |
|
LOC 0.6480 0.7969 0.7148 261 |
|
ORG 0.4694 0.4423 0.4554 52 |
|
HumanProd 0.7083 0.7727 0.7391 22 |
|
|
|
micro avg 0.7021 0.7936 0.7450 683 |
|
macro avg 0.6509 0.7142 0.6798 683 |
|
weighted avg 0.7025 0.7936 0.7443 683 |
|
|
|
2023-10-16 19:00:35,112 ---------------------------------------------------------------------------------------------------- |
|
|