|
2023-11-16 03:28:06,601 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,603 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): XLMRobertaModel( |
|
(embeddings): XLMRobertaEmbeddings( |
|
(word_embeddings): Embedding(250003, 1024) |
|
(position_embeddings): Embedding(514, 1024, padding_idx=1) |
|
(token_type_embeddings): Embedding(1, 1024) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): XLMRobertaEncoder( |
|
(layer): ModuleList( |
|
(0-23): 24 x XLMRobertaLayer( |
|
(attention): XLMRobertaAttention( |
|
(self): XLMRobertaSelfAttention( |
|
(query): Linear(in_features=1024, out_features=1024, bias=True) |
|
(key): Linear(in_features=1024, out_features=1024, bias=True) |
|
(value): Linear(in_features=1024, out_features=1024, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): XLMRobertaSelfOutput( |
|
(dense): Linear(in_features=1024, out_features=1024, bias=True) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): XLMRobertaIntermediate( |
|
(dense): Linear(in_features=1024, out_features=4096, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): XLMRobertaOutput( |
|
(dense): Linear(in_features=4096, out_features=1024, bias=True) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): XLMRobertaPooler( |
|
(dense): Linear(in_features=1024, out_features=1024, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1024, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-11-16 03:28:06,603 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,603 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences |
|
- ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en |
|
- ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka |
|
2023-11-16 03:28:06,603 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,603 Train: 30000 sentences |
|
2023-11-16 03:28:06,603 (train_with_dev=False, train_with_test=False) |
|
2023-11-16 03:28:06,603 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,604 Training Params: |
|
2023-11-16 03:28:06,604 - learning_rate: "5e-06" |
|
2023-11-16 03:28:06,604 - mini_batch_size: "4" |
|
2023-11-16 03:28:06,604 - max_epochs: "10" |
|
2023-11-16 03:28:06,604 - shuffle: "True" |
|
2023-11-16 03:28:06,604 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,604 Plugins: |
|
2023-11-16 03:28:06,604 - TensorboardLogger |
|
2023-11-16 03:28:06,604 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-11-16 03:28:06,604 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,604 Final evaluation on model from best epoch (best-model.pt) |
|
2023-11-16 03:28:06,604 - metric: "('micro avg', 'f1-score')" |
|
2023-11-16 03:28:06,604 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,604 Computation: |
|
2023-11-16 03:28:06,604 - compute on device: cuda:0 |
|
2023-11-16 03:28:06,604 - embedding storage: none |
|
2023-11-16 03:28:06,604 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,604 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-3" |
|
2023-11-16 03:28:06,604 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,604 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:28:06,604 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-11-16 03:29:38,193 epoch 1 - iter 750/7500 - loss 2.70469216 - time (sec): 91.59 - samples/sec: 264.85 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 03:31:09,213 epoch 1 - iter 1500/7500 - loss 2.24893654 - time (sec): 182.61 - samples/sec: 261.81 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 03:32:42,308 epoch 1 - iter 2250/7500 - loss 1.97006153 - time (sec): 275.70 - samples/sec: 260.33 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 03:34:16,815 epoch 1 - iter 3000/7500 - loss 1.72031860 - time (sec): 370.21 - samples/sec: 260.02 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 03:35:50,112 epoch 1 - iter 3750/7500 - loss 1.52308109 - time (sec): 463.51 - samples/sec: 259.42 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 03:37:23,760 epoch 1 - iter 4500/7500 - loss 1.36457847 - time (sec): 557.15 - samples/sec: 259.48 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 03:38:57,168 epoch 1 - iter 5250/7500 - loss 1.24407079 - time (sec): 650.56 - samples/sec: 259.07 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 03:40:28,972 epoch 1 - iter 6000/7500 - loss 1.15260515 - time (sec): 742.37 - samples/sec: 259.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 03:42:03,894 epoch 1 - iter 6750/7500 - loss 1.07519645 - time (sec): 837.29 - samples/sec: 258.95 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 03:43:39,060 epoch 1 - iter 7500/7500 - loss 1.01557427 - time (sec): 932.45 - samples/sec: 258.24 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:43:39,062 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:43:39,063 EPOCH 1 done: loss 1.0156 - lr: 0.000005 |
|
2023-11-16 03:44:06,229 DEV : loss 0.27559971809387207 - f1-score (micro avg) 0.8152 |
|
2023-11-16 03:44:08,725 saving best model |
|
2023-11-16 03:44:10,470 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:45:42,474 epoch 2 - iter 750/7500 - loss 0.39106376 - time (sec): 92.00 - samples/sec: 261.03 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:47:15,847 epoch 2 - iter 1500/7500 - loss 0.40555598 - time (sec): 185.37 - samples/sec: 261.97 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:48:49,533 epoch 2 - iter 2250/7500 - loss 0.40652252 - time (sec): 279.06 - samples/sec: 260.36 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:50:24,376 epoch 2 - iter 3000/7500 - loss 0.40712357 - time (sec): 373.90 - samples/sec: 258.58 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:52:01,501 epoch 2 - iter 3750/7500 - loss 0.40345429 - time (sec): 471.03 - samples/sec: 256.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:53:38,242 epoch 2 - iter 4500/7500 - loss 0.40372313 - time (sec): 567.77 - samples/sec: 255.87 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:55:11,702 epoch 2 - iter 5250/7500 - loss 0.40504927 - time (sec): 661.23 - samples/sec: 255.50 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:56:44,579 epoch 2 - iter 6000/7500 - loss 0.40569421 - time (sec): 754.11 - samples/sec: 256.15 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:58:17,886 epoch 2 - iter 6750/7500 - loss 0.40571892 - time (sec): 847.41 - samples/sec: 256.18 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 03:59:50,847 epoch 2 - iter 7500/7500 - loss 0.40365851 - time (sec): 940.37 - samples/sec: 256.06 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 03:59:50,849 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 03:59:50,849 EPOCH 2 done: loss 0.4037 - lr: 0.000004 |
|
2023-11-16 04:00:17,681 DEV : loss 0.271997332572937 - f1-score (micro avg) 0.8697 |
|
2023-11-16 04:00:20,070 saving best model |
|
2023-11-16 04:00:23,060 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 04:01:57,142 epoch 3 - iter 750/7500 - loss 0.34646794 - time (sec): 94.08 - samples/sec: 250.74 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:03:32,257 epoch 3 - iter 1500/7500 - loss 0.33277165 - time (sec): 189.19 - samples/sec: 253.91 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:05:06,742 epoch 3 - iter 2250/7500 - loss 0.34013081 - time (sec): 283.68 - samples/sec: 253.23 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:06:41,133 epoch 3 - iter 3000/7500 - loss 0.33864371 - time (sec): 378.07 - samples/sec: 253.41 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:08:14,833 epoch 3 - iter 3750/7500 - loss 0.34190452 - time (sec): 471.77 - samples/sec: 254.37 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:09:45,391 epoch 3 - iter 4500/7500 - loss 0.34219639 - time (sec): 562.33 - samples/sec: 256.12 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:11:18,334 epoch 3 - iter 5250/7500 - loss 0.34365478 - time (sec): 655.27 - samples/sec: 256.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:12:52,829 epoch 3 - iter 6000/7500 - loss 0.34431528 - time (sec): 749.76 - samples/sec: 256.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:14:25,065 epoch 3 - iter 6750/7500 - loss 0.34309773 - time (sec): 842.00 - samples/sec: 257.59 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:15:57,201 epoch 3 - iter 7500/7500 - loss 0.34251715 - time (sec): 934.14 - samples/sec: 257.77 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:15:57,204 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 04:15:57,204 EPOCH 3 done: loss 0.3425 - lr: 0.000004 |
|
2023-11-16 04:16:24,728 DEV : loss 0.2714731991291046 - f1-score (micro avg) 0.8842 |
|
2023-11-16 04:16:27,191 saving best model |
|
2023-11-16 04:16:29,639 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 04:18:06,042 epoch 4 - iter 750/7500 - loss 0.29074268 - time (sec): 96.40 - samples/sec: 252.40 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:19:39,895 epoch 4 - iter 1500/7500 - loss 0.29294947 - time (sec): 190.25 - samples/sec: 256.92 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:21:12,116 epoch 4 - iter 2250/7500 - loss 0.29693683 - time (sec): 282.47 - samples/sec: 257.67 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:22:43,053 epoch 4 - iter 3000/7500 - loss 0.29670062 - time (sec): 373.41 - samples/sec: 259.41 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:24:17,342 epoch 4 - iter 3750/7500 - loss 0.29561519 - time (sec): 467.70 - samples/sec: 257.80 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:25:50,373 epoch 4 - iter 4500/7500 - loss 0.29194840 - time (sec): 560.73 - samples/sec: 258.18 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:27:24,822 epoch 4 - iter 5250/7500 - loss 0.29857267 - time (sec): 655.18 - samples/sec: 257.96 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 04:28:58,980 epoch 4 - iter 6000/7500 - loss 0.30018714 - time (sec): 749.34 - samples/sec: 257.24 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:30:33,294 epoch 4 - iter 6750/7500 - loss 0.30336094 - time (sec): 843.65 - samples/sec: 257.08 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:32:07,976 epoch 4 - iter 7500/7500 - loss 0.30240959 - time (sec): 938.33 - samples/sec: 256.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:32:07,980 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 04:32:07,980 EPOCH 4 done: loss 0.3024 - lr: 0.000003 |
|
2023-11-16 04:32:35,569 DEV : loss 0.2897871732711792 - f1-score (micro avg) 0.8922 |
|
2023-11-16 04:32:38,075 saving best model |
|
2023-11-16 04:32:40,983 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 04:34:14,736 epoch 5 - iter 750/7500 - loss 0.22168761 - time (sec): 93.75 - samples/sec: 260.88 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:35:48,275 epoch 5 - iter 1500/7500 - loss 0.23358638 - time (sec): 187.29 - samples/sec: 258.77 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:37:23,490 epoch 5 - iter 2250/7500 - loss 0.24130242 - time (sec): 282.50 - samples/sec: 256.72 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:38:57,959 epoch 5 - iter 3000/7500 - loss 0.24848714 - time (sec): 376.97 - samples/sec: 257.38 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:40:32,648 epoch 5 - iter 3750/7500 - loss 0.25384312 - time (sec): 471.66 - samples/sec: 255.68 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:42:08,140 epoch 5 - iter 4500/7500 - loss 0.25352346 - time (sec): 567.15 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:43:42,002 epoch 5 - iter 5250/7500 - loss 0.25599881 - time (sec): 661.01 - samples/sec: 255.09 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:45:13,434 epoch 5 - iter 6000/7500 - loss 0.25515887 - time (sec): 752.45 - samples/sec: 255.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:46:47,959 epoch 5 - iter 6750/7500 - loss 0.25539887 - time (sec): 846.97 - samples/sec: 255.87 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:48:21,835 epoch 5 - iter 7500/7500 - loss 0.25660205 - time (sec): 940.85 - samples/sec: 255.94 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:48:21,838 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 04:48:21,838 EPOCH 5 done: loss 0.2566 - lr: 0.000003 |
|
2023-11-16 04:48:49,130 DEV : loss 0.28101304173469543 - f1-score (micro avg) 0.8973 |
|
2023-11-16 04:48:51,696 saving best model |
|
2023-11-16 04:48:53,741 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 04:50:26,277 epoch 6 - iter 750/7500 - loss 0.22465859 - time (sec): 92.53 - samples/sec: 255.27 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:52:00,738 epoch 6 - iter 1500/7500 - loss 0.21970656 - time (sec): 186.99 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:53:34,911 epoch 6 - iter 2250/7500 - loss 0.21946764 - time (sec): 281.17 - samples/sec: 255.61 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:55:09,828 epoch 6 - iter 3000/7500 - loss 0.21638489 - time (sec): 376.08 - samples/sec: 255.02 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:56:43,313 epoch 6 - iter 3750/7500 - loss 0.21414458 - time (sec): 469.57 - samples/sec: 255.78 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 04:58:15,828 epoch 6 - iter 4500/7500 - loss 0.21434532 - time (sec): 562.08 - samples/sec: 256.74 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 04:59:47,824 epoch 6 - iter 5250/7500 - loss 0.21772911 - time (sec): 654.08 - samples/sec: 257.12 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:01:19,648 epoch 6 - iter 6000/7500 - loss 0.21657089 - time (sec): 745.90 - samples/sec: 257.68 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:02:53,268 epoch 6 - iter 6750/7500 - loss 0.21549326 - time (sec): 839.52 - samples/sec: 257.74 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:04:26,550 epoch 6 - iter 7500/7500 - loss 0.21351207 - time (sec): 932.80 - samples/sec: 258.14 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:04:26,555 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:04:26,555 EPOCH 6 done: loss 0.2135 - lr: 0.000002 |
|
2023-11-16 05:04:53,798 DEV : loss 0.3079068958759308 - f1-score (micro avg) 0.9002 |
|
2023-11-16 05:04:56,055 saving best model |
|
2023-11-16 05:04:58,666 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:06:33,192 epoch 7 - iter 750/7500 - loss 0.18268097 - time (sec): 94.52 - samples/sec: 251.77 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:08:05,085 epoch 7 - iter 1500/7500 - loss 0.18175139 - time (sec): 186.42 - samples/sec: 257.03 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:09:42,400 epoch 7 - iter 2250/7500 - loss 0.19001507 - time (sec): 283.73 - samples/sec: 252.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:11:16,393 epoch 7 - iter 3000/7500 - loss 0.18641112 - time (sec): 377.72 - samples/sec: 253.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:12:51,038 epoch 7 - iter 3750/7500 - loss 0.18515279 - time (sec): 472.37 - samples/sec: 253.66 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:14:26,806 epoch 7 - iter 4500/7500 - loss 0.18525402 - time (sec): 568.14 - samples/sec: 253.77 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:16:01,622 epoch 7 - iter 5250/7500 - loss 0.18863436 - time (sec): 662.95 - samples/sec: 253.50 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:17:35,258 epoch 7 - iter 6000/7500 - loss 0.18494686 - time (sec): 756.59 - samples/sec: 253.73 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:19:09,299 epoch 7 - iter 6750/7500 - loss 0.18556342 - time (sec): 850.63 - samples/sec: 254.64 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:20:43,065 epoch 7 - iter 7500/7500 - loss 0.18644460 - time (sec): 944.40 - samples/sec: 254.97 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:20:43,069 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:20:43,069 EPOCH 7 done: loss 0.1864 - lr: 0.000002 |
|
2023-11-16 05:21:10,981 DEV : loss 0.2802160382270813 - f1-score (micro avg) 0.9048 |
|
2023-11-16 05:21:13,241 saving best model |
|
2023-11-16 05:21:15,612 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:22:49,896 epoch 8 - iter 750/7500 - loss 0.14122739 - time (sec): 94.28 - samples/sec: 259.14 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:24:23,137 epoch 8 - iter 1500/7500 - loss 0.14874139 - time (sec): 187.52 - samples/sec: 258.77 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:25:56,602 epoch 8 - iter 2250/7500 - loss 0.15341856 - time (sec): 280.99 - samples/sec: 257.70 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 05:27:28,992 epoch 8 - iter 3000/7500 - loss 0.15416389 - time (sec): 373.38 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:29:03,053 epoch 8 - iter 3750/7500 - loss 0.15634692 - time (sec): 467.44 - samples/sec: 257.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:30:37,470 epoch 8 - iter 4500/7500 - loss 0.15700278 - time (sec): 561.85 - samples/sec: 256.31 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:32:10,851 epoch 8 - iter 5250/7500 - loss 0.15692674 - time (sec): 655.24 - samples/sec: 256.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:33:44,315 epoch 8 - iter 6000/7500 - loss 0.15879525 - time (sec): 748.70 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:35:16,930 epoch 8 - iter 6750/7500 - loss 0.15726830 - time (sec): 841.31 - samples/sec: 257.53 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:36:52,298 epoch 8 - iter 7500/7500 - loss 0.15647824 - time (sec): 936.68 - samples/sec: 257.07 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:36:52,300 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:36:52,301 EPOCH 8 done: loss 0.1565 - lr: 0.000001 |
|
2023-11-16 05:37:19,973 DEV : loss 0.3105733096599579 - f1-score (micro avg) 0.9056 |
|
2023-11-16 05:37:21,975 saving best model |
|
2023-11-16 05:37:24,268 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:38:57,543 epoch 9 - iter 750/7500 - loss 0.13578701 - time (sec): 93.27 - samples/sec: 260.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:40:30,477 epoch 9 - iter 1500/7500 - loss 0.13977943 - time (sec): 186.21 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:42:04,338 epoch 9 - iter 2250/7500 - loss 0.13579281 - time (sec): 280.07 - samples/sec: 257.18 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:43:38,134 epoch 9 - iter 3000/7500 - loss 0.13083188 - time (sec): 373.86 - samples/sec: 257.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:45:11,777 epoch 9 - iter 3750/7500 - loss 0.13761002 - time (sec): 467.51 - samples/sec: 257.61 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:46:46,321 epoch 9 - iter 4500/7500 - loss 0.13992387 - time (sec): 562.05 - samples/sec: 256.71 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:48:20,030 epoch 9 - iter 5250/7500 - loss 0.13868841 - time (sec): 655.76 - samples/sec: 256.12 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:49:52,363 epoch 9 - iter 6000/7500 - loss 0.13924211 - time (sec): 748.09 - samples/sec: 256.89 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:51:24,421 epoch 9 - iter 6750/7500 - loss 0.13714285 - time (sec): 840.15 - samples/sec: 257.44 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:52:57,141 epoch 9 - iter 7500/7500 - loss 0.13574777 - time (sec): 932.87 - samples/sec: 258.12 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:52:57,144 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:52:57,144 EPOCH 9 done: loss 0.1357 - lr: 0.000001 |
|
2023-11-16 05:53:24,122 DEV : loss 0.30354949831962585 - f1-score (micro avg) 0.9069 |
|
2023-11-16 05:53:26,256 saving best model |
|
2023-11-16 05:53:28,623 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 05:55:01,065 epoch 10 - iter 750/7500 - loss 0.11662424 - time (sec): 92.44 - samples/sec: 258.78 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 05:56:34,198 epoch 10 - iter 1500/7500 - loss 0.10739844 - time (sec): 185.57 - samples/sec: 260.22 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 05:58:07,104 epoch 10 - iter 2250/7500 - loss 0.11728002 - time (sec): 278.48 - samples/sec: 261.23 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 05:59:38,545 epoch 10 - iter 3000/7500 - loss 0.11111246 - time (sec): 369.92 - samples/sec: 263.47 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:01:13,020 epoch 10 - iter 3750/7500 - loss 0.11185424 - time (sec): 464.39 - samples/sec: 261.72 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:02:47,617 epoch 10 - iter 4500/7500 - loss 0.11443883 - time (sec): 558.99 - samples/sec: 260.01 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:04:22,079 epoch 10 - iter 5250/7500 - loss 0.11684866 - time (sec): 653.45 - samples/sec: 259.18 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:05:55,899 epoch 10 - iter 6000/7500 - loss 0.11690532 - time (sec): 747.27 - samples/sec: 258.80 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:07:29,604 epoch 10 - iter 6750/7500 - loss 0.11669926 - time (sec): 840.98 - samples/sec: 258.11 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:09:02,429 epoch 10 - iter 7500/7500 - loss 0.11723510 - time (sec): 933.80 - samples/sec: 257.87 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:09:02,432 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:09:02,432 EPOCH 10 done: loss 0.1172 - lr: 0.000000 |
|
2023-11-16 06:09:29,736 DEV : loss 0.3160940110683441 - f1-score (micro avg) 0.9064 |
|
2023-11-16 06:09:34,595 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:09:34,598 Loading model from best epoch ... |
|
2023-11-16 06:09:44,517 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER |
|
2023-11-16 06:10:13,551 |
|
Results: |
|
- F-score (micro) 0.9076 |
|
- F-score (macro) 0.9067 |
|
- Accuracy 0.8601 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.9066 0.9143 0.9105 5288 |
|
PER 0.9231 0.9485 0.9356 3962 |
|
ORG 0.8737 0.8742 0.8739 3807 |
|
|
|
micro avg 0.9022 0.9130 0.9076 13057 |
|
macro avg 0.9012 0.9123 0.9067 13057 |
|
weighted avg 0.9020 0.9130 0.9075 13057 |
|
|
|
2023-11-16 06:10:13,551 ---------------------------------------------------------------------------------------------------- |
|
|