|
2023-10-14 06:59:00,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:00,999 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 06:59:00,999 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,000 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-14 06:59:01,000 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,000 Train: 6183 sentences |
|
2023-10-14 06:59:01,000 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 06:59:01,000 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,000 Training Params: |
|
2023-10-14 06:59:01,000 - learning_rate: "0.00016" |
|
2023-10-14 06:59:01,000 - mini_batch_size: "4" |
|
2023-10-14 06:59:01,000 - max_epochs: "10" |
|
2023-10-14 06:59:01,000 - shuffle: "True" |
|
2023-10-14 06:59:01,000 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,000 Plugins: |
|
2023-10-14 06:59:01,001 - TensorboardLogger |
|
2023-10-14 06:59:01,001 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 06:59:01,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,001 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 06:59:01,001 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 06:59:01,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,001 Computation: |
|
2023-10-14 06:59:01,001 - compute on device: cuda:0 |
|
2023-10-14 06:59:01,001 - embedding storage: none |
|
2023-10-14 06:59:01,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,001 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-14 06:59:01,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,001 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 06:59:01,002 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-14 06:59:45,519 epoch 1 - iter 154/1546 - loss 2.53367752 - time (sec): 44.51 - samples/sec: 293.16 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 07:00:28,897 epoch 1 - iter 308/1546 - loss 2.39511462 - time (sec): 87.89 - samples/sec: 277.77 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 07:01:13,669 epoch 1 - iter 462/1546 - loss 2.09233496 - time (sec): 132.66 - samples/sec: 286.47 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 07:01:57,687 epoch 1 - iter 616/1546 - loss 1.80104935 - time (sec): 176.68 - samples/sec: 285.87 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 07:02:41,286 epoch 1 - iter 770/1546 - loss 1.52017352 - time (sec): 220.28 - samples/sec: 285.13 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-14 07:03:25,160 epoch 1 - iter 924/1546 - loss 1.31701764 - time (sec): 264.16 - samples/sec: 282.41 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-14 07:04:08,908 epoch 1 - iter 1078/1546 - loss 1.16397770 - time (sec): 307.90 - samples/sec: 282.18 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-14 07:04:51,938 epoch 1 - iter 1232/1546 - loss 1.04301728 - time (sec): 350.93 - samples/sec: 282.89 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-14 07:05:34,982 epoch 1 - iter 1386/1546 - loss 0.94525638 - time (sec): 393.98 - samples/sec: 283.41 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-14 07:06:17,655 epoch 1 - iter 1540/1546 - loss 0.86473047 - time (sec): 436.65 - samples/sec: 283.83 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-14 07:06:19,183 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:06:19,184 EPOCH 1 done: loss 0.8628 - lr: 0.000159 |
|
2023-10-14 07:06:36,909 DEV : loss 0.08073323220014572 - f1-score (micro avg) 0.5703 |
|
2023-10-14 07:06:36,938 saving best model |
|
2023-10-14 07:06:37,882 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:07:21,356 epoch 2 - iter 154/1546 - loss 0.11828154 - time (sec): 43.47 - samples/sec: 286.96 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-14 07:08:04,722 epoch 2 - iter 308/1546 - loss 0.10931633 - time (sec): 86.84 - samples/sec: 284.02 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-14 07:08:48,768 epoch 2 - iter 462/1546 - loss 0.10610638 - time (sec): 130.88 - samples/sec: 287.25 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-14 07:09:31,719 epoch 2 - iter 616/1546 - loss 0.10331153 - time (sec): 173.83 - samples/sec: 285.11 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-14 07:10:14,886 epoch 2 - iter 770/1546 - loss 0.10152915 - time (sec): 217.00 - samples/sec: 287.19 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-14 07:10:57,523 epoch 2 - iter 924/1546 - loss 0.10297707 - time (sec): 259.64 - samples/sec: 286.08 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-14 07:11:40,202 epoch 2 - iter 1078/1546 - loss 0.10041430 - time (sec): 302.32 - samples/sec: 285.61 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-14 07:12:23,049 epoch 2 - iter 1232/1546 - loss 0.09737347 - time (sec): 345.16 - samples/sec: 284.96 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-14 07:13:05,542 epoch 2 - iter 1386/1546 - loss 0.09560133 - time (sec): 387.66 - samples/sec: 285.21 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-14 07:13:49,783 epoch 2 - iter 1540/1546 - loss 0.09267652 - time (sec): 431.90 - samples/sec: 286.68 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-14 07:13:51,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:13:51,495 EPOCH 2 done: loss 0.0926 - lr: 0.000142 |
|
2023-10-14 07:14:09,092 DEV : loss 0.05598240718245506 - f1-score (micro avg) 0.7705 |
|
2023-10-14 07:14:09,121 saving best model |
|
2023-10-14 07:14:11,747 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:14:55,113 epoch 3 - iter 154/1546 - loss 0.05909403 - time (sec): 43.36 - samples/sec: 272.89 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-14 07:15:38,763 epoch 3 - iter 308/1546 - loss 0.06590260 - time (sec): 87.01 - samples/sec: 277.37 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-14 07:16:21,328 epoch 3 - iter 462/1546 - loss 0.05889868 - time (sec): 129.58 - samples/sec: 276.83 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-14 07:17:04,710 epoch 3 - iter 616/1546 - loss 0.05905022 - time (sec): 172.96 - samples/sec: 276.19 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-14 07:17:47,405 epoch 3 - iter 770/1546 - loss 0.05708013 - time (sec): 215.65 - samples/sec: 273.80 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-14 07:18:31,678 epoch 3 - iter 924/1546 - loss 0.05694215 - time (sec): 259.93 - samples/sec: 277.08 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-14 07:19:14,995 epoch 3 - iter 1078/1546 - loss 0.05820291 - time (sec): 303.24 - samples/sec: 277.93 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-14 07:19:59,332 epoch 3 - iter 1232/1546 - loss 0.05618785 - time (sec): 347.58 - samples/sec: 281.89 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-14 07:20:43,894 epoch 3 - iter 1386/1546 - loss 0.05533088 - time (sec): 392.14 - samples/sec: 283.77 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-14 07:21:28,210 epoch 3 - iter 1540/1546 - loss 0.05550675 - time (sec): 436.46 - samples/sec: 283.58 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-14 07:21:29,889 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:21:29,889 EPOCH 3 done: loss 0.0553 - lr: 0.000125 |
|
2023-10-14 07:21:47,577 DEV : loss 0.06477273255586624 - f1-score (micro avg) 0.7854 |
|
2023-10-14 07:21:47,624 saving best model |
|
2023-10-14 07:21:50,208 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:22:33,257 epoch 4 - iter 154/1546 - loss 0.02285673 - time (sec): 43.04 - samples/sec: 275.32 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-14 07:23:16,926 epoch 4 - iter 308/1546 - loss 0.02948801 - time (sec): 86.71 - samples/sec: 269.79 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-14 07:24:01,630 epoch 4 - iter 462/1546 - loss 0.03110973 - time (sec): 131.42 - samples/sec: 279.38 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-14 07:24:44,269 epoch 4 - iter 616/1546 - loss 0.03527505 - time (sec): 174.06 - samples/sec: 278.43 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-14 07:25:27,884 epoch 4 - iter 770/1546 - loss 0.03456284 - time (sec): 217.67 - samples/sec: 277.62 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-14 07:26:11,459 epoch 4 - iter 924/1546 - loss 0.03333344 - time (sec): 261.25 - samples/sec: 281.49 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-14 07:26:55,225 epoch 4 - iter 1078/1546 - loss 0.03195906 - time (sec): 305.01 - samples/sec: 284.13 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-14 07:27:39,092 epoch 4 - iter 1232/1546 - loss 0.03314907 - time (sec): 348.88 - samples/sec: 283.50 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-14 07:28:22,264 epoch 4 - iter 1386/1546 - loss 0.03375538 - time (sec): 392.05 - samples/sec: 282.34 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-14 07:29:06,490 epoch 4 - iter 1540/1546 - loss 0.03359796 - time (sec): 436.28 - samples/sec: 283.68 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-14 07:29:08,128 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:29:08,128 EPOCH 4 done: loss 0.0335 - lr: 0.000107 |
|
2023-10-14 07:29:26,346 DEV : loss 0.07549448311328888 - f1-score (micro avg) 0.7856 |
|
2023-10-14 07:29:26,374 saving best model |
|
2023-10-14 07:29:28,946 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:30:13,352 epoch 5 - iter 154/1546 - loss 0.01905230 - time (sec): 44.40 - samples/sec: 285.75 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-14 07:30:57,264 epoch 5 - iter 308/1546 - loss 0.02049148 - time (sec): 88.31 - samples/sec: 284.49 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-14 07:31:41,538 epoch 5 - iter 462/1546 - loss 0.02040651 - time (sec): 132.59 - samples/sec: 281.32 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-14 07:32:25,272 epoch 5 - iter 616/1546 - loss 0.02477982 - time (sec): 176.32 - samples/sec: 280.46 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-14 07:33:08,950 epoch 5 - iter 770/1546 - loss 0.02419349 - time (sec): 220.00 - samples/sec: 283.34 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-14 07:33:51,465 epoch 5 - iter 924/1546 - loss 0.02422524 - time (sec): 262.51 - samples/sec: 283.64 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-14 07:34:34,416 epoch 5 - iter 1078/1546 - loss 0.02383068 - time (sec): 305.46 - samples/sec: 285.41 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-14 07:35:18,710 epoch 5 - iter 1232/1546 - loss 0.02325916 - time (sec): 349.76 - samples/sec: 285.14 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-14 07:36:02,509 epoch 5 - iter 1386/1546 - loss 0.02337776 - time (sec): 393.56 - samples/sec: 283.43 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-14 07:36:46,153 epoch 5 - iter 1540/1546 - loss 0.02279623 - time (sec): 437.20 - samples/sec: 283.18 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-14 07:36:47,837 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:36:47,837 EPOCH 5 done: loss 0.0229 - lr: 0.000089 |
|
2023-10-14 07:37:04,753 DEV : loss 0.08190007507801056 - f1-score (micro avg) 0.7896 |
|
2023-10-14 07:37:04,790 saving best model |
|
2023-10-14 07:37:07,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:37:50,256 epoch 6 - iter 154/1546 - loss 0.00969539 - time (sec): 42.85 - samples/sec: 268.78 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-14 07:38:33,112 epoch 6 - iter 308/1546 - loss 0.01538073 - time (sec): 85.71 - samples/sec: 279.75 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-14 07:39:16,585 epoch 6 - iter 462/1546 - loss 0.01616084 - time (sec): 129.18 - samples/sec: 279.28 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-14 07:39:59,763 epoch 6 - iter 616/1546 - loss 0.01406096 - time (sec): 172.36 - samples/sec: 283.73 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-14 07:40:43,602 epoch 6 - iter 770/1546 - loss 0.01462753 - time (sec): 216.20 - samples/sec: 287.06 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-14 07:41:27,629 epoch 6 - iter 924/1546 - loss 0.01427468 - time (sec): 260.22 - samples/sec: 285.89 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-14 07:42:10,163 epoch 6 - iter 1078/1546 - loss 0.01327148 - time (sec): 302.76 - samples/sec: 285.70 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-14 07:42:54,225 epoch 6 - iter 1232/1546 - loss 0.01466265 - time (sec): 346.82 - samples/sec: 285.32 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-14 07:43:36,876 epoch 6 - iter 1386/1546 - loss 0.01463610 - time (sec): 389.47 - samples/sec: 286.71 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-14 07:44:19,458 epoch 6 - iter 1540/1546 - loss 0.01426892 - time (sec): 432.05 - samples/sec: 286.86 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-14 07:44:21,038 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:44:21,038 EPOCH 6 done: loss 0.0142 - lr: 0.000071 |
|
2023-10-14 07:44:38,731 DEV : loss 0.0895775631070137 - f1-score (micro avg) 0.8016 |
|
2023-10-14 07:44:38,763 saving best model |
|
2023-10-14 07:44:41,375 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:45:24,823 epoch 7 - iter 154/1546 - loss 0.01162543 - time (sec): 43.44 - samples/sec: 262.18 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-14 07:46:08,196 epoch 7 - iter 308/1546 - loss 0.00959638 - time (sec): 86.82 - samples/sec: 272.23 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-14 07:46:52,473 epoch 7 - iter 462/1546 - loss 0.01027633 - time (sec): 131.09 - samples/sec: 279.51 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-14 07:47:36,887 epoch 7 - iter 616/1546 - loss 0.01002455 - time (sec): 175.51 - samples/sec: 283.22 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 07:48:20,985 epoch 7 - iter 770/1546 - loss 0.01007218 - time (sec): 219.61 - samples/sec: 285.76 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-14 07:49:05,694 epoch 7 - iter 924/1546 - loss 0.01004780 - time (sec): 264.31 - samples/sec: 287.39 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-14 07:49:48,406 epoch 7 - iter 1078/1546 - loss 0.00935674 - time (sec): 307.03 - samples/sec: 286.00 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-14 07:50:31,126 epoch 7 - iter 1232/1546 - loss 0.01000516 - time (sec): 349.75 - samples/sec: 282.72 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-14 07:51:16,008 epoch 7 - iter 1386/1546 - loss 0.00999930 - time (sec): 394.63 - samples/sec: 282.98 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-14 07:51:59,420 epoch 7 - iter 1540/1546 - loss 0.00970238 - time (sec): 438.04 - samples/sec: 282.76 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-14 07:52:00,988 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:52:00,988 EPOCH 7 done: loss 0.0097 - lr: 0.000053 |
|
2023-10-14 07:52:17,824 DEV : loss 0.09281705319881439 - f1-score (micro avg) 0.8008 |
|
2023-10-14 07:52:17,856 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:53:00,643 epoch 8 - iter 154/1546 - loss 0.00605265 - time (sec): 42.78 - samples/sec: 287.96 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-14 07:53:43,503 epoch 8 - iter 308/1546 - loss 0.00569176 - time (sec): 85.64 - samples/sec: 289.88 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 07:54:26,501 epoch 8 - iter 462/1546 - loss 0.00483968 - time (sec): 128.64 - samples/sec: 288.97 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-14 07:55:09,907 epoch 8 - iter 616/1546 - loss 0.00473354 - time (sec): 172.05 - samples/sec: 287.27 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-14 07:55:53,331 epoch 8 - iter 770/1546 - loss 0.00459032 - time (sec): 215.47 - samples/sec: 288.37 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 07:56:37,373 epoch 8 - iter 924/1546 - loss 0.00529213 - time (sec): 259.51 - samples/sec: 287.59 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-14 07:57:20,568 epoch 8 - iter 1078/1546 - loss 0.00545007 - time (sec): 302.71 - samples/sec: 286.23 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-14 07:58:04,028 epoch 8 - iter 1232/1546 - loss 0.00536983 - time (sec): 346.17 - samples/sec: 284.30 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 07:58:47,400 epoch 8 - iter 1386/1546 - loss 0.00547668 - time (sec): 389.54 - samples/sec: 285.96 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 07:59:30,679 epoch 8 - iter 1540/1546 - loss 0.00522496 - time (sec): 432.82 - samples/sec: 286.10 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-14 07:59:32,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 07:59:32,291 EPOCH 8 done: loss 0.0052 - lr: 0.000036 |
|
2023-10-14 07:59:50,168 DEV : loss 0.1020515114068985 - f1-score (micro avg) 0.8056 |
|
2023-10-14 07:59:50,197 saving best model |
|
2023-10-14 07:59:52,823 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:00:35,552 epoch 9 - iter 154/1546 - loss 0.00161671 - time (sec): 42.73 - samples/sec: 260.83 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 08:01:17,988 epoch 9 - iter 308/1546 - loss 0.00448654 - time (sec): 85.16 - samples/sec: 263.58 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 08:02:00,783 epoch 9 - iter 462/1546 - loss 0.00312069 - time (sec): 127.96 - samples/sec: 274.73 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 08:02:43,818 epoch 9 - iter 616/1546 - loss 0.00504142 - time (sec): 170.99 - samples/sec: 280.01 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 08:03:27,578 epoch 9 - iter 770/1546 - loss 0.00630168 - time (sec): 214.75 - samples/sec: 284.54 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 08:04:10,248 epoch 9 - iter 924/1546 - loss 0.00569486 - time (sec): 257.42 - samples/sec: 285.86 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 08:04:53,452 epoch 9 - iter 1078/1546 - loss 0.00555538 - time (sec): 300.62 - samples/sec: 287.98 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-14 08:05:36,500 epoch 9 - iter 1232/1546 - loss 0.00503775 - time (sec): 343.67 - samples/sec: 288.22 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-14 08:06:20,118 epoch 9 - iter 1386/1546 - loss 0.00467164 - time (sec): 387.29 - samples/sec: 287.45 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 08:07:04,078 epoch 9 - iter 1540/1546 - loss 0.00441975 - time (sec): 431.25 - samples/sec: 287.10 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-14 08:07:05,680 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:07:05,681 EPOCH 9 done: loss 0.0044 - lr: 0.000018 |
|
2023-10-14 08:07:22,832 DEV : loss 0.10757710039615631 - f1-score (micro avg) 0.8025 |
|
2023-10-14 08:07:22,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:08:06,885 epoch 10 - iter 154/1546 - loss 0.00142778 - time (sec): 44.01 - samples/sec: 284.56 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-14 08:08:50,212 epoch 10 - iter 308/1546 - loss 0.00144232 - time (sec): 87.34 - samples/sec: 285.77 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 08:09:32,857 epoch 10 - iter 462/1546 - loss 0.00105990 - time (sec): 129.99 - samples/sec: 281.37 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-14 08:10:15,405 epoch 10 - iter 616/1546 - loss 0.00174004 - time (sec): 172.53 - samples/sec: 280.91 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-14 08:10:59,065 epoch 10 - iter 770/1546 - loss 0.00189463 - time (sec): 216.19 - samples/sec: 280.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 08:11:42,514 epoch 10 - iter 924/1546 - loss 0.00211620 - time (sec): 259.64 - samples/sec: 281.95 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 08:12:26,108 epoch 10 - iter 1078/1546 - loss 0.00230920 - time (sec): 303.24 - samples/sec: 281.66 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 08:13:09,820 epoch 10 - iter 1232/1546 - loss 0.00241433 - time (sec): 346.95 - samples/sec: 280.54 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 08:13:54,659 epoch 10 - iter 1386/1546 - loss 0.00262085 - time (sec): 391.79 - samples/sec: 282.66 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 08:14:39,324 epoch 10 - iter 1540/1546 - loss 0.00254029 - time (sec): 436.45 - samples/sec: 283.66 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-14 08:14:41,029 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:14:41,029 EPOCH 10 done: loss 0.0025 - lr: 0.000000 |
|
2023-10-14 08:14:59,686 DEV : loss 0.10781844705343246 - f1-score (micro avg) 0.7992 |
|
2023-10-14 08:15:00,633 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 08:15:00,635 Loading model from best epoch ... |
|
2023-10-14 08:15:04,628 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-14 08:15:59,143 |
|
Results: |
|
- F-score (micro) 0.7935 |
|
- F-score (macro) 0.7115 |
|
- Accuracy 0.6783 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8358 0.8446 0.8402 946 |
|
BUILDING 0.5737 0.5892 0.5813 185 |
|
STREET 0.6949 0.7321 0.7130 56 |
|
|
|
micro avg 0.7876 0.7995 0.7935 1187 |
|
macro avg 0.7015 0.7220 0.7115 1187 |
|
weighted avg 0.7883 0.7995 0.7938 1187 |
|
|
|
2023-10-14 08:15:59,143 ---------------------------------------------------------------------------------------------------- |
|
|