|
2023-10-13 05:43:31,169 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,171 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 05:43:31,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,172 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-13 05:43:31,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,172 Train: 7936 sentences |
|
2023-10-13 05:43:31,172 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 05:43:31,172 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,172 Training Params: |
|
2023-10-13 05:43:31,172 - learning_rate: "0.00015" |
|
2023-10-13 05:43:31,172 - mini_batch_size: "4" |
|
2023-10-13 05:43:31,172 - max_epochs: "10" |
|
2023-10-13 05:43:31,172 - shuffle: "True" |
|
2023-10-13 05:43:31,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,173 Plugins: |
|
2023-10-13 05:43:31,173 - TensorboardLogger |
|
2023-10-13 05:43:31,173 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 05:43:31,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,173 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 05:43:31,173 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 05:43:31,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,173 Computation: |
|
2023-10-13 05:43:31,173 - compute on device: cuda:0 |
|
2023-10-13 05:43:31,173 - embedding storage: none |
|
2023-10-13 05:43:31,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,173 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-13 05:43:31,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,173 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:43:31,174 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 05:44:23,669 epoch 1 - iter 198/1984 - loss 2.56023642 - time (sec): 52.49 - samples/sec: 310.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 05:45:18,867 epoch 1 - iter 396/1984 - loss 2.35774357 - time (sec): 107.69 - samples/sec: 309.60 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 05:46:14,303 epoch 1 - iter 594/1984 - loss 2.05704108 - time (sec): 163.13 - samples/sec: 307.54 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 05:47:06,479 epoch 1 - iter 792/1984 - loss 1.74769940 - time (sec): 215.30 - samples/sec: 307.47 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 05:47:57,574 epoch 1 - iter 990/1984 - loss 1.49982570 - time (sec): 266.40 - samples/sec: 310.06 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 05:48:51,134 epoch 1 - iter 1188/1984 - loss 1.30358343 - time (sec): 319.96 - samples/sec: 308.94 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-13 05:49:47,616 epoch 1 - iter 1386/1984 - loss 1.15490690 - time (sec): 376.44 - samples/sec: 304.80 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 05:50:43,445 epoch 1 - iter 1584/1984 - loss 1.03965057 - time (sec): 432.27 - samples/sec: 303.80 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-13 05:51:37,606 epoch 1 - iter 1782/1984 - loss 0.94460620 - time (sec): 486.43 - samples/sec: 303.83 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 05:52:30,621 epoch 1 - iter 1980/1984 - loss 0.87250029 - time (sec): 539.45 - samples/sec: 303.15 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-13 05:52:31,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:52:31,783 EPOCH 1 done: loss 0.8711 - lr: 0.000150 |
|
2023-10-13 05:52:57,860 DEV : loss 0.16231723129749298 - f1-score (micro avg) 0.663 |
|
2023-10-13 05:52:57,906 saving best model |
|
2023-10-13 05:52:58,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 05:53:54,708 epoch 2 - iter 198/1984 - loss 0.15963194 - time (sec): 55.92 - samples/sec: 295.50 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 05:54:49,891 epoch 2 - iter 396/1984 - loss 0.15446338 - time (sec): 111.10 - samples/sec: 295.65 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-13 05:55:42,962 epoch 2 - iter 594/1984 - loss 0.14934617 - time (sec): 164.17 - samples/sec: 298.92 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-13 05:56:38,570 epoch 2 - iter 792/1984 - loss 0.14004568 - time (sec): 219.78 - samples/sec: 293.01 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-13 05:57:34,636 epoch 2 - iter 990/1984 - loss 0.13664831 - time (sec): 275.85 - samples/sec: 296.06 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 05:58:28,427 epoch 2 - iter 1188/1984 - loss 0.13641309 - time (sec): 329.64 - samples/sec: 297.14 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 05:59:22,135 epoch 2 - iter 1386/1984 - loss 0.13311507 - time (sec): 383.35 - samples/sec: 298.85 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-13 06:00:19,474 epoch 2 - iter 1584/1984 - loss 0.13084502 - time (sec): 440.68 - samples/sec: 296.84 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 06:01:14,119 epoch 2 - iter 1782/1984 - loss 0.12778979 - time (sec): 495.33 - samples/sec: 297.40 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 06:02:07,974 epoch 2 - iter 1980/1984 - loss 0.12586842 - time (sec): 549.18 - samples/sec: 298.12 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 06:02:08,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:02:08,972 EPOCH 2 done: loss 0.1258 - lr: 0.000133 |
|
2023-10-13 06:02:35,243 DEV : loss 0.08949719369411469 - f1-score (micro avg) 0.7352 |
|
2023-10-13 06:02:35,285 saving best model |
|
2023-10-13 06:02:37,860 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:03:32,717 epoch 3 - iter 198/1984 - loss 0.07354557 - time (sec): 54.85 - samples/sec: 312.62 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 06:04:25,496 epoch 3 - iter 396/1984 - loss 0.07621576 - time (sec): 107.63 - samples/sec: 308.46 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 06:05:19,074 epoch 3 - iter 594/1984 - loss 0.08125367 - time (sec): 161.21 - samples/sec: 306.72 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 06:06:15,512 epoch 3 - iter 792/1984 - loss 0.07585305 - time (sec): 217.65 - samples/sec: 303.05 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-13 06:07:10,722 epoch 3 - iter 990/1984 - loss 0.07972188 - time (sec): 272.86 - samples/sec: 298.46 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 06:08:07,849 epoch 3 - iter 1188/1984 - loss 0.07889533 - time (sec): 329.98 - samples/sec: 295.56 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 06:09:01,056 epoch 3 - iter 1386/1984 - loss 0.07614265 - time (sec): 383.19 - samples/sec: 297.04 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-13 06:09:55,354 epoch 3 - iter 1584/1984 - loss 0.07610978 - time (sec): 437.49 - samples/sec: 297.93 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-13 06:10:50,635 epoch 3 - iter 1782/1984 - loss 0.07645598 - time (sec): 492.77 - samples/sec: 299.47 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-13 06:11:45,893 epoch 3 - iter 1980/1984 - loss 0.07656055 - time (sec): 548.03 - samples/sec: 298.44 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 06:11:47,067 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:11:47,067 EPOCH 3 done: loss 0.0764 - lr: 0.000117 |
|
2023-10-13 06:12:13,772 DEV : loss 0.10229434072971344 - f1-score (micro avg) 0.7421 |
|
2023-10-13 06:12:13,819 saving best model |
|
2023-10-13 06:12:16,515 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:13:11,781 epoch 4 - iter 198/1984 - loss 0.06257893 - time (sec): 55.26 - samples/sec: 301.67 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-13 06:14:06,955 epoch 4 - iter 396/1984 - loss 0.05453809 - time (sec): 110.44 - samples/sec: 294.81 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-13 06:15:02,500 epoch 4 - iter 594/1984 - loss 0.05487829 - time (sec): 165.98 - samples/sec: 304.37 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 06:15:57,389 epoch 4 - iter 792/1984 - loss 0.05252948 - time (sec): 220.87 - samples/sec: 301.83 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 06:16:50,636 epoch 4 - iter 990/1984 - loss 0.05421408 - time (sec): 274.12 - samples/sec: 303.57 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-13 06:17:42,895 epoch 4 - iter 1188/1984 - loss 0.05404910 - time (sec): 326.38 - samples/sec: 304.71 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 06:18:36,607 epoch 4 - iter 1386/1984 - loss 0.05254585 - time (sec): 380.09 - samples/sec: 304.61 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 06:19:34,329 epoch 4 - iter 1584/1984 - loss 0.05326865 - time (sec): 437.81 - samples/sec: 299.95 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 06:20:28,956 epoch 4 - iter 1782/1984 - loss 0.05356980 - time (sec): 492.44 - samples/sec: 300.72 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-13 06:21:26,190 epoch 4 - iter 1980/1984 - loss 0.05411207 - time (sec): 549.67 - samples/sec: 297.93 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 06:21:27,442 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:21:27,442 EPOCH 4 done: loss 0.0544 - lr: 0.000100 |
|
2023-10-13 06:21:56,062 DEV : loss 0.1296338140964508 - f1-score (micro avg) 0.7448 |
|
2023-10-13 06:21:56,106 saving best model |
|
2023-10-13 06:22:00,166 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:22:57,064 epoch 5 - iter 198/1984 - loss 0.03401430 - time (sec): 56.89 - samples/sec: 285.27 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 06:23:49,847 epoch 5 - iter 396/1984 - loss 0.03367653 - time (sec): 109.68 - samples/sec: 287.03 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-13 06:24:44,133 epoch 5 - iter 594/1984 - loss 0.03772318 - time (sec): 163.96 - samples/sec: 294.26 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-13 06:25:37,666 epoch 5 - iter 792/1984 - loss 0.03702507 - time (sec): 217.49 - samples/sec: 296.65 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 06:26:37,327 epoch 5 - iter 990/1984 - loss 0.03647125 - time (sec): 277.16 - samples/sec: 298.28 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-13 06:27:29,315 epoch 5 - iter 1188/1984 - loss 0.03821053 - time (sec): 329.14 - samples/sec: 298.94 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-13 06:28:20,911 epoch 5 - iter 1386/1984 - loss 0.04020892 - time (sec): 380.74 - samples/sec: 300.13 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-13 06:29:15,349 epoch 5 - iter 1584/1984 - loss 0.04121393 - time (sec): 435.18 - samples/sec: 298.01 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 06:30:15,057 epoch 5 - iter 1782/1984 - loss 0.03983358 - time (sec): 494.89 - samples/sec: 295.16 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 06:31:16,160 epoch 5 - iter 1980/1984 - loss 0.04078366 - time (sec): 555.99 - samples/sec: 294.31 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-13 06:31:17,222 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:31:17,222 EPOCH 5 done: loss 0.0407 - lr: 0.000083 |
|
2023-10-13 06:31:42,184 DEV : loss 0.14384247362613678 - f1-score (micro avg) 0.7497 |
|
2023-10-13 06:31:42,224 saving best model |
|
2023-10-13 06:31:44,772 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:32:37,857 epoch 6 - iter 198/1984 - loss 0.02600490 - time (sec): 53.08 - samples/sec: 290.45 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 06:33:33,217 epoch 6 - iter 396/1984 - loss 0.02863585 - time (sec): 108.44 - samples/sec: 290.63 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 06:34:28,450 epoch 6 - iter 594/1984 - loss 0.03111200 - time (sec): 163.67 - samples/sec: 291.52 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 06:35:21,690 epoch 6 - iter 792/1984 - loss 0.03154475 - time (sec): 216.91 - samples/sec: 297.11 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 06:36:14,761 epoch 6 - iter 990/1984 - loss 0.03052400 - time (sec): 269.98 - samples/sec: 302.74 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 06:37:09,857 epoch 6 - iter 1188/1984 - loss 0.02980984 - time (sec): 325.08 - samples/sec: 303.28 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 06:38:05,132 epoch 6 - iter 1386/1984 - loss 0.02858646 - time (sec): 380.36 - samples/sec: 301.79 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-13 06:38:57,131 epoch 6 - iter 1584/1984 - loss 0.02911257 - time (sec): 432.35 - samples/sec: 301.30 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-13 06:39:51,485 epoch 6 - iter 1782/1984 - loss 0.02938630 - time (sec): 486.71 - samples/sec: 302.81 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 06:40:47,211 epoch 6 - iter 1980/1984 - loss 0.02932146 - time (sec): 542.43 - samples/sec: 301.60 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-13 06:40:48,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:40:48,350 EPOCH 6 done: loss 0.0293 - lr: 0.000067 |
|
2023-10-13 06:41:17,254 DEV : loss 0.1786336749792099 - f1-score (micro avg) 0.7585 |
|
2023-10-13 06:41:17,296 saving best model |
|
2023-10-13 06:41:18,383 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:42:15,602 epoch 7 - iter 198/1984 - loss 0.01565821 - time (sec): 57.22 - samples/sec: 273.44 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-13 06:43:13,372 epoch 7 - iter 396/1984 - loss 0.02160073 - time (sec): 114.99 - samples/sec: 275.78 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-13 06:44:11,042 epoch 7 - iter 594/1984 - loss 0.02297071 - time (sec): 172.66 - samples/sec: 279.79 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 06:45:06,994 epoch 7 - iter 792/1984 - loss 0.02194959 - time (sec): 228.61 - samples/sec: 281.23 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 06:46:02,561 epoch 7 - iter 990/1984 - loss 0.02145332 - time (sec): 284.18 - samples/sec: 283.36 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-13 06:46:55,114 epoch 7 - iter 1188/1984 - loss 0.02157394 - time (sec): 336.73 - samples/sec: 288.44 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 06:47:45,302 epoch 7 - iter 1386/1984 - loss 0.02232190 - time (sec): 386.92 - samples/sec: 294.58 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 06:48:36,186 epoch 7 - iter 1584/1984 - loss 0.02126233 - time (sec): 437.80 - samples/sec: 297.31 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 06:49:30,550 epoch 7 - iter 1782/1984 - loss 0.02124752 - time (sec): 492.16 - samples/sec: 296.78 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 06:50:24,014 epoch 7 - iter 1980/1984 - loss 0.02216943 - time (sec): 545.63 - samples/sec: 299.99 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 06:50:25,017 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:50:25,018 EPOCH 7 done: loss 0.0221 - lr: 0.000050 |
|
2023-10-13 06:50:50,990 DEV : loss 0.19668884575366974 - f1-score (micro avg) 0.7557 |
|
2023-10-13 06:50:51,030 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:51:42,700 epoch 8 - iter 198/1984 - loss 0.00771193 - time (sec): 51.67 - samples/sec: 307.77 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 06:52:33,732 epoch 8 - iter 396/1984 - loss 0.01096548 - time (sec): 102.70 - samples/sec: 311.86 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 06:53:25,866 epoch 8 - iter 594/1984 - loss 0.01124620 - time (sec): 154.83 - samples/sec: 306.71 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 06:54:21,829 epoch 8 - iter 792/1984 - loss 0.01189251 - time (sec): 210.80 - samples/sec: 303.37 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 06:55:16,537 epoch 8 - iter 990/1984 - loss 0.01234024 - time (sec): 265.50 - samples/sec: 303.81 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 06:56:09,844 epoch 8 - iter 1188/1984 - loss 0.01263419 - time (sec): 318.81 - samples/sec: 305.71 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 06:57:04,037 epoch 8 - iter 1386/1984 - loss 0.01265776 - time (sec): 373.00 - samples/sec: 303.96 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 06:57:55,226 epoch 8 - iter 1584/1984 - loss 0.01301713 - time (sec): 424.19 - samples/sec: 306.78 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 06:58:47,190 epoch 8 - iter 1782/1984 - loss 0.01413211 - time (sec): 476.16 - samples/sec: 309.92 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 06:59:38,513 epoch 8 - iter 1980/1984 - loss 0.01490286 - time (sec): 527.48 - samples/sec: 310.17 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 06:59:39,610 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 06:59:39,610 EPOCH 8 done: loss 0.0149 - lr: 0.000033 |
|
2023-10-13 07:00:04,753 DEV : loss 0.2151404768228531 - f1-score (micro avg) 0.7413 |
|
2023-10-13 07:00:04,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 07:00:55,949 epoch 9 - iter 198/1984 - loss 0.00807895 - time (sec): 51.15 - samples/sec: 325.86 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 07:01:51,407 epoch 9 - iter 396/1984 - loss 0.01077008 - time (sec): 106.61 - samples/sec: 318.81 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 07:02:43,202 epoch 9 - iter 594/1984 - loss 0.01143847 - time (sec): 158.41 - samples/sec: 316.85 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 07:03:35,088 epoch 9 - iter 792/1984 - loss 0.01163397 - time (sec): 210.29 - samples/sec: 315.64 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 07:04:27,415 epoch 9 - iter 990/1984 - loss 0.01100476 - time (sec): 262.62 - samples/sec: 314.62 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 07:05:20,750 epoch 9 - iter 1188/1984 - loss 0.01112938 - time (sec): 315.95 - samples/sec: 308.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 07:06:13,433 epoch 9 - iter 1386/1984 - loss 0.01138202 - time (sec): 368.64 - samples/sec: 307.01 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 07:07:07,062 epoch 9 - iter 1584/1984 - loss 0.01089230 - time (sec): 422.27 - samples/sec: 307.08 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 07:08:02,575 epoch 9 - iter 1782/1984 - loss 0.01217880 - time (sec): 477.78 - samples/sec: 306.61 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 07:08:58,446 epoch 9 - iter 1980/1984 - loss 0.01165758 - time (sec): 533.65 - samples/sec: 306.75 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 07:08:59,508 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 07:08:59,508 EPOCH 9 done: loss 0.0117 - lr: 0.000017 |
|
2023-10-13 07:09:24,667 DEV : loss 0.22990703582763672 - f1-score (micro avg) 0.7597 |
|
2023-10-13 07:09:24,711 saving best model |
|
2023-10-13 07:09:27,881 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 07:10:20,160 epoch 10 - iter 198/1984 - loss 0.00796034 - time (sec): 52.27 - samples/sec: 315.72 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 07:11:12,693 epoch 10 - iter 396/1984 - loss 0.00819348 - time (sec): 104.81 - samples/sec: 314.94 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 07:12:06,325 epoch 10 - iter 594/1984 - loss 0.00998088 - time (sec): 158.44 - samples/sec: 311.36 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 07:12:58,511 epoch 10 - iter 792/1984 - loss 0.00872009 - time (sec): 210.63 - samples/sec: 315.36 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 07:13:49,092 epoch 10 - iter 990/1984 - loss 0.00830259 - time (sec): 261.21 - samples/sec: 315.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 07:14:40,128 epoch 10 - iter 1188/1984 - loss 0.00852866 - time (sec): 312.24 - samples/sec: 315.34 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 07:15:33,304 epoch 10 - iter 1386/1984 - loss 0.00801943 - time (sec): 365.42 - samples/sec: 316.72 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 07:16:28,514 epoch 10 - iter 1584/1984 - loss 0.00810705 - time (sec): 420.63 - samples/sec: 314.79 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 07:17:20,494 epoch 10 - iter 1782/1984 - loss 0.00802002 - time (sec): 472.61 - samples/sec: 312.72 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 07:18:10,876 epoch 10 - iter 1980/1984 - loss 0.00791535 - time (sec): 522.99 - samples/sec: 312.85 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 07:18:11,919 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 07:18:11,920 EPOCH 10 done: loss 0.0079 - lr: 0.000000 |
|
2023-10-13 07:18:36,515 DEV : loss 0.23257124423980713 - f1-score (micro avg) 0.7575 |
|
2023-10-13 07:18:37,476 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 07:18:37,478 Loading model from best epoch ... |
|
2023-10-13 07:18:41,683 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-13 07:19:08,466 |
|
Results: |
|
- F-score (micro) 0.7605 |
|
- F-score (macro) 0.6686 |
|
- Accuracy 0.6421 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8172 0.8397 0.8283 655 |
|
PER 0.6693 0.7713 0.7167 223 |
|
ORG 0.5146 0.4173 0.4609 127 |
|
|
|
micro avg 0.7502 0.7711 0.7605 1005 |
|
macro avg 0.6670 0.6761 0.6686 1005 |
|
weighted avg 0.7462 0.7711 0.7571 1005 |
|
|
|
2023-10-13 07:19:08,466 ---------------------------------------------------------------------------------------------------- |
|
|