Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- final-model.pt +3 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697108791.de2e83fddbee.1952.2 +3 -0
- test.tsv +0 -0
- training.log +263 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:31c16296e221d2ca81eb429e0ee99ad5b572f2a976e628668113b8ef6fd899b7
|
3 |
+
size 870793839
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
final-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:46b2f4e95ae3622feebe06f0b777b2acc66b96a1b4b879fc130ee9a43698cf27
|
3 |
+
size 870793956
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 11:16:24 0.0001 0.8622 0.1457 0.6215 0.6482 0.6346 0.5018
|
3 |
+
2 11:26:19 0.0001 0.1257 0.0886 0.7206 0.7353 0.7279 0.5936
|
4 |
+
3 11:35:54 0.0001 0.0764 0.0982 0.7372 0.7805 0.7582 0.6313
|
5 |
+
4 11:45:33 0.0001 0.0570 0.1236 0.7287 0.7839 0.7553 0.6271
|
6 |
+
5 11:55:01 0.0001 0.0429 0.1341 0.7452 0.7941 0.7689 0.6464
|
7 |
+
6 12:04:42 0.0001 0.0333 0.1606 0.7479 0.7952 0.7708 0.6455
|
8 |
+
7 12:14:00 0.0001 0.0223 0.1936 0.7433 0.7896 0.7658 0.6415
|
9 |
+
8 12:23:15 0.0000 0.0159 0.2119 0.7550 0.7738 0.7642 0.6393
|
10 |
+
9 12:32:07 0.0000 0.0123 0.2116 0.7634 0.7885 0.7757 0.6551
|
11 |
+
10 12:42:28 0.0000 0.0090 0.2183 0.7519 0.7817 0.7665 0.6440
|
runs/events.out.tfevents.1697108791.de2e83fddbee.1952.2
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:208552e0787d8e2af0bbfd51bffd1de02aa25b8214a405809543ed8e328616c7
|
3 |
+
size 1108164
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,263 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-12 11:06:31,480 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-12 11:06:31,482 Model: "SequenceTagger(
|
3 |
+
(embeddings): ByT5Embeddings(
|
4 |
+
(model): T5EncoderModel(
|
5 |
+
(shared): Embedding(384, 1472)
|
6 |
+
(encoder): T5Stack(
|
7 |
+
(embed_tokens): Embedding(384, 1472)
|
8 |
+
(block): ModuleList(
|
9 |
+
(0): T5Block(
|
10 |
+
(layer): ModuleList(
|
11 |
+
(0): T5LayerSelfAttention(
|
12 |
+
(SelfAttention): T5Attention(
|
13 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
14 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
15 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
16 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
17 |
+
(relative_attention_bias): Embedding(32, 6)
|
18 |
+
)
|
19 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(1): T5LayerFF(
|
23 |
+
(DenseReluDense): T5DenseGatedActDense(
|
24 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
25 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
26 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
27 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
28 |
+
(act): NewGELUActivation()
|
29 |
+
)
|
30 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
31 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
32 |
+
)
|
33 |
+
)
|
34 |
+
)
|
35 |
+
(1-11): 11 x T5Block(
|
36 |
+
(layer): ModuleList(
|
37 |
+
(0): T5LayerSelfAttention(
|
38 |
+
(SelfAttention): T5Attention(
|
39 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
40 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
41 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
42 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
43 |
+
)
|
44 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
45 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
46 |
+
)
|
47 |
+
(1): T5LayerFF(
|
48 |
+
(DenseReluDense): T5DenseGatedActDense(
|
49 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
50 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
51 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
52 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
53 |
+
(act): NewGELUActivation()
|
54 |
+
)
|
55 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
56 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
57 |
+
)
|
58 |
+
)
|
59 |
+
)
|
60 |
+
)
|
61 |
+
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
62 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
63 |
+
)
|
64 |
+
)
|
65 |
+
)
|
66 |
+
(locked_dropout): LockedDropout(p=0.5)
|
67 |
+
(linear): Linear(in_features=1472, out_features=13, bias=True)
|
68 |
+
(loss_function): CrossEntropyLoss()
|
69 |
+
)"
|
70 |
+
2023-10-12 11:06:31,482 ----------------------------------------------------------------------------------------------------
|
71 |
+
2023-10-12 11:06:31,482 MultiCorpus: 7936 train + 992 dev + 992 test sentences
|
72 |
+
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
|
73 |
+
2023-10-12 11:06:31,482 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-12 11:06:31,482 Train: 7936 sentences
|
75 |
+
2023-10-12 11:06:31,483 (train_with_dev=False, train_with_test=False)
|
76 |
+
2023-10-12 11:06:31,483 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-12 11:06:31,483 Training Params:
|
78 |
+
2023-10-12 11:06:31,483 - learning_rate: "0.00015"
|
79 |
+
2023-10-12 11:06:31,483 - mini_batch_size: "4"
|
80 |
+
2023-10-12 11:06:31,483 - max_epochs: "10"
|
81 |
+
2023-10-12 11:06:31,483 - shuffle: "True"
|
82 |
+
2023-10-12 11:06:31,483 ----------------------------------------------------------------------------------------------------
|
83 |
+
2023-10-12 11:06:31,483 Plugins:
|
84 |
+
2023-10-12 11:06:31,483 - TensorboardLogger
|
85 |
+
2023-10-12 11:06:31,483 - LinearScheduler | warmup_fraction: '0.1'
|
86 |
+
2023-10-12 11:06:31,483 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-12 11:06:31,483 Final evaluation on model from best epoch (best-model.pt)
|
88 |
+
2023-10-12 11:06:31,483 - metric: "('micro avg', 'f1-score')"
|
89 |
+
2023-10-12 11:06:31,484 ----------------------------------------------------------------------------------------------------
|
90 |
+
2023-10-12 11:06:31,484 Computation:
|
91 |
+
2023-10-12 11:06:31,484 - compute on device: cuda:0
|
92 |
+
2023-10-12 11:06:31,484 - embedding storage: none
|
93 |
+
2023-10-12 11:06:31,484 ----------------------------------------------------------------------------------------------------
|
94 |
+
2023-10-12 11:06:31,484 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1"
|
95 |
+
2023-10-12 11:06:31,484 ----------------------------------------------------------------------------------------------------
|
96 |
+
2023-10-12 11:06:31,484 ----------------------------------------------------------------------------------------------------
|
97 |
+
2023-10-12 11:06:31,484 Logging anything other than scalars to TensorBoard is currently not supported.
|
98 |
+
2023-10-12 11:07:26,022 epoch 1 - iter 198/1984 - loss 2.57999255 - time (sec): 54.54 - samples/sec: 283.61 - lr: 0.000015 - momentum: 0.000000
|
99 |
+
2023-10-12 11:08:22,459 epoch 1 - iter 396/1984 - loss 2.44981053 - time (sec): 110.97 - samples/sec: 284.09 - lr: 0.000030 - momentum: 0.000000
|
100 |
+
2023-10-12 11:09:15,722 epoch 1 - iter 594/1984 - loss 2.13411946 - time (sec): 164.24 - samples/sec: 291.69 - lr: 0.000045 - momentum: 0.000000
|
101 |
+
2023-10-12 11:10:10,549 epoch 1 - iter 792/1984 - loss 1.79298227 - time (sec): 219.06 - samples/sec: 293.62 - lr: 0.000060 - momentum: 0.000000
|
102 |
+
2023-10-12 11:11:04,778 epoch 1 - iter 990/1984 - loss 1.52165692 - time (sec): 273.29 - samples/sec: 293.74 - lr: 0.000075 - momentum: 0.000000
|
103 |
+
2023-10-12 11:12:01,502 epoch 1 - iter 1188/1984 - loss 1.30787581 - time (sec): 330.02 - samples/sec: 293.94 - lr: 0.000090 - momentum: 0.000000
|
104 |
+
2023-10-12 11:13:02,102 epoch 1 - iter 1386/1984 - loss 1.14333260 - time (sec): 390.62 - samples/sec: 292.77 - lr: 0.000105 - momentum: 0.000000
|
105 |
+
2023-10-12 11:14:03,282 epoch 1 - iter 1584/1984 - loss 1.02809814 - time (sec): 451.80 - samples/sec: 290.09 - lr: 0.000120 - momentum: 0.000000
|
106 |
+
2023-10-12 11:14:59,314 epoch 1 - iter 1782/1984 - loss 0.93905606 - time (sec): 507.83 - samples/sec: 290.13 - lr: 0.000135 - momentum: 0.000000
|
107 |
+
2023-10-12 11:15:57,372 epoch 1 - iter 1980/1984 - loss 0.86344728 - time (sec): 565.89 - samples/sec: 289.24 - lr: 0.000150 - momentum: 0.000000
|
108 |
+
2023-10-12 11:15:58,552 ----------------------------------------------------------------------------------------------------
|
109 |
+
2023-10-12 11:15:58,552 EPOCH 1 done: loss 0.8622 - lr: 0.000150
|
110 |
+
2023-10-12 11:16:24,769 DEV : loss 0.14566047489643097 - f1-score (micro avg) 0.6346
|
111 |
+
2023-10-12 11:16:24,809 saving best model
|
112 |
+
2023-10-12 11:16:25,688 ----------------------------------------------------------------------------------------------------
|
113 |
+
2023-10-12 11:17:20,017 epoch 2 - iter 198/1984 - loss 0.16983836 - time (sec): 54.33 - samples/sec: 303.20 - lr: 0.000148 - momentum: 0.000000
|
114 |
+
2023-10-12 11:18:13,216 epoch 2 - iter 396/1984 - loss 0.14876062 - time (sec): 107.53 - samples/sec: 304.03 - lr: 0.000147 - momentum: 0.000000
|
115 |
+
2023-10-12 11:19:12,234 epoch 2 - iter 594/1984 - loss 0.14225847 - time (sec): 166.54 - samples/sec: 294.50 - lr: 0.000145 - momentum: 0.000000
|
116 |
+
2023-10-12 11:20:16,804 epoch 2 - iter 792/1984 - loss 0.14082827 - time (sec): 231.11 - samples/sec: 284.38 - lr: 0.000143 - momentum: 0.000000
|
117 |
+
2023-10-12 11:21:13,372 epoch 2 - iter 990/1984 - loss 0.13649551 - time (sec): 287.68 - samples/sec: 286.33 - lr: 0.000142 - momentum: 0.000000
|
118 |
+
2023-10-12 11:22:09,716 epoch 2 - iter 1188/1984 - loss 0.13419348 - time (sec): 344.03 - samples/sec: 286.44 - lr: 0.000140 - momentum: 0.000000
|
119 |
+
2023-10-12 11:23:05,261 epoch 2 - iter 1386/1984 - loss 0.13452875 - time (sec): 399.57 - samples/sec: 288.02 - lr: 0.000138 - momentum: 0.000000
|
120 |
+
2023-10-12 11:24:01,182 epoch 2 - iter 1584/1984 - loss 0.13063731 - time (sec): 455.49 - samples/sec: 287.18 - lr: 0.000137 - momentum: 0.000000
|
121 |
+
2023-10-12 11:24:57,266 epoch 2 - iter 1782/1984 - loss 0.12821277 - time (sec): 511.58 - samples/sec: 287.32 - lr: 0.000135 - momentum: 0.000000
|
122 |
+
2023-10-12 11:25:50,691 epoch 2 - iter 1980/1984 - loss 0.12598244 - time (sec): 565.00 - samples/sec: 289.39 - lr: 0.000133 - momentum: 0.000000
|
123 |
+
2023-10-12 11:25:51,942 ----------------------------------------------------------------------------------------------------
|
124 |
+
2023-10-12 11:25:51,942 EPOCH 2 done: loss 0.1257 - lr: 0.000133
|
125 |
+
2023-10-12 11:26:18,959 DEV : loss 0.08862575143575668 - f1-score (micro avg) 0.7279
|
126 |
+
2023-10-12 11:26:19,000 saving best model
|
127 |
+
2023-10-12 11:26:21,592 ----------------------------------------------------------------------------------------------------
|
128 |
+
2023-10-12 11:27:15,300 epoch 3 - iter 198/1984 - loss 0.08035378 - time (sec): 53.70 - samples/sec: 293.08 - lr: 0.000132 - momentum: 0.000000
|
129 |
+
2023-10-12 11:28:07,651 epoch 3 - iter 396/1984 - loss 0.08045312 - time (sec): 106.05 - samples/sec: 301.32 - lr: 0.000130 - momentum: 0.000000
|
130 |
+
2023-10-12 11:29:04,386 epoch 3 - iter 594/1984 - loss 0.08136632 - time (sec): 162.79 - samples/sec: 299.79 - lr: 0.000128 - momentum: 0.000000
|
131 |
+
2023-10-12 11:29:59,889 epoch 3 - iter 792/1984 - loss 0.08074706 - time (sec): 218.29 - samples/sec: 298.62 - lr: 0.000127 - momentum: 0.000000
|
132 |
+
2023-10-12 11:30:55,332 epoch 3 - iter 990/1984 - loss 0.07862473 - time (sec): 273.73 - samples/sec: 297.34 - lr: 0.000125 - momentum: 0.000000
|
133 |
+
2023-10-12 11:31:48,283 epoch 3 - iter 1188/1984 - loss 0.07766833 - time (sec): 326.69 - samples/sec: 298.33 - lr: 0.000123 - momentum: 0.000000
|
134 |
+
2023-10-12 11:32:41,000 epoch 3 - iter 1386/1984 - loss 0.07860726 - time (sec): 379.40 - samples/sec: 298.38 - lr: 0.000122 - momentum: 0.000000
|
135 |
+
2023-10-12 11:33:38,620 epoch 3 - iter 1584/1984 - loss 0.07716988 - time (sec): 437.02 - samples/sec: 299.44 - lr: 0.000120 - momentum: 0.000000
|
136 |
+
2023-10-12 11:34:33,138 epoch 3 - iter 1782/1984 - loss 0.07591753 - time (sec): 491.54 - samples/sec: 300.57 - lr: 0.000118 - momentum: 0.000000
|
137 |
+
2023-10-12 11:35:27,361 epoch 3 - iter 1980/1984 - loss 0.07646179 - time (sec): 545.76 - samples/sec: 299.93 - lr: 0.000117 - momentum: 0.000000
|
138 |
+
2023-10-12 11:35:28,378 ----------------------------------------------------------------------------------------------------
|
139 |
+
2023-10-12 11:35:28,378 EPOCH 3 done: loss 0.0764 - lr: 0.000117
|
140 |
+
2023-10-12 11:35:54,450 DEV : loss 0.09820140898227692 - f1-score (micro avg) 0.7582
|
141 |
+
2023-10-12 11:35:54,501 saving best model
|
142 |
+
2023-10-12 11:35:57,216 ----------------------------------------------------------------------------------------------------
|
143 |
+
2023-10-12 11:36:54,454 epoch 4 - iter 198/1984 - loss 0.05107641 - time (sec): 57.23 - samples/sec: 299.51 - lr: 0.000115 - momentum: 0.000000
|
144 |
+
2023-10-12 11:37:49,239 epoch 4 - iter 396/1984 - loss 0.05351624 - time (sec): 112.01 - samples/sec: 304.06 - lr: 0.000113 - momentum: 0.000000
|
145 |
+
2023-10-12 11:38:41,612 epoch 4 - iter 594/1984 - loss 0.05652355 - time (sec): 164.39 - samples/sec: 304.03 - lr: 0.000112 - momentum: 0.000000
|
146 |
+
2023-10-12 11:39:34,981 epoch 4 - iter 792/1984 - loss 0.05647937 - time (sec): 217.76 - samples/sec: 301.77 - lr: 0.000110 - momentum: 0.000000
|
147 |
+
2023-10-12 11:40:27,996 epoch 4 - iter 990/1984 - loss 0.05659483 - time (sec): 270.77 - samples/sec: 302.74 - lr: 0.000108 - momentum: 0.000000
|
148 |
+
2023-10-12 11:41:21,031 epoch 4 - iter 1188/1984 - loss 0.05624160 - time (sec): 323.81 - samples/sec: 302.79 - lr: 0.000107 - momentum: 0.000000
|
149 |
+
2023-10-12 11:42:19,315 epoch 4 - iter 1386/1984 - loss 0.05573244 - time (sec): 382.09 - samples/sec: 299.89 - lr: 0.000105 - momentum: 0.000000
|
150 |
+
2023-10-12 11:43:14,049 epoch 4 - iter 1584/1984 - loss 0.05755650 - time (sec): 436.82 - samples/sec: 299.30 - lr: 0.000103 - momentum: 0.000000
|
151 |
+
2023-10-12 11:44:08,504 epoch 4 - iter 1782/1984 - loss 0.05629255 - time (sec): 491.28 - samples/sec: 300.73 - lr: 0.000102 - momentum: 0.000000
|
152 |
+
2023-10-12 11:45:04,669 epoch 4 - iter 1980/1984 - loss 0.05700810 - time (sec): 547.44 - samples/sec: 299.11 - lr: 0.000100 - momentum: 0.000000
|
153 |
+
2023-10-12 11:45:05,749 ----------------------------------------------------------------------------------------------------
|
154 |
+
2023-10-12 11:45:05,749 EPOCH 4 done: loss 0.0570 - lr: 0.000100
|
155 |
+
2023-10-12 11:45:33,306 DEV : loss 0.12358254194259644 - f1-score (micro avg) 0.7553
|
156 |
+
2023-10-12 11:45:33,357 ----------------------------------------------------------------------------------------------------
|
157 |
+
2023-10-12 11:46:26,712 epoch 5 - iter 198/1984 - loss 0.04324754 - time (sec): 53.35 - samples/sec: 302.80 - lr: 0.000098 - momentum: 0.000000
|
158 |
+
2023-10-12 11:47:19,893 epoch 5 - iter 396/1984 - loss 0.03649823 - time (sec): 106.53 - samples/sec: 303.56 - lr: 0.000097 - momentum: 0.000000
|
159 |
+
2023-10-12 11:48:13,499 epoch 5 - iter 594/1984 - loss 0.03805430 - time (sec): 160.14 - samples/sec: 303.80 - lr: 0.000095 - momentum: 0.000000
|
160 |
+
2023-10-12 11:49:05,591 epoch 5 - iter 792/1984 - loss 0.03840236 - time (sec): 212.23 - samples/sec: 306.73 - lr: 0.000093 - momentum: 0.000000
|
161 |
+
2023-10-12 11:49:59,452 epoch 5 - iter 990/1984 - loss 0.03877875 - time (sec): 266.09 - samples/sec: 305.54 - lr: 0.000092 - momentum: 0.000000
|
162 |
+
2023-10-12 11:50:53,853 epoch 5 - iter 1188/1984 - loss 0.04135513 - time (sec): 320.49 - samples/sec: 305.54 - lr: 0.000090 - momentum: 0.000000
|
163 |
+
2023-10-12 11:51:50,814 epoch 5 - iter 1386/1984 - loss 0.04131838 - time (sec): 377.45 - samples/sec: 303.18 - lr: 0.000088 - momentum: 0.000000
|
164 |
+
2023-10-12 11:52:45,024 epoch 5 - iter 1584/1984 - loss 0.04180636 - time (sec): 431.66 - samples/sec: 304.24 - lr: 0.000087 - momentum: 0.000000
|
165 |
+
2023-10-12 11:53:37,845 epoch 5 - iter 1782/1984 - loss 0.04178494 - time (sec): 484.49 - samples/sec: 305.16 - lr: 0.000085 - momentum: 0.000000
|
166 |
+
2023-10-12 11:54:32,459 epoch 5 - iter 1980/1984 - loss 0.04294845 - time (sec): 539.10 - samples/sec: 303.51 - lr: 0.000083 - momentum: 0.000000
|
167 |
+
2023-10-12 11:54:33,704 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-12 11:54:33,704 EPOCH 5 done: loss 0.0429 - lr: 0.000083
|
169 |
+
2023-10-12 11:55:01,009 DEV : loss 0.13407257199287415 - f1-score (micro avg) 0.7689
|
170 |
+
2023-10-12 11:55:01,055 saving best model
|
171 |
+
2023-10-12 11:55:03,705 ----------------------------------------------------------------------------------------------------
|
172 |
+
2023-10-12 11:55:59,952 epoch 6 - iter 198/1984 - loss 0.03113160 - time (sec): 56.24 - samples/sec: 278.53 - lr: 0.000082 - momentum: 0.000000
|
173 |
+
2023-10-12 11:56:56,451 epoch 6 - iter 396/1984 - loss 0.02887060 - time (sec): 112.74 - samples/sec: 284.02 - lr: 0.000080 - momentum: 0.000000
|
174 |
+
2023-10-12 11:57:53,189 epoch 6 - iter 594/1984 - loss 0.02945609 - time (sec): 169.48 - samples/sec: 284.26 - lr: 0.000078 - momentum: 0.000000
|
175 |
+
2023-10-12 11:58:50,010 epoch 6 - iter 792/1984 - loss 0.02963186 - time (sec): 226.30 - samples/sec: 287.98 - lr: 0.000077 - momentum: 0.000000
|
176 |
+
2023-10-12 11:59:45,930 epoch 6 - iter 990/1984 - loss 0.02807978 - time (sec): 282.22 - samples/sec: 287.49 - lr: 0.000075 - momentum: 0.000000
|
177 |
+
2023-10-12 12:00:42,181 epoch 6 - iter 1188/1984 - loss 0.02971360 - time (sec): 338.47 - samples/sec: 289.04 - lr: 0.000073 - momentum: 0.000000
|
178 |
+
2023-10-12 12:01:34,537 epoch 6 - iter 1386/1984 - loss 0.02997489 - time (sec): 390.83 - samples/sec: 293.47 - lr: 0.000072 - momentum: 0.000000
|
179 |
+
2023-10-12 12:02:27,693 epoch 6 - iter 1584/1984 - loss 0.03200441 - time (sec): 443.98 - samples/sec: 294.44 - lr: 0.000070 - momentum: 0.000000
|
180 |
+
2023-10-12 12:03:23,539 epoch 6 - iter 1782/1984 - loss 0.03269817 - time (sec): 499.83 - samples/sec: 294.76 - lr: 0.000068 - momentum: 0.000000
|
181 |
+
2023-10-12 12:04:15,284 epoch 6 - iter 1980/1984 - loss 0.03337058 - time (sec): 551.57 - samples/sec: 296.63 - lr: 0.000067 - momentum: 0.000000
|
182 |
+
2023-10-12 12:04:16,359 ----------------------------------------------------------------------------------------------------
|
183 |
+
2023-10-12 12:04:16,359 EPOCH 6 done: loss 0.0333 - lr: 0.000067
|
184 |
+
2023-10-12 12:04:42,197 DEV : loss 0.1605551391839981 - f1-score (micro avg) 0.7708
|
185 |
+
2023-10-12 12:04:42,242 saving best model
|
186 |
+
2023-10-12 12:04:44,860 ----------------------------------------------------------------------------------------------------
|
187 |
+
2023-10-12 12:05:38,725 epoch 7 - iter 198/1984 - loss 0.01630351 - time (sec): 53.86 - samples/sec: 302.76 - lr: 0.000065 - momentum: 0.000000
|
188 |
+
2023-10-12 12:06:32,943 epoch 7 - iter 396/1984 - loss 0.02088592 - time (sec): 108.08 - samples/sec: 305.05 - lr: 0.000063 - momentum: 0.000000
|
189 |
+
2023-10-12 12:07:25,178 epoch 7 - iter 594/1984 - loss 0.02089722 - time (sec): 160.31 - samples/sec: 305.28 - lr: 0.000062 - momentum: 0.000000
|
190 |
+
2023-10-12 12:08:19,504 epoch 7 - iter 792/1984 - loss 0.02127973 - time (sec): 214.64 - samples/sec: 305.85 - lr: 0.000060 - momentum: 0.000000
|
191 |
+
2023-10-12 12:09:12,089 epoch 7 - iter 990/1984 - loss 0.02061894 - time (sec): 267.23 - samples/sec: 305.21 - lr: 0.000058 - momentum: 0.000000
|
192 |
+
2023-10-12 12:10:03,837 epoch 7 - iter 1188/1984 - loss 0.02132952 - time (sec): 318.97 - samples/sec: 306.78 - lr: 0.000057 - momentum: 0.000000
|
193 |
+
2023-10-12 12:10:56,791 epoch 7 - iter 1386/1984 - loss 0.02181938 - time (sec): 371.93 - samples/sec: 308.54 - lr: 0.000055 - momentum: 0.000000
|
194 |
+
2023-10-12 12:11:48,135 epoch 7 - iter 1584/1984 - loss 0.02244607 - time (sec): 423.27 - samples/sec: 306.27 - lr: 0.000053 - momentum: 0.000000
|
195 |
+
2023-10-12 12:12:41,566 epoch 7 - iter 1782/1984 - loss 0.02208528 - time (sec): 476.70 - samples/sec: 306.92 - lr: 0.000052 - momentum: 0.000000
|
196 |
+
2023-10-12 12:13:34,953 epoch 7 - iter 1980/1984 - loss 0.02228252 - time (sec): 530.09 - samples/sec: 308.63 - lr: 0.000050 - momentum: 0.000000
|
197 |
+
2023-10-12 12:13:36,037 ----------------------------------------------------------------------------------------------------
|
198 |
+
2023-10-12 12:13:36,037 EPOCH 7 done: loss 0.0223 - lr: 0.000050
|
199 |
+
2023-10-12 12:14:00,840 DEV : loss 0.193552166223526 - f1-score (micro avg) 0.7658
|
200 |
+
2023-10-12 12:14:00,880 ----------------------------------------------------------------------------------------------------
|
201 |
+
2023-10-12 12:14:53,000 epoch 8 - iter 198/1984 - loss 0.01942802 - time (sec): 52.12 - samples/sec: 324.79 - lr: 0.000048 - momentum: 0.000000
|
202 |
+
2023-10-12 12:15:44,809 epoch 8 - iter 396/1984 - loss 0.01739382 - time (sec): 103.93 - samples/sec: 310.32 - lr: 0.000047 - momentum: 0.000000
|
203 |
+
2023-10-12 12:16:38,516 epoch 8 - iter 594/1984 - loss 0.01651369 - time (sec): 157.63 - samples/sec: 302.92 - lr: 0.000045 - momentum: 0.000000
|
204 |
+
2023-10-12 12:17:34,341 epoch 8 - iter 792/1984 - loss 0.01549617 - time (sec): 213.46 - samples/sec: 298.75 - lr: 0.000043 - momentum: 0.000000
|
205 |
+
2023-10-12 12:18:27,661 epoch 8 - iter 990/1984 - loss 0.01499724 - time (sec): 266.78 - samples/sec: 301.86 - lr: 0.000042 - momentum: 0.000000
|
206 |
+
2023-10-12 12:19:19,537 epoch 8 - iter 1188/1984 - loss 0.01661391 - time (sec): 318.65 - samples/sec: 307.15 - lr: 0.000040 - momentum: 0.000000
|
207 |
+
2023-10-12 12:20:11,598 epoch 8 - iter 1386/1984 - loss 0.01696979 - time (sec): 370.72 - samples/sec: 307.35 - lr: 0.000038 - momentum: 0.000000
|
208 |
+
2023-10-12 12:21:05,660 epoch 8 - iter 1584/1984 - loss 0.01658658 - time (sec): 424.78 - samples/sec: 307.53 - lr: 0.000037 - momentum: 0.000000
|
209 |
+
2023-10-12 12:21:59,556 epoch 8 - iter 1782/1984 - loss 0.01660014 - time (sec): 478.67 - samples/sec: 306.38 - lr: 0.000035 - momentum: 0.000000
|
210 |
+
2023-10-12 12:22:50,214 epoch 8 - iter 1980/1984 - loss 0.01587470 - time (sec): 529.33 - samples/sec: 309.34 - lr: 0.000033 - momentum: 0.000000
|
211 |
+
2023-10-12 12:22:51,156 ----------------------------------------------------------------------------------------------------
|
212 |
+
2023-10-12 12:22:51,156 EPOCH 8 done: loss 0.0159 - lr: 0.000033
|
213 |
+
2023-10-12 12:23:15,305 DEV : loss 0.21188023686408997 - f1-score (micro avg) 0.7642
|
214 |
+
2023-10-12 12:23:15,344 ----------------------------------------------------------------------------------------------------
|
215 |
+
2023-10-12 12:24:05,207 epoch 9 - iter 198/1984 - loss 0.01503978 - time (sec): 49.86 - samples/sec: 345.20 - lr: 0.000032 - momentum: 0.000000
|
216 |
+
2023-10-12 12:24:55,182 epoch 9 - iter 396/1984 - loss 0.01537065 - time (sec): 99.84 - samples/sec: 337.06 - lr: 0.000030 - momentum: 0.000000
|
217 |
+
2023-10-12 12:25:45,441 epoch 9 - iter 594/1984 - loss 0.01380477 - time (sec): 150.09 - samples/sec: 337.26 - lr: 0.000028 - momentum: 0.000000
|
218 |
+
2023-10-12 12:26:34,719 epoch 9 - iter 792/1984 - loss 0.01424107 - time (sec): 199.37 - samples/sec: 331.93 - lr: 0.000027 - momentum: 0.000000
|
219 |
+
2023-10-12 12:27:24,733 epoch 9 - iter 990/1984 - loss 0.01368776 - time (sec): 249.39 - samples/sec: 331.51 - lr: 0.000025 - momentum: 0.000000
|
220 |
+
2023-10-12 12:28:16,126 epoch 9 - iter 1188/1984 - loss 0.01282396 - time (sec): 300.78 - samples/sec: 329.38 - lr: 0.000023 - momentum: 0.000000
|
221 |
+
2023-10-12 12:29:07,987 epoch 9 - iter 1386/1984 - loss 0.01240754 - time (sec): 352.64 - samples/sec: 328.17 - lr: 0.000022 - momentum: 0.000000
|
222 |
+
2023-10-12 12:29:58,734 epoch 9 - iter 1584/1984 - loss 0.01272824 - time (sec): 403.39 - samples/sec: 325.31 - lr: 0.000020 - momentum: 0.000000
|
223 |
+
2023-10-12 12:30:49,736 epoch 9 - iter 1782/1984 - loss 0.01309551 - time (sec): 454.39 - samples/sec: 324.27 - lr: 0.000018 - momentum: 0.000000
|
224 |
+
2023-10-12 12:31:40,744 epoch 9 - iter 1980/1984 - loss 0.01231972 - time (sec): 505.40 - samples/sec: 323.79 - lr: 0.000017 - momentum: 0.000000
|
225 |
+
2023-10-12 12:31:41,792 ----------------------------------------------------------------------------------------------------
|
226 |
+
2023-10-12 12:31:41,793 EPOCH 9 done: loss 0.0123 - lr: 0.000017
|
227 |
+
2023-10-12 12:32:07,050 DEV : loss 0.21163929998874664 - f1-score (micro avg) 0.7757
|
228 |
+
2023-10-12 12:32:07,090 saving best model
|
229 |
+
2023-10-12 12:32:09,671 ----------------------------------------------------------------------------------------------------
|
230 |
+
2023-10-12 12:33:08,394 epoch 10 - iter 198/1984 - loss 0.00652323 - time (sec): 58.72 - samples/sec: 284.22 - lr: 0.000015 - momentum: 0.000000
|
231 |
+
2023-10-12 12:34:09,493 epoch 10 - iter 396/1984 - loss 0.00848912 - time (sec): 119.82 - samples/sec: 273.48 - lr: 0.000013 - momentum: 0.000000
|
232 |
+
2023-10-12 12:35:07,187 epoch 10 - iter 594/1984 - loss 0.00744853 - time (sec): 177.51 - samples/sec: 278.27 - lr: 0.000012 - momentum: 0.000000
|
233 |
+
2023-10-12 12:36:07,684 epoch 10 - iter 792/1984 - loss 0.00864282 - time (sec): 238.01 - samples/sec: 277.06 - lr: 0.000010 - momentum: 0.000000
|
234 |
+
2023-10-12 12:37:07,295 epoch 10 - iter 990/1984 - loss 0.00791604 - time (sec): 297.62 - samples/sec: 277.30 - lr: 0.000008 - momentum: 0.000000
|
235 |
+
2023-10-12 12:38:04,898 epoch 10 - iter 1188/1984 - loss 0.00821374 - time (sec): 355.22 - samples/sec: 277.02 - lr: 0.000007 - momentum: 0.000000
|
236 |
+
2023-10-12 12:39:04,148 epoch 10 - iter 1386/1984 - loss 0.00830391 - time (sec): 414.47 - samples/sec: 275.43 - lr: 0.000005 - momentum: 0.000000
|
237 |
+
2023-10-12 12:40:03,881 epoch 10 - iter 1584/1984 - loss 0.00845608 - time (sec): 474.21 - samples/sec: 275.41 - lr: 0.000003 - momentum: 0.000000
|
238 |
+
2023-10-12 12:41:01,804 epoch 10 - iter 1782/1984 - loss 0.00817011 - time (sec): 532.13 - samples/sec: 276.69 - lr: 0.000002 - momentum: 0.000000
|
239 |
+
2023-10-12 12:41:59,075 epoch 10 - iter 1980/1984 - loss 0.00863686 - time (sec): 589.40 - samples/sec: 277.86 - lr: 0.000000 - momentum: 0.000000
|
240 |
+
2023-10-12 12:42:00,374 ----------------------------------------------------------------------------------------------------
|
241 |
+
2023-10-12 12:42:00,374 EPOCH 10 done: loss 0.0090 - lr: 0.000000
|
242 |
+
2023-10-12 12:42:28,840 DEV : loss 0.2182641178369522 - f1-score (micro avg) 0.7665
|
243 |
+
2023-10-12 12:42:29,907 ----------------------------------------------------------------------------------------------------
|
244 |
+
2023-10-12 12:42:29,909 Loading model from best epoch ...
|
245 |
+
2023-10-12 12:42:33,658 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
|
246 |
+
2023-10-12 12:42:58,091
|
247 |
+
Results:
|
248 |
+
- F-score (micro) 0.7533
|
249 |
+
- F-score (macro) 0.6701
|
250 |
+
- Accuracy 0.6309
|
251 |
+
|
252 |
+
By class:
|
253 |
+
precision recall f1-score support
|
254 |
+
|
255 |
+
LOC 0.8056 0.8290 0.8172 655
|
256 |
+
PER 0.6577 0.7668 0.7081 223
|
257 |
+
ORG 0.5278 0.4488 0.4851 127
|
258 |
+
|
259 |
+
micro avg 0.7399 0.7672 0.7533 1005
|
260 |
+
macro avg 0.6637 0.6815 0.6701 1005
|
261 |
+
weighted avg 0.7377 0.7672 0.7510 1005
|
262 |
+
|
263 |
+
2023-10-12 12:42:58,091 ----------------------------------------------------------------------------------------------------
|