Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- final-model.pt +3 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697145825.c8b2203b18a8.2408.11 +3 -0
- test.tsv +0 -0
- training.log +261 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:375fc5e10946e949c65177a19309cd5b40bd532d576318b99eab0ae59bad7972
|
3 |
+
size 870793839
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
final-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1f7b3a6d5eb67d89d5951986915899d3f2e2da9da6cce62010b91aac1811106b
|
3 |
+
size 870793956
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 21:31:04 0.0002 0.8621 0.1848 0.4414 0.3192 0.3705 0.2558
|
3 |
+
2 21:38:35 0.0001 0.1103 0.0892 0.8570 0.7924 0.8234 0.7108
|
4 |
+
3 21:45:54 0.0001 0.0679 0.0786 0.8678 0.8275 0.8472 0.7514
|
5 |
+
4 21:53:16 0.0001 0.0460 0.1007 0.8920 0.7934 0.8398 0.7335
|
6 |
+
5 22:00:21 0.0001 0.0340 0.1116 0.8916 0.7820 0.8332 0.7223
|
7 |
+
6 22:07:36 0.0001 0.0237 0.1355 0.8698 0.8140 0.8410 0.7358
|
8 |
+
7 22:14:51 0.0001 0.0180 0.1320 0.8920 0.8192 0.8541 0.7538
|
9 |
+
8 22:22:17 0.0000 0.0143 0.1547 0.8922 0.8037 0.8457 0.7424
|
10 |
+
9 22:29:27 0.0000 0.0105 0.1578 0.8914 0.8140 0.8510 0.7483
|
11 |
+
10 22:36:47 0.0000 0.0077 0.1664 0.8894 0.8058 0.8455 0.7400
|
runs/events.out.tfevents.1697145825.c8b2203b18a8.2408.11
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:42e64ca4b38ca21bc9bbce32f45a15e2eff23ef986b44c3c874f46bd46886e19
|
3 |
+
size 808480
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,261 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-12 21:23:45,902 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-12 21:23:45,904 Model: "SequenceTagger(
|
3 |
+
(embeddings): ByT5Embeddings(
|
4 |
+
(model): T5EncoderModel(
|
5 |
+
(shared): Embedding(384, 1472)
|
6 |
+
(encoder): T5Stack(
|
7 |
+
(embed_tokens): Embedding(384, 1472)
|
8 |
+
(block): ModuleList(
|
9 |
+
(0): T5Block(
|
10 |
+
(layer): ModuleList(
|
11 |
+
(0): T5LayerSelfAttention(
|
12 |
+
(SelfAttention): T5Attention(
|
13 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
14 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
15 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
16 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
17 |
+
(relative_attention_bias): Embedding(32, 6)
|
18 |
+
)
|
19 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(1): T5LayerFF(
|
23 |
+
(DenseReluDense): T5DenseGatedActDense(
|
24 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
25 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
26 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
27 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
28 |
+
(act): NewGELUActivation()
|
29 |
+
)
|
30 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
31 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
32 |
+
)
|
33 |
+
)
|
34 |
+
)
|
35 |
+
(1-11): 11 x T5Block(
|
36 |
+
(layer): ModuleList(
|
37 |
+
(0): T5LayerSelfAttention(
|
38 |
+
(SelfAttention): T5Attention(
|
39 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
40 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
41 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
42 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
43 |
+
)
|
44 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
45 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
46 |
+
)
|
47 |
+
(1): T5LayerFF(
|
48 |
+
(DenseReluDense): T5DenseGatedActDense(
|
49 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
50 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
51 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
52 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
53 |
+
(act): NewGELUActivation()
|
54 |
+
)
|
55 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
56 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
57 |
+
)
|
58 |
+
)
|
59 |
+
)
|
60 |
+
)
|
61 |
+
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
62 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
63 |
+
)
|
64 |
+
)
|
65 |
+
)
|
66 |
+
(locked_dropout): LockedDropout(p=0.5)
|
67 |
+
(linear): Linear(in_features=1472, out_features=13, bias=True)
|
68 |
+
(loss_function): CrossEntropyLoss()
|
69 |
+
)"
|
70 |
+
2023-10-12 21:23:45,904 ----------------------------------------------------------------------------------------------------
|
71 |
+
2023-10-12 21:23:45,904 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
72 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
73 |
+
2023-10-12 21:23:45,904 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-12 21:23:45,904 Train: 5777 sentences
|
75 |
+
2023-10-12 21:23:45,905 (train_with_dev=False, train_with_test=False)
|
76 |
+
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-12 21:23:45,905 Training Params:
|
78 |
+
2023-10-12 21:23:45,905 - learning_rate: "0.00016"
|
79 |
+
2023-10-12 21:23:45,905 - mini_batch_size: "4"
|
80 |
+
2023-10-12 21:23:45,905 - max_epochs: "10"
|
81 |
+
2023-10-12 21:23:45,905 - shuffle: "True"
|
82 |
+
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
|
83 |
+
2023-10-12 21:23:45,905 Plugins:
|
84 |
+
2023-10-12 21:23:45,905 - TensorboardLogger
|
85 |
+
2023-10-12 21:23:45,905 - LinearScheduler | warmup_fraction: '0.1'
|
86 |
+
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-12 21:23:45,905 Final evaluation on model from best epoch (best-model.pt)
|
88 |
+
2023-10-12 21:23:45,905 - metric: "('micro avg', 'f1-score')"
|
89 |
+
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
|
90 |
+
2023-10-12 21:23:45,906 Computation:
|
91 |
+
2023-10-12 21:23:45,906 - compute on device: cuda:0
|
92 |
+
2023-10-12 21:23:45,906 - embedding storage: none
|
93 |
+
2023-10-12 21:23:45,906 ----------------------------------------------------------------------------------------------------
|
94 |
+
2023-10-12 21:23:45,906 Model training base path: "hmbench-icdar/nl-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5"
|
95 |
+
2023-10-12 21:23:45,906 ----------------------------------------------------------------------------------------------------
|
96 |
+
2023-10-12 21:23:45,906 ----------------------------------------------------------------------------------------------------
|
97 |
+
2023-10-12 21:23:45,906 Logging anything other than scalars to TensorBoard is currently not supported.
|
98 |
+
2023-10-12 21:24:27,641 epoch 1 - iter 144/1445 - loss 2.53329630 - time (sec): 41.73 - samples/sec: 432.73 - lr: 0.000016 - momentum: 0.000000
|
99 |
+
2023-10-12 21:25:09,203 epoch 1 - iter 288/1445 - loss 2.36537339 - time (sec): 83.29 - samples/sec: 433.39 - lr: 0.000032 - momentum: 0.000000
|
100 |
+
2023-10-12 21:25:49,786 epoch 1 - iter 432/1445 - loss 2.11838959 - time (sec): 123.88 - samples/sec: 422.80 - lr: 0.000048 - momentum: 0.000000
|
101 |
+
2023-10-12 21:26:31,250 epoch 1 - iter 576/1445 - loss 1.81958359 - time (sec): 165.34 - samples/sec: 423.75 - lr: 0.000064 - momentum: 0.000000
|
102 |
+
2023-10-12 21:27:12,928 epoch 1 - iter 720/1445 - loss 1.54330067 - time (sec): 207.02 - samples/sec: 424.24 - lr: 0.000080 - momentum: 0.000000
|
103 |
+
2023-10-12 21:27:54,484 epoch 1 - iter 864/1445 - loss 1.33675258 - time (sec): 248.58 - samples/sec: 421.19 - lr: 0.000096 - momentum: 0.000000
|
104 |
+
2023-10-12 21:28:36,755 epoch 1 - iter 1008/1445 - loss 1.17087202 - time (sec): 290.85 - samples/sec: 420.93 - lr: 0.000112 - momentum: 0.000000
|
105 |
+
2023-10-12 21:29:17,487 epoch 1 - iter 1152/1445 - loss 1.05166695 - time (sec): 331.58 - samples/sec: 419.84 - lr: 0.000127 - momentum: 0.000000
|
106 |
+
2023-10-12 21:29:59,649 epoch 1 - iter 1296/1445 - loss 0.95132545 - time (sec): 373.74 - samples/sec: 419.88 - lr: 0.000143 - momentum: 0.000000
|
107 |
+
2023-10-12 21:30:42,244 epoch 1 - iter 1440/1445 - loss 0.86538888 - time (sec): 416.34 - samples/sec: 421.44 - lr: 0.000159 - momentum: 0.000000
|
108 |
+
2023-10-12 21:30:43,683 ----------------------------------------------------------------------------------------------------
|
109 |
+
2023-10-12 21:30:43,684 EPOCH 1 done: loss 0.8621 - lr: 0.000159
|
110 |
+
2023-10-12 21:31:04,241 DEV : loss 0.1847972571849823 - f1-score (micro avg) 0.3705
|
111 |
+
2023-10-12 21:31:04,273 saving best model
|
112 |
+
2023-10-12 21:31:05,195 ----------------------------------------------------------------------------------------------------
|
113 |
+
2023-10-12 21:31:48,840 epoch 2 - iter 144/1445 - loss 0.13150717 - time (sec): 43.64 - samples/sec: 399.70 - lr: 0.000158 - momentum: 0.000000
|
114 |
+
2023-10-12 21:32:32,839 epoch 2 - iter 288/1445 - loss 0.12547482 - time (sec): 87.64 - samples/sec: 404.66 - lr: 0.000156 - momentum: 0.000000
|
115 |
+
2023-10-12 21:33:15,269 epoch 2 - iter 432/1445 - loss 0.12341173 - time (sec): 130.07 - samples/sec: 399.66 - lr: 0.000155 - momentum: 0.000000
|
116 |
+
2023-10-12 21:33:58,085 epoch 2 - iter 576/1445 - loss 0.12210825 - time (sec): 172.89 - samples/sec: 403.62 - lr: 0.000153 - momentum: 0.000000
|
117 |
+
2023-10-12 21:34:41,150 epoch 2 - iter 720/1445 - loss 0.11883011 - time (sec): 215.95 - samples/sec: 403.98 - lr: 0.000151 - momentum: 0.000000
|
118 |
+
2023-10-12 21:35:27,590 epoch 2 - iter 864/1445 - loss 0.11676973 - time (sec): 262.39 - samples/sec: 399.41 - lr: 0.000149 - momentum: 0.000000
|
119 |
+
2023-10-12 21:36:12,186 epoch 2 - iter 1008/1445 - loss 0.11778461 - time (sec): 306.99 - samples/sec: 397.72 - lr: 0.000148 - momentum: 0.000000
|
120 |
+
2023-10-12 21:36:53,431 epoch 2 - iter 1152/1445 - loss 0.11553749 - time (sec): 348.23 - samples/sec: 401.78 - lr: 0.000146 - momentum: 0.000000
|
121 |
+
2023-10-12 21:37:33,796 epoch 2 - iter 1296/1445 - loss 0.11346210 - time (sec): 388.60 - samples/sec: 407.08 - lr: 0.000144 - momentum: 0.000000
|
122 |
+
2023-10-12 21:38:13,915 epoch 2 - iter 1440/1445 - loss 0.11050610 - time (sec): 428.72 - samples/sec: 409.60 - lr: 0.000142 - momentum: 0.000000
|
123 |
+
2023-10-12 21:38:15,240 ----------------------------------------------------------------------------------------------------
|
124 |
+
2023-10-12 21:38:15,241 EPOCH 2 done: loss 0.1103 - lr: 0.000142
|
125 |
+
2023-10-12 21:38:35,365 DEV : loss 0.08923686295747757 - f1-score (micro avg) 0.8234
|
126 |
+
2023-10-12 21:38:35,394 saving best model
|
127 |
+
2023-10-12 21:38:37,920 ----------------------------------------------------------------------------------------------------
|
128 |
+
2023-10-12 21:39:18,782 epoch 3 - iter 144/1445 - loss 0.06834371 - time (sec): 40.86 - samples/sec: 438.20 - lr: 0.000140 - momentum: 0.000000
|
129 |
+
2023-10-12 21:40:00,281 epoch 3 - iter 288/1445 - loss 0.06815199 - time (sec): 82.36 - samples/sec: 436.70 - lr: 0.000139 - momentum: 0.000000
|
130 |
+
2023-10-12 21:40:40,704 epoch 3 - iter 432/1445 - loss 0.06735091 - time (sec): 122.78 - samples/sec: 435.93 - lr: 0.000137 - momentum: 0.000000
|
131 |
+
2023-10-12 21:41:21,257 epoch 3 - iter 576/1445 - loss 0.06974133 - time (sec): 163.33 - samples/sec: 435.20 - lr: 0.000135 - momentum: 0.000000
|
132 |
+
2023-10-12 21:42:01,798 epoch 3 - iter 720/1445 - loss 0.06899677 - time (sec): 203.87 - samples/sec: 436.74 - lr: 0.000133 - momentum: 0.000000
|
133 |
+
2023-10-12 21:42:42,864 epoch 3 - iter 864/1445 - loss 0.06959133 - time (sec): 244.94 - samples/sec: 439.98 - lr: 0.000132 - momentum: 0.000000
|
134 |
+
2023-10-12 21:43:24,863 epoch 3 - iter 1008/1445 - loss 0.06960467 - time (sec): 286.94 - samples/sec: 436.21 - lr: 0.000130 - momentum: 0.000000
|
135 |
+
2023-10-12 21:44:06,858 epoch 3 - iter 1152/1445 - loss 0.07023728 - time (sec): 328.93 - samples/sec: 432.23 - lr: 0.000128 - momentum: 0.000000
|
136 |
+
2023-10-12 21:44:49,005 epoch 3 - iter 1296/1445 - loss 0.06915927 - time (sec): 371.08 - samples/sec: 427.73 - lr: 0.000126 - momentum: 0.000000
|
137 |
+
2023-10-12 21:45:31,391 epoch 3 - iter 1440/1445 - loss 0.06784203 - time (sec): 413.47 - samples/sec: 424.50 - lr: 0.000125 - momentum: 0.000000
|
138 |
+
2023-10-12 21:45:32,761 ----------------------------------------------------------------------------------------------------
|
139 |
+
2023-10-12 21:45:32,762 EPOCH 3 done: loss 0.0679 - lr: 0.000125
|
140 |
+
2023-10-12 21:45:54,049 DEV : loss 0.07864446192979813 - f1-score (micro avg) 0.8472
|
141 |
+
2023-10-12 21:45:54,079 saving best model
|
142 |
+
2023-10-12 21:45:56,622 ----------------------------------------------------------------------------------------------------
|
143 |
+
2023-10-12 21:46:38,942 epoch 4 - iter 144/1445 - loss 0.05236873 - time (sec): 42.32 - samples/sec: 423.83 - lr: 0.000123 - momentum: 0.000000
|
144 |
+
2023-10-12 21:47:20,089 epoch 4 - iter 288/1445 - loss 0.05180904 - time (sec): 83.46 - samples/sec: 416.34 - lr: 0.000121 - momentum: 0.000000
|
145 |
+
2023-10-12 21:48:02,270 epoch 4 - iter 432/1445 - loss 0.04908008 - time (sec): 125.64 - samples/sec: 417.28 - lr: 0.000119 - momentum: 0.000000
|
146 |
+
2023-10-12 21:48:45,487 epoch 4 - iter 576/1445 - loss 0.04796946 - time (sec): 168.86 - samples/sec: 423.21 - lr: 0.000117 - momentum: 0.000000
|
147 |
+
2023-10-12 21:49:28,393 epoch 4 - iter 720/1445 - loss 0.04610333 - time (sec): 211.77 - samples/sec: 421.32 - lr: 0.000116 - momentum: 0.000000
|
148 |
+
2023-10-12 21:50:09,873 epoch 4 - iter 864/1445 - loss 0.04493061 - time (sec): 253.25 - samples/sec: 419.33 - lr: 0.000114 - momentum: 0.000000
|
149 |
+
2023-10-12 21:50:50,880 epoch 4 - iter 1008/1445 - loss 0.04548811 - time (sec): 294.25 - samples/sec: 418.63 - lr: 0.000112 - momentum: 0.000000
|
150 |
+
2023-10-12 21:51:32,768 epoch 4 - iter 1152/1445 - loss 0.04476085 - time (sec): 336.14 - samples/sec: 420.71 - lr: 0.000110 - momentum: 0.000000
|
151 |
+
2023-10-12 21:52:13,889 epoch 4 - iter 1296/1445 - loss 0.04656794 - time (sec): 377.26 - samples/sec: 421.08 - lr: 0.000109 - momentum: 0.000000
|
152 |
+
2023-10-12 21:52:54,707 epoch 4 - iter 1440/1445 - loss 0.04588922 - time (sec): 418.08 - samples/sec: 420.56 - lr: 0.000107 - momentum: 0.000000
|
153 |
+
2023-10-12 21:52:55,823 ----------------------------------------------------------------------------------------------------
|
154 |
+
2023-10-12 21:52:55,823 EPOCH 4 done: loss 0.0460 - lr: 0.000107
|
155 |
+
2023-10-12 21:53:16,107 DEV : loss 0.10065485537052155 - f1-score (micro avg) 0.8398
|
156 |
+
2023-10-12 21:53:16,137 ----------------------------------------------------------------------------------------------------
|
157 |
+
2023-10-12 21:53:57,924 epoch 5 - iter 144/1445 - loss 0.03754111 - time (sec): 41.79 - samples/sec: 451.73 - lr: 0.000105 - momentum: 0.000000
|
158 |
+
2023-10-12 21:54:37,644 epoch 5 - iter 288/1445 - loss 0.03295251 - time (sec): 81.51 - samples/sec: 445.25 - lr: 0.000103 - momentum: 0.000000
|
159 |
+
2023-10-12 21:55:16,790 epoch 5 - iter 432/1445 - loss 0.03034951 - time (sec): 120.65 - samples/sec: 429.78 - lr: 0.000101 - momentum: 0.000000
|
160 |
+
2023-10-12 21:55:55,747 epoch 5 - iter 576/1445 - loss 0.02962774 - time (sec): 159.61 - samples/sec: 426.38 - lr: 0.000100 - momentum: 0.000000
|
161 |
+
2023-10-12 21:56:36,572 epoch 5 - iter 720/1445 - loss 0.03171128 - time (sec): 200.43 - samples/sec: 432.19 - lr: 0.000098 - momentum: 0.000000
|
162 |
+
2023-10-12 21:57:16,464 epoch 5 - iter 864/1445 - loss 0.03189659 - time (sec): 240.33 - samples/sec: 432.78 - lr: 0.000096 - momentum: 0.000000
|
163 |
+
2023-10-12 21:57:57,974 epoch 5 - iter 1008/1445 - loss 0.03208778 - time (sec): 281.83 - samples/sec: 434.75 - lr: 0.000094 - momentum: 0.000000
|
164 |
+
2023-10-12 21:58:38,623 epoch 5 - iter 1152/1445 - loss 0.03196953 - time (sec): 322.48 - samples/sec: 435.07 - lr: 0.000093 - momentum: 0.000000
|
165 |
+
2023-10-12 21:59:18,873 epoch 5 - iter 1296/1445 - loss 0.03247366 - time (sec): 362.73 - samples/sec: 435.30 - lr: 0.000091 - momentum: 0.000000
|
166 |
+
2023-10-12 21:59:58,904 epoch 5 - iter 1440/1445 - loss 0.03343762 - time (sec): 402.77 - samples/sec: 435.41 - lr: 0.000089 - momentum: 0.000000
|
167 |
+
2023-10-12 22:00:00,335 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-12 22:00:00,336 EPOCH 5 done: loss 0.0340 - lr: 0.000089
|
169 |
+
2023-10-12 22:00:21,562 DEV : loss 0.11156909167766571 - f1-score (micro avg) 0.8332
|
170 |
+
2023-10-12 22:00:21,591 ----------------------------------------------------------------------------------------------------
|
171 |
+
2023-10-12 22:01:02,121 epoch 6 - iter 144/1445 - loss 0.01887123 - time (sec): 40.53 - samples/sec: 424.57 - lr: 0.000087 - momentum: 0.000000
|
172 |
+
2023-10-12 22:01:42,563 epoch 6 - iter 288/1445 - loss 0.02116022 - time (sec): 80.97 - samples/sec: 426.42 - lr: 0.000085 - momentum: 0.000000
|
173 |
+
2023-10-12 22:02:23,726 epoch 6 - iter 432/1445 - loss 0.02492081 - time (sec): 122.13 - samples/sec: 429.07 - lr: 0.000084 - momentum: 0.000000
|
174 |
+
2023-10-12 22:03:05,157 epoch 6 - iter 576/1445 - loss 0.02312836 - time (sec): 163.56 - samples/sec: 430.48 - lr: 0.000082 - momentum: 0.000000
|
175 |
+
2023-10-12 22:03:46,615 epoch 6 - iter 720/1445 - loss 0.02397003 - time (sec): 205.02 - samples/sec: 429.86 - lr: 0.000080 - momentum: 0.000000
|
176 |
+
2023-10-12 22:04:29,524 epoch 6 - iter 864/1445 - loss 0.02169080 - time (sec): 247.93 - samples/sec: 429.81 - lr: 0.000078 - momentum: 0.000000
|
177 |
+
2023-10-12 22:05:11,526 epoch 6 - iter 1008/1445 - loss 0.02492991 - time (sec): 289.93 - samples/sec: 429.05 - lr: 0.000076 - momentum: 0.000000
|
178 |
+
2023-10-12 22:05:52,796 epoch 6 - iter 1152/1445 - loss 0.02363443 - time (sec): 331.20 - samples/sec: 425.58 - lr: 0.000075 - momentum: 0.000000
|
179 |
+
2023-10-12 22:06:32,978 epoch 6 - iter 1296/1445 - loss 0.02330251 - time (sec): 371.38 - samples/sec: 424.51 - lr: 0.000073 - momentum: 0.000000
|
180 |
+
2023-10-12 22:07:14,891 epoch 6 - iter 1440/1445 - loss 0.02376795 - time (sec): 413.30 - samples/sec: 425.05 - lr: 0.000071 - momentum: 0.000000
|
181 |
+
2023-10-12 22:07:16,126 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-12 22:07:16,126 EPOCH 6 done: loss 0.0237 - lr: 0.000071
|
183 |
+
2023-10-12 22:07:36,356 DEV : loss 0.13551419973373413 - f1-score (micro avg) 0.841
|
184 |
+
2023-10-12 22:07:36,386 ----------------------------------------------------------------------------------------------------
|
185 |
+
2023-10-12 22:08:18,583 epoch 7 - iter 144/1445 - loss 0.01993712 - time (sec): 42.19 - samples/sec: 418.04 - lr: 0.000069 - momentum: 0.000000
|
186 |
+
2023-10-12 22:08:59,833 epoch 7 - iter 288/1445 - loss 0.01779429 - time (sec): 83.45 - samples/sec: 426.34 - lr: 0.000068 - momentum: 0.000000
|
187 |
+
2023-10-12 22:09:40,252 epoch 7 - iter 432/1445 - loss 0.01748285 - time (sec): 123.86 - samples/sec: 420.58 - lr: 0.000066 - momentum: 0.000000
|
188 |
+
2023-10-12 22:10:20,324 epoch 7 - iter 576/1445 - loss 0.01656374 - time (sec): 163.94 - samples/sec: 418.75 - lr: 0.000064 - momentum: 0.000000
|
189 |
+
2023-10-12 22:11:01,416 epoch 7 - iter 720/1445 - loss 0.01878790 - time (sec): 205.03 - samples/sec: 423.33 - lr: 0.000062 - momentum: 0.000000
|
190 |
+
2023-10-12 22:11:42,695 epoch 7 - iter 864/1445 - loss 0.01829691 - time (sec): 246.31 - samples/sec: 423.11 - lr: 0.000060 - momentum: 0.000000
|
191 |
+
2023-10-12 22:12:23,106 epoch 7 - iter 1008/1445 - loss 0.01767695 - time (sec): 286.72 - samples/sec: 425.26 - lr: 0.000059 - momentum: 0.000000
|
192 |
+
2023-10-12 22:13:03,909 epoch 7 - iter 1152/1445 - loss 0.01746928 - time (sec): 327.52 - samples/sec: 424.22 - lr: 0.000057 - momentum: 0.000000
|
193 |
+
2023-10-12 22:13:45,103 epoch 7 - iter 1296/1445 - loss 0.01724052 - time (sec): 368.71 - samples/sec: 424.49 - lr: 0.000055 - momentum: 0.000000
|
194 |
+
2023-10-12 22:14:27,824 epoch 7 - iter 1440/1445 - loss 0.01810091 - time (sec): 411.44 - samples/sec: 426.53 - lr: 0.000053 - momentum: 0.000000
|
195 |
+
2023-10-12 22:14:29,309 ----------------------------------------------------------------------------------------------------
|
196 |
+
2023-10-12 22:14:29,310 EPOCH 7 done: loss 0.0180 - lr: 0.000053
|
197 |
+
2023-10-12 22:14:51,394 DEV : loss 0.13199612498283386 - f1-score (micro avg) 0.8541
|
198 |
+
2023-10-12 22:14:51,426 saving best model
|
199 |
+
2023-10-12 22:14:54,014 ----------------------------------------------------------------------------------------------------
|
200 |
+
2023-10-12 22:15:35,081 epoch 8 - iter 144/1445 - loss 0.01400894 - time (sec): 41.06 - samples/sec: 452.00 - lr: 0.000052 - momentum: 0.000000
|
201 |
+
2023-10-12 22:16:16,384 epoch 8 - iter 288/1445 - loss 0.01190023 - time (sec): 82.36 - samples/sec: 435.97 - lr: 0.000050 - momentum: 0.000000
|
202 |
+
2023-10-12 22:16:57,421 epoch 8 - iter 432/1445 - loss 0.01305794 - time (sec): 123.40 - samples/sec: 428.05 - lr: 0.000048 - momentum: 0.000000
|
203 |
+
2023-10-12 22:17:40,099 epoch 8 - iter 576/1445 - loss 0.01195701 - time (sec): 166.08 - samples/sec: 433.10 - lr: 0.000046 - momentum: 0.000000
|
204 |
+
2023-10-12 22:18:22,708 epoch 8 - iter 720/1445 - loss 0.01228432 - time (sec): 208.69 - samples/sec: 427.92 - lr: 0.000044 - momentum: 0.000000
|
205 |
+
2023-10-12 22:19:04,704 epoch 8 - iter 864/1445 - loss 0.01257074 - time (sec): 250.69 - samples/sec: 421.89 - lr: 0.000043 - momentum: 0.000000
|
206 |
+
2023-10-12 22:19:47,463 epoch 8 - iter 1008/1445 - loss 0.01349553 - time (sec): 293.44 - samples/sec: 419.32 - lr: 0.000041 - momentum: 0.000000
|
207 |
+
2023-10-12 22:20:29,739 epoch 8 - iter 1152/1445 - loss 0.01302670 - time (sec): 335.72 - samples/sec: 415.59 - lr: 0.000039 - momentum: 0.000000
|
208 |
+
2023-10-12 22:21:13,081 epoch 8 - iter 1296/1445 - loss 0.01428236 - time (sec): 379.06 - samples/sec: 416.56 - lr: 0.000037 - momentum: 0.000000
|
209 |
+
2023-10-12 22:21:55,650 epoch 8 - iter 1440/1445 - loss 0.01424477 - time (sec): 421.63 - samples/sec: 416.77 - lr: 0.000036 - momentum: 0.000000
|
210 |
+
2023-10-12 22:21:56,886 ----------------------------------------------------------------------------------------------------
|
211 |
+
2023-10-12 22:21:56,887 EPOCH 8 done: loss 0.0143 - lr: 0.000036
|
212 |
+
2023-10-12 22:22:17,436 DEV : loss 0.15470005571842194 - f1-score (micro avg) 0.8457
|
213 |
+
2023-10-12 22:22:17,465 ----------------------------------------------------------------------------------------------------
|
214 |
+
2023-10-12 22:22:59,069 epoch 9 - iter 144/1445 - loss 0.00569141 - time (sec): 41.60 - samples/sec: 442.36 - lr: 0.000034 - momentum: 0.000000
|
215 |
+
2023-10-12 22:23:40,717 epoch 9 - iter 288/1445 - loss 0.01301404 - time (sec): 83.25 - samples/sec: 446.83 - lr: 0.000032 - momentum: 0.000000
|
216 |
+
2023-10-12 22:24:21,009 epoch 9 - iter 432/1445 - loss 0.01230076 - time (sec): 123.54 - samples/sec: 445.61 - lr: 0.000030 - momentum: 0.000000
|
217 |
+
2023-10-12 22:25:00,563 epoch 9 - iter 576/1445 - loss 0.01178494 - time (sec): 163.10 - samples/sec: 436.19 - lr: 0.000028 - momentum: 0.000000
|
218 |
+
2023-10-12 22:25:39,478 epoch 9 - iter 720/1445 - loss 0.01121005 - time (sec): 202.01 - samples/sec: 430.67 - lr: 0.000027 - momentum: 0.000000
|
219 |
+
2023-10-12 22:26:19,982 epoch 9 - iter 864/1445 - loss 0.01117518 - time (sec): 242.51 - samples/sec: 432.62 - lr: 0.000025 - momentum: 0.000000
|
220 |
+
2023-10-12 22:27:00,522 epoch 9 - iter 1008/1445 - loss 0.01142421 - time (sec): 283.05 - samples/sec: 432.90 - lr: 0.000023 - momentum: 0.000000
|
221 |
+
2023-10-12 22:27:42,905 epoch 9 - iter 1152/1445 - loss 0.01188551 - time (sec): 325.44 - samples/sec: 434.68 - lr: 0.000021 - momentum: 0.000000
|
222 |
+
2023-10-12 22:28:23,774 epoch 9 - iter 1296/1445 - loss 0.01117904 - time (sec): 366.31 - samples/sec: 432.74 - lr: 0.000020 - momentum: 0.000000
|
223 |
+
2023-10-12 22:29:04,715 epoch 9 - iter 1440/1445 - loss 0.01054392 - time (sec): 407.25 - samples/sec: 431.36 - lr: 0.000018 - momentum: 0.000000
|
224 |
+
2023-10-12 22:29:05,942 ----------------------------------------------------------------------------------------------------
|
225 |
+
2023-10-12 22:29:05,943 EPOCH 9 done: loss 0.0105 - lr: 0.000018
|
226 |
+
2023-10-12 22:29:27,685 DEV : loss 0.15782958269119263 - f1-score (micro avg) 0.851
|
227 |
+
2023-10-12 22:29:27,716 ----------------------------------------------------------------------------------------------------
|
228 |
+
2023-10-12 22:30:09,377 epoch 10 - iter 144/1445 - loss 0.00649677 - time (sec): 41.66 - samples/sec: 432.41 - lr: 0.000016 - momentum: 0.000000
|
229 |
+
2023-10-12 22:30:48,589 epoch 10 - iter 288/1445 - loss 0.00717470 - time (sec): 80.87 - samples/sec: 417.09 - lr: 0.000014 - momentum: 0.000000
|
230 |
+
2023-10-12 22:31:28,912 epoch 10 - iter 432/1445 - loss 0.00796770 - time (sec): 121.19 - samples/sec: 418.26 - lr: 0.000012 - momentum: 0.000000
|
231 |
+
2023-10-12 22:32:11,533 epoch 10 - iter 576/1445 - loss 0.00925299 - time (sec): 163.82 - samples/sec: 421.90 - lr: 0.000011 - momentum: 0.000000
|
232 |
+
2023-10-12 22:32:53,189 epoch 10 - iter 720/1445 - loss 0.00823255 - time (sec): 205.47 - samples/sec: 420.03 - lr: 0.000009 - momentum: 0.000000
|
233 |
+
2023-10-12 22:33:35,725 epoch 10 - iter 864/1445 - loss 0.00732982 - time (sec): 248.01 - samples/sec: 422.20 - lr: 0.000007 - momentum: 0.000000
|
234 |
+
2023-10-12 22:34:18,703 epoch 10 - iter 1008/1445 - loss 0.00779755 - time (sec): 290.98 - samples/sec: 423.65 - lr: 0.000005 - momentum: 0.000000
|
235 |
+
2023-10-12 22:35:00,516 epoch 10 - iter 1152/1445 - loss 0.00738921 - time (sec): 332.80 - samples/sec: 420.58 - lr: 0.000004 - momentum: 0.000000
|
236 |
+
2023-10-12 22:35:42,966 epoch 10 - iter 1296/1445 - loss 0.00811633 - time (sec): 375.25 - samples/sec: 420.22 - lr: 0.000002 - momentum: 0.000000
|
237 |
+
2023-10-12 22:36:25,327 epoch 10 - iter 1440/1445 - loss 0.00775884 - time (sec): 417.61 - samples/sec: 420.82 - lr: 0.000000 - momentum: 0.000000
|
238 |
+
2023-10-12 22:36:26,519 ----------------------------------------------------------------------------------------------------
|
239 |
+
2023-10-12 22:36:26,519 EPOCH 10 done: loss 0.0077 - lr: 0.000000
|
240 |
+
2023-10-12 22:36:47,942 DEV : loss 0.16641554236412048 - f1-score (micro avg) 0.8455
|
241 |
+
2023-10-12 22:36:48,832 ----------------------------------------------------------------------------------------------------
|
242 |
+
2023-10-12 22:36:48,834 Loading model from best epoch ...
|
243 |
+
2023-10-12 22:36:52,922 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
244 |
+
2023-10-12 22:37:13,988
|
245 |
+
Results:
|
246 |
+
- F-score (micro) 0.8543
|
247 |
+
- F-score (macro) 0.7635
|
248 |
+
- Accuracy 0.7602
|
249 |
+
|
250 |
+
By class:
|
251 |
+
precision recall f1-score support
|
252 |
+
|
253 |
+
PER 0.8577 0.8631 0.8604 482
|
254 |
+
LOC 0.9238 0.8734 0.8979 458
|
255 |
+
ORG 0.5286 0.5362 0.5324 69
|
256 |
+
|
257 |
+
micro avg 0.8634 0.8454 0.8543 1009
|
258 |
+
macro avg 0.7700 0.7576 0.7635 1009
|
259 |
+
weighted avg 0.8652 0.8454 0.8550 1009
|
260 |
+
|
261 |
+
2023-10-12 22:37:13,988 ----------------------------------------------------------------------------------------------------
|