stefan-it commited on
Commit
4124722
1 Parent(s): e56d34a

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f55a702588902480e78d04ab437dba6e98ec2b2f704035a08562a524ca8c2d6
3
+ size 19045922
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 01:11:58 0.0000 0.7293 0.1804 0.2137 0.1751 0.1925 0.1094
3
+ 2 01:12:59 0.0000 0.1857 0.1723 0.3455 0.4336 0.3846 0.2474
4
+ 3 01:14:00 0.0000 0.1578 0.1696 0.3595 0.4931 0.4158 0.2723
5
+ 4 01:15:01 0.0000 0.1464 0.1666 0.3991 0.5114 0.4483 0.2986
6
+ 5 01:16:02 0.0000 0.1351 0.1814 0.3782 0.5824 0.4586 0.3092
7
+ 6 01:17:02 0.0000 0.1264 0.1841 0.3907 0.6030 0.4741 0.3210
8
+ 7 01:18:03 0.0000 0.1222 0.1875 0.4058 0.5812 0.4779 0.3227
9
+ 8 01:19:04 0.0000 0.1166 0.1948 0.4165 0.5824 0.4857 0.3299
10
+ 9 01:20:05 0.0000 0.1135 0.1936 0.4142 0.5801 0.4833 0.3275
11
+ 10 01:21:05 0.0000 0.1097 0.1968 0.4118 0.5847 0.4832 0.3278
runs/events.out.tfevents.1697677859.46dc0c540dd0.3802.12 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c38de5dfc6ca4c9f3b0675d7b92556d7e81c895c9077cb17acc37633b2c775ab
3
+ size 2030580
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-19 01:10:59,040 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 128)
7
+ (position_embeddings): Embedding(512, 128)
8
+ (token_type_embeddings): Embedding(2, 128)
9
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-1): 2 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=128, out_features=128, bias=True)
18
+ (key): Linear(in_features=128, out_features=128, bias=True)
19
+ (value): Linear(in_features=128, out_features=128, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=128, out_features=128, bias=True)
24
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=128, out_features=512, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=512, out_features=128, bias=True)
34
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=128, out_features=128, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=128, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-19 01:10:59,040 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
52
+ - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
53
+ 2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-19 01:10:59,040 Train: 14465 sentences
55
+ 2023-10-19 01:10:59,040 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-19 01:10:59,040 Training Params:
58
+ 2023-10-19 01:10:59,040 - learning_rate: "3e-05"
59
+ 2023-10-19 01:10:59,040 - mini_batch_size: "4"
60
+ 2023-10-19 01:10:59,040 - max_epochs: "10"
61
+ 2023-10-19 01:10:59,040 - shuffle: "True"
62
+ 2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-19 01:10:59,040 Plugins:
64
+ 2023-10-19 01:10:59,040 - TensorboardLogger
65
+ 2023-10-19 01:10:59,040 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-19 01:10:59,040 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-19 01:10:59,040 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-19 01:10:59,040 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-19 01:10:59,041 Computation:
71
+ 2023-10-19 01:10:59,041 - compute on device: cuda:0
72
+ 2023-10-19 01:10:59,041 - embedding storage: none
73
+ 2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-19 01:10:59,041 Model training base path: "hmbench-letemps/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
75
+ 2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-19 01:10:59,041 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-19 01:10:59,041 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-19 01:11:04,907 epoch 1 - iter 361/3617 - loss 3.12252186 - time (sec): 5.87 - samples/sec: 6145.92 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-19 01:11:10,541 epoch 1 - iter 722/3617 - loss 2.41293036 - time (sec): 11.50 - samples/sec: 6537.22 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-19 01:11:16,281 epoch 1 - iter 1083/3617 - loss 1.79547173 - time (sec): 17.24 - samples/sec: 6533.93 - lr: 0.000009 - momentum: 0.000000
81
+ 2023-10-19 01:11:21,954 epoch 1 - iter 1444/3617 - loss 1.44599180 - time (sec): 22.91 - samples/sec: 6519.84 - lr: 0.000012 - momentum: 0.000000
82
+ 2023-10-19 01:11:27,623 epoch 1 - iter 1805/3617 - loss 1.22012401 - time (sec): 28.58 - samples/sec: 6522.30 - lr: 0.000015 - momentum: 0.000000
83
+ 2023-10-19 01:11:33,316 epoch 1 - iter 2166/3617 - loss 1.06655197 - time (sec): 34.27 - samples/sec: 6522.49 - lr: 0.000018 - momentum: 0.000000
84
+ 2023-10-19 01:11:39,043 epoch 1 - iter 2527/3617 - loss 0.95272625 - time (sec): 40.00 - samples/sec: 6529.96 - lr: 0.000021 - momentum: 0.000000
85
+ 2023-10-19 01:11:44,824 epoch 1 - iter 2888/3617 - loss 0.85861658 - time (sec): 45.78 - samples/sec: 6579.78 - lr: 0.000024 - momentum: 0.000000
86
+ 2023-10-19 01:11:50,477 epoch 1 - iter 3249/3617 - loss 0.78754238 - time (sec): 51.44 - samples/sec: 6619.51 - lr: 0.000027 - momentum: 0.000000
87
+ 2023-10-19 01:11:56,201 epoch 1 - iter 3610/3617 - loss 0.73000898 - time (sec): 57.16 - samples/sec: 6635.44 - lr: 0.000030 - momentum: 0.000000
88
+ 2023-10-19 01:11:56,311 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-19 01:11:56,311 EPOCH 1 done: loss 0.7293 - lr: 0.000030
90
+ 2023-10-19 01:11:58,584 DEV : loss 0.1803603321313858 - f1-score (micro avg) 0.1925
91
+ 2023-10-19 01:11:58,611 saving best model
92
+ 2023-10-19 01:11:58,640 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-19 01:12:04,265 epoch 2 - iter 361/3617 - loss 0.20098825 - time (sec): 5.62 - samples/sec: 6528.17 - lr: 0.000030 - momentum: 0.000000
94
+ 2023-10-19 01:12:09,986 epoch 2 - iter 722/3617 - loss 0.20199769 - time (sec): 11.35 - samples/sec: 6570.12 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-19 01:12:15,686 epoch 2 - iter 1083/3617 - loss 0.19444887 - time (sec): 17.05 - samples/sec: 6553.53 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-19 01:12:21,265 epoch 2 - iter 1444/3617 - loss 0.19472453 - time (sec): 22.62 - samples/sec: 6574.65 - lr: 0.000029 - momentum: 0.000000
97
+ 2023-10-19 01:12:26,799 epoch 2 - iter 1805/3617 - loss 0.19141549 - time (sec): 28.16 - samples/sec: 6668.61 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-19 01:12:32,506 epoch 2 - iter 2166/3617 - loss 0.18942948 - time (sec): 33.87 - samples/sec: 6651.96 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-19 01:12:38,393 epoch 2 - iter 2527/3617 - loss 0.19177342 - time (sec): 39.75 - samples/sec: 6633.01 - lr: 0.000028 - momentum: 0.000000
100
+ 2023-10-19 01:12:44,107 epoch 2 - iter 2888/3617 - loss 0.18890055 - time (sec): 45.47 - samples/sec: 6637.92 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-19 01:12:49,797 epoch 2 - iter 3249/3617 - loss 0.18726829 - time (sec): 51.16 - samples/sec: 6664.34 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-19 01:12:55,492 epoch 2 - iter 3610/3617 - loss 0.18564262 - time (sec): 56.85 - samples/sec: 6673.24 - lr: 0.000027 - momentum: 0.000000
103
+ 2023-10-19 01:12:55,588 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-19 01:12:55,588 EPOCH 2 done: loss 0.1857 - lr: 0.000027
105
+ 2023-10-19 01:12:59,509 DEV : loss 0.17232035100460052 - f1-score (micro avg) 0.3846
106
+ 2023-10-19 01:12:59,536 saving best model
107
+ 2023-10-19 01:12:59,569 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-19 01:13:05,350 epoch 3 - iter 361/3617 - loss 0.15068487 - time (sec): 5.78 - samples/sec: 6657.47 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-19 01:13:11,111 epoch 3 - iter 722/3617 - loss 0.16512105 - time (sec): 11.54 - samples/sec: 6531.24 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-19 01:13:16,903 epoch 3 - iter 1083/3617 - loss 0.16734512 - time (sec): 17.33 - samples/sec: 6649.78 - lr: 0.000026 - momentum: 0.000000
111
+ 2023-10-19 01:13:22,599 epoch 3 - iter 1444/3617 - loss 0.16745629 - time (sec): 23.03 - samples/sec: 6608.73 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-19 01:13:28,382 epoch 3 - iter 1805/3617 - loss 0.16513383 - time (sec): 28.81 - samples/sec: 6577.01 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-19 01:13:34,182 epoch 3 - iter 2166/3617 - loss 0.16581818 - time (sec): 34.61 - samples/sec: 6543.84 - lr: 0.000025 - momentum: 0.000000
114
+ 2023-10-19 01:13:40,005 epoch 3 - iter 2527/3617 - loss 0.16204270 - time (sec): 40.44 - samples/sec: 6531.56 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-19 01:13:45,508 epoch 3 - iter 2888/3617 - loss 0.15884163 - time (sec): 45.94 - samples/sec: 6591.42 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-19 01:13:51,225 epoch 3 - iter 3249/3617 - loss 0.15690141 - time (sec): 51.66 - samples/sec: 6628.98 - lr: 0.000024 - momentum: 0.000000
117
+ 2023-10-19 01:13:56,955 epoch 3 - iter 3610/3617 - loss 0.15791128 - time (sec): 57.38 - samples/sec: 6609.03 - lr: 0.000023 - momentum: 0.000000
118
+ 2023-10-19 01:13:57,053 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-19 01:13:57,053 EPOCH 3 done: loss 0.1578 - lr: 0.000023
120
+ 2023-10-19 01:14:00,254 DEV : loss 0.16959495842456818 - f1-score (micro avg) 0.4158
121
+ 2023-10-19 01:14:00,282 saving best model
122
+ 2023-10-19 01:14:00,315 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-19 01:14:06,037 epoch 4 - iter 361/3617 - loss 0.13799445 - time (sec): 5.72 - samples/sec: 6598.38 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-19 01:14:11,790 epoch 4 - iter 722/3617 - loss 0.13722792 - time (sec): 11.47 - samples/sec: 6581.33 - lr: 0.000023 - momentum: 0.000000
125
+ 2023-10-19 01:14:17,606 epoch 4 - iter 1083/3617 - loss 0.14251973 - time (sec): 17.29 - samples/sec: 6618.47 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-19 01:14:23,333 epoch 4 - iter 1444/3617 - loss 0.14215649 - time (sec): 23.02 - samples/sec: 6612.71 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-19 01:14:29,047 epoch 4 - iter 1805/3617 - loss 0.14527769 - time (sec): 28.73 - samples/sec: 6567.33 - lr: 0.000022 - momentum: 0.000000
128
+ 2023-10-19 01:14:34,901 epoch 4 - iter 2166/3617 - loss 0.14771726 - time (sec): 34.58 - samples/sec: 6575.48 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-19 01:14:40,765 epoch 4 - iter 2527/3617 - loss 0.14620802 - time (sec): 40.45 - samples/sec: 6581.63 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-19 01:14:46,464 epoch 4 - iter 2888/3617 - loss 0.14434453 - time (sec): 46.15 - samples/sec: 6577.15 - lr: 0.000021 - momentum: 0.000000
131
+ 2023-10-19 01:14:52,181 epoch 4 - iter 3249/3617 - loss 0.14432907 - time (sec): 51.86 - samples/sec: 6602.27 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-19 01:14:57,873 epoch 4 - iter 3610/3617 - loss 0.14625128 - time (sec): 57.56 - samples/sec: 6587.12 - lr: 0.000020 - momentum: 0.000000
133
+ 2023-10-19 01:14:57,984 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-19 01:14:57,984 EPOCH 4 done: loss 0.1464 - lr: 0.000020
135
+ 2023-10-19 01:15:01,868 DEV : loss 0.16661077737808228 - f1-score (micro avg) 0.4483
136
+ 2023-10-19 01:15:01,897 saving best model
137
+ 2023-10-19 01:15:01,929 ----------------------------------------------------------------------------------------------------
138
+ 2023-10-19 01:15:07,423 epoch 5 - iter 361/3617 - loss 0.12987237 - time (sec): 5.49 - samples/sec: 6882.43 - lr: 0.000020 - momentum: 0.000000
139
+ 2023-10-19 01:15:13,383 epoch 5 - iter 722/3617 - loss 0.12949253 - time (sec): 11.45 - samples/sec: 6532.00 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-19 01:15:19,029 epoch 5 - iter 1083/3617 - loss 0.13420185 - time (sec): 17.10 - samples/sec: 6506.71 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-19 01:15:24,691 epoch 5 - iter 1444/3617 - loss 0.13910024 - time (sec): 22.76 - samples/sec: 6542.09 - lr: 0.000019 - momentum: 0.000000
142
+ 2023-10-19 01:15:30,487 epoch 5 - iter 1805/3617 - loss 0.13856134 - time (sec): 28.56 - samples/sec: 6558.16 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-19 01:15:36,168 epoch 5 - iter 2166/3617 - loss 0.13530390 - time (sec): 34.24 - samples/sec: 6550.97 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-19 01:15:41,913 epoch 5 - iter 2527/3617 - loss 0.13679379 - time (sec): 39.98 - samples/sec: 6590.17 - lr: 0.000018 - momentum: 0.000000
145
+ 2023-10-19 01:15:47,672 epoch 5 - iter 2888/3617 - loss 0.13624011 - time (sec): 45.74 - samples/sec: 6604.87 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-19 01:15:53,434 epoch 5 - iter 3249/3617 - loss 0.13560627 - time (sec): 51.50 - samples/sec: 6616.88 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-19 01:15:59,190 epoch 5 - iter 3610/3617 - loss 0.13495541 - time (sec): 57.26 - samples/sec: 6624.56 - lr: 0.000017 - momentum: 0.000000
148
+ 2023-10-19 01:15:59,302 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-19 01:15:59,303 EPOCH 5 done: loss 0.1351 - lr: 0.000017
150
+ 2023-10-19 01:16:02,561 DEV : loss 0.1814231425523758 - f1-score (micro avg) 0.4586
151
+ 2023-10-19 01:16:02,588 saving best model
152
+ 2023-10-19 01:16:02,621 ----------------------------------------------------------------------------------------------------
153
+ 2023-10-19 01:16:08,251 epoch 6 - iter 361/3617 - loss 0.13714181 - time (sec): 5.63 - samples/sec: 6642.75 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-19 01:16:13,926 epoch 6 - iter 722/3617 - loss 0.12680325 - time (sec): 11.30 - samples/sec: 6662.60 - lr: 0.000016 - momentum: 0.000000
155
+ 2023-10-19 01:16:19,692 epoch 6 - iter 1083/3617 - loss 0.12500890 - time (sec): 17.07 - samples/sec: 6691.92 - lr: 0.000016 - momentum: 0.000000
156
+ 2023-10-19 01:16:25,275 epoch 6 - iter 1444/3617 - loss 0.12471214 - time (sec): 22.65 - samples/sec: 6643.42 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-19 01:16:30,912 epoch 6 - iter 1805/3617 - loss 0.12052615 - time (sec): 28.29 - samples/sec: 6579.21 - lr: 0.000015 - momentum: 0.000000
158
+ 2023-10-19 01:16:36,593 epoch 6 - iter 2166/3617 - loss 0.12279489 - time (sec): 33.97 - samples/sec: 6628.67 - lr: 0.000015 - momentum: 0.000000
159
+ 2023-10-19 01:16:42,367 epoch 6 - iter 2527/3617 - loss 0.12224267 - time (sec): 39.74 - samples/sec: 6630.65 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-19 01:16:48,183 epoch 6 - iter 2888/3617 - loss 0.12426780 - time (sec): 45.56 - samples/sec: 6629.71 - lr: 0.000014 - momentum: 0.000000
161
+ 2023-10-19 01:16:53,829 epoch 6 - iter 3249/3617 - loss 0.12488997 - time (sec): 51.21 - samples/sec: 6654.26 - lr: 0.000014 - momentum: 0.000000
162
+ 2023-10-19 01:16:59,084 epoch 6 - iter 3610/3617 - loss 0.12639819 - time (sec): 56.46 - samples/sec: 6717.91 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-19 01:16:59,180 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-19 01:16:59,181 EPOCH 6 done: loss 0.1264 - lr: 0.000013
165
+ 2023-10-19 01:17:02,429 DEV : loss 0.1841202825307846 - f1-score (micro avg) 0.4741
166
+ 2023-10-19 01:17:02,456 saving best model
167
+ 2023-10-19 01:17:02,489 ----------------------------------------------------------------------------------------------------
168
+ 2023-10-19 01:17:08,157 epoch 7 - iter 361/3617 - loss 0.12363306 - time (sec): 5.67 - samples/sec: 6708.62 - lr: 0.000013 - momentum: 0.000000
169
+ 2023-10-19 01:17:13,867 epoch 7 - iter 722/3617 - loss 0.12287275 - time (sec): 11.38 - samples/sec: 6576.34 - lr: 0.000013 - momentum: 0.000000
170
+ 2023-10-19 01:17:19,709 epoch 7 - iter 1083/3617 - loss 0.11963241 - time (sec): 17.22 - samples/sec: 6579.90 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-19 01:17:25,354 epoch 7 - iter 1444/3617 - loss 0.12218097 - time (sec): 22.86 - samples/sec: 6637.46 - lr: 0.000012 - momentum: 0.000000
172
+ 2023-10-19 01:17:31,082 epoch 7 - iter 1805/3617 - loss 0.12399375 - time (sec): 28.59 - samples/sec: 6639.53 - lr: 0.000012 - momentum: 0.000000
173
+ 2023-10-19 01:17:36,684 epoch 7 - iter 2166/3617 - loss 0.12278498 - time (sec): 34.19 - samples/sec: 6696.17 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-19 01:17:43,138 epoch 7 - iter 2527/3617 - loss 0.12015701 - time (sec): 40.65 - samples/sec: 6563.75 - lr: 0.000011 - momentum: 0.000000
175
+ 2023-10-19 01:17:48,760 epoch 7 - iter 2888/3617 - loss 0.11877865 - time (sec): 46.27 - samples/sec: 6593.24 - lr: 0.000011 - momentum: 0.000000
176
+ 2023-10-19 01:17:54,087 epoch 7 - iter 3249/3617 - loss 0.12139837 - time (sec): 51.60 - samples/sec: 6624.35 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-19 01:17:59,808 epoch 7 - iter 3610/3617 - loss 0.12214986 - time (sec): 57.32 - samples/sec: 6617.30 - lr: 0.000010 - momentum: 0.000000
178
+ 2023-10-19 01:17:59,917 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-19 01:17:59,917 EPOCH 7 done: loss 0.1222 - lr: 0.000010
180
+ 2023-10-19 01:18:03,123 DEV : loss 0.18754823505878448 - f1-score (micro avg) 0.4779
181
+ 2023-10-19 01:18:03,151 saving best model
182
+ 2023-10-19 01:18:03,186 ----------------------------------------------------------------------------------------------------
183
+ 2023-10-19 01:18:09,163 epoch 8 - iter 361/3617 - loss 0.11552861 - time (sec): 5.98 - samples/sec: 6550.53 - lr: 0.000010 - momentum: 0.000000
184
+ 2023-10-19 01:18:14,914 epoch 8 - iter 722/3617 - loss 0.11948353 - time (sec): 11.73 - samples/sec: 6616.21 - lr: 0.000009 - momentum: 0.000000
185
+ 2023-10-19 01:18:20,735 epoch 8 - iter 1083/3617 - loss 0.11968151 - time (sec): 17.55 - samples/sec: 6671.45 - lr: 0.000009 - momentum: 0.000000
186
+ 2023-10-19 01:18:26,269 epoch 8 - iter 1444/3617 - loss 0.12114434 - time (sec): 23.08 - samples/sec: 6666.92 - lr: 0.000009 - momentum: 0.000000
187
+ 2023-10-19 01:18:32,065 epoch 8 - iter 1805/3617 - loss 0.11664197 - time (sec): 28.88 - samples/sec: 6667.87 - lr: 0.000008 - momentum: 0.000000
188
+ 2023-10-19 01:18:37,854 epoch 8 - iter 2166/3617 - loss 0.11725016 - time (sec): 34.67 - samples/sec: 6598.38 - lr: 0.000008 - momentum: 0.000000
189
+ 2023-10-19 01:18:43,593 epoch 8 - iter 2527/3617 - loss 0.11944440 - time (sec): 40.41 - samples/sec: 6575.91 - lr: 0.000008 - momentum: 0.000000
190
+ 2023-10-19 01:18:49,269 epoch 8 - iter 2888/3617 - loss 0.11884190 - time (sec): 46.08 - samples/sec: 6587.45 - lr: 0.000007 - momentum: 0.000000
191
+ 2023-10-19 01:18:54,994 epoch 8 - iter 3249/3617 - loss 0.11785410 - time (sec): 51.81 - samples/sec: 6583.13 - lr: 0.000007 - momentum: 0.000000
192
+ 2023-10-19 01:19:00,793 epoch 8 - iter 3610/3617 - loss 0.11660685 - time (sec): 57.61 - samples/sec: 6587.20 - lr: 0.000007 - momentum: 0.000000
193
+ 2023-10-19 01:19:00,903 ----------------------------------------------------------------------------------------------------
194
+ 2023-10-19 01:19:00,903 EPOCH 8 done: loss 0.1166 - lr: 0.000007
195
+ 2023-10-19 01:19:04,160 DEV : loss 0.19483603537082672 - f1-score (micro avg) 0.4857
196
+ 2023-10-19 01:19:04,188 saving best model
197
+ 2023-10-19 01:19:04,220 ----------------------------------------------------------------------------------------------------
198
+ 2023-10-19 01:19:10,105 epoch 9 - iter 361/3617 - loss 0.12224075 - time (sec): 5.88 - samples/sec: 6742.34 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-19 01:19:15,778 epoch 9 - iter 722/3617 - loss 0.10932434 - time (sec): 11.56 - samples/sec: 6748.35 - lr: 0.000006 - momentum: 0.000000
200
+ 2023-10-19 01:19:21,542 epoch 9 - iter 1083/3617 - loss 0.11140440 - time (sec): 17.32 - samples/sec: 6728.86 - lr: 0.000006 - momentum: 0.000000
201
+ 2023-10-19 01:19:27,301 epoch 9 - iter 1444/3617 - loss 0.10874201 - time (sec): 23.08 - samples/sec: 6632.17 - lr: 0.000005 - momentum: 0.000000
202
+ 2023-10-19 01:19:32,820 epoch 9 - iter 1805/3617 - loss 0.10872015 - time (sec): 28.60 - samples/sec: 6732.84 - lr: 0.000005 - momentum: 0.000000
203
+ 2023-10-19 01:19:38,447 epoch 9 - iter 2166/3617 - loss 0.11133825 - time (sec): 34.23 - samples/sec: 6714.00 - lr: 0.000005 - momentum: 0.000000
204
+ 2023-10-19 01:19:44,220 epoch 9 - iter 2527/3617 - loss 0.11266491 - time (sec): 40.00 - samples/sec: 6670.77 - lr: 0.000004 - momentum: 0.000000
205
+ 2023-10-19 01:19:50,013 epoch 9 - iter 2888/3617 - loss 0.11290616 - time (sec): 45.79 - samples/sec: 6672.15 - lr: 0.000004 - momentum: 0.000000
206
+ 2023-10-19 01:19:55,727 epoch 9 - iter 3249/3617 - loss 0.11269907 - time (sec): 51.51 - samples/sec: 6647.26 - lr: 0.000004 - momentum: 0.000000
207
+ 2023-10-19 01:20:01,410 epoch 9 - iter 3610/3617 - loss 0.11350583 - time (sec): 57.19 - samples/sec: 6630.19 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-19 01:20:01,524 ----------------------------------------------------------------------------------------------------
209
+ 2023-10-19 01:20:01,524 EPOCH 9 done: loss 0.1135 - lr: 0.000003
210
+ 2023-10-19 01:20:05,482 DEV : loss 0.19358040392398834 - f1-score (micro avg) 0.4833
211
+ 2023-10-19 01:20:05,511 ----------------------------------------------------------------------------------------------------
212
+ 2023-10-19 01:20:11,321 epoch 10 - iter 361/3617 - loss 0.11277088 - time (sec): 5.81 - samples/sec: 6246.22 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-19 01:20:17,167 epoch 10 - iter 722/3617 - loss 0.11290023 - time (sec): 11.66 - samples/sec: 6445.07 - lr: 0.000003 - momentum: 0.000000
214
+ 2023-10-19 01:20:22,907 epoch 10 - iter 1083/3617 - loss 0.11198872 - time (sec): 17.40 - samples/sec: 6454.90 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-19 01:20:28,643 epoch 10 - iter 1444/3617 - loss 0.10861250 - time (sec): 23.13 - samples/sec: 6495.86 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-19 01:20:33,825 epoch 10 - iter 1805/3617 - loss 0.11091058 - time (sec): 28.31 - samples/sec: 6672.48 - lr: 0.000002 - momentum: 0.000000
217
+ 2023-10-19 01:20:39,517 epoch 10 - iter 2166/3617 - loss 0.11258390 - time (sec): 34.01 - samples/sec: 6697.61 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-19 01:20:45,292 epoch 10 - iter 2527/3617 - loss 0.11155306 - time (sec): 39.78 - samples/sec: 6694.42 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-19 01:20:51,014 epoch 10 - iter 2888/3617 - loss 0.10931783 - time (sec): 45.50 - samples/sec: 6687.76 - lr: 0.000001 - momentum: 0.000000
220
+ 2023-10-19 01:20:56,674 epoch 10 - iter 3249/3617 - loss 0.11004335 - time (sec): 51.16 - samples/sec: 6662.31 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-19 01:21:02,509 epoch 10 - iter 3610/3617 - loss 0.10947971 - time (sec): 57.00 - samples/sec: 6656.02 - lr: 0.000000 - momentum: 0.000000
222
+ 2023-10-19 01:21:02,609 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-19 01:21:02,609 EPOCH 10 done: loss 0.1097 - lr: 0.000000
224
+ 2023-10-19 01:21:05,822 DEV : loss 0.1967521756887436 - f1-score (micro avg) 0.4832
225
+ 2023-10-19 01:21:05,881 ----------------------------------------------------------------------------------------------------
226
+ 2023-10-19 01:21:05,881 Loading model from best epoch ...
227
+ 2023-10-19 01:21:05,977 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
228
+ 2023-10-19 01:21:10,135
229
+ Results:
230
+ - F-score (micro) 0.5273
231
+ - F-score (macro) 0.3519
232
+ - Accuracy 0.37
233
+
234
+ By class:
235
+ precision recall f1-score support
236
+
237
+ loc 0.5467 0.6734 0.6035 591
238
+ pers 0.4155 0.4958 0.4521 357
239
+ org 0.0000 0.0000 0.0000 79
240
+
241
+ micro avg 0.4983 0.5599 0.5273 1027
242
+ macro avg 0.3207 0.3897 0.3519 1027
243
+ weighted avg 0.4590 0.5599 0.5044 1027
244
+
245
+ 2023-10-19 01:21:10,135 ----------------------------------------------------------------------------------------------------