stefan-it commited on
Commit
1c85a06
1 Parent(s): 4da78f7

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6fcd804ef8b514ac9cc6d18b4e2811cfe4bc7046a03faba90869cc140ad4fb28
3
+ size 440966725
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 15:21:10 0.0000 0.6939 0.1628 0.6980 0.6325 0.6637 0.5088
3
+ 2 15:22:03 0.0000 0.1388 0.1320 0.7471 0.7435 0.7453 0.6175
4
+ 3 15:22:57 0.0000 0.0779 0.1567 0.7373 0.7349 0.7361 0.6006
5
+ 4 15:23:52 0.0000 0.0579 0.1569 0.7407 0.7748 0.7574 0.6260
6
+ 5 15:24:48 0.0000 0.0342 0.1943 0.7704 0.7686 0.7695 0.6421
7
+ 6 15:25:43 0.0000 0.0187 0.1980 0.7519 0.7912 0.7710 0.6521
8
+ 7 15:26:39 0.0000 0.0139 0.2285 0.8031 0.7975 0.8003 0.6823
9
+ 8 15:27:37 0.0000 0.0095 0.2194 0.7869 0.7998 0.7933 0.6770
10
+ 9 15:28:34 0.0000 0.0060 0.2406 0.7978 0.7834 0.7905 0.6684
11
+ 10 15:29:29 0.0000 0.0032 0.2325 0.8028 0.8022 0.8025 0.6849
runs/events.out.tfevents.1697556020.4c6324b99746.1390.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ebbe7bf18ebce61f88958b951071e8b9b3fc7ab0c5ebbd11bff4e6d4e513457
3
+ size 253592
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,242 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 15:20:20,697 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 15:20:20,699 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=21, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 15:20:20,699 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 15:20:20,699 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
48
+ - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
49
+ 2023-10-17 15:20:20,699 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 15:20:20,699 Train: 3575 sentences
51
+ 2023-10-17 15:20:20,699 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 15:20:20,699 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 15:20:20,699 Training Params:
54
+ 2023-10-17 15:20:20,700 - learning_rate: "5e-05"
55
+ 2023-10-17 15:20:20,700 - mini_batch_size: "8"
56
+ 2023-10-17 15:20:20,700 - max_epochs: "10"
57
+ 2023-10-17 15:20:20,700 - shuffle: "True"
58
+ 2023-10-17 15:20:20,700 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 15:20:20,700 Plugins:
60
+ 2023-10-17 15:20:20,700 - TensorboardLogger
61
+ 2023-10-17 15:20:20,700 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 15:20:20,700 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 15:20:20,700 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 15:20:20,700 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 15:20:20,700 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 15:20:20,700 Computation:
67
+ 2023-10-17 15:20:20,700 - compute on device: cuda:0
68
+ 2023-10-17 15:20:20,700 - embedding storage: none
69
+ 2023-10-17 15:20:20,701 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 15:20:20,701 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
71
+ 2023-10-17 15:20:20,701 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 15:20:20,701 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 15:20:20,701 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 15:20:25,296 epoch 1 - iter 44/447 - loss 3.29322406 - time (sec): 4.59 - samples/sec: 1935.59 - lr: 0.000005 - momentum: 0.000000
75
+ 2023-10-17 15:20:29,526 epoch 1 - iter 88/447 - loss 2.20404796 - time (sec): 8.82 - samples/sec: 1932.03 - lr: 0.000010 - momentum: 0.000000
76
+ 2023-10-17 15:20:33,812 epoch 1 - iter 132/447 - loss 1.66693701 - time (sec): 13.11 - samples/sec: 1943.12 - lr: 0.000015 - momentum: 0.000000
77
+ 2023-10-17 15:20:37,841 epoch 1 - iter 176/447 - loss 1.34300581 - time (sec): 17.14 - samples/sec: 2013.07 - lr: 0.000020 - momentum: 0.000000
78
+ 2023-10-17 15:20:42,001 epoch 1 - iter 220/447 - loss 1.13494006 - time (sec): 21.30 - samples/sec: 2034.17 - lr: 0.000024 - momentum: 0.000000
79
+ 2023-10-17 15:20:46,118 epoch 1 - iter 264/447 - loss 0.99498347 - time (sec): 25.42 - samples/sec: 2037.60 - lr: 0.000029 - momentum: 0.000000
80
+ 2023-10-17 15:20:50,011 epoch 1 - iter 308/447 - loss 0.90014844 - time (sec): 29.31 - samples/sec: 2045.24 - lr: 0.000034 - momentum: 0.000000
81
+ 2023-10-17 15:20:54,208 epoch 1 - iter 352/447 - loss 0.82692832 - time (sec): 33.51 - samples/sec: 2030.01 - lr: 0.000039 - momentum: 0.000000
82
+ 2023-10-17 15:20:58,668 epoch 1 - iter 396/447 - loss 0.75333912 - time (sec): 37.97 - samples/sec: 2037.44 - lr: 0.000044 - momentum: 0.000000
83
+ 2023-10-17 15:21:02,777 epoch 1 - iter 440/447 - loss 0.70068186 - time (sec): 42.07 - samples/sec: 2029.61 - lr: 0.000049 - momentum: 0.000000
84
+ 2023-10-17 15:21:03,549 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 15:21:03,549 EPOCH 1 done: loss 0.6939 - lr: 0.000049
86
+ 2023-10-17 15:21:10,111 DEV : loss 0.1627715528011322 - f1-score (micro avg) 0.6637
87
+ 2023-10-17 15:21:10,167 saving best model
88
+ 2023-10-17 15:21:10,755 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 15:21:14,984 epoch 2 - iter 44/447 - loss 0.18574466 - time (sec): 4.23 - samples/sec: 2020.81 - lr: 0.000049 - momentum: 0.000000
90
+ 2023-10-17 15:21:18,869 epoch 2 - iter 88/447 - loss 0.17587291 - time (sec): 8.11 - samples/sec: 2043.93 - lr: 0.000049 - momentum: 0.000000
91
+ 2023-10-17 15:21:22,901 epoch 2 - iter 132/447 - loss 0.18074319 - time (sec): 12.14 - samples/sec: 2045.51 - lr: 0.000048 - momentum: 0.000000
92
+ 2023-10-17 15:21:27,252 epoch 2 - iter 176/447 - loss 0.16716021 - time (sec): 16.49 - samples/sec: 2035.17 - lr: 0.000048 - momentum: 0.000000
93
+ 2023-10-17 15:21:31,318 epoch 2 - iter 220/447 - loss 0.16088597 - time (sec): 20.56 - samples/sec: 2037.01 - lr: 0.000047 - momentum: 0.000000
94
+ 2023-10-17 15:21:35,411 epoch 2 - iter 264/447 - loss 0.15627115 - time (sec): 24.65 - samples/sec: 2064.17 - lr: 0.000047 - momentum: 0.000000
95
+ 2023-10-17 15:21:39,518 epoch 2 - iter 308/447 - loss 0.15212403 - time (sec): 28.76 - samples/sec: 2071.36 - lr: 0.000046 - momentum: 0.000000
96
+ 2023-10-17 15:21:43,871 epoch 2 - iter 352/447 - loss 0.14647911 - time (sec): 33.11 - samples/sec: 2068.26 - lr: 0.000046 - momentum: 0.000000
97
+ 2023-10-17 15:21:47,774 epoch 2 - iter 396/447 - loss 0.14381048 - time (sec): 37.02 - samples/sec: 2063.23 - lr: 0.000045 - momentum: 0.000000
98
+ 2023-10-17 15:21:52,104 epoch 2 - iter 440/447 - loss 0.13930114 - time (sec): 41.35 - samples/sec: 2061.40 - lr: 0.000045 - momentum: 0.000000
99
+ 2023-10-17 15:21:52,733 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 15:21:52,733 EPOCH 2 done: loss 0.1388 - lr: 0.000045
101
+ 2023-10-17 15:22:03,664 DEV : loss 0.13203084468841553 - f1-score (micro avg) 0.7453
102
+ 2023-10-17 15:22:03,719 saving best model
103
+ 2023-10-17 15:22:04,360 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 15:22:08,773 epoch 3 - iter 44/447 - loss 0.07462311 - time (sec): 4.41 - samples/sec: 2144.45 - lr: 0.000044 - momentum: 0.000000
105
+ 2023-10-17 15:22:12,960 epoch 3 - iter 88/447 - loss 0.07763574 - time (sec): 8.60 - samples/sec: 2161.08 - lr: 0.000043 - momentum: 0.000000
106
+ 2023-10-17 15:22:16,914 epoch 3 - iter 132/447 - loss 0.07588462 - time (sec): 12.55 - samples/sec: 2119.86 - lr: 0.000043 - momentum: 0.000000
107
+ 2023-10-17 15:22:20,965 epoch 3 - iter 176/447 - loss 0.07957562 - time (sec): 16.60 - samples/sec: 2065.06 - lr: 0.000042 - momentum: 0.000000
108
+ 2023-10-17 15:22:24,832 epoch 3 - iter 220/447 - loss 0.07641351 - time (sec): 20.47 - samples/sec: 2080.13 - lr: 0.000042 - momentum: 0.000000
109
+ 2023-10-17 15:22:28,692 epoch 3 - iter 264/447 - loss 0.07516708 - time (sec): 24.33 - samples/sec: 2077.44 - lr: 0.000041 - momentum: 0.000000
110
+ 2023-10-17 15:22:32,638 epoch 3 - iter 308/447 - loss 0.07530165 - time (sec): 28.28 - samples/sec: 2075.02 - lr: 0.000041 - momentum: 0.000000
111
+ 2023-10-17 15:22:36,871 epoch 3 - iter 352/447 - loss 0.07665390 - time (sec): 32.51 - samples/sec: 2087.45 - lr: 0.000040 - momentum: 0.000000
112
+ 2023-10-17 15:22:40,961 epoch 3 - iter 396/447 - loss 0.07644784 - time (sec): 36.60 - samples/sec: 2091.93 - lr: 0.000040 - momentum: 0.000000
113
+ 2023-10-17 15:22:45,586 epoch 3 - iter 440/447 - loss 0.07596108 - time (sec): 41.22 - samples/sec: 2072.90 - lr: 0.000039 - momentum: 0.000000
114
+ 2023-10-17 15:22:46,244 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 15:22:46,244 EPOCH 3 done: loss 0.0779 - lr: 0.000039
116
+ 2023-10-17 15:22:57,932 DEV : loss 0.15669210255146027 - f1-score (micro avg) 0.7361
117
+ 2023-10-17 15:22:57,990 ----------------------------------------------------------------------------------------------------
118
+ 2023-10-17 15:23:01,950 epoch 4 - iter 44/447 - loss 0.11113508 - time (sec): 3.96 - samples/sec: 2079.82 - lr: 0.000038 - momentum: 0.000000
119
+ 2023-10-17 15:23:06,160 epoch 4 - iter 88/447 - loss 0.07727697 - time (sec): 8.17 - samples/sec: 2009.83 - lr: 0.000038 - momentum: 0.000000
120
+ 2023-10-17 15:23:10,270 epoch 4 - iter 132/447 - loss 0.07094192 - time (sec): 12.28 - samples/sec: 1949.87 - lr: 0.000037 - momentum: 0.000000
121
+ 2023-10-17 15:23:14,604 epoch 4 - iter 176/447 - loss 0.06225710 - time (sec): 16.61 - samples/sec: 1963.87 - lr: 0.000037 - momentum: 0.000000
122
+ 2023-10-17 15:23:19,493 epoch 4 - iter 220/447 - loss 0.06035341 - time (sec): 21.50 - samples/sec: 1996.90 - lr: 0.000036 - momentum: 0.000000
123
+ 2023-10-17 15:23:23,666 epoch 4 - iter 264/447 - loss 0.05722586 - time (sec): 25.67 - samples/sec: 1995.76 - lr: 0.000036 - momentum: 0.000000
124
+ 2023-10-17 15:23:27,927 epoch 4 - iter 308/447 - loss 0.05924836 - time (sec): 29.93 - samples/sec: 1999.94 - lr: 0.000035 - momentum: 0.000000
125
+ 2023-10-17 15:23:32,357 epoch 4 - iter 352/447 - loss 0.05871739 - time (sec): 34.36 - samples/sec: 1997.60 - lr: 0.000035 - momentum: 0.000000
126
+ 2023-10-17 15:23:36,757 epoch 4 - iter 396/447 - loss 0.05849539 - time (sec): 38.77 - samples/sec: 1987.97 - lr: 0.000034 - momentum: 0.000000
127
+ 2023-10-17 15:23:40,809 epoch 4 - iter 440/447 - loss 0.05804840 - time (sec): 42.82 - samples/sec: 1989.71 - lr: 0.000033 - momentum: 0.000000
128
+ 2023-10-17 15:23:41,474 ----------------------------------------------------------------------------------------------------
129
+ 2023-10-17 15:23:41,474 EPOCH 4 done: loss 0.0579 - lr: 0.000033
130
+ 2023-10-17 15:23:52,260 DEV : loss 0.15689311921596527 - f1-score (micro avg) 0.7574
131
+ 2023-10-17 15:23:52,319 saving best model
132
+ 2023-10-17 15:23:54,139 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-17 15:23:58,437 epoch 5 - iter 44/447 - loss 0.03896987 - time (sec): 4.29 - samples/sec: 2061.67 - lr: 0.000033 - momentum: 0.000000
134
+ 2023-10-17 15:24:02,519 epoch 5 - iter 88/447 - loss 0.04068401 - time (sec): 8.38 - samples/sec: 2066.50 - lr: 0.000032 - momentum: 0.000000
135
+ 2023-10-17 15:24:06,827 epoch 5 - iter 132/447 - loss 0.03466089 - time (sec): 12.68 - samples/sec: 2079.00 - lr: 0.000032 - momentum: 0.000000
136
+ 2023-10-17 15:24:10,950 epoch 5 - iter 176/447 - loss 0.03234991 - time (sec): 16.81 - samples/sec: 2029.72 - lr: 0.000031 - momentum: 0.000000
137
+ 2023-10-17 15:24:15,467 epoch 5 - iter 220/447 - loss 0.03486824 - time (sec): 21.32 - samples/sec: 2031.61 - lr: 0.000031 - momentum: 0.000000
138
+ 2023-10-17 15:24:19,494 epoch 5 - iter 264/447 - loss 0.03514322 - time (sec): 25.35 - samples/sec: 2028.49 - lr: 0.000030 - momentum: 0.000000
139
+ 2023-10-17 15:24:23,617 epoch 5 - iter 308/447 - loss 0.03414630 - time (sec): 29.47 - samples/sec: 2030.22 - lr: 0.000030 - momentum: 0.000000
140
+ 2023-10-17 15:24:27,949 epoch 5 - iter 352/447 - loss 0.03379714 - time (sec): 33.81 - samples/sec: 2023.06 - lr: 0.000029 - momentum: 0.000000
141
+ 2023-10-17 15:24:32,314 epoch 5 - iter 396/447 - loss 0.03258252 - time (sec): 38.17 - samples/sec: 2009.56 - lr: 0.000028 - momentum: 0.000000
142
+ 2023-10-17 15:24:37,011 epoch 5 - iter 440/447 - loss 0.03419784 - time (sec): 42.87 - samples/sec: 1992.60 - lr: 0.000028 - momentum: 0.000000
143
+ 2023-10-17 15:24:37,701 ----------------------------------------------------------------------------------------------------
144
+ 2023-10-17 15:24:37,702 EPOCH 5 done: loss 0.0342 - lr: 0.000028
145
+ 2023-10-17 15:24:48,404 DEV : loss 0.19426260888576508 - f1-score (micro avg) 0.7695
146
+ 2023-10-17 15:24:48,467 saving best model
147
+ 2023-10-17 15:24:49,151 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-17 15:24:53,549 epoch 6 - iter 44/447 - loss 0.01163168 - time (sec): 4.40 - samples/sec: 1887.17 - lr: 0.000027 - momentum: 0.000000
149
+ 2023-10-17 15:24:58,088 epoch 6 - iter 88/447 - loss 0.01528518 - time (sec): 8.94 - samples/sec: 1878.66 - lr: 0.000027 - momentum: 0.000000
150
+ 2023-10-17 15:25:02,258 epoch 6 - iter 132/447 - loss 0.01991735 - time (sec): 13.10 - samples/sec: 1941.94 - lr: 0.000026 - momentum: 0.000000
151
+ 2023-10-17 15:25:06,592 epoch 6 - iter 176/447 - loss 0.02065023 - time (sec): 17.44 - samples/sec: 1960.45 - lr: 0.000026 - momentum: 0.000000
152
+ 2023-10-17 15:25:10,877 epoch 6 - iter 220/447 - loss 0.01950593 - time (sec): 21.72 - samples/sec: 1980.94 - lr: 0.000025 - momentum: 0.000000
153
+ 2023-10-17 15:25:14,822 epoch 6 - iter 264/447 - loss 0.01844720 - time (sec): 25.67 - samples/sec: 1977.88 - lr: 0.000025 - momentum: 0.000000
154
+ 2023-10-17 15:25:19,281 epoch 6 - iter 308/447 - loss 0.01859815 - time (sec): 30.13 - samples/sec: 2007.45 - lr: 0.000024 - momentum: 0.000000
155
+ 2023-10-17 15:25:23,292 epoch 6 - iter 352/447 - loss 0.01870907 - time (sec): 34.14 - samples/sec: 2019.09 - lr: 0.000023 - momentum: 0.000000
156
+ 2023-10-17 15:25:27,828 epoch 6 - iter 396/447 - loss 0.01881535 - time (sec): 38.68 - samples/sec: 2000.02 - lr: 0.000023 - momentum: 0.000000
157
+ 2023-10-17 15:25:31,845 epoch 6 - iter 440/447 - loss 0.01872424 - time (sec): 42.69 - samples/sec: 1997.24 - lr: 0.000022 - momentum: 0.000000
158
+ 2023-10-17 15:25:32,465 ----------------------------------------------------------------------------------------------------
159
+ 2023-10-17 15:25:32,466 EPOCH 6 done: loss 0.0187 - lr: 0.000022
160
+ 2023-10-17 15:25:43,025 DEV : loss 0.19804143905639648 - f1-score (micro avg) 0.771
161
+ 2023-10-17 15:25:43,090 saving best model
162
+ 2023-10-17 15:25:44,546 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-17 15:25:49,064 epoch 7 - iter 44/447 - loss 0.01406927 - time (sec): 4.51 - samples/sec: 2113.97 - lr: 0.000022 - momentum: 0.000000
164
+ 2023-10-17 15:25:53,177 epoch 7 - iter 88/447 - loss 0.01378757 - time (sec): 8.63 - samples/sec: 2067.06 - lr: 0.000021 - momentum: 0.000000
165
+ 2023-10-17 15:25:57,437 epoch 7 - iter 132/447 - loss 0.01246763 - time (sec): 12.88 - samples/sec: 2063.50 - lr: 0.000021 - momentum: 0.000000
166
+ 2023-10-17 15:26:01,968 epoch 7 - iter 176/447 - loss 0.01393283 - time (sec): 17.42 - samples/sec: 2068.23 - lr: 0.000020 - momentum: 0.000000
167
+ 2023-10-17 15:26:05,992 epoch 7 - iter 220/447 - loss 0.01344518 - time (sec): 21.44 - samples/sec: 2076.84 - lr: 0.000020 - momentum: 0.000000
168
+ 2023-10-17 15:26:10,031 epoch 7 - iter 264/447 - loss 0.01257828 - time (sec): 25.48 - samples/sec: 2061.24 - lr: 0.000019 - momentum: 0.000000
169
+ 2023-10-17 15:26:14,132 epoch 7 - iter 308/447 - loss 0.01318801 - time (sec): 29.58 - samples/sec: 2051.47 - lr: 0.000018 - momentum: 0.000000
170
+ 2023-10-17 15:26:18,294 epoch 7 - iter 352/447 - loss 0.01344369 - time (sec): 33.74 - samples/sec: 2051.06 - lr: 0.000018 - momentum: 0.000000
171
+ 2023-10-17 15:26:22,814 epoch 7 - iter 396/447 - loss 0.01372082 - time (sec): 38.26 - samples/sec: 2025.99 - lr: 0.000017 - momentum: 0.000000
172
+ 2023-10-17 15:26:27,255 epoch 7 - iter 440/447 - loss 0.01396394 - time (sec): 42.70 - samples/sec: 1996.18 - lr: 0.000017 - momentum: 0.000000
173
+ 2023-10-17 15:26:27,962 ----------------------------------------------------------------------------------------------------
174
+ 2023-10-17 15:26:27,963 EPOCH 7 done: loss 0.0139 - lr: 0.000017
175
+ 2023-10-17 15:26:38,984 DEV : loss 0.22845597565174103 - f1-score (micro avg) 0.8003
176
+ 2023-10-17 15:26:39,038 saving best model
177
+ 2023-10-17 15:26:40,462 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-17 15:26:44,896 epoch 8 - iter 44/447 - loss 0.00921936 - time (sec): 4.43 - samples/sec: 1841.71 - lr: 0.000016 - momentum: 0.000000
179
+ 2023-10-17 15:26:49,151 epoch 8 - iter 88/447 - loss 0.00912078 - time (sec): 8.68 - samples/sec: 1880.14 - lr: 0.000016 - momentum: 0.000000
180
+ 2023-10-17 15:26:53,847 epoch 8 - iter 132/447 - loss 0.01106636 - time (sec): 13.38 - samples/sec: 1809.62 - lr: 0.000015 - momentum: 0.000000
181
+ 2023-10-17 15:26:58,466 epoch 8 - iter 176/447 - loss 0.01077249 - time (sec): 18.00 - samples/sec: 1830.19 - lr: 0.000015 - momentum: 0.000000
182
+ 2023-10-17 15:27:02,928 epoch 8 - iter 220/447 - loss 0.00985768 - time (sec): 22.46 - samples/sec: 1859.16 - lr: 0.000014 - momentum: 0.000000
183
+ 2023-10-17 15:27:07,409 epoch 8 - iter 264/447 - loss 0.00933209 - time (sec): 26.94 - samples/sec: 1856.19 - lr: 0.000013 - momentum: 0.000000
184
+ 2023-10-17 15:27:11,913 epoch 8 - iter 308/447 - loss 0.00948564 - time (sec): 31.45 - samples/sec: 1848.78 - lr: 0.000013 - momentum: 0.000000
185
+ 2023-10-17 15:27:16,212 epoch 8 - iter 352/447 - loss 0.00988733 - time (sec): 35.75 - samples/sec: 1856.93 - lr: 0.000012 - momentum: 0.000000
186
+ 2023-10-17 15:27:20,656 epoch 8 - iter 396/447 - loss 0.00993039 - time (sec): 40.19 - samples/sec: 1886.26 - lr: 0.000012 - momentum: 0.000000
187
+ 2023-10-17 15:27:25,141 epoch 8 - iter 440/447 - loss 0.00960367 - time (sec): 44.67 - samples/sec: 1907.98 - lr: 0.000011 - momentum: 0.000000
188
+ 2023-10-17 15:27:25,768 ----------------------------------------------------------------------------------------------------
189
+ 2023-10-17 15:27:25,769 EPOCH 8 done: loss 0.0095 - lr: 0.000011
190
+ 2023-10-17 15:27:37,592 DEV : loss 0.21943919360637665 - f1-score (micro avg) 0.7933
191
+ 2023-10-17 15:27:37,660 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-17 15:27:41,957 epoch 9 - iter 44/447 - loss 0.00566894 - time (sec): 4.30 - samples/sec: 1769.19 - lr: 0.000011 - momentum: 0.000000
193
+ 2023-10-17 15:27:45,995 epoch 9 - iter 88/447 - loss 0.00637583 - time (sec): 8.33 - samples/sec: 1919.87 - lr: 0.000010 - momentum: 0.000000
194
+ 2023-10-17 15:27:50,104 epoch 9 - iter 132/447 - loss 0.00607137 - time (sec): 12.44 - samples/sec: 1954.16 - lr: 0.000010 - momentum: 0.000000
195
+ 2023-10-17 15:27:54,416 epoch 9 - iter 176/447 - loss 0.00515606 - time (sec): 16.75 - samples/sec: 1948.74 - lr: 0.000009 - momentum: 0.000000
196
+ 2023-10-17 15:28:00,031 epoch 9 - iter 220/447 - loss 0.00413243 - time (sec): 22.37 - samples/sec: 1922.67 - lr: 0.000008 - momentum: 0.000000
197
+ 2023-10-17 15:28:04,550 epoch 9 - iter 264/447 - loss 0.00424171 - time (sec): 26.89 - samples/sec: 1929.66 - lr: 0.000008 - momentum: 0.000000
198
+ 2023-10-17 15:28:09,004 epoch 9 - iter 308/447 - loss 0.00528104 - time (sec): 31.34 - samples/sec: 1914.50 - lr: 0.000007 - momentum: 0.000000
199
+ 2023-10-17 15:28:13,713 epoch 9 - iter 352/447 - loss 0.00636826 - time (sec): 36.05 - samples/sec: 1888.42 - lr: 0.000007 - momentum: 0.000000
200
+ 2023-10-17 15:28:18,158 epoch 9 - iter 396/447 - loss 0.00660453 - time (sec): 40.50 - samples/sec: 1892.15 - lr: 0.000006 - momentum: 0.000000
201
+ 2023-10-17 15:28:22,745 epoch 9 - iter 440/447 - loss 0.00609731 - time (sec): 45.08 - samples/sec: 1883.24 - lr: 0.000006 - momentum: 0.000000
202
+ 2023-10-17 15:28:23,448 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-17 15:28:23,449 EPOCH 9 done: loss 0.0060 - lr: 0.000006
204
+ 2023-10-17 15:28:34,773 DEV : loss 0.24061691761016846 - f1-score (micro avg) 0.7905
205
+ 2023-10-17 15:28:34,835 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-17 15:28:39,324 epoch 10 - iter 44/447 - loss 0.00465839 - time (sec): 4.49 - samples/sec: 1899.47 - lr: 0.000005 - momentum: 0.000000
207
+ 2023-10-17 15:28:43,753 epoch 10 - iter 88/447 - loss 0.00384135 - time (sec): 8.92 - samples/sec: 1850.06 - lr: 0.000005 - momentum: 0.000000
208
+ 2023-10-17 15:28:48,498 epoch 10 - iter 132/447 - loss 0.00456888 - time (sec): 13.66 - samples/sec: 1924.22 - lr: 0.000004 - momentum: 0.000000
209
+ 2023-10-17 15:28:52,819 epoch 10 - iter 176/447 - loss 0.00488622 - time (sec): 17.98 - samples/sec: 1899.89 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-17 15:28:57,213 epoch 10 - iter 220/447 - loss 0.00398089 - time (sec): 22.38 - samples/sec: 1920.45 - lr: 0.000003 - momentum: 0.000000
211
+ 2023-10-17 15:29:01,484 epoch 10 - iter 264/447 - loss 0.00334650 - time (sec): 26.65 - samples/sec: 1942.12 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-17 15:29:05,562 epoch 10 - iter 308/447 - loss 0.00323001 - time (sec): 30.72 - samples/sec: 1952.19 - lr: 0.000002 - momentum: 0.000000
213
+ 2023-10-17 15:29:09,820 epoch 10 - iter 352/447 - loss 0.00363184 - time (sec): 34.98 - samples/sec: 1967.59 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-17 15:29:13,771 epoch 10 - iter 396/447 - loss 0.00344856 - time (sec): 38.93 - samples/sec: 1965.60 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-17 15:29:18,176 epoch 10 - iter 440/447 - loss 0.00328569 - time (sec): 43.34 - samples/sec: 1969.55 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-17 15:29:18,816 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-17 15:29:18,816 EPOCH 10 done: loss 0.0032 - lr: 0.000000
218
+ 2023-10-17 15:29:29,893 DEV : loss 0.23245255649089813 - f1-score (micro avg) 0.8025
219
+ 2023-10-17 15:29:29,948 saving best model
220
+ 2023-10-17 15:29:32,004 ----------------------------------------------------------------------------------------------------
221
+ 2023-10-17 15:29:32,006 Loading model from best epoch ...
222
+ 2023-10-17 15:29:35,209 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
223
+ 2023-10-17 15:29:41,022
224
+ Results:
225
+ - F-score (micro) 0.762
226
+ - F-score (macro) 0.6737
227
+ - Accuracy 0.6314
228
+
229
+ By class:
230
+ precision recall f1-score support
231
+
232
+ loc 0.8550 0.8607 0.8579 596
233
+ pers 0.6850 0.7838 0.7311 333
234
+ org 0.5431 0.4773 0.5081 132
235
+ prod 0.6071 0.5152 0.5574 66
236
+ time 0.7143 0.7143 0.7143 49
237
+
238
+ micro avg 0.7537 0.7704 0.7620 1176
239
+ macro avg 0.6809 0.6702 0.6737 1176
240
+ weighted avg 0.7521 0.7704 0.7599 1176
241
+
242
+ 2023-10-17 15:29:41,023 ----------------------------------------------------------------------------------------------------