stefan-it commited on
Commit
295fa69
1 Parent(s): bacce6a

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +243 -0
training.log ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-25 17:12:05,512 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-25 17:12:05,513 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(64001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-25 17:12:05,514 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-25 17:12:05,514 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
52
+ - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
53
+ 2023-10-25 17:12:05,514 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-25 17:12:05,514 Train: 7142 sentences
55
+ 2023-10-25 17:12:05,514 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-25 17:12:05,514 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-25 17:12:05,514 Training Params:
58
+ 2023-10-25 17:12:05,514 - learning_rate: "5e-05"
59
+ 2023-10-25 17:12:05,514 - mini_batch_size: "8"
60
+ 2023-10-25 17:12:05,514 - max_epochs: "10"
61
+ 2023-10-25 17:12:05,514 - shuffle: "True"
62
+ 2023-10-25 17:12:05,514 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-25 17:12:05,514 Plugins:
64
+ 2023-10-25 17:12:05,514 - TensorboardLogger
65
+ 2023-10-25 17:12:05,514 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-25 17:12:05,515 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-25 17:12:05,515 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-25 17:12:05,515 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-25 17:12:05,515 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-25 17:12:05,515 Computation:
71
+ 2023-10-25 17:12:05,515 - compute on device: cuda:0
72
+ 2023-10-25 17:12:05,515 - embedding storage: none
73
+ 2023-10-25 17:12:05,515 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-25 17:12:05,515 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
75
+ 2023-10-25 17:12:05,515 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-25 17:12:05,515 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-25 17:12:05,515 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-25 17:12:11,754 epoch 1 - iter 89/893 - loss 1.86966354 - time (sec): 6.24 - samples/sec: 4048.40 - lr: 0.000005 - momentum: 0.000000
79
+ 2023-10-25 17:12:18,158 epoch 1 - iter 178/893 - loss 1.17037592 - time (sec): 12.64 - samples/sec: 4015.73 - lr: 0.000010 - momentum: 0.000000
80
+ 2023-10-25 17:12:24,252 epoch 1 - iter 267/893 - loss 0.89779536 - time (sec): 18.74 - samples/sec: 3990.87 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-25 17:12:30,281 epoch 1 - iter 356/893 - loss 0.72799361 - time (sec): 24.77 - samples/sec: 4021.65 - lr: 0.000020 - momentum: 0.000000
82
+ 2023-10-25 17:12:36,084 epoch 1 - iter 445/893 - loss 0.62359043 - time (sec): 30.57 - samples/sec: 4037.40 - lr: 0.000025 - momentum: 0.000000
83
+ 2023-10-25 17:12:41,987 epoch 1 - iter 534/893 - loss 0.54978542 - time (sec): 36.47 - samples/sec: 4059.93 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-25 17:12:48,656 epoch 1 - iter 623/893 - loss 0.49086057 - time (sec): 43.14 - samples/sec: 4008.81 - lr: 0.000035 - momentum: 0.000000
85
+ 2023-10-25 17:12:54,781 epoch 1 - iter 712/893 - loss 0.44720365 - time (sec): 49.27 - samples/sec: 4036.10 - lr: 0.000040 - momentum: 0.000000
86
+ 2023-10-25 17:13:00,618 epoch 1 - iter 801/893 - loss 0.41503176 - time (sec): 55.10 - samples/sec: 4063.39 - lr: 0.000045 - momentum: 0.000000
87
+ 2023-10-25 17:13:06,507 epoch 1 - iter 890/893 - loss 0.38844168 - time (sec): 60.99 - samples/sec: 4060.90 - lr: 0.000050 - momentum: 0.000000
88
+ 2023-10-25 17:13:06,717 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-25 17:13:06,717 EPOCH 1 done: loss 0.3873 - lr: 0.000050
90
+ 2023-10-25 17:13:09,813 DEV : loss 0.10254143178462982 - f1-score (micro avg) 0.7386
91
+ 2023-10-25 17:13:09,836 saving best model
92
+ 2023-10-25 17:13:10,384 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-25 17:13:16,410 epoch 2 - iter 89/893 - loss 0.11297432 - time (sec): 6.02 - samples/sec: 4100.95 - lr: 0.000049 - momentum: 0.000000
94
+ 2023-10-25 17:13:22,388 epoch 2 - iter 178/893 - loss 0.10095221 - time (sec): 12.00 - samples/sec: 4091.37 - lr: 0.000049 - momentum: 0.000000
95
+ 2023-10-25 17:13:28,578 epoch 2 - iter 267/893 - loss 0.09904465 - time (sec): 18.19 - samples/sec: 4123.74 - lr: 0.000048 - momentum: 0.000000
96
+ 2023-10-25 17:13:34,706 epoch 2 - iter 356/893 - loss 0.10304399 - time (sec): 24.32 - samples/sec: 4161.23 - lr: 0.000048 - momentum: 0.000000
97
+ 2023-10-25 17:13:40,891 epoch 2 - iter 445/893 - loss 0.10290863 - time (sec): 30.50 - samples/sec: 4163.32 - lr: 0.000047 - momentum: 0.000000
98
+ 2023-10-25 17:13:46,799 epoch 2 - iter 534/893 - loss 0.10253728 - time (sec): 36.41 - samples/sec: 4138.20 - lr: 0.000047 - momentum: 0.000000
99
+ 2023-10-25 17:13:52,856 epoch 2 - iter 623/893 - loss 0.10517292 - time (sec): 42.47 - samples/sec: 4127.97 - lr: 0.000046 - momentum: 0.000000
100
+ 2023-10-25 17:13:59,058 epoch 2 - iter 712/893 - loss 0.10448830 - time (sec): 48.67 - samples/sec: 4121.70 - lr: 0.000046 - momentum: 0.000000
101
+ 2023-10-25 17:14:05,695 epoch 2 - iter 801/893 - loss 0.10541126 - time (sec): 55.31 - samples/sec: 4030.63 - lr: 0.000045 - momentum: 0.000000
102
+ 2023-10-25 17:14:11,612 epoch 2 - iter 890/893 - loss 0.10497282 - time (sec): 61.23 - samples/sec: 4053.12 - lr: 0.000044 - momentum: 0.000000
103
+ 2023-10-25 17:14:11,796 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-25 17:14:11,797 EPOCH 2 done: loss 0.1048 - lr: 0.000044
105
+ 2023-10-25 17:14:15,732 DEV : loss 0.09835401922464371 - f1-score (micro avg) 0.7484
106
+ 2023-10-25 17:14:15,753 saving best model
107
+ 2023-10-25 17:14:17,077 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-25 17:14:22,720 epoch 3 - iter 89/893 - loss 0.06670513 - time (sec): 5.64 - samples/sec: 4160.09 - lr: 0.000044 - momentum: 0.000000
109
+ 2023-10-25 17:14:28,626 epoch 3 - iter 178/893 - loss 0.06908344 - time (sec): 11.55 - samples/sec: 4278.75 - lr: 0.000043 - momentum: 0.000000
110
+ 2023-10-25 17:14:34,039 epoch 3 - iter 267/893 - loss 0.06482755 - time (sec): 16.96 - samples/sec: 4378.26 - lr: 0.000043 - momentum: 0.000000
111
+ 2023-10-25 17:14:39,692 epoch 3 - iter 356/893 - loss 0.06663973 - time (sec): 22.61 - samples/sec: 4381.76 - lr: 0.000042 - momentum: 0.000000
112
+ 2023-10-25 17:14:45,211 epoch 3 - iter 445/893 - loss 0.06653035 - time (sec): 28.13 - samples/sec: 4430.59 - lr: 0.000042 - momentum: 0.000000
113
+ 2023-10-25 17:14:50,834 epoch 3 - iter 534/893 - loss 0.06616108 - time (sec): 33.75 - samples/sec: 4434.83 - lr: 0.000041 - momentum: 0.000000
114
+ 2023-10-25 17:14:56,338 epoch 3 - iter 623/893 - loss 0.06517711 - time (sec): 39.26 - samples/sec: 4441.20 - lr: 0.000041 - momentum: 0.000000
115
+ 2023-10-25 17:15:01,626 epoch 3 - iter 712/893 - loss 0.06547935 - time (sec): 44.55 - samples/sec: 4424.27 - lr: 0.000040 - momentum: 0.000000
116
+ 2023-10-25 17:15:07,582 epoch 3 - iter 801/893 - loss 0.06531523 - time (sec): 50.50 - samples/sec: 4426.12 - lr: 0.000039 - momentum: 0.000000
117
+ 2023-10-25 17:15:13,138 epoch 3 - iter 890/893 - loss 0.06609427 - time (sec): 56.06 - samples/sec: 4424.89 - lr: 0.000039 - momentum: 0.000000
118
+ 2023-10-25 17:15:13,306 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-25 17:15:13,306 EPOCH 3 done: loss 0.0663 - lr: 0.000039
120
+ 2023-10-25 17:15:18,728 DEV : loss 0.10818858444690704 - f1-score (micro avg) 0.78
121
+ 2023-10-25 17:15:18,750 saving best model
122
+ 2023-10-25 17:15:19,433 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-25 17:15:25,434 epoch 4 - iter 89/893 - loss 0.04360221 - time (sec): 6.00 - samples/sec: 4325.89 - lr: 0.000038 - momentum: 0.000000
124
+ 2023-10-25 17:15:31,204 epoch 4 - iter 178/893 - loss 0.04758612 - time (sec): 11.77 - samples/sec: 4382.46 - lr: 0.000038 - momentum: 0.000000
125
+ 2023-10-25 17:15:36,707 epoch 4 - iter 267/893 - loss 0.04849685 - time (sec): 17.27 - samples/sec: 4342.52 - lr: 0.000037 - momentum: 0.000000
126
+ 2023-10-25 17:15:42,413 epoch 4 - iter 356/893 - loss 0.04765784 - time (sec): 22.98 - samples/sec: 4307.89 - lr: 0.000037 - momentum: 0.000000
127
+ 2023-10-25 17:15:48,360 epoch 4 - iter 445/893 - loss 0.04585792 - time (sec): 28.92 - samples/sec: 4301.93 - lr: 0.000036 - momentum: 0.000000
128
+ 2023-10-25 17:15:53,987 epoch 4 - iter 534/893 - loss 0.04678008 - time (sec): 34.55 - samples/sec: 4339.31 - lr: 0.000036 - momentum: 0.000000
129
+ 2023-10-25 17:15:59,524 epoch 4 - iter 623/893 - loss 0.04623631 - time (sec): 40.09 - samples/sec: 4325.05 - lr: 0.000035 - momentum: 0.000000
130
+ 2023-10-25 17:16:05,227 epoch 4 - iter 712/893 - loss 0.04616789 - time (sec): 45.79 - samples/sec: 4335.78 - lr: 0.000034 - momentum: 0.000000
131
+ 2023-10-25 17:16:10,897 epoch 4 - iter 801/893 - loss 0.04654488 - time (sec): 51.46 - samples/sec: 4351.22 - lr: 0.000034 - momentum: 0.000000
132
+ 2023-10-25 17:16:16,408 epoch 4 - iter 890/893 - loss 0.04626882 - time (sec): 56.97 - samples/sec: 4340.98 - lr: 0.000033 - momentum: 0.000000
133
+ 2023-10-25 17:16:16,693 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-25 17:16:16,694 EPOCH 4 done: loss 0.0460 - lr: 0.000033
135
+ 2023-10-25 17:16:21,128 DEV : loss 0.14430440962314606 - f1-score (micro avg) 0.7763
136
+ 2023-10-25 17:16:21,148 ----------------------------------------------------------------------------------------------------
137
+ 2023-10-25 17:16:26,833 epoch 5 - iter 89/893 - loss 0.03076065 - time (sec): 5.68 - samples/sec: 4079.73 - lr: 0.000033 - momentum: 0.000000
138
+ 2023-10-25 17:16:32,613 epoch 5 - iter 178/893 - loss 0.03387457 - time (sec): 11.46 - samples/sec: 4189.45 - lr: 0.000032 - momentum: 0.000000
139
+ 2023-10-25 17:16:38,412 epoch 5 - iter 267/893 - loss 0.03857465 - time (sec): 17.26 - samples/sec: 4219.15 - lr: 0.000032 - momentum: 0.000000
140
+ 2023-10-25 17:16:44,270 epoch 5 - iter 356/893 - loss 0.03668496 - time (sec): 23.12 - samples/sec: 4226.11 - lr: 0.000031 - momentum: 0.000000
141
+ 2023-10-25 17:16:50,915 epoch 5 - iter 445/893 - loss 0.03779554 - time (sec): 29.77 - samples/sec: 4131.62 - lr: 0.000031 - momentum: 0.000000
142
+ 2023-10-25 17:16:56,745 epoch 5 - iter 534/893 - loss 0.03736583 - time (sec): 35.60 - samples/sec: 4160.67 - lr: 0.000030 - momentum: 0.000000
143
+ 2023-10-25 17:17:02,359 epoch 5 - iter 623/893 - loss 0.03619574 - time (sec): 41.21 - samples/sec: 4183.66 - lr: 0.000029 - momentum: 0.000000
144
+ 2023-10-25 17:17:08,388 epoch 5 - iter 712/893 - loss 0.03616854 - time (sec): 47.24 - samples/sec: 4165.28 - lr: 0.000029 - momentum: 0.000000
145
+ 2023-10-25 17:17:14,180 epoch 5 - iter 801/893 - loss 0.03605795 - time (sec): 53.03 - samples/sec: 4207.84 - lr: 0.000028 - momentum: 0.000000
146
+ 2023-10-25 17:17:19,715 epoch 5 - iter 890/893 - loss 0.03609116 - time (sec): 58.57 - samples/sec: 4231.73 - lr: 0.000028 - momentum: 0.000000
147
+ 2023-10-25 17:17:19,915 ----------------------------------------------------------------------------------------------------
148
+ 2023-10-25 17:17:19,915 EPOCH 5 done: loss 0.0361 - lr: 0.000028
149
+ 2023-10-25 17:17:23,906 DEV : loss 0.1808791607618332 - f1-score (micro avg) 0.7874
150
+ 2023-10-25 17:17:23,926 saving best model
151
+ 2023-10-25 17:17:24,575 ----------------------------------------------------------------------------------------------------
152
+ 2023-10-25 17:17:30,427 epoch 6 - iter 89/893 - loss 0.03174096 - time (sec): 5.85 - samples/sec: 4049.45 - lr: 0.000027 - momentum: 0.000000
153
+ 2023-10-25 17:17:36,184 epoch 6 - iter 178/893 - loss 0.02710066 - time (sec): 11.61 - samples/sec: 4011.60 - lr: 0.000027 - momentum: 0.000000
154
+ 2023-10-25 17:17:42,186 epoch 6 - iter 267/893 - loss 0.02807563 - time (sec): 17.61 - samples/sec: 4098.40 - lr: 0.000026 - momentum: 0.000000
155
+ 2023-10-25 17:17:48,150 epoch 6 - iter 356/893 - loss 0.02692728 - time (sec): 23.57 - samples/sec: 4121.36 - lr: 0.000026 - momentum: 0.000000
156
+ 2023-10-25 17:17:54,081 epoch 6 - iter 445/893 - loss 0.02735570 - time (sec): 29.50 - samples/sec: 4159.09 - lr: 0.000025 - momentum: 0.000000
157
+ 2023-10-25 17:18:00,130 epoch 6 - iter 534/893 - loss 0.02805044 - time (sec): 35.55 - samples/sec: 4171.09 - lr: 0.000024 - momentum: 0.000000
158
+ 2023-10-25 17:18:06,195 epoch 6 - iter 623/893 - loss 0.02725646 - time (sec): 41.62 - samples/sec: 4157.30 - lr: 0.000024 - momentum: 0.000000
159
+ 2023-10-25 17:18:12,254 epoch 6 - iter 712/893 - loss 0.02769298 - time (sec): 47.68 - samples/sec: 4162.52 - lr: 0.000023 - momentum: 0.000000
160
+ 2023-10-25 17:18:18,094 epoch 6 - iter 801/893 - loss 0.02746510 - time (sec): 53.52 - samples/sec: 4161.49 - lr: 0.000023 - momentum: 0.000000
161
+ 2023-10-25 17:18:24,001 epoch 6 - iter 890/893 - loss 0.02725677 - time (sec): 59.42 - samples/sec: 4178.69 - lr: 0.000022 - momentum: 0.000000
162
+ 2023-10-25 17:18:24,190 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-25 17:18:24,190 EPOCH 6 done: loss 0.0274 - lr: 0.000022
164
+ 2023-10-25 17:18:29,212 DEV : loss 0.18829816579818726 - f1-score (micro avg) 0.8008
165
+ 2023-10-25 17:18:29,234 saving best model
166
+ 2023-10-25 17:18:29,906 ----------------------------------------------------------------------------------------------------
167
+ 2023-10-25 17:18:35,941 epoch 7 - iter 89/893 - loss 0.01515087 - time (sec): 6.03 - samples/sec: 3972.69 - lr: 0.000022 - momentum: 0.000000
168
+ 2023-10-25 17:18:41,986 epoch 7 - iter 178/893 - loss 0.02051512 - time (sec): 12.08 - samples/sec: 4024.69 - lr: 0.000021 - momentum: 0.000000
169
+ 2023-10-25 17:18:48,051 epoch 7 - iter 267/893 - loss 0.01998553 - time (sec): 18.14 - samples/sec: 4115.71 - lr: 0.000021 - momentum: 0.000000
170
+ 2023-10-25 17:18:54,021 epoch 7 - iter 356/893 - loss 0.01986934 - time (sec): 24.11 - samples/sec: 4124.11 - lr: 0.000020 - momentum: 0.000000
171
+ 2023-10-25 17:18:59,887 epoch 7 - iter 445/893 - loss 0.02129780 - time (sec): 29.98 - samples/sec: 4178.30 - lr: 0.000019 - momentum: 0.000000
172
+ 2023-10-25 17:19:05,771 epoch 7 - iter 534/893 - loss 0.02089146 - time (sec): 35.86 - samples/sec: 4208.17 - lr: 0.000019 - momentum: 0.000000
173
+ 2023-10-25 17:19:11,666 epoch 7 - iter 623/893 - loss 0.02180007 - time (sec): 41.76 - samples/sec: 4201.24 - lr: 0.000018 - momentum: 0.000000
174
+ 2023-10-25 17:19:17,316 epoch 7 - iter 712/893 - loss 0.02130021 - time (sec): 47.41 - samples/sec: 4179.67 - lr: 0.000018 - momentum: 0.000000
175
+ 2023-10-25 17:19:23,114 epoch 7 - iter 801/893 - loss 0.02092378 - time (sec): 53.21 - samples/sec: 4187.33 - lr: 0.000017 - momentum: 0.000000
176
+ 2023-10-25 17:19:29,000 epoch 7 - iter 890/893 - loss 0.02059414 - time (sec): 59.09 - samples/sec: 4200.88 - lr: 0.000017 - momentum: 0.000000
177
+ 2023-10-25 17:19:29,168 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-25 17:19:29,168 EPOCH 7 done: loss 0.0206 - lr: 0.000017
179
+ 2023-10-25 17:19:33,135 DEV : loss 0.20971202850341797 - f1-score (micro avg) 0.7835
180
+ 2023-10-25 17:19:33,158 ----------------------------------------------------------------------------------------------------
181
+ 2023-10-25 17:19:39,046 epoch 8 - iter 89/893 - loss 0.01764633 - time (sec): 5.89 - samples/sec: 4379.59 - lr: 0.000016 - momentum: 0.000000
182
+ 2023-10-25 17:19:45,066 epoch 8 - iter 178/893 - loss 0.01628569 - time (sec): 11.91 - samples/sec: 4235.59 - lr: 0.000016 - momentum: 0.000000
183
+ 2023-10-25 17:19:51,714 epoch 8 - iter 267/893 - loss 0.01625886 - time (sec): 18.55 - samples/sec: 4031.94 - lr: 0.000015 - momentum: 0.000000
184
+ 2023-10-25 17:19:57,412 epoch 8 - iter 356/893 - loss 0.01628296 - time (sec): 24.25 - samples/sec: 4045.87 - lr: 0.000014 - momentum: 0.000000
185
+ 2023-10-25 17:20:03,072 epoch 8 - iter 445/893 - loss 0.01501426 - time (sec): 29.91 - samples/sec: 4086.07 - lr: 0.000014 - momentum: 0.000000
186
+ 2023-10-25 17:20:08,856 epoch 8 - iter 534/893 - loss 0.01465639 - time (sec): 35.70 - samples/sec: 4134.41 - lr: 0.000013 - momentum: 0.000000
187
+ 2023-10-25 17:20:14,511 epoch 8 - iter 623/893 - loss 0.01504650 - time (sec): 41.35 - samples/sec: 4158.41 - lr: 0.000013 - momentum: 0.000000
188
+ 2023-10-25 17:20:20,232 epoch 8 - iter 712/893 - loss 0.01458875 - time (sec): 47.07 - samples/sec: 4166.37 - lr: 0.000012 - momentum: 0.000000
189
+ 2023-10-25 17:20:26,137 epoch 8 - iter 801/893 - loss 0.01570324 - time (sec): 52.98 - samples/sec: 4186.22 - lr: 0.000012 - momentum: 0.000000
190
+ 2023-10-25 17:20:32,101 epoch 8 - iter 890/893 - loss 0.01581706 - time (sec): 58.94 - samples/sec: 4207.71 - lr: 0.000011 - momentum: 0.000000
191
+ 2023-10-25 17:20:32,280 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-25 17:20:32,281 EPOCH 8 done: loss 0.0158 - lr: 0.000011
193
+ 2023-10-25 17:20:36,350 DEV : loss 0.21289943158626556 - f1-score (micro avg) 0.8
194
+ 2023-10-25 17:20:36,375 ----------------------------------------------------------------------------------------------------
195
+ 2023-10-25 17:20:42,196 epoch 9 - iter 89/893 - loss 0.00765889 - time (sec): 5.82 - samples/sec: 4349.70 - lr: 0.000011 - momentum: 0.000000
196
+ 2023-10-25 17:20:48,053 epoch 9 - iter 178/893 - loss 0.00998425 - time (sec): 11.68 - samples/sec: 4306.03 - lr: 0.000010 - momentum: 0.000000
197
+ 2023-10-25 17:20:54,019 epoch 9 - iter 267/893 - loss 0.01244333 - time (sec): 17.64 - samples/sec: 4202.39 - lr: 0.000009 - momentum: 0.000000
198
+ 2023-10-25 17:20:59,745 epoch 9 - iter 356/893 - loss 0.01212745 - time (sec): 23.37 - samples/sec: 4282.98 - lr: 0.000009 - momentum: 0.000000
199
+ 2023-10-25 17:21:05,354 epoch 9 - iter 445/893 - loss 0.01178149 - time (sec): 28.98 - samples/sec: 4326.46 - lr: 0.000008 - momentum: 0.000000
200
+ 2023-10-25 17:21:10,965 epoch 9 - iter 534/893 - loss 0.01170529 - time (sec): 34.59 - samples/sec: 4302.74 - lr: 0.000008 - momentum: 0.000000
201
+ 2023-10-25 17:21:16,851 epoch 9 - iter 623/893 - loss 0.01136873 - time (sec): 40.47 - samples/sec: 4324.76 - lr: 0.000007 - momentum: 0.000000
202
+ 2023-10-25 17:21:22,488 epoch 9 - iter 712/893 - loss 0.01142673 - time (sec): 46.11 - samples/sec: 4298.22 - lr: 0.000007 - momentum: 0.000000
203
+ 2023-10-25 17:21:28,182 epoch 9 - iter 801/893 - loss 0.01136731 - time (sec): 51.81 - samples/sec: 4291.71 - lr: 0.000006 - momentum: 0.000000
204
+ 2023-10-25 17:21:34,076 epoch 9 - iter 890/893 - loss 0.01120456 - time (sec): 57.70 - samples/sec: 4294.92 - lr: 0.000006 - momentum: 0.000000
205
+ 2023-10-25 17:21:34,259 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-25 17:21:34,259 EPOCH 9 done: loss 0.0112 - lr: 0.000006
207
+ 2023-10-25 17:21:39,355 DEV : loss 0.22147664427757263 - f1-score (micro avg) 0.7981
208
+ 2023-10-25 17:21:39,378 ----------------------------------------------------------------------------------------------------
209
+ 2023-10-25 17:21:44,817 epoch 10 - iter 89/893 - loss 0.00613679 - time (sec): 5.44 - samples/sec: 4461.62 - lr: 0.000005 - momentum: 0.000000
210
+ 2023-10-25 17:21:50,598 epoch 10 - iter 178/893 - loss 0.00627718 - time (sec): 11.22 - samples/sec: 4222.26 - lr: 0.000004 - momentum: 0.000000
211
+ 2023-10-25 17:21:56,574 epoch 10 - iter 267/893 - loss 0.00759554 - time (sec): 17.19 - samples/sec: 4268.40 - lr: 0.000004 - momentum: 0.000000
212
+ 2023-10-25 17:22:02,673 epoch 10 - iter 356/893 - loss 0.00807461 - time (sec): 23.29 - samples/sec: 4232.88 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-25 17:22:08,727 epoch 10 - iter 445/893 - loss 0.00789473 - time (sec): 29.35 - samples/sec: 4155.31 - lr: 0.000003 - momentum: 0.000000
214
+ 2023-10-25 17:22:14,934 epoch 10 - iter 534/893 - loss 0.00783961 - time (sec): 35.55 - samples/sec: 4162.38 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-25 17:22:20,987 epoch 10 - iter 623/893 - loss 0.00764283 - time (sec): 41.61 - samples/sec: 4158.54 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-25 17:22:26,725 epoch 10 - iter 712/893 - loss 0.00703458 - time (sec): 47.35 - samples/sec: 4144.99 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-25 17:22:32,754 epoch 10 - iter 801/893 - loss 0.00689931 - time (sec): 53.37 - samples/sec: 4157.08 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-25 17:22:38,960 epoch 10 - iter 890/893 - loss 0.00666490 - time (sec): 59.58 - samples/sec: 4161.77 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-10-25 17:22:39,141 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-25 17:22:39,141 EPOCH 10 done: loss 0.0066 - lr: 0.000000
221
+ 2023-10-25 17:22:43,875 DEV : loss 0.23105905950069427 - f1-score (micro avg) 0.8
222
+ 2023-10-25 17:22:44,387 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-25 17:22:44,388 Loading model from best epoch ...
224
+ 2023-10-25 17:22:46,202 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
225
+ 2023-10-25 17:22:58,838
226
+ Results:
227
+ - F-score (micro) 0.6773
228
+ - F-score (macro) 0.588
229
+ - Accuracy 0.5304
230
+
231
+ By class:
232
+ precision recall f1-score support
233
+
234
+ LOC 0.6839 0.6877 0.6858 1095
235
+ PER 0.7644 0.7500 0.7571 1012
236
+ ORG 0.4379 0.5434 0.4850 357
237
+ HumanProd 0.3182 0.6364 0.4242 33
238
+
239
+ micro avg 0.6635 0.6916 0.6773 2497
240
+ macro avg 0.5511 0.6544 0.5880 2497
241
+ weighted avg 0.6765 0.6916 0.6825 2497
242
+
243
+ 2023-10-25 17:22:58,839 ----------------------------------------------------------------------------------------------------