stefan-it commited on
Commit
84c3e98
1 Parent(s): a24edb9

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. best-model.pt +3 -0
  2. dev.tsv +0 -0
  3. loss.tsv +11 -0
  4. test.tsv +0 -0
  5. training.log +242 -0
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:61c515014e37d9b79d144a997e551014b6e0b3a43130731521d5209676e98f87
3
+ size 443311111
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 00:35:57 0.0000 0.2920 0.1220 0.4659 0.7414 0.5722 0.4125
3
+ 2 00:37:55 0.0000 0.0814 0.1171 0.5546 0.7323 0.6312 0.4685
4
+ 3 00:39:56 0.0000 0.0574 0.1580 0.5224 0.8009 0.6323 0.4736
5
+ 4 00:41:55 0.0000 0.0412 0.2665 0.5267 0.8021 0.6358 0.4772
6
+ 5 00:43:55 0.0000 0.0281 0.2983 0.5413 0.8169 0.6512 0.4934
7
+ 6 00:45:54 0.0000 0.0201 0.3269 0.5494 0.8021 0.6521 0.4940
8
+ 7 00:47:51 0.0000 0.0136 0.3592 0.5404 0.8043 0.6464 0.4892
9
+ 8 00:49:49 0.0000 0.0096 0.3747 0.5515 0.7906 0.6497 0.4901
10
+ 9 00:51:49 0.0000 0.0068 0.3671 0.5541 0.7792 0.6476 0.4878
11
+ 10 00:53:48 0.0000 0.0044 0.3939 0.5558 0.8089 0.6589 0.5004
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,242 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-15 00:34:00,672 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-15 00:34:00,673 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-15 00:34:00,673 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-15 00:34:00,673 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
52
+ - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
53
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-15 00:34:00,674 Train: 14465 sentences
55
+ 2023-10-15 00:34:00,674 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-15 00:34:00,674 Training Params:
58
+ 2023-10-15 00:34:00,674 - learning_rate: "3e-05"
59
+ 2023-10-15 00:34:00,674 - mini_batch_size: "8"
60
+ 2023-10-15 00:34:00,674 - max_epochs: "10"
61
+ 2023-10-15 00:34:00,674 - shuffle: "True"
62
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-15 00:34:00,674 Plugins:
64
+ 2023-10-15 00:34:00,674 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-15 00:34:00,674 Final evaluation on model from best epoch (best-model.pt)
67
+ 2023-10-15 00:34:00,674 - metric: "('micro avg', 'f1-score')"
68
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
69
+ 2023-10-15 00:34:00,674 Computation:
70
+ 2023-10-15 00:34:00,674 - compute on device: cuda:0
71
+ 2023-10-15 00:34:00,674 - embedding storage: none
72
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-15 00:34:00,674 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
74
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
75
+ 2023-10-15 00:34:00,674 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-15 00:34:12,020 epoch 1 - iter 180/1809 - loss 1.69578880 - time (sec): 11.35 - samples/sec: 3329.84 - lr: 0.000003 - momentum: 0.000000
77
+ 2023-10-15 00:34:23,157 epoch 1 - iter 360/1809 - loss 0.96774144 - time (sec): 22.48 - samples/sec: 3335.91 - lr: 0.000006 - momentum: 0.000000
78
+ 2023-10-15 00:34:34,271 epoch 1 - iter 540/1809 - loss 0.70357347 - time (sec): 33.60 - samples/sec: 3365.18 - lr: 0.000009 - momentum: 0.000000
79
+ 2023-10-15 00:34:45,056 epoch 1 - iter 720/1809 - loss 0.56603696 - time (sec): 44.38 - samples/sec: 3386.28 - lr: 0.000012 - momentum: 0.000000
80
+ 2023-10-15 00:34:56,166 epoch 1 - iter 900/1809 - loss 0.47915304 - time (sec): 55.49 - samples/sec: 3400.37 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-15 00:35:07,591 epoch 1 - iter 1080/1809 - loss 0.41979242 - time (sec): 66.92 - samples/sec: 3381.28 - lr: 0.000018 - momentum: 0.000000
82
+ 2023-10-15 00:35:18,537 epoch 1 - iter 1260/1809 - loss 0.37659632 - time (sec): 77.86 - samples/sec: 3380.44 - lr: 0.000021 - momentum: 0.000000
83
+ 2023-10-15 00:35:29,412 epoch 1 - iter 1440/1809 - loss 0.34144756 - time (sec): 88.74 - samples/sec: 3392.55 - lr: 0.000024 - momentum: 0.000000
84
+ 2023-10-15 00:35:40,488 epoch 1 - iter 1620/1809 - loss 0.31444292 - time (sec): 99.81 - samples/sec: 3405.87 - lr: 0.000027 - momentum: 0.000000
85
+ 2023-10-15 00:35:51,789 epoch 1 - iter 1800/1809 - loss 0.29260619 - time (sec): 111.11 - samples/sec: 3405.01 - lr: 0.000030 - momentum: 0.000000
86
+ 2023-10-15 00:35:52,293 ----------------------------------------------------------------------------------------------------
87
+ 2023-10-15 00:35:52,293 EPOCH 1 done: loss 0.2920 - lr: 0.000030
88
+ 2023-10-15 00:35:56,993 DEV : loss 0.12198404222726822 - f1-score (micro avg) 0.5722
89
+ 2023-10-15 00:35:57,022 saving best model
90
+ 2023-10-15 00:35:57,380 ----------------------------------------------------------------------------------------------------
91
+ 2023-10-15 00:36:08,464 epoch 2 - iter 180/1809 - loss 0.09620289 - time (sec): 11.08 - samples/sec: 3499.09 - lr: 0.000030 - momentum: 0.000000
92
+ 2023-10-15 00:36:19,654 epoch 2 - iter 360/1809 - loss 0.08888247 - time (sec): 22.27 - samples/sec: 3425.10 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-15 00:36:30,545 epoch 2 - iter 540/1809 - loss 0.08750531 - time (sec): 33.16 - samples/sec: 3430.82 - lr: 0.000029 - momentum: 0.000000
94
+ 2023-10-15 00:36:41,703 epoch 2 - iter 720/1809 - loss 0.08597305 - time (sec): 44.32 - samples/sec: 3446.10 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-15 00:36:52,772 epoch 2 - iter 900/1809 - loss 0.08644518 - time (sec): 55.39 - samples/sec: 3439.68 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-15 00:37:04,106 epoch 2 - iter 1080/1809 - loss 0.08478312 - time (sec): 66.73 - samples/sec: 3445.62 - lr: 0.000028 - momentum: 0.000000
97
+ 2023-10-15 00:37:15,095 epoch 2 - iter 1260/1809 - loss 0.08491575 - time (sec): 77.71 - samples/sec: 3434.67 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-15 00:37:26,060 epoch 2 - iter 1440/1809 - loss 0.08292218 - time (sec): 88.68 - samples/sec: 3420.93 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-15 00:37:37,259 epoch 2 - iter 1620/1809 - loss 0.08203160 - time (sec): 99.88 - samples/sec: 3418.03 - lr: 0.000027 - momentum: 0.000000
100
+ 2023-10-15 00:37:48,250 epoch 2 - iter 1800/1809 - loss 0.08159459 - time (sec): 110.87 - samples/sec: 3412.35 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-15 00:37:48,794 ----------------------------------------------------------------------------------------------------
102
+ 2023-10-15 00:37:48,794 EPOCH 2 done: loss 0.0814 - lr: 0.000027
103
+ 2023-10-15 00:37:55,062 DEV : loss 0.11705530434846878 - f1-score (micro avg) 0.6312
104
+ 2023-10-15 00:37:55,091 saving best model
105
+ 2023-10-15 00:37:55,626 ----------------------------------------------------------------------------------------------------
106
+ 2023-10-15 00:38:06,588 epoch 3 - iter 180/1809 - loss 0.05232030 - time (sec): 10.96 - samples/sec: 3449.48 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-15 00:38:18,020 epoch 3 - iter 360/1809 - loss 0.05666857 - time (sec): 22.39 - samples/sec: 3430.40 - lr: 0.000026 - momentum: 0.000000
108
+ 2023-10-15 00:38:28,583 epoch 3 - iter 540/1809 - loss 0.05594377 - time (sec): 32.95 - samples/sec: 3441.67 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-15 00:38:39,674 epoch 3 - iter 720/1809 - loss 0.05871684 - time (sec): 44.05 - samples/sec: 3425.15 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-15 00:38:51,076 epoch 3 - iter 900/1809 - loss 0.05950701 - time (sec): 55.45 - samples/sec: 3374.92 - lr: 0.000025 - momentum: 0.000000
111
+ 2023-10-15 00:39:02,741 epoch 3 - iter 1080/1809 - loss 0.05920344 - time (sec): 67.11 - samples/sec: 3358.55 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-15 00:39:14,716 epoch 3 - iter 1260/1809 - loss 0.05784277 - time (sec): 79.09 - samples/sec: 3328.74 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-15 00:39:26,435 epoch 3 - iter 1440/1809 - loss 0.05840880 - time (sec): 90.81 - samples/sec: 3318.84 - lr: 0.000024 - momentum: 0.000000
114
+ 2023-10-15 00:39:37,770 epoch 3 - iter 1620/1809 - loss 0.05800859 - time (sec): 102.14 - samples/sec: 3315.52 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-15 00:39:49,682 epoch 3 - iter 1800/1809 - loss 0.05721469 - time (sec): 114.05 - samples/sec: 3318.60 - lr: 0.000023 - momentum: 0.000000
116
+ 2023-10-15 00:39:50,192 ----------------------------------------------------------------------------------------------------
117
+ 2023-10-15 00:39:50,192 EPOCH 3 done: loss 0.0574 - lr: 0.000023
118
+ 2023-10-15 00:39:56,519 DEV : loss 0.15803837776184082 - f1-score (micro avg) 0.6323
119
+ 2023-10-15 00:39:56,553 saving best model
120
+ 2023-10-15 00:39:57,023 ----------------------------------------------------------------------------------------------------
121
+ 2023-10-15 00:40:07,803 epoch 4 - iter 180/1809 - loss 0.03675662 - time (sec): 10.78 - samples/sec: 3513.50 - lr: 0.000023 - momentum: 0.000000
122
+ 2023-10-15 00:40:18,830 epoch 4 - iter 360/1809 - loss 0.03782429 - time (sec): 21.80 - samples/sec: 3425.01 - lr: 0.000023 - momentum: 0.000000
123
+ 2023-10-15 00:40:30,431 epoch 4 - iter 540/1809 - loss 0.04025441 - time (sec): 33.40 - samples/sec: 3433.46 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-15 00:40:41,758 epoch 4 - iter 720/1809 - loss 0.03972038 - time (sec): 44.73 - samples/sec: 3398.75 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-15 00:40:52,956 epoch 4 - iter 900/1809 - loss 0.04064138 - time (sec): 55.93 - samples/sec: 3390.54 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-15 00:41:04,080 epoch 4 - iter 1080/1809 - loss 0.04035289 - time (sec): 67.05 - samples/sec: 3388.44 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-15 00:41:15,319 epoch 4 - iter 1260/1809 - loss 0.04089726 - time (sec): 78.29 - samples/sec: 3388.14 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-15 00:41:26,299 epoch 4 - iter 1440/1809 - loss 0.04115836 - time (sec): 89.27 - samples/sec: 3390.11 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-15 00:41:37,035 epoch 4 - iter 1620/1809 - loss 0.04108914 - time (sec): 100.01 - samples/sec: 3402.00 - lr: 0.000020 - momentum: 0.000000
130
+ 2023-10-15 00:41:47,994 epoch 4 - iter 1800/1809 - loss 0.04098637 - time (sec): 110.97 - samples/sec: 3408.61 - lr: 0.000020 - momentum: 0.000000
131
+ 2023-10-15 00:41:48,505 ----------------------------------------------------------------------------------------------------
132
+ 2023-10-15 00:41:48,506 EPOCH 4 done: loss 0.0412 - lr: 0.000020
133
+ 2023-10-15 00:41:55,067 DEV : loss 0.26651689410209656 - f1-score (micro avg) 0.6358
134
+ 2023-10-15 00:41:55,097 saving best model
135
+ 2023-10-15 00:41:55,617 ----------------------------------------------------------------------------------------------------
136
+ 2023-10-15 00:42:06,584 epoch 5 - iter 180/1809 - loss 0.03222780 - time (sec): 10.96 - samples/sec: 3384.38 - lr: 0.000020 - momentum: 0.000000
137
+ 2023-10-15 00:42:17,751 epoch 5 - iter 360/1809 - loss 0.02816632 - time (sec): 22.13 - samples/sec: 3432.68 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-15 00:42:28,823 epoch 5 - iter 540/1809 - loss 0.02611782 - time (sec): 33.20 - samples/sec: 3445.57 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-15 00:42:39,839 epoch 5 - iter 720/1809 - loss 0.02595299 - time (sec): 44.22 - samples/sec: 3431.75 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-15 00:42:51,400 epoch 5 - iter 900/1809 - loss 0.02604561 - time (sec): 55.78 - samples/sec: 3405.01 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-15 00:43:02,924 epoch 5 - iter 1080/1809 - loss 0.02815177 - time (sec): 67.30 - samples/sec: 3393.94 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-15 00:43:14,004 epoch 5 - iter 1260/1809 - loss 0.02856667 - time (sec): 78.38 - samples/sec: 3416.12 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-15 00:43:25,118 epoch 5 - iter 1440/1809 - loss 0.02810936 - time (sec): 89.50 - samples/sec: 3410.79 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-15 00:43:36,034 epoch 5 - iter 1620/1809 - loss 0.02857657 - time (sec): 100.41 - samples/sec: 3407.81 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-15 00:43:47,204 epoch 5 - iter 1800/1809 - loss 0.02823526 - time (sec): 111.58 - samples/sec: 3387.43 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-15 00:43:47,831 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-15 00:43:47,831 EPOCH 5 done: loss 0.0281 - lr: 0.000017
148
+ 2023-10-15 00:43:54,965 DEV : loss 0.29827266931533813 - f1-score (micro avg) 0.6512
149
+ 2023-10-15 00:43:55,016 saving best model
150
+ 2023-10-15 00:43:55,433 ----------------------------------------------------------------------------------------------------
151
+ 2023-10-15 00:44:06,476 epoch 6 - iter 180/1809 - loss 0.02021871 - time (sec): 11.04 - samples/sec: 3386.99 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-15 00:44:17,804 epoch 6 - iter 360/1809 - loss 0.02162851 - time (sec): 22.37 - samples/sec: 3390.81 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-15 00:44:29,501 epoch 6 - iter 540/1809 - loss 0.02039145 - time (sec): 34.07 - samples/sec: 3338.77 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-15 00:44:40,865 epoch 6 - iter 720/1809 - loss 0.02010225 - time (sec): 45.43 - samples/sec: 3354.54 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-15 00:44:51,991 epoch 6 - iter 900/1809 - loss 0.01961907 - time (sec): 56.56 - samples/sec: 3351.96 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-15 00:45:03,033 epoch 6 - iter 1080/1809 - loss 0.02020129 - time (sec): 67.60 - samples/sec: 3361.18 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-15 00:45:13,853 epoch 6 - iter 1260/1809 - loss 0.02013371 - time (sec): 78.42 - samples/sec: 3371.39 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-15 00:45:24,859 epoch 6 - iter 1440/1809 - loss 0.01963702 - time (sec): 89.42 - samples/sec: 3378.59 - lr: 0.000014 - momentum: 0.000000
159
+ 2023-10-15 00:45:35,850 epoch 6 - iter 1620/1809 - loss 0.01979049 - time (sec): 100.41 - samples/sec: 3391.33 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-15 00:45:46,868 epoch 6 - iter 1800/1809 - loss 0.02010880 - time (sec): 111.43 - samples/sec: 3393.44 - lr: 0.000013 - momentum: 0.000000
161
+ 2023-10-15 00:45:47,411 ----------------------------------------------------------------------------------------------------
162
+ 2023-10-15 00:45:47,412 EPOCH 6 done: loss 0.0201 - lr: 0.000013
163
+ 2023-10-15 00:45:54,097 DEV : loss 0.326910138130188 - f1-score (micro avg) 0.6521
164
+ 2023-10-15 00:45:54,136 saving best model
165
+ 2023-10-15 00:45:54,554 ----------------------------------------------------------------------------------------------------
166
+ 2023-10-15 00:46:05,343 epoch 7 - iter 180/1809 - loss 0.01170096 - time (sec): 10.79 - samples/sec: 3427.57 - lr: 0.000013 - momentum: 0.000000
167
+ 2023-10-15 00:46:16,264 epoch 7 - iter 360/1809 - loss 0.01374452 - time (sec): 21.71 - samples/sec: 3385.53 - lr: 0.000013 - momentum: 0.000000
168
+ 2023-10-15 00:46:27,180 epoch 7 - iter 540/1809 - loss 0.01405741 - time (sec): 32.62 - samples/sec: 3422.57 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-15 00:46:38,027 epoch 7 - iter 720/1809 - loss 0.01354319 - time (sec): 43.47 - samples/sec: 3423.22 - lr: 0.000012 - momentum: 0.000000
170
+ 2023-10-15 00:46:49,010 epoch 7 - iter 900/1809 - loss 0.01402736 - time (sec): 54.45 - samples/sec: 3432.65 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-15 00:46:59,959 epoch 7 - iter 1080/1809 - loss 0.01405428 - time (sec): 65.40 - samples/sec: 3441.56 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-15 00:47:11,928 epoch 7 - iter 1260/1809 - loss 0.01369459 - time (sec): 77.37 - samples/sec: 3413.33 - lr: 0.000011 - momentum: 0.000000
173
+ 2023-10-15 00:47:23,332 epoch 7 - iter 1440/1809 - loss 0.01300601 - time (sec): 88.78 - samples/sec: 3393.45 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-15 00:47:34,578 epoch 7 - iter 1620/1809 - loss 0.01324269 - time (sec): 100.02 - samples/sec: 3412.43 - lr: 0.000010 - momentum: 0.000000
175
+ 2023-10-15 00:47:45,549 epoch 7 - iter 1800/1809 - loss 0.01359864 - time (sec): 110.99 - samples/sec: 3407.83 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-15 00:47:46,058 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-15 00:47:46,058 EPOCH 7 done: loss 0.0136 - lr: 0.000010
178
+ 2023-10-15 00:47:51,722 DEV : loss 0.35920804738998413 - f1-score (micro avg) 0.6464
179
+ 2023-10-15 00:47:51,760 ----------------------------------------------------------------------------------------------------
180
+ 2023-10-15 00:48:04,052 epoch 8 - iter 180/1809 - loss 0.00809107 - time (sec): 12.29 - samples/sec: 3016.59 - lr: 0.000010 - momentum: 0.000000
181
+ 2023-10-15 00:48:15,154 epoch 8 - iter 360/1809 - loss 0.00997925 - time (sec): 23.39 - samples/sec: 3223.35 - lr: 0.000009 - momentum: 0.000000
182
+ 2023-10-15 00:48:26,207 epoch 8 - iter 540/1809 - loss 0.00950308 - time (sec): 34.44 - samples/sec: 3264.79 - lr: 0.000009 - momentum: 0.000000
183
+ 2023-10-15 00:48:37,453 epoch 8 - iter 720/1809 - loss 0.01017881 - time (sec): 45.69 - samples/sec: 3318.90 - lr: 0.000009 - momentum: 0.000000
184
+ 2023-10-15 00:48:48,337 epoch 8 - iter 900/1809 - loss 0.01019771 - time (sec): 56.57 - samples/sec: 3331.55 - lr: 0.000008 - momentum: 0.000000
185
+ 2023-10-15 00:48:59,241 epoch 8 - iter 1080/1809 - loss 0.00985664 - time (sec): 67.48 - samples/sec: 3352.87 - lr: 0.000008 - momentum: 0.000000
186
+ 2023-10-15 00:49:10,421 epoch 8 - iter 1260/1809 - loss 0.01007201 - time (sec): 78.66 - samples/sec: 3348.79 - lr: 0.000008 - momentum: 0.000000
187
+ 2023-10-15 00:49:21,548 epoch 8 - iter 1440/1809 - loss 0.01003360 - time (sec): 89.79 - samples/sec: 3367.68 - lr: 0.000007 - momentum: 0.000000
188
+ 2023-10-15 00:49:32,423 epoch 8 - iter 1620/1809 - loss 0.00994408 - time (sec): 100.66 - samples/sec: 3376.73 - lr: 0.000007 - momentum: 0.000000
189
+ 2023-10-15 00:49:43,635 epoch 8 - iter 1800/1809 - loss 0.00961262 - time (sec): 111.87 - samples/sec: 3377.27 - lr: 0.000007 - momentum: 0.000000
190
+ 2023-10-15 00:49:44,234 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-15 00:49:44,235 EPOCH 8 done: loss 0.0096 - lr: 0.000007
192
+ 2023-10-15 00:49:49,924 DEV : loss 0.3747365176677704 - f1-score (micro avg) 0.6497
193
+ 2023-10-15 00:49:49,967 ----------------------------------------------------------------------------------------------------
194
+ 2023-10-15 00:50:01,635 epoch 9 - iter 180/1809 - loss 0.00978635 - time (sec): 11.67 - samples/sec: 3241.63 - lr: 0.000006 - momentum: 0.000000
195
+ 2023-10-15 00:50:13,437 epoch 9 - iter 360/1809 - loss 0.00794210 - time (sec): 23.47 - samples/sec: 3243.38 - lr: 0.000006 - momentum: 0.000000
196
+ 2023-10-15 00:50:25,022 epoch 9 - iter 540/1809 - loss 0.00746988 - time (sec): 35.05 - samples/sec: 3250.84 - lr: 0.000006 - momentum: 0.000000
197
+ 2023-10-15 00:50:35,745 epoch 9 - iter 720/1809 - loss 0.00720058 - time (sec): 45.78 - samples/sec: 3297.68 - lr: 0.000005 - momentum: 0.000000
198
+ 2023-10-15 00:50:47,088 epoch 9 - iter 900/1809 - loss 0.00675225 - time (sec): 57.12 - samples/sec: 3316.81 - lr: 0.000005 - momentum: 0.000000
199
+ 2023-10-15 00:50:59,223 epoch 9 - iter 1080/1809 - loss 0.00625992 - time (sec): 69.25 - samples/sec: 3281.87 - lr: 0.000005 - momentum: 0.000000
200
+ 2023-10-15 00:51:10,232 epoch 9 - iter 1260/1809 - loss 0.00599461 - time (sec): 80.26 - samples/sec: 3295.86 - lr: 0.000004 - momentum: 0.000000
201
+ 2023-10-15 00:51:21,369 epoch 9 - iter 1440/1809 - loss 0.00630740 - time (sec): 91.40 - samples/sec: 3316.86 - lr: 0.000004 - momentum: 0.000000
202
+ 2023-10-15 00:51:32,748 epoch 9 - iter 1620/1809 - loss 0.00639991 - time (sec): 102.78 - samples/sec: 3316.07 - lr: 0.000004 - momentum: 0.000000
203
+ 2023-10-15 00:51:43,587 epoch 9 - iter 1800/1809 - loss 0.00679754 - time (sec): 113.62 - samples/sec: 3327.65 - lr: 0.000003 - momentum: 0.000000
204
+ 2023-10-15 00:51:44,107 ----------------------------------------------------------------------------------------------------
205
+ 2023-10-15 00:51:44,107 EPOCH 9 done: loss 0.0068 - lr: 0.000003
206
+ 2023-10-15 00:51:49,804 DEV : loss 0.36713123321533203 - f1-score (micro avg) 0.6476
207
+ 2023-10-15 00:51:49,848 ----------------------------------------------------------------------------------------------------
208
+ 2023-10-15 00:52:01,561 epoch 10 - iter 180/1809 - loss 0.00501353 - time (sec): 11.71 - samples/sec: 3294.40 - lr: 0.000003 - momentum: 0.000000
209
+ 2023-10-15 00:52:12,931 epoch 10 - iter 360/1809 - loss 0.00564503 - time (sec): 23.08 - samples/sec: 3312.41 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-15 00:52:24,067 epoch 10 - iter 540/1809 - loss 0.00553930 - time (sec): 34.22 - samples/sec: 3314.09 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-15 00:52:35,168 epoch 10 - iter 720/1809 - loss 0.00474944 - time (sec): 45.32 - samples/sec: 3360.42 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-15 00:52:46,049 epoch 10 - iter 900/1809 - loss 0.00441452 - time (sec): 56.20 - samples/sec: 3374.49 - lr: 0.000002 - momentum: 0.000000
213
+ 2023-10-15 00:52:56,767 epoch 10 - iter 1080/1809 - loss 0.00439160 - time (sec): 66.92 - samples/sec: 3382.57 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-15 00:53:08,047 epoch 10 - iter 1260/1809 - loss 0.00399218 - time (sec): 78.20 - samples/sec: 3396.59 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-10-15 00:53:18,818 epoch 10 - iter 1440/1809 - loss 0.00419350 - time (sec): 88.97 - samples/sec: 3407.77 - lr: 0.000001 - momentum: 0.000000
216
+ 2023-10-15 00:53:29,758 epoch 10 - iter 1620/1809 - loss 0.00416085 - time (sec): 99.91 - samples/sec: 3411.06 - lr: 0.000000 - momentum: 0.000000
217
+ 2023-10-15 00:53:41,863 epoch 10 - iter 1800/1809 - loss 0.00440667 - time (sec): 112.01 - samples/sec: 3371.59 - lr: 0.000000 - momentum: 0.000000
218
+ 2023-10-15 00:53:42,494 ----------------------------------------------------------------------------------------------------
219
+ 2023-10-15 00:53:42,494 EPOCH 10 done: loss 0.0044 - lr: 0.000000
220
+ 2023-10-15 00:53:48,144 DEV : loss 0.3939391076564789 - f1-score (micro avg) 0.6589
221
+ 2023-10-15 00:53:48,186 saving best model
222
+ 2023-10-15 00:53:49,083 ----------------------------------------------------------------------------------------------------
223
+ 2023-10-15 00:53:49,084 Loading model from best epoch ...
224
+ 2023-10-15 00:53:50,690 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
225
+ 2023-10-15 00:53:58,307
226
+ Results:
227
+ - F-score (micro) 0.6652
228
+ - F-score (macro) 0.5525
229
+ - Accuracy 0.5115
230
+
231
+ By class:
232
+ precision recall f1-score support
233
+
234
+ loc 0.6331 0.8088 0.7103 591
235
+ pers 0.5735 0.7871 0.6635 357
236
+ org 0.2895 0.2785 0.2839 79
237
+
238
+ micro avg 0.5912 0.7605 0.6652 1027
239
+ macro avg 0.4987 0.6248 0.5525 1027
240
+ weighted avg 0.5859 0.7605 0.6612 1027
241
+
242
+ 2023-10-15 00:53:58,308 ----------------------------------------------------------------------------------------------------