File size: 71,437 Bytes
c96c265 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 |
2024-08-06 03:39:40,339 INFO [trainer.py:870] (7/8) Training started
2024-08-06 03:39:40,340 INFO [trainer.py:889] (7/8) Device: cuda:7
2024-08-06 03:39:40,340 INFO [trainer.py:890] (7/8) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '279b0c87015a615b81b147251814d737a548f397', 'k2-git-date': 'Wed May 24 22:24:09 2023', 'lhotse-version': '1.26.0', 'torch-version': '2.0.1+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.10', 'icefall-git-branch': 'main', 'icefall-git-sha1': '7d2e5f4-dirty', 'icefall-git-date': 'Tue Aug 6 02:59:12 2024', 'icefall-path': '/workspace/icefall_llm', 'k2-path': '/usr/local/lib/python3.10/dist-packages/k2/__init__.py', 'lhotse-path': '/usr/local/lib/python3.10/dist-packages/lhotse/__init__.py', 'hostname': '6865771', 'IP address': '0.104.195.107'}, 'world_size': 8, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.03, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 1000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'bfloat16', 'filter_min_duration': 0.5, 'filter_max_duration': 14.0, 'train_stage': 1, 'visualize': False, 'oom_check': False, 'model_name': 'valle', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 1, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False, 'manifest_dir': PosixPath('data/tokenized'), 'max_duration': 320, 'bucketing_sampler': True, 'num_buckets': 6, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 0.1, 'on_the_fly_feats': False, 'shuffle': True, 'buffer_size': 40000, 'shuffle_buffer_size': 100000, 'drop_last': False, 'return_cuts': True, 'num_workers': 8, 'enable_spec_aug': False, 'spec_aug_time_warp_factor': 80, 'input_strategy': 'PrecomputedFeatures', 'dataset': 'libritts', 'text_tokens': 'data/tokenized/unique_text_tokens.k2symbols', 'sampling_rate': 24000}
2024-08-06 03:39:40,341 INFO [trainer.py:892] (7/8) About to create model
2024-08-06 03:39:41,122 INFO [trainer.py:899] (7/8) Number of model parameters: 367386628
2024-08-06 03:39:41,926 INFO [trainer.py:914] (7/8) Using DDP
2024-08-06 03:39:43,998 INFO [datamodule.py:427] (7/8) About to get train cuts
2024-08-06 03:39:44,000 INFO [datamodule.py:434] (7/8) About to get dev cuts
2024-08-06 03:39:44,001 INFO [datamodule.py:292] (7/8) Disable SpecAugment
2024-08-06 03:39:44,002 INFO [datamodule.py:294] (7/8) About to create train dataset
2024-08-06 03:39:44,002 INFO [datamodule.py:323] (7/8) Using DynamicBucketingSampler
2024-08-06 03:39:44,610 INFO [datamodule.py:344] (7/8) About to create train dataloader
2024-08-06 03:39:44,611 INFO [datamodule.py:367] (7/8) About to create dev dataset
2024-08-06 03:39:44,941 INFO [datamodule.py:388] (7/8) About to create dev dataloader
2024-08-06 03:40:39,571 INFO [trainer.py:765] (7/8) Epoch 1, batch 100, train_loss[loss=4.129, ArTop10Accuracy=0.5083, over 14809.00 frames. ], tot_loss[loss=4.784, ArTop10Accuracy=0.3964, over 4792.93 frames. ], batch size: 61, lr: 2.25e-02
2024-08-06 03:41:16,923 INFO [trainer.py:765] (7/8) Epoch 1, batch 200, train_loss[loss=3.934, ArTop10Accuracy=0.5439, over 13819.00 frames. ], tot_loss[loss=4.308, ArTop10Accuracy=0.4752, over 7794.05 frames. ], batch size: 34, lr: 3.00e-02
2024-08-06 03:41:57,951 INFO [trainer.py:765] (7/8) Epoch 1, batch 300, train_loss[loss=3.846, ArTop10Accuracy=0.5427, over 14222.00 frames. ], tot_loss[loss=4.09, ArTop10Accuracy=0.5104, over 9413.19 frames. ], batch size: 44, lr: 3.00e-02
2024-08-06 03:42:33,081 INFO [trainer.py:765] (7/8) Epoch 1, batch 400, train_loss[loss=3.626, ArTop10Accuracy=0.5841, over 11074.00 frames. ], tot_loss[loss=3.945, ArTop10Accuracy=0.5338, over 10341.79 frames. ], batch size: 15, lr: 3.00e-02
2024-08-06 03:43:11,270 INFO [trainer.py:765] (7/8) Epoch 1, batch 500, train_loss[loss=3.737, ArTop10Accuracy=0.5661, over 12277.00 frames. ], tot_loss[loss=3.829, ArTop10Accuracy=0.5533, over 10907.39 frames. ], batch size: 22, lr: 2.99e-02
2024-08-06 03:43:46,592 INFO [trainer.py:765] (7/8) Epoch 1, batch 600, train_loss[loss=3.518, ArTop10Accuracy=0.6106, over 11558.00 frames. ], tot_loss[loss=3.75, ArTop10Accuracy=0.5673, over 11441.86 frames. ], batch size: 18, lr: 2.99e-02
2024-08-06 03:44:27,900 INFO [trainer.py:765] (7/8) Epoch 1, batch 700, train_loss[loss=3.401, ArTop10Accuracy=0.6251, over 10127.00 frames. ], tot_loss[loss=3.686, ArTop10Accuracy=0.5785, over 11584.80 frames. ], batch size: 12, lr: 2.99e-02
2024-08-06 03:45:01,513 INFO [trainer.py:765] (7/8) Epoch 1, batch 800, train_loss[loss=3.423, ArTop10Accuracy=0.625, over 10089.00 frames. ], tot_loss[loss=3.634, ArTop10Accuracy=0.5881, over 11711.73 frames. ], batch size: 12, lr: 2.98e-02
2024-08-06 03:45:32,557 INFO [trainer.py:765] (7/8) Epoch 1, batch 900, train_loss[loss=3.579, ArTop10Accuracy=0.5984, over 12938.00 frames. ], tot_loss[loss=3.586, ArTop10Accuracy=0.5969, over 11738.21 frames. ], batch size: 27, lr: 2.98e-02
2024-08-06 03:46:03,648 INFO [trainer.py:765] (7/8) Epoch 1, batch 1000, train_loss[loss=3.551, ArTop10Accuracy=0.6069, over 12928.00 frames. ], tot_loss[loss=3.554, ArTop10Accuracy=0.603, over 11938.35 frames. ], batch size: 27, lr: 2.97e-02
2024-08-06 03:46:07,988 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 8.169e+01 1.565e+02 2.239e+02 3.485e+02 9.105e+03, threshold=4.478e+02, percent-clipped=0.0
2024-08-06 03:46:38,611 INFO [trainer.py:765] (7/8) Epoch 1, batch 1100, train_loss[loss=3.42, ArTop10Accuracy=0.623, over 13609.00 frames. ], tot_loss[loss=3.527, ArTop10Accuracy=0.6078, over 12007.78 frames. ], batch size: 34, lr: 2.96e-02
2024-08-06 03:47:08,744 INFO [trainer.py:765] (7/8) Epoch 1, batch 1200, train_loss[loss=3.49, ArTop10Accuracy=0.6182, over 12024.00 frames. ], tot_loss[loss=3.504, ArTop10Accuracy=0.6121, over 11959.08 frames. ], batch size: 97, lr: 2.96e-02
2024-08-06 03:47:33,901 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 03:48:38,675 INFO [trainer.py:765] (7/8) Epoch 2, batch 100, train_loss[loss=3.47, ArTop10Accuracy=0.6212, over 14221.00 frames. ], tot_loss[loss=3.466, ArTop10Accuracy=0.6196, over 4767.00 frames. ], batch size: 61, lr: 2.90e-02
2024-08-06 03:49:14,596 INFO [trainer.py:765] (7/8) Epoch 2, batch 200, train_loss[loss=3.427, ArTop10Accuracy=0.6234, over 13745.00 frames. ], tot_loss[loss=3.439, ArTop10Accuracy=0.6243, over 7776.46 frames. ], batch size: 34, lr: 2.89e-02
2024-08-06 03:49:56,519 INFO [trainer.py:765] (7/8) Epoch 2, batch 300, train_loss[loss=3.483, ArTop10Accuracy=0.6213, over 14230.00 frames. ], tot_loss[loss=3.416, ArTop10Accuracy=0.6294, over 9397.98 frames. ], batch size: 44, lr: 2.89e-02
2024-08-06 03:50:31,999 INFO [trainer.py:765] (7/8) Epoch 2, batch 400, train_loss[loss=3.402, ArTop10Accuracy=0.6343, over 10467.00 frames. ], tot_loss[loss=3.41, ArTop10Accuracy=0.6307, over 10298.79 frames. ], batch size: 14, lr: 2.88e-02
2024-08-06 03:51:17,109 INFO [trainer.py:765] (7/8) Epoch 2, batch 500, train_loss[loss=3.401, ArTop10Accuracy=0.6375, over 12143.00 frames. ], tot_loss[loss=3.404, ArTop10Accuracy=0.632, over 10872.04 frames. ], batch size: 22, lr: 2.87e-02
2024-08-06 03:51:53,203 INFO [trainer.py:765] (7/8) Epoch 2, batch 600, train_loss[loss=3.297, ArTop10Accuracy=0.6514, over 11641.00 frames. ], tot_loss[loss=3.401, ArTop10Accuracy=0.6323, over 11425.09 frames. ], batch size: 18, lr: 2.86e-02
2024-08-06 03:52:38,994 INFO [trainer.py:765] (7/8) Epoch 2, batch 700, train_loss[loss=3.344, ArTop10Accuracy=0.6342, over 9135.00 frames. ], tot_loss[loss=3.394, ArTop10Accuracy=0.6336, over 11561.78 frames. ], batch size: 11, lr: 2.85e-02
2024-08-06 03:52:47,091 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 03:52:56,023 INFO [trainer.py:811] (7/8) Epoch 2, validation: loss=3.327, ArTop10Accuracy=0.6492, over 1829298.00 frames.
2024-08-06 03:52:56,024 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 28662MB
2024-08-06 03:52:56,542 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 8.181e+01 1.431e+02 1.849e+02 2.730e+02 2.344e+03, threshold=3.697e+02, percent-clipped=7.2
2024-08-06 03:53:21,882 INFO [trainer.py:765] (7/8) Epoch 2, batch 800, train_loss[loss=3.318, ArTop10Accuracy=0.6498, over 9988.00 frames. ], tot_loss[loss=3.387, ArTop10Accuracy=0.635, over 11683.76 frames. ], batch size: 12, lr: 2.84e-02
2024-08-06 03:53:53,300 INFO [trainer.py:765] (7/8) Epoch 2, batch 900, train_loss[loss=3.401, ArTop10Accuracy=0.63, over 12975.00 frames. ], tot_loss[loss=3.367, ArTop10Accuracy=0.6386, over 11740.71 frames. ], batch size: 27, lr: 2.83e-02
2024-08-06 03:54:24,809 INFO [trainer.py:765] (7/8) Epoch 2, batch 1000, train_loss[loss=3.42, ArTop10Accuracy=0.635, over 12892.00 frames. ], tot_loss[loss=3.367, ArTop10Accuracy=0.6392, over 11931.42 frames. ], batch size: 27, lr: 2.82e-02
2024-08-06 03:54:56,007 INFO [trainer.py:765] (7/8) Epoch 2, batch 1100, train_loss[loss=3.335, ArTop10Accuracy=0.6454, over 13818.00 frames. ], tot_loss[loss=3.362, ArTop10Accuracy=0.64, over 12003.56 frames. ], batch size: 34, lr: 2.81e-02
2024-08-06 03:55:26,229 INFO [trainer.py:765] (7/8) Epoch 2, batch 1200, train_loss[loss=3.372, ArTop10Accuracy=0.6381, over 11813.00 frames. ], tot_loss[loss=3.356, ArTop10Accuracy=0.641, over 11933.31 frames. ], batch size: 97, lr: 2.80e-02
2024-08-06 03:55:51,139 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 03:57:04,102 INFO [trainer.py:765] (7/8) Epoch 3, batch 100, train_loss[loss=3.371, ArTop10Accuracy=0.6374, over 14601.00 frames. ], tot_loss[loss=3.314, ArTop10Accuracy=0.6503, over 4779.56 frames. ], batch size: 62, lr: 2.67e-02
2024-08-06 03:57:50,980 INFO [trainer.py:765] (7/8) Epoch 3, batch 200, train_loss[loss=3.284, ArTop10Accuracy=0.6563, over 13693.00 frames. ], tot_loss[loss=3.296, ArTop10Accuracy=0.6538, over 7785.32 frames. ], batch size: 34, lr: 2.66e-02
2024-08-06 03:58:26,075 INFO [trainer.py:765] (7/8) Epoch 3, batch 300, train_loss[loss=3.175, ArTop10Accuracy=0.6722, over 14066.00 frames. ], tot_loss[loss=3.281, ArTop10Accuracy=0.6564, over 9410.41 frames. ], batch size: 44, lr: 2.64e-02
2024-08-06 03:59:11,254 INFO [trainer.py:765] (7/8) Epoch 3, batch 400, train_loss[loss=3.326, ArTop10Accuracy=0.6447, over 10334.00 frames. ], tot_loss[loss=3.27, ArTop10Accuracy=0.6584, over 10329.08 frames. ], batch size: 14, lr: 2.63e-02
2024-08-06 03:59:29,675 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 8.720e+01 1.461e+02 1.775e+02 2.344e+02 9.150e+02, threshold=3.550e+02, percent-clipped=5.2
2024-08-06 03:59:49,304 INFO [trainer.py:765] (7/8) Epoch 3, batch 500, train_loss[loss=3.121, ArTop10Accuracy=0.682, over 12193.00 frames. ], tot_loss[loss=3.255, ArTop10Accuracy=0.6604, over 10904.15 frames. ], batch size: 22, lr: 2.62e-02
2024-08-06 04:00:35,096 INFO [trainer.py:765] (7/8) Epoch 3, batch 600, train_loss[loss=3.192, ArTop10Accuracy=0.6709, over 11744.00 frames. ], tot_loss[loss=3.241, ArTop10Accuracy=0.6632, over 11426.58 frames. ], batch size: 18, lr: 2.61e-02
2024-08-06 04:01:22,059 INFO [trainer.py:765] (7/8) Epoch 3, batch 700, train_loss[loss=2.996, ArTop10Accuracy=0.7045, over 10156.00 frames. ], tot_loss[loss=3.238, ArTop10Accuracy=0.664, over 11562.24 frames. ], batch size: 12, lr: 2.60e-02
2024-08-06 04:01:56,269 INFO [trainer.py:765] (7/8) Epoch 3, batch 800, train_loss[loss=3.181, ArTop10Accuracy=0.6888, over 9863.00 frames. ], tot_loss[loss=3.232, ArTop10Accuracy=0.6653, over 11678.21 frames. ], batch size: 12, lr: 2.59e-02
2024-08-06 04:02:27,740 INFO [trainer.py:765] (7/8) Epoch 3, batch 900, train_loss[loss=3.18, ArTop10Accuracy=0.6827, over 12994.00 frames. ], tot_loss[loss=3.212, ArTop10Accuracy=0.6696, over 11732.46 frames. ], batch size: 27, lr: 2.57e-02
2024-08-06 04:02:59,284 INFO [trainer.py:765] (7/8) Epoch 3, batch 1000, train_loss[loss=3.077, ArTop10Accuracy=0.7003, over 13318.00 frames. ], tot_loss[loss=3.197, ArTop10Accuracy=0.6723, over 11939.63 frames. ], batch size: 28, lr: 2.56e-02
2024-08-06 04:03:30,942 INFO [trainer.py:765] (7/8) Epoch 3, batch 1100, train_loss[loss=3.118, ArTop10Accuracy=0.6931, over 13902.00 frames. ], tot_loss[loss=3.195, ArTop10Accuracy=0.6724, over 12001.98 frames. ], batch size: 34, lr: 2.55e-02
2024-08-06 04:04:01,314 INFO [trainer.py:765] (7/8) Epoch 3, batch 1200, train_loss[loss=3.266, ArTop10Accuracy=0.6626, over 12006.00 frames. ], tot_loss[loss=3.18, ArTop10Accuracy=0.6756, over 11951.26 frames. ], batch size: 98, lr: 2.54e-02
2024-08-06 04:04:26,531 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 04:05:43,369 INFO [trainer.py:765] (7/8) Epoch 4, batch 100, train_loss[loss=3.199, ArTop10Accuracy=0.6694, over 14536.00 frames. ], tot_loss[loss=3.132, ArTop10Accuracy=0.6861, over 4773.15 frames. ], batch size: 61, lr: 2.38e-02
2024-08-06 04:06:07,077 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 04:06:16,404 INFO [trainer.py:811] (7/8) Epoch 4, validation: loss=3.063, ArTop10Accuracy=0.7031, over 1829298.00 frames.
2024-08-06 04:06:16,405 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 29190MB
2024-08-06 04:06:16,746 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.091e+02 1.493e+02 1.709e+02 2.068e+02 7.969e+02, threshold=3.418e+02, percent-clipped=2.9
2024-08-06 04:06:31,827 INFO [trainer.py:765] (7/8) Epoch 4, batch 200, train_loss[loss=3.208, ArTop10Accuracy=0.6659, over 13285.00 frames. ], tot_loss[loss=3.122, ArTop10Accuracy=0.6878, over 7793.19 frames. ], batch size: 33, lr: 2.37e-02
2024-08-06 04:07:18,544 INFO [trainer.py:765] (7/8) Epoch 4, batch 300, train_loss[loss=3.184, ArTop10Accuracy=0.6809, over 14081.00 frames. ], tot_loss[loss=3.12, ArTop10Accuracy=0.6883, over 9423.25 frames. ], batch size: 44, lr: 2.36e-02
2024-08-06 04:08:01,911 INFO [trainer.py:765] (7/8) Epoch 4, batch 400, train_loss[loss=3.047, ArTop10Accuracy=0.7063, over 10851.00 frames. ], tot_loss[loss=3.114, ArTop10Accuracy=0.6892, over 10352.23 frames. ], batch size: 15, lr: 2.34e-02
2024-08-06 04:08:45,345 INFO [trainer.py:765] (7/8) Epoch 4, batch 500, train_loss[loss=3.135, ArTop10Accuracy=0.691, over 12305.00 frames. ], tot_loss[loss=3.107, ArTop10Accuracy=0.6901, over 10907.45 frames. ], batch size: 22, lr: 2.33e-02
2024-08-06 04:09:37,072 INFO [trainer.py:765] (7/8) Epoch 4, batch 600, train_loss[loss=3.077, ArTop10Accuracy=0.6936, over 11653.00 frames. ], tot_loss[loss=3.108, ArTop10Accuracy=0.6902, over 11427.22 frames. ], batch size: 18, lr: 2.32e-02
2024-08-06 04:10:13,502 INFO [trainer.py:765] (7/8) Epoch 4, batch 700, train_loss[loss=2.914, ArTop10Accuracy=0.7315, over 10206.00 frames. ], tot_loss[loss=3.109, ArTop10Accuracy=0.69, over 11569.47 frames. ], batch size: 12, lr: 2.31e-02
2024-08-06 04:10:51,960 INFO [trainer.py:765] (7/8) Epoch 4, batch 800, train_loss[loss=3.06, ArTop10Accuracy=0.7008, over 10219.00 frames. ], tot_loss[loss=3.11, ArTop10Accuracy=0.6898, over 11675.66 frames. ], batch size: 12, lr: 2.30e-02
2024-08-06 04:11:23,334 INFO [trainer.py:765] (7/8) Epoch 4, batch 900, train_loss[loss=3.236, ArTop10Accuracy=0.6636, over 12924.00 frames. ], tot_loss[loss=3.1, ArTop10Accuracy=0.6916, over 11746.39 frames. ], batch size: 27, lr: 2.29e-02
2024-08-06 04:11:54,827 INFO [trainer.py:765] (7/8) Epoch 4, batch 1000, train_loss[loss=3.025, ArTop10Accuracy=0.7075, over 13048.00 frames. ], tot_loss[loss=3.105, ArTop10Accuracy=0.6909, over 11936.66 frames. ], batch size: 27, lr: 2.28e-02
2024-08-06 04:12:25,961 INFO [trainer.py:765] (7/8) Epoch 4, batch 1100, train_loss[loss=3.122, ArTop10Accuracy=0.6878, over 13697.00 frames. ], tot_loss[loss=3.105, ArTop10Accuracy=0.6906, over 11990.53 frames. ], batch size: 34, lr: 2.26e-02
2024-08-06 04:12:48,545 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.106e+02 1.440e+02 1.608e+02 1.893e+02 7.925e+02, threshold=3.216e+02, percent-clipped=2.0
2024-08-06 04:12:58,828 INFO [trainer.py:765] (7/8) Epoch 4, batch 1200, train_loss[loss=3.22, ArTop10Accuracy=0.6702, over 11768.00 frames. ], tot_loss[loss=3.1, ArTop10Accuracy=0.6917, over 11957.42 frames. ], batch size: 98, lr: 2.25e-02
2024-08-06 04:13:23,901 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 04:14:38,685 INFO [trainer.py:765] (7/8) Epoch 5, batch 100, train_loss[loss=3.133, ArTop10Accuracy=0.686, over 14554.00 frames. ], tot_loss[loss=3.073, ArTop10Accuracy=0.697, over 4789.25 frames. ], batch size: 61, lr: 2.10e-02
2024-08-06 04:15:26,826 INFO [trainer.py:765] (7/8) Epoch 5, batch 200, train_loss[loss=3.021, ArTop10Accuracy=0.7045, over 13785.00 frames. ], tot_loss[loss=3.059, ArTop10Accuracy=0.7005, over 7796.18 frames. ], batch size: 34, lr: 2.09e-02
2024-08-06 04:16:08,011 INFO [trainer.py:765] (7/8) Epoch 5, batch 300, train_loss[loss=3.121, ArTop10Accuracy=0.6867, over 14160.00 frames. ], tot_loss[loss=3.054, ArTop10Accuracy=0.701, over 9403.92 frames. ], batch size: 44, lr: 2.08e-02
2024-08-06 04:16:53,134 INFO [trainer.py:765] (7/8) Epoch 5, batch 400, train_loss[loss=3.106, ArTop10Accuracy=0.6987, over 10195.00 frames. ], tot_loss[loss=3.055, ArTop10Accuracy=0.7011, over 10331.16 frames. ], batch size: 14, lr: 2.07e-02
2024-08-06 04:17:36,638 INFO [trainer.py:765] (7/8) Epoch 5, batch 500, train_loss[loss=3.018, ArTop10Accuracy=0.7048, over 12351.00 frames. ], tot_loss[loss=3.053, ArTop10Accuracy=0.7015, over 10902.94 frames. ], batch size: 22, lr: 2.06e-02
2024-08-06 04:18:22,114 INFO [trainer.py:765] (7/8) Epoch 5, batch 600, train_loss[loss=2.915, ArTop10Accuracy=0.7154, over 11679.00 frames. ], tot_loss[loss=3.051, ArTop10Accuracy=0.7017, over 11434.96 frames. ], batch size: 18, lr: 2.05e-02
2024-08-06 04:19:17,033 INFO [trainer.py:765] (7/8) Epoch 5, batch 700, train_loss[loss=2.761, ArTop10Accuracy=0.7566, over 10083.00 frames. ], tot_loss[loss=3.056, ArTop10Accuracy=0.7005, over 11586.81 frames. ], batch size: 12, lr: 2.04e-02
2024-08-06 04:19:51,067 INFO [trainer.py:765] (7/8) Epoch 5, batch 800, train_loss[loss=3.05, ArTop10Accuracy=0.707, over 9901.00 frames. ], tot_loss[loss=3.061, ArTop10Accuracy=0.6994, over 11699.42 frames. ], batch size: 12, lr: 2.03e-02
2024-08-06 04:20:18,215 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 04:20:27,476 INFO [trainer.py:811] (7/8) Epoch 5, validation: loss=2.998, ArTop10Accuracy=0.7157, over 1829298.00 frames.
2024-08-06 04:20:27,476 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 32945MB
2024-08-06 04:20:27,781 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.057e+02 1.385e+02 1.542e+02 1.759e+02 7.741e+02, threshold=3.083e+02, percent-clipped=0.7
2024-08-06 04:20:31,767 INFO [trainer.py:765] (7/8) Epoch 5, batch 900, train_loss[loss=3.11, ArTop10Accuracy=0.6938, over 13011.00 frames. ], tot_loss[loss=3.06, ArTop10Accuracy=0.6998, over 11721.33 frames. ], batch size: 27, lr: 2.02e-02
2024-08-06 04:21:03,306 INFO [trainer.py:765] (7/8) Epoch 5, batch 1000, train_loss[loss=3.075, ArTop10Accuracy=0.7002, over 12775.00 frames. ], tot_loss[loss=3.061, ArTop10Accuracy=0.6997, over 11923.59 frames. ], batch size: 27, lr: 2.01e-02
2024-08-06 04:21:34,453 INFO [trainer.py:765] (7/8) Epoch 5, batch 1100, train_loss[loss=3.032, ArTop10Accuracy=0.7089, over 13742.00 frames. ], tot_loss[loss=3.059, ArTop10Accuracy=0.7, over 11985.07 frames. ], batch size: 34, lr: 2.00e-02
2024-08-06 04:22:04,752 INFO [trainer.py:765] (7/8) Epoch 5, batch 1200, train_loss[loss=3.175, ArTop10Accuracy=0.6769, over 12739.00 frames. ], tot_loss[loss=3.058, ArTop10Accuracy=0.7004, over 11931.24 frames. ], batch size: 97, lr: 1.99e-02
2024-08-06 04:22:30,466 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 04:23:46,282 INFO [trainer.py:765] (7/8) Epoch 6, batch 100, train_loss[loss=3.137, ArTop10Accuracy=0.6858, over 14525.00 frames. ], tot_loss[loss=3.028, ArTop10Accuracy=0.7069, over 4764.51 frames. ], batch size: 62, lr: 1.85e-02
2024-08-06 04:24:35,255 INFO [trainer.py:765] (7/8) Epoch 6, batch 200, train_loss[loss=3.05, ArTop10Accuracy=0.7099, over 13371.00 frames. ], tot_loss[loss=3.019, ArTop10Accuracy=0.709, over 7788.62 frames. ], batch size: 34, lr: 1.84e-02
2024-08-06 04:25:16,676 INFO [trainer.py:765] (7/8) Epoch 6, batch 300, train_loss[loss=3.058, ArTop10Accuracy=0.7028, over 14106.00 frames. ], tot_loss[loss=3.013, ArTop10Accuracy=0.7099, over 9415.75 frames. ], batch size: 44, lr: 1.83e-02
2024-08-06 04:26:08,924 INFO [trainer.py:765] (7/8) Epoch 6, batch 400, train_loss[loss=3.02, ArTop10Accuracy=0.7105, over 10473.00 frames. ], tot_loss[loss=3.011, ArTop10Accuracy=0.7102, over 10331.60 frames. ], batch size: 14, lr: 1.83e-02
2024-08-06 04:26:51,485 INFO [trainer.py:765] (7/8) Epoch 6, batch 500, train_loss[loss=2.853, ArTop10Accuracy=0.7278, over 12190.00 frames. ], tot_loss[loss=3.009, ArTop10Accuracy=0.7097, over 10890.24 frames. ], batch size: 22, lr: 1.82e-02
2024-08-06 04:27:39,298 INFO [trainer.py:765] (7/8) Epoch 6, batch 600, train_loss[loss=3.056, ArTop10Accuracy=0.7117, over 11631.00 frames. ], tot_loss[loss=3.011, ArTop10Accuracy=0.709, over 11412.90 frames. ], batch size: 18, lr: 1.81e-02
2024-08-06 04:27:46,369 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.054e+02 1.343e+02 1.474e+02 1.660e+02 8.574e+02, threshold=2.947e+02, percent-clipped=0.6
2024-08-06 04:28:33,240 INFO [trainer.py:765] (7/8) Epoch 6, batch 700, train_loss[loss=2.941, ArTop10Accuracy=0.7306, over 10133.00 frames. ], tot_loss[loss=3.021, ArTop10Accuracy=0.7074, over 11568.97 frames. ], batch size: 12, lr: 1.80e-02
2024-08-06 04:29:11,216 INFO [trainer.py:765] (7/8) Epoch 6, batch 800, train_loss[loss=2.968, ArTop10Accuracy=0.7185, over 10059.00 frames. ], tot_loss[loss=3.025, ArTop10Accuracy=0.7067, over 11665.19 frames. ], batch size: 12, lr: 1.79e-02
2024-08-06 04:29:42,751 INFO [trainer.py:765] (7/8) Epoch 6, batch 900, train_loss[loss=3.023, ArTop10Accuracy=0.7067, over 13535.00 frames. ], tot_loss[loss=3.024, ArTop10Accuracy=0.7071, over 11723.71 frames. ], batch size: 28, lr: 1.78e-02
2024-08-06 04:30:14,306 INFO [trainer.py:765] (7/8) Epoch 6, batch 1000, train_loss[loss=2.997, ArTop10Accuracy=0.7111, over 12926.00 frames. ], tot_loss[loss=3.023, ArTop10Accuracy=0.7072, over 11907.02 frames. ], batch size: 27, lr: 1.77e-02
2024-08-06 04:30:45,384 INFO [trainer.py:765] (7/8) Epoch 6, batch 1100, train_loss[loss=3.086, ArTop10Accuracy=0.6944, over 13694.00 frames. ], tot_loss[loss=3.021, ArTop10Accuracy=0.7074, over 11972.20 frames. ], batch size: 34, lr: 1.77e-02
2024-08-06 04:31:15,673 INFO [trainer.py:765] (7/8) Epoch 6, batch 1200, train_loss[loss=3.2, ArTop10Accuracy=0.6732, over 12349.00 frames. ], tot_loss[loss=3.023, ArTop10Accuracy=0.7069, over 11929.01 frames. ], batch size: 97, lr: 1.76e-02
2024-08-06 04:31:40,504 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 04:32:52,405 INFO [trainer.py:765] (7/8) Epoch 7, batch 100, train_loss[loss=3.005, ArTop10Accuracy=0.7082, over 14471.00 frames. ], tot_loss[loss=2.997, ArTop10Accuracy=0.7128, over 4777.01 frames. ], batch size: 61, lr: 1.64e-02
2024-08-06 04:33:38,223 INFO [trainer.py:765] (7/8) Epoch 7, batch 200, train_loss[loss=2.982, ArTop10Accuracy=0.7117, over 13792.00 frames. ], tot_loss[loss=2.994, ArTop10Accuracy=0.7141, over 7788.20 frames. ], batch size: 34, lr: 1.64e-02
2024-08-06 04:34:22,609 INFO [trainer.py:765] (7/8) Epoch 7, batch 300, train_loss[loss=3.132, ArTop10Accuracy=0.6902, over 14476.00 frames. ], tot_loss[loss=2.99, ArTop10Accuracy=0.7145, over 9408.60 frames. ], batch size: 44, lr: 1.63e-02
2024-08-06 04:34:36,848 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 04:34:45,809 INFO [trainer.py:811] (7/8) Epoch 7, validation: loss=2.963, ArTop10Accuracy=0.7233, over 1829298.00 frames.
2024-08-06 04:34:45,810 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 32945MB
2024-08-06 04:34:46,125 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.009e+02 1.306e+02 1.435e+02 1.599e+02 8.689e+02, threshold=2.871e+02, percent-clipped=0.9
2024-08-06 04:35:17,146 INFO [trainer.py:765] (7/8) Epoch 7, batch 400, train_loss[loss=2.922, ArTop10Accuracy=0.7247, over 10307.00 frames. ], tot_loss[loss=2.988, ArTop10Accuracy=0.7148, over 10321.44 frames. ], batch size: 14, lr: 1.62e-02
2024-08-06 04:36:01,711 INFO [trainer.py:765] (7/8) Epoch 7, batch 500, train_loss[loss=3.02, ArTop10Accuracy=0.706, over 12451.00 frames. ], tot_loss[loss=2.987, ArTop10Accuracy=0.7146, over 10901.64 frames. ], batch size: 22, lr: 1.61e-02
2024-08-06 04:36:48,811 INFO [trainer.py:765] (7/8) Epoch 7, batch 600, train_loss[loss=2.976, ArTop10Accuracy=0.7147, over 11378.00 frames. ], tot_loss[loss=2.992, ArTop10Accuracy=0.7136, over 11426.53 frames. ], batch size: 18, lr: 1.61e-02
2024-08-06 04:37:34,800 INFO [trainer.py:765] (7/8) Epoch 7, batch 700, train_loss[loss=2.91, ArTop10Accuracy=0.7316, over 9960.00 frames. ], tot_loss[loss=2.994, ArTop10Accuracy=0.7128, over 11559.44 frames. ], batch size: 12, lr: 1.60e-02
2024-08-06 04:38:13,614 INFO [trainer.py:765] (7/8) Epoch 7, batch 800, train_loss[loss=2.98, ArTop10Accuracy=0.7089, over 9939.00 frames. ], tot_loss[loss=2.998, ArTop10Accuracy=0.712, over 11680.97 frames. ], batch size: 12, lr: 1.59e-02
2024-08-06 04:38:45,111 INFO [trainer.py:765] (7/8) Epoch 7, batch 900, train_loss[loss=2.98, ArTop10Accuracy=0.7173, over 13145.00 frames. ], tot_loss[loss=2.989, ArTop10Accuracy=0.7137, over 11735.27 frames. ], batch size: 27, lr: 1.59e-02
2024-08-06 04:39:16,576 INFO [trainer.py:765] (7/8) Epoch 7, batch 1000, train_loss[loss=3.062, ArTop10Accuracy=0.697, over 12817.00 frames. ], tot_loss[loss=2.988, ArTop10Accuracy=0.7139, over 11930.57 frames. ], batch size: 27, lr: 1.58e-02
2024-08-06 04:39:47,572 INFO [trainer.py:765] (7/8) Epoch 7, batch 1100, train_loss[loss=2.984, ArTop10Accuracy=0.7187, over 13764.00 frames. ], tot_loss[loss=2.998, ArTop10Accuracy=0.7119, over 11985.69 frames. ], batch size: 34, lr: 1.57e-02
2024-08-06 04:40:17,990 INFO [trainer.py:765] (7/8) Epoch 7, batch 1200, train_loss[loss=3.153, ArTop10Accuracy=0.6813, over 12201.00 frames. ], tot_loss[loss=2.997, ArTop10Accuracy=0.7123, over 11940.08 frames. ], batch size: 97, lr: 1.57e-02
2024-08-06 04:40:43,324 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 04:41:37,492 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 9.816e+01 1.295e+02 1.411e+02 1.574e+02 4.953e+02, threshold=2.821e+02, percent-clipped=1.1
2024-08-06 04:41:58,371 INFO [trainer.py:765] (7/8) Epoch 8, batch 100, train_loss[loss=2.961, ArTop10Accuracy=0.7196, over 14413.00 frames. ], tot_loss[loss=2.974, ArTop10Accuracy=0.7179, over 4806.07 frames. ], batch size: 61, lr: 1.47e-02
2024-08-06 04:42:44,986 INFO [trainer.py:765] (7/8) Epoch 8, batch 200, train_loss[loss=3.018, ArTop10Accuracy=0.7182, over 13428.00 frames. ], tot_loss[loss=2.967, ArTop10Accuracy=0.7192, over 7816.68 frames. ], batch size: 34, lr: 1.46e-02
2024-08-06 04:43:28,045 INFO [trainer.py:765] (7/8) Epoch 8, batch 300, train_loss[loss=3.064, ArTop10Accuracy=0.6965, over 14189.00 frames. ], tot_loss[loss=2.957, ArTop10Accuracy=0.721, over 9440.02 frames. ], batch size: 44, lr: 1.46e-02
2024-08-06 04:44:14,462 INFO [trainer.py:765] (7/8) Epoch 8, batch 400, train_loss[loss=2.895, ArTop10Accuracy=0.7349, over 10381.00 frames. ], tot_loss[loss=2.958, ArTop10Accuracy=0.7208, over 10338.67 frames. ], batch size: 14, lr: 1.45e-02
2024-08-06 04:45:00,692 INFO [trainer.py:765] (7/8) Epoch 8, batch 500, train_loss[loss=2.869, ArTop10Accuracy=0.7261, over 12318.00 frames. ], tot_loss[loss=2.956, ArTop10Accuracy=0.7209, over 10903.17 frames. ], batch size: 22, lr: 1.45e-02
2024-08-06 04:45:45,393 INFO [trainer.py:765] (7/8) Epoch 8, batch 600, train_loss[loss=2.875, ArTop10Accuracy=0.7403, over 11602.00 frames. ], tot_loss[loss=2.962, ArTop10Accuracy=0.7196, over 11414.72 frames. ], batch size: 18, lr: 1.44e-02
2024-08-06 04:46:34,038 INFO [trainer.py:765] (7/8) Epoch 8, batch 700, train_loss[loss=2.968, ArTop10Accuracy=0.7134, over 10095.00 frames. ], tot_loss[loss=2.972, ArTop10Accuracy=0.7179, over 11574.87 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 04:47:10,208 INFO [trainer.py:765] (7/8) Epoch 8, batch 800, train_loss[loss=2.839, ArTop10Accuracy=0.7445, over 10143.00 frames. ], tot_loss[loss=2.974, ArTop10Accuracy=0.7172, over 11683.39 frames. ], batch size: 12, lr: 1.43e-02
2024-08-06 04:47:41,605 INFO [trainer.py:765] (7/8) Epoch 8, batch 900, train_loss[loss=2.96, ArTop10Accuracy=0.718, over 12929.00 frames. ], tot_loss[loss=2.96, ArTop10Accuracy=0.7195, over 11725.69 frames. ], batch size: 27, lr: 1.42e-02
2024-08-06 04:48:13,032 INFO [trainer.py:765] (7/8) Epoch 8, batch 1000, train_loss[loss=2.924, ArTop10Accuracy=0.7384, over 13394.00 frames. ], tot_loss[loss=2.971, ArTop10Accuracy=0.7176, over 11930.37 frames. ], batch size: 28, lr: 1.42e-02
2024-08-06 04:48:28,827 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 04:48:37,663 INFO [trainer.py:811] (7/8) Epoch 8, validation: loss=2.946, ArTop10Accuracy=0.7266, over 1829298.00 frames.
2024-08-06 04:48:37,664 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 32945MB
2024-08-06 04:48:37,951 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.035e+02 1.289e+02 1.393e+02 1.532e+02 3.557e+02, threshold=2.786e+02, percent-clipped=0.2
2024-08-06 04:48:52,932 INFO [trainer.py:765] (7/8) Epoch 8, batch 1100, train_loss[loss=2.9, ArTop10Accuracy=0.7299, over 14090.00 frames. ], tot_loss[loss=2.98, ArTop10Accuracy=0.716, over 11976.62 frames. ], batch size: 34, lr: 1.41e-02
2024-08-06 04:49:23,202 INFO [trainer.py:765] (7/8) Epoch 8, batch 1200, train_loss[loss=3.102, ArTop10Accuracy=0.6899, over 12060.00 frames. ], tot_loss[loss=2.977, ArTop10Accuracy=0.7162, over 11934.38 frames. ], batch size: 98, lr: 1.40e-02
2024-08-06 04:49:48,360 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 04:51:01,547 INFO [trainer.py:765] (7/8) Epoch 9, batch 100, train_loss[loss=3.039, ArTop10Accuracy=0.7035, over 14529.00 frames. ], tot_loss[loss=2.939, ArTop10Accuracy=0.7243, over 4770.76 frames. ], batch size: 61, lr: 1.32e-02
2024-08-06 04:51:45,414 INFO [trainer.py:765] (7/8) Epoch 9, batch 200, train_loss[loss=2.903, ArTop10Accuracy=0.7298, over 13628.00 frames. ], tot_loss[loss=2.935, ArTop10Accuracy=0.7257, over 7790.94 frames. ], batch size: 34, lr: 1.32e-02
2024-08-06 04:52:29,082 INFO [trainer.py:765] (7/8) Epoch 9, batch 300, train_loss[loss=2.973, ArTop10Accuracy=0.7197, over 14170.00 frames. ], tot_loss[loss=2.934, ArTop10Accuracy=0.7255, over 9409.58 frames. ], batch size: 44, lr: 1.31e-02
2024-08-06 04:53:16,431 INFO [trainer.py:765] (7/8) Epoch 9, batch 400, train_loss[loss=2.927, ArTop10Accuracy=0.7286, over 10517.00 frames. ], tot_loss[loss=2.933, ArTop10Accuracy=0.7258, over 10308.35 frames. ], batch size: 14, lr: 1.31e-02
2024-08-06 04:53:58,143 INFO [trainer.py:765] (7/8) Epoch 9, batch 500, train_loss[loss=2.974, ArTop10Accuracy=0.7274, over 12342.00 frames. ], tot_loss[loss=2.94, ArTop10Accuracy=0.7238, over 10883.91 frames. ], batch size: 22, lr: 1.30e-02
2024-08-06 04:54:51,077 INFO [trainer.py:765] (7/8) Epoch 9, batch 600, train_loss[loss=2.987, ArTop10Accuracy=0.7169, over 11449.00 frames. ], tot_loss[loss=2.947, ArTop10Accuracy=0.7225, over 11422.54 frames. ], batch size: 18, lr: 1.30e-02
2024-08-06 04:55:34,399 INFO [trainer.py:765] (7/8) Epoch 9, batch 700, train_loss[loss=2.88, ArTop10Accuracy=0.7346, over 10007.00 frames. ], tot_loss[loss=2.949, ArTop10Accuracy=0.7221, over 11559.69 frames. ], batch size: 12, lr: 1.29e-02
2024-08-06 04:56:04,575 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.029e+02 1.257e+02 1.367e+02 1.507e+02 8.820e+02, threshold=2.735e+02, percent-clipped=0.5
2024-08-06 04:56:13,597 INFO [trainer.py:765] (7/8) Epoch 9, batch 800, train_loss[loss=2.917, ArTop10Accuracy=0.7269, over 10347.00 frames. ], tot_loss[loss=2.954, ArTop10Accuracy=0.7211, over 11670.55 frames. ], batch size: 12, lr: 1.29e-02
2024-08-06 04:56:44,975 INFO [trainer.py:765] (7/8) Epoch 9, batch 900, train_loss[loss=2.797, ArTop10Accuracy=0.7471, over 13184.00 frames. ], tot_loss[loss=2.952, ArTop10Accuracy=0.7214, over 11717.30 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 04:57:16,491 INFO [trainer.py:765] (7/8) Epoch 9, batch 1000, train_loss[loss=2.87, ArTop10Accuracy=0.7419, over 12925.00 frames. ], tot_loss[loss=2.959, ArTop10Accuracy=0.7203, over 11922.27 frames. ], batch size: 27, lr: 1.28e-02
2024-08-06 04:57:47,657 INFO [trainer.py:765] (7/8) Epoch 9, batch 1100, train_loss[loss=2.933, ArTop10Accuracy=0.7229, over 13975.00 frames. ], tot_loss[loss=2.963, ArTop10Accuracy=0.7195, over 11978.59 frames. ], batch size: 35, lr: 1.27e-02
2024-08-06 04:58:18,094 INFO [trainer.py:765] (7/8) Epoch 9, batch 1200, train_loss[loss=3.094, ArTop10Accuracy=0.6954, over 11733.00 frames. ], tot_loss[loss=2.96, ArTop10Accuracy=0.7199, over 11924.05 frames. ], batch size: 99, lr: 1.27e-02
2024-08-06 04:58:43,766 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 04:59:52,749 INFO [trainer.py:765] (7/8) Epoch 10, batch 100, train_loss[loss=2.94, ArTop10Accuracy=0.7291, over 14618.00 frames. ], tot_loss[loss=2.932, ArTop10Accuracy=0.7274, over 4805.03 frames. ], batch size: 61, lr: 1.20e-02
2024-08-06 05:00:43,730 INFO [trainer.py:765] (7/8) Epoch 10, batch 200, train_loss[loss=3.014, ArTop10Accuracy=0.7121, over 13804.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.7279, over 7796.69 frames. ], batch size: 34, lr: 1.20e-02
2024-08-06 05:01:20,592 INFO [trainer.py:765] (7/8) Epoch 10, batch 300, train_loss[loss=3.029, ArTop10Accuracy=0.7081, over 13831.00 frames. ], tot_loss[loss=2.927, ArTop10Accuracy=0.7273, over 9427.24 frames. ], batch size: 44, lr: 1.19e-02
2024-08-06 05:02:10,048 INFO [trainer.py:765] (7/8) Epoch 10, batch 400, train_loss[loss=2.787, ArTop10Accuracy=0.7534, over 10233.00 frames. ], tot_loss[loss=2.925, ArTop10Accuracy=0.7278, over 10324.53 frames. ], batch size: 14, lr: 1.19e-02
2024-08-06 05:02:46,488 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 05:02:55,377 INFO [trainer.py:811] (7/8) Epoch 10, validation: loss=2.927, ArTop10Accuracy=0.7304, over 1829298.00 frames.
2024-08-06 05:02:55,378 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 32945MB
2024-08-06 05:02:55,728 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.023e+02 1.269e+02 1.367e+02 1.518e+02 4.405e+02, threshold=2.733e+02, percent-clipped=0.4
2024-08-06 05:02:58,361 INFO [trainer.py:765] (7/8) Epoch 10, batch 500, train_loss[loss=2.93, ArTop10Accuracy=0.7277, over 12306.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7279, over 10904.75 frames. ], batch size: 22, lr: 1.19e-02
2024-08-06 05:03:48,229 INFO [trainer.py:765] (7/8) Epoch 10, batch 600, train_loss[loss=2.851, ArTop10Accuracy=0.7395, over 11544.00 frames. ], tot_loss[loss=2.92, ArTop10Accuracy=0.7277, over 11443.00 frames. ], batch size: 18, lr: 1.18e-02
2024-08-06 05:04:36,716 INFO [trainer.py:765] (7/8) Epoch 10, batch 700, train_loss[loss=2.832, ArTop10Accuracy=0.739, over 9305.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7271, over 11578.54 frames. ], batch size: 11, lr: 1.18e-02
2024-08-06 05:05:10,725 INFO [trainer.py:765] (7/8) Epoch 10, batch 800, train_loss[loss=2.897, ArTop10Accuracy=0.7411, over 9992.00 frames. ], tot_loss[loss=2.931, ArTop10Accuracy=0.7256, over 11678.93 frames. ], batch size: 12, lr: 1.17e-02
2024-08-06 05:05:42,245 INFO [trainer.py:765] (7/8) Epoch 10, batch 900, train_loss[loss=2.902, ArTop10Accuracy=0.7292, over 13011.00 frames. ], tot_loss[loss=2.924, ArTop10Accuracy=0.7268, over 11719.54 frames. ], batch size: 27, lr: 1.17e-02
2024-08-06 05:06:13,843 INFO [trainer.py:765] (7/8) Epoch 10, batch 1000, train_loss[loss=2.879, ArTop10Accuracy=0.7321, over 12733.00 frames. ], tot_loss[loss=2.928, ArTop10Accuracy=0.726, over 11953.11 frames. ], batch size: 27, lr: 1.16e-02
2024-08-06 05:06:45,055 INFO [trainer.py:765] (7/8) Epoch 10, batch 1100, train_loss[loss=2.942, ArTop10Accuracy=0.719, over 13572.00 frames. ], tot_loss[loss=2.937, ArTop10Accuracy=0.7246, over 12003.87 frames. ], batch size: 34, lr: 1.16e-02
2024-08-06 05:07:15,483 INFO [trainer.py:765] (7/8) Epoch 10, batch 1200, train_loss[loss=3.163, ArTop10Accuracy=0.6777, over 12281.00 frames. ], tot_loss[loss=2.938, ArTop10Accuracy=0.7245, over 11927.34 frames. ], batch size: 97, lr: 1.16e-02
2024-08-06 05:07:40,800 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 05:08:52,967 INFO [trainer.py:765] (7/8) Epoch 11, batch 100, train_loss[loss=2.933, ArTop10Accuracy=0.723, over 14427.00 frames. ], tot_loss[loss=2.911, ArTop10Accuracy=0.7303, over 4794.99 frames. ], batch size: 61, lr: 1.10e-02
2024-08-06 05:09:41,278 INFO [trainer.py:765] (7/8) Epoch 11, batch 200, train_loss[loss=3.016, ArTop10Accuracy=0.71, over 13705.00 frames. ], tot_loss[loss=2.906, ArTop10Accuracy=0.7316, over 7799.16 frames. ], batch size: 34, lr: 1.10e-02
2024-08-06 05:09:51,176 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.001e+02 1.278e+02 1.371e+02 1.502e+02 3.785e+02, threshold=2.743e+02, percent-clipped=0.3
2024-08-06 05:10:24,721 INFO [trainer.py:765] (7/8) Epoch 11, batch 300, train_loss[loss=3.012, ArTop10Accuracy=0.7111, over 14181.00 frames. ], tot_loss[loss=2.908, ArTop10Accuracy=0.7309, over 9412.18 frames. ], batch size: 44, lr: 1.09e-02
2024-08-06 05:11:11,784 INFO [trainer.py:765] (7/8) Epoch 11, batch 400, train_loss[loss=2.876, ArTop10Accuracy=0.7361, over 11002.00 frames. ], tot_loss[loss=2.909, ArTop10Accuracy=0.7308, over 10335.77 frames. ], batch size: 15, lr: 1.09e-02
2024-08-06 05:11:52,692 INFO [trainer.py:765] (7/8) Epoch 11, batch 500, train_loss[loss=2.877, ArTop10Accuracy=0.7305, over 12402.00 frames. ], tot_loss[loss=2.902, ArTop10Accuracy=0.732, over 10917.64 frames. ], batch size: 22, lr: 1.09e-02
2024-08-06 05:12:40,288 INFO [trainer.py:765] (7/8) Epoch 11, batch 600, train_loss[loss=2.898, ArTop10Accuracy=0.7318, over 11560.00 frames. ], tot_loss[loss=2.903, ArTop10Accuracy=0.7313, over 11450.34 frames. ], batch size: 18, lr: 1.08e-02
2024-08-06 05:13:25,709 INFO [trainer.py:765] (7/8) Epoch 11, batch 700, train_loss[loss=2.809, ArTop10Accuracy=0.7486, over 10109.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7297, over 11582.10 frames. ], batch size: 12, lr: 1.08e-02
2024-08-06 05:14:04,206 INFO [trainer.py:765] (7/8) Epoch 11, batch 800, train_loss[loss=2.855, ArTop10Accuracy=0.737, over 9140.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7295, over 11680.97 frames. ], batch size: 11, lr: 1.07e-02
2024-08-06 05:14:35,667 INFO [trainer.py:765] (7/8) Epoch 11, batch 900, train_loss[loss=2.956, ArTop10Accuracy=0.7271, over 12898.00 frames. ], tot_loss[loss=2.904, ArTop10Accuracy=0.7314, over 11739.77 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 05:15:07,264 INFO [trainer.py:765] (7/8) Epoch 11, batch 1000, train_loss[loss=2.996, ArTop10Accuracy=0.7159, over 12925.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7297, over 11935.65 frames. ], batch size: 27, lr: 1.07e-02
2024-08-06 05:15:38,260 INFO [trainer.py:765] (7/8) Epoch 11, batch 1100, train_loss[loss=2.956, ArTop10Accuracy=0.7185, over 13898.00 frames. ], tot_loss[loss=2.921, ArTop10Accuracy=0.7277, over 12000.99 frames. ], batch size: 35, lr: 1.06e-02
2024-08-06 05:16:08,498 INFO [trainer.py:765] (7/8) Epoch 11, batch 1200, train_loss[loss=3.11, ArTop10Accuracy=0.6932, over 12329.00 frames. ], tot_loss[loss=2.923, ArTop10Accuracy=0.7273, over 11935.02 frames. ], batch size: 97, lr: 1.06e-02
2024-08-06 05:16:12,697 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 05:16:21,622 INFO [trainer.py:811] (7/8) Epoch 11, validation: loss=2.923, ArTop10Accuracy=0.7318, over 1829298.00 frames.
2024-08-06 05:16:21,623 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 32945MB
2024-08-06 05:16:21,949 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.076e+02 1.268e+02 1.368e+02 1.481e+02 4.790e+02, threshold=2.736e+02, percent-clipped=0.6
2024-08-06 05:16:42,778 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 05:18:03,005 INFO [trainer.py:765] (7/8) Epoch 12, batch 100, train_loss[loss=2.937, ArTop10Accuracy=0.724, over 14685.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7355, over 4788.97 frames. ], batch size: 61, lr: 1.01e-02
2024-08-06 05:18:46,004 INFO [trainer.py:765] (7/8) Epoch 12, batch 200, train_loss[loss=2.843, ArTop10Accuracy=0.7439, over 13985.00 frames. ], tot_loss[loss=2.884, ArTop10Accuracy=0.7357, over 7789.85 frames. ], batch size: 34, lr: 1.01e-02
2024-08-06 05:19:31,946 INFO [trainer.py:765] (7/8) Epoch 12, batch 300, train_loss[loss=2.984, ArTop10Accuracy=0.7123, over 14488.00 frames. ], tot_loss[loss=2.883, ArTop10Accuracy=0.7359, over 9409.31 frames. ], batch size: 45, lr: 1.01e-02
2024-08-06 05:20:12,430 INFO [trainer.py:765] (7/8) Epoch 12, batch 400, train_loss[loss=2.722, ArTop10Accuracy=0.7559, over 10436.00 frames. ], tot_loss[loss=2.882, ArTop10Accuracy=0.7353, over 10324.29 frames. ], batch size: 14, lr: 1.00e-02
2024-08-06 05:21:00,640 INFO [trainer.py:765] (7/8) Epoch 12, batch 500, train_loss[loss=2.804, ArTop10Accuracy=0.7499, over 12250.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7365, over 10895.36 frames. ], batch size: 22, lr: 9.99e-03
2024-08-06 05:21:43,915 INFO [trainer.py:765] (7/8) Epoch 12, batch 600, train_loss[loss=2.829, ArTop10Accuracy=0.7489, over 11619.00 frames. ], tot_loss[loss=2.881, ArTop10Accuracy=0.7353, over 11433.62 frames. ], batch size: 18, lr: 9.96e-03
2024-08-06 05:22:32,207 INFO [trainer.py:765] (7/8) Epoch 12, batch 700, train_loss[loss=2.936, ArTop10Accuracy=0.7328, over 10181.00 frames. ], tot_loss[loss=2.887, ArTop10Accuracy=0.7342, over 11563.75 frames. ], batch size: 12, lr: 9.93e-03
2024-08-06 05:23:08,911 INFO [trainer.py:765] (7/8) Epoch 12, batch 800, train_loss[loss=2.758, ArTop10Accuracy=0.7528, over 9162.00 frames. ], tot_loss[loss=2.898, ArTop10Accuracy=0.7319, over 11669.36 frames. ], batch size: 11, lr: 9.90e-03
2024-08-06 05:23:40,460 INFO [trainer.py:765] (7/8) Epoch 12, batch 900, train_loss[loss=2.834, ArTop10Accuracy=0.7366, over 12971.00 frames. ], tot_loss[loss=2.892, ArTop10Accuracy=0.7334, over 11729.80 frames. ], batch size: 27, lr: 9.87e-03
2024-08-06 05:23:54,576 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.067e+02 1.273e+02 1.376e+02 1.503e+02 4.050e+02, threshold=2.752e+02, percent-clipped=0.4
2024-08-06 05:24:14,346 INFO [trainer.py:765] (7/8) Epoch 12, batch 1000, train_loss[loss=2.917, ArTop10Accuracy=0.7296, over 13072.00 frames. ], tot_loss[loss=2.9, ArTop10Accuracy=0.732, over 11918.96 frames. ], batch size: 27, lr: 9.84e-03
2024-08-06 05:24:45,504 INFO [trainer.py:765] (7/8) Epoch 12, batch 1100, train_loss[loss=2.956, ArTop10Accuracy=0.7238, over 13653.00 frames. ], tot_loss[loss=2.912, ArTop10Accuracy=0.7296, over 11992.21 frames. ], batch size: 34, lr: 9.81e-03
2024-08-06 05:25:15,882 INFO [trainer.py:765] (7/8) Epoch 12, batch 1200, train_loss[loss=3.062, ArTop10Accuracy=0.6972, over 11811.00 frames. ], tot_loss[loss=2.913, ArTop10Accuracy=0.7297, over 11939.20 frames. ], batch size: 97, lr: 9.78e-03
2024-08-06 05:25:41,190 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 05:26:46,787 INFO [trainer.py:765] (7/8) Epoch 13, batch 100, train_loss[loss=2.928, ArTop10Accuracy=0.7275, over 14297.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7376, over 4779.82 frames. ], batch size: 61, lr: 9.36e-03
2024-08-06 05:27:32,553 INFO [trainer.py:765] (7/8) Epoch 13, batch 200, train_loss[loss=2.95, ArTop10Accuracy=0.7271, over 13915.00 frames. ], tot_loss[loss=2.869, ArTop10Accuracy=0.7389, over 7787.10 frames. ], batch size: 34, lr: 9.34e-03
2024-08-06 05:28:16,036 INFO [trainer.py:765] (7/8) Epoch 13, batch 300, train_loss[loss=3.065, ArTop10Accuracy=0.7023, over 14336.00 frames. ], tot_loss[loss=2.863, ArTop10Accuracy=0.7397, over 9410.31 frames. ], batch size: 44, lr: 9.31e-03
2024-08-06 05:29:00,149 INFO [trainer.py:765] (7/8) Epoch 13, batch 400, train_loss[loss=2.885, ArTop10Accuracy=0.732, over 10352.00 frames. ], tot_loss[loss=2.862, ArTop10Accuracy=0.7392, over 10334.89 frames. ], batch size: 14, lr: 9.28e-03
2024-08-06 05:29:43,967 INFO [trainer.py:765] (7/8) Epoch 13, batch 500, train_loss[loss=2.745, ArTop10Accuracy=0.7553, over 12225.00 frames. ], tot_loss[loss=2.865, ArTop10Accuracy=0.7388, over 10899.49 frames. ], batch size: 22, lr: 9.26e-03
2024-08-06 05:30:24,247 INFO [trainer.py:765] (7/8) Epoch 13, batch 600, train_loss[loss=2.854, ArTop10Accuracy=0.7456, over 11463.00 frames. ], tot_loss[loss=2.874, ArTop10Accuracy=0.7369, over 11438.36 frames. ], batch size: 18, lr: 9.23e-03
2024-08-06 05:30:58,110 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 05:31:07,054 INFO [trainer.py:811] (7/8) Epoch 13, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames.
2024-08-06 05:31:07,054 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 33330MB
2024-08-06 05:31:07,351 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.049e+02 1.283e+02 1.389e+02 1.496e+02 2.729e+02, threshold=2.779e+02, percent-clipped=0.0
2024-08-06 05:31:24,043 INFO [trainer.py:765] (7/8) Epoch 13, batch 700, train_loss[loss=2.72, ArTop10Accuracy=0.7559, over 10121.00 frames. ], tot_loss[loss=2.88, ArTop10Accuracy=0.7358, over 11564.55 frames. ], batch size: 12, lr: 9.20e-03
2024-08-06 05:32:00,147 INFO [trainer.py:765] (7/8) Epoch 13, batch 800, train_loss[loss=2.683, ArTop10Accuracy=0.7698, over 10204.00 frames. ], tot_loss[loss=2.881, ArTop10Accuracy=0.7357, over 11662.77 frames. ], batch size: 12, lr: 9.18e-03
2024-08-06 05:32:31,521 INFO [trainer.py:765] (7/8) Epoch 13, batch 900, train_loss[loss=2.813, ArTop10Accuracy=0.7501, over 13011.00 frames. ], tot_loss[loss=2.877, ArTop10Accuracy=0.7366, over 11735.97 frames. ], batch size: 27, lr: 9.15e-03
2024-08-06 05:33:03,043 INFO [trainer.py:765] (7/8) Epoch 13, batch 1000, train_loss[loss=2.809, ArTop10Accuracy=0.7527, over 12993.00 frames. ], tot_loss[loss=2.886, ArTop10Accuracy=0.7348, over 11942.54 frames. ], batch size: 27, lr: 9.13e-03
2024-08-06 05:33:34,232 INFO [trainer.py:765] (7/8) Epoch 13, batch 1100, train_loss[loss=3.034, ArTop10Accuracy=0.7081, over 13594.00 frames. ], tot_loss[loss=2.893, ArTop10Accuracy=0.7332, over 12002.91 frames. ], batch size: 34, lr: 9.10e-03
2024-08-06 05:34:04,519 INFO [trainer.py:765] (7/8) Epoch 13, batch 1200, train_loss[loss=2.917, ArTop10Accuracy=0.7239, over 12703.00 frames. ], tot_loss[loss=2.891, ArTop10Accuracy=0.7337, over 11938.74 frames. ], batch size: 98, lr: 9.07e-03
2024-08-06 05:34:29,769 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 05:35:39,198 INFO [trainer.py:765] (7/8) Epoch 14, batch 100, train_loss[loss=3.021, ArTop10Accuracy=0.7104, over 14762.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7414, over 4786.81 frames. ], batch size: 61, lr: 8.71e-03
2024-08-06 05:36:23,063 INFO [trainer.py:765] (7/8) Epoch 14, batch 200, train_loss[loss=2.909, ArTop10Accuracy=0.7284, over 13763.00 frames. ], tot_loss[loss=2.86, ArTop10Accuracy=0.7409, over 7808.29 frames. ], batch size: 34, lr: 8.68e-03
2024-08-06 05:37:09,309 INFO [trainer.py:765] (7/8) Epoch 14, batch 300, train_loss[loss=2.872, ArTop10Accuracy=0.7372, over 14461.00 frames. ], tot_loss[loss=2.861, ArTop10Accuracy=0.7405, over 9434.89 frames. ], batch size: 44, lr: 8.66e-03
2024-08-06 05:37:46,030 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.097e+02 1.304e+02 1.410e+02 1.531e+02 2.912e+02, threshold=2.820e+02, percent-clipped=0.2
2024-08-06 05:37:55,139 INFO [trainer.py:765] (7/8) Epoch 14, batch 400, train_loss[loss=2.817, ArTop10Accuracy=0.744, over 10288.00 frames. ], tot_loss[loss=2.86, ArTop10Accuracy=0.7402, over 10351.66 frames. ], batch size: 14, lr: 8.64e-03
2024-08-06 05:38:42,025 INFO [trainer.py:765] (7/8) Epoch 14, batch 500, train_loss[loss=2.833, ArTop10Accuracy=0.7431, over 12302.00 frames. ], tot_loss[loss=2.857, ArTop10Accuracy=0.7405, over 10918.81 frames. ], batch size: 22, lr: 8.61e-03
2024-08-06 05:39:22,374 INFO [trainer.py:765] (7/8) Epoch 14, batch 600, train_loss[loss=2.904, ArTop10Accuracy=0.7352, over 11537.00 frames. ], tot_loss[loss=2.86, ArTop10Accuracy=0.7397, over 11438.29 frames. ], batch size: 18, lr: 8.59e-03
2024-08-06 05:40:15,143 INFO [trainer.py:765] (7/8) Epoch 14, batch 700, train_loss[loss=2.907, ArTop10Accuracy=0.7416, over 10214.00 frames. ], tot_loss[loss=2.87, ArTop10Accuracy=0.738, over 11584.81 frames. ], batch size: 12, lr: 8.57e-03
2024-08-06 05:40:49,135 INFO [trainer.py:765] (7/8) Epoch 14, batch 800, train_loss[loss=2.774, ArTop10Accuracy=0.7546, over 10079.00 frames. ], tot_loss[loss=2.872, ArTop10Accuracy=0.738, over 11701.95 frames. ], batch size: 12, lr: 8.55e-03
2024-08-06 05:41:20,466 INFO [trainer.py:765] (7/8) Epoch 14, batch 900, train_loss[loss=2.781, ArTop10Accuracy=0.7551, over 12897.00 frames. ], tot_loss[loss=2.868, ArTop10Accuracy=0.7382, over 11750.67 frames. ], batch size: 27, lr: 8.52e-03
2024-08-06 05:41:51,996 INFO [trainer.py:765] (7/8) Epoch 14, batch 1000, train_loss[loss=2.865, ArTop10Accuracy=0.7385, over 13192.00 frames. ], tot_loss[loss=2.873, ArTop10Accuracy=0.7378, over 11959.68 frames. ], batch size: 27, lr: 8.50e-03
2024-08-06 05:42:23,216 INFO [trainer.py:765] (7/8) Epoch 14, batch 1100, train_loss[loss=2.838, ArTop10Accuracy=0.747, over 13818.00 frames. ], tot_loss[loss=2.88, ArTop10Accuracy=0.7362, over 12008.10 frames. ], batch size: 34, lr: 8.48e-03
2024-08-06 05:42:53,549 INFO [trainer.py:765] (7/8) Epoch 14, batch 1200, train_loss[loss=3.06, ArTop10Accuracy=0.7085, over 11500.00 frames. ], tot_loss[loss=2.881, ArTop10Accuracy=0.7361, over 11952.39 frames. ], batch size: 99, lr: 8.46e-03
2024-08-06 05:43:19,099 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 05:44:28,571 INFO [trainer.py:765] (7/8) Epoch 15, batch 100, train_loss[loss=2.855, ArTop10Accuracy=0.7372, over 14711.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7443, over 4779.25 frames. ], batch size: 62, lr: 8.14e-03
2024-08-06 05:44:29,213 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 05:44:38,024 INFO [trainer.py:811] (7/8) Epoch 15, validation: loss=2.913, ArTop10Accuracy=0.7339, over 1829298.00 frames.
2024-08-06 05:44:38,024 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 33330MB
2024-08-06 05:44:38,413 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.100e+02 1.307e+02 1.417e+02 1.528e+02 2.981e+02, threshold=2.833e+02, percent-clipped=0.1
2024-08-06 05:45:20,184 INFO [trainer.py:765] (7/8) Epoch 15, batch 200, train_loss[loss=2.772, ArTop10Accuracy=0.755, over 13671.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7434, over 7798.34 frames. ], batch size: 34, lr: 8.11e-03
2024-08-06 05:46:04,647 INFO [trainer.py:765] (7/8) Epoch 15, batch 300, train_loss[loss=2.874, ArTop10Accuracy=0.7398, over 14197.00 frames. ], tot_loss[loss=2.847, ArTop10Accuracy=0.7434, over 9419.25 frames. ], batch size: 44, lr: 8.09e-03
2024-08-06 05:46:51,902 INFO [trainer.py:765] (7/8) Epoch 15, batch 400, train_loss[loss=2.894, ArTop10Accuracy=0.7382, over 11023.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7431, over 10331.10 frames. ], batch size: 15, lr: 8.07e-03
2024-08-06 05:47:36,911 INFO [trainer.py:765] (7/8) Epoch 15, batch 500, train_loss[loss=2.803, ArTop10Accuracy=0.7451, over 11984.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.7431, over 10898.05 frames. ], batch size: 22, lr: 8.05e-03
2024-08-06 05:48:24,723 INFO [trainer.py:765] (7/8) Epoch 15, batch 600, train_loss[loss=2.866, ArTop10Accuracy=0.7361, over 11603.00 frames. ], tot_loss[loss=2.848, ArTop10Accuracy=0.7421, over 11431.42 frames. ], batch size: 18, lr: 8.03e-03
2024-08-06 05:49:11,855 INFO [trainer.py:765] (7/8) Epoch 15, batch 700, train_loss[loss=2.791, ArTop10Accuracy=0.7493, over 10176.00 frames. ], tot_loss[loss=2.853, ArTop10Accuracy=0.7411, over 11571.28 frames. ], batch size: 12, lr: 8.01e-03
2024-08-06 05:49:45,778 INFO [trainer.py:765] (7/8) Epoch 15, batch 800, train_loss[loss=2.948, ArTop10Accuracy=0.7084, over 9865.00 frames. ], tot_loss[loss=2.859, ArTop10Accuracy=0.7398, over 11676.56 frames. ], batch size: 12, lr: 7.99e-03
2024-08-06 05:50:17,210 INFO [trainer.py:765] (7/8) Epoch 15, batch 900, train_loss[loss=2.927, ArTop10Accuracy=0.7297, over 13065.00 frames. ], tot_loss[loss=2.856, ArTop10Accuracy=0.7408, over 11719.16 frames. ], batch size: 27, lr: 7.97e-03
2024-08-06 05:50:48,829 INFO [trainer.py:765] (7/8) Epoch 15, batch 1000, train_loss[loss=2.937, ArTop10Accuracy=0.7292, over 12817.00 frames. ], tot_loss[loss=2.864, ArTop10Accuracy=0.7392, over 11931.16 frames. ], batch size: 27, lr: 7.95e-03
2024-08-06 05:51:20,069 INFO [trainer.py:765] (7/8) Epoch 15, batch 1100, train_loss[loss=2.935, ArTop10Accuracy=0.7279, over 13793.00 frames. ], tot_loss[loss=2.875, ArTop10Accuracy=0.7372, over 11987.27 frames. ], batch size: 34, lr: 7.93e-03
2024-08-06 05:51:23,515 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.123e+02 1.337e+02 1.431e+02 1.541e+02 2.784e+02, threshold=2.862e+02, percent-clipped=0.0
2024-08-06 05:51:53,082 INFO [trainer.py:765] (7/8) Epoch 15, batch 1200, train_loss[loss=3.025, ArTop10Accuracy=0.7066, over 12796.00 frames. ], tot_loss[loss=2.878, ArTop10Accuracy=0.7364, over 11924.70 frames. ], batch size: 98, lr: 7.91e-03
2024-08-06 05:52:18,170 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 05:53:29,263 INFO [trainer.py:765] (7/8) Epoch 16, batch 100, train_loss[loss=2.91, ArTop10Accuracy=0.7335, over 14452.00 frames. ], tot_loss[loss=2.841, ArTop10Accuracy=0.7447, over 4792.27 frames. ], batch size: 61, lr: 7.63e-03
2024-08-06 05:54:12,877 INFO [trainer.py:765] (7/8) Epoch 16, batch 200, train_loss[loss=2.824, ArTop10Accuracy=0.7481, over 13704.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.745, over 7808.32 frames. ], batch size: 34, lr: 7.61e-03
2024-08-06 05:54:59,737 INFO [trainer.py:765] (7/8) Epoch 16, batch 300, train_loss[loss=2.967, ArTop10Accuracy=0.7217, over 14224.00 frames. ], tot_loss[loss=2.838, ArTop10Accuracy=0.7446, over 9417.88 frames. ], batch size: 44, lr: 7.59e-03
2024-08-06 05:55:41,931 INFO [trainer.py:765] (7/8) Epoch 16, batch 400, train_loss[loss=2.81, ArTop10Accuracy=0.7495, over 10345.00 frames. ], tot_loss[loss=2.836, ArTop10Accuracy=0.7451, over 10344.76 frames. ], batch size: 14, lr: 7.58e-03
2024-08-06 05:56:27,680 INFO [trainer.py:765] (7/8) Epoch 16, batch 500, train_loss[loss=2.813, ArTop10Accuracy=0.7427, over 12488.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.7445, over 10892.21 frames. ], batch size: 22, lr: 7.56e-03
2024-08-06 05:57:12,439 INFO [trainer.py:765] (7/8) Epoch 16, batch 600, train_loss[loss=2.741, ArTop10Accuracy=0.7625, over 11649.00 frames. ], tot_loss[loss=2.836, ArTop10Accuracy=0.7445, over 11409.96 frames. ], batch size: 18, lr: 7.54e-03
2024-08-06 05:58:00,040 INFO [trainer.py:765] (7/8) Epoch 16, batch 700, train_loss[loss=2.781, ArTop10Accuracy=0.7502, over 9260.00 frames. ], tot_loss[loss=2.843, ArTop10Accuracy=0.7433, over 11558.03 frames. ], batch size: 11, lr: 7.52e-03
2024-08-06 05:58:34,024 INFO [trainer.py:765] (7/8) Epoch 16, batch 800, train_loss[loss=2.638, ArTop10Accuracy=0.7731, over 10124.00 frames. ], tot_loss[loss=2.849, ArTop10Accuracy=0.7422, over 11686.87 frames. ], batch size: 12, lr: 7.50e-03
2024-08-06 05:58:41,569 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 05:58:50,426 INFO [trainer.py:811] (7/8) Epoch 16, validation: loss=2.915, ArTop10Accuracy=0.7338, over 1829298.00 frames.
2024-08-06 05:58:50,427 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 33330MB
2024-08-06 05:58:50,730 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.121e+02 1.335e+02 1.445e+02 1.570e+02 3.252e+02, threshold=2.890e+02, percent-clipped=0.1
2024-08-06 05:59:14,321 INFO [trainer.py:765] (7/8) Epoch 16, batch 900, train_loss[loss=2.922, ArTop10Accuracy=0.7276, over 13025.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7438, over 11734.24 frames. ], batch size: 27, lr: 7.49e-03
2024-08-06 05:59:45,915 INFO [trainer.py:765] (7/8) Epoch 16, batch 1000, train_loss[loss=2.942, ArTop10Accuracy=0.728, over 12972.00 frames. ], tot_loss[loss=2.846, ArTop10Accuracy=0.7424, over 11936.83 frames. ], batch size: 27, lr: 7.47e-03
2024-08-06 06:00:17,092 INFO [trainer.py:765] (7/8) Epoch 16, batch 1100, train_loss[loss=2.712, ArTop10Accuracy=0.7669, over 13554.00 frames. ], tot_loss[loss=2.854, ArTop10Accuracy=0.7412, over 11992.15 frames. ], batch size: 34, lr: 7.45e-03
2024-08-06 06:00:47,465 INFO [trainer.py:765] (7/8) Epoch 16, batch 1200, train_loss[loss=3.043, ArTop10Accuracy=0.707, over 11286.00 frames. ], tot_loss[loss=2.853, ArTop10Accuracy=0.7415, over 11929.58 frames. ], batch size: 97, lr: 7.43e-03
2024-08-06 06:01:12,505 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 06:02:27,261 INFO [trainer.py:765] (7/8) Epoch 17, batch 100, train_loss[loss=2.849, ArTop10Accuracy=0.7423, over 14226.00 frames. ], tot_loss[loss=2.835, ArTop10Accuracy=0.746, over 4787.10 frames. ], batch size: 61, lr: 7.18e-03
2024-08-06 06:03:11,850 INFO [trainer.py:765] (7/8) Epoch 17, batch 200, train_loss[loss=2.721, ArTop10Accuracy=0.7681, over 13637.00 frames. ], tot_loss[loss=2.824, ArTop10Accuracy=0.748, over 7790.35 frames. ], batch size: 34, lr: 7.17e-03
2024-08-06 06:03:57,502 INFO [trainer.py:765] (7/8) Epoch 17, batch 300, train_loss[loss=2.877, ArTop10Accuracy=0.7331, over 14629.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7482, over 9438.25 frames. ], batch size: 44, lr: 7.15e-03
2024-08-06 06:04:42,837 INFO [trainer.py:765] (7/8) Epoch 17, batch 400, train_loss[loss=2.743, ArTop10Accuracy=0.7543, over 10921.00 frames. ], tot_loss[loss=2.821, ArTop10Accuracy=0.7481, over 10342.37 frames. ], batch size: 15, lr: 7.13e-03
2024-08-06 06:05:29,004 INFO [trainer.py:765] (7/8) Epoch 17, batch 500, train_loss[loss=2.778, ArTop10Accuracy=0.7533, over 12312.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.749, over 10913.72 frames. ], batch size: 22, lr: 7.12e-03
2024-08-06 06:05:49,551 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.142e+02 1.359e+02 1.445e+02 1.551e+02 2.741e+02, threshold=2.891e+02, percent-clipped=0.0
2024-08-06 06:06:20,723 INFO [trainer.py:765] (7/8) Epoch 17, batch 600, train_loss[loss=2.677, ArTop10Accuracy=0.7716, over 11626.00 frames. ], tot_loss[loss=2.824, ArTop10Accuracy=0.7473, over 11448.56 frames. ], batch size: 18, lr: 7.10e-03
2024-08-06 06:07:04,694 INFO [trainer.py:765] (7/8) Epoch 17, batch 700, train_loss[loss=2.844, ArTop10Accuracy=0.7481, over 10138.00 frames. ], tot_loss[loss=2.833, ArTop10Accuracy=0.7456, over 11579.32 frames. ], batch size: 12, lr: 7.09e-03
2024-08-06 06:07:44,896 INFO [trainer.py:765] (7/8) Epoch 17, batch 800, train_loss[loss=2.5, ArTop10Accuracy=0.7955, over 10062.00 frames. ], tot_loss[loss=2.837, ArTop10Accuracy=0.7444, over 11683.64 frames. ], batch size: 12, lr: 7.07e-03
2024-08-06 06:08:16,384 INFO [trainer.py:765] (7/8) Epoch 17, batch 900, train_loss[loss=2.784, ArTop10Accuracy=0.7562, over 12850.00 frames. ], tot_loss[loss=2.828, ArTop10Accuracy=0.7462, over 11737.83 frames. ], batch size: 27, lr: 7.05e-03
2024-08-06 06:08:47,994 INFO [trainer.py:765] (7/8) Epoch 17, batch 1000, train_loss[loss=2.828, ArTop10Accuracy=0.7479, over 13056.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.7461, over 11930.06 frames. ], batch size: 27, lr: 7.04e-03
2024-08-06 06:09:19,134 INFO [trainer.py:765] (7/8) Epoch 17, batch 1100, train_loss[loss=2.878, ArTop10Accuracy=0.7387, over 13781.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.743, over 12004.86 frames. ], batch size: 34, lr: 7.02e-03
2024-08-06 06:09:49,444 INFO [trainer.py:765] (7/8) Epoch 17, batch 1200, train_loss[loss=3.044, ArTop10Accuracy=0.707, over 11693.00 frames. ], tot_loss[loss=2.845, ArTop10Accuracy=0.7427, over 11935.28 frames. ], batch size: 99, lr: 7.01e-03
2024-08-06 06:10:14,194 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 06:11:23,103 INFO [trainer.py:765] (7/8) Epoch 18, batch 100, train_loss[loss=2.881, ArTop10Accuracy=0.7371, over 14889.00 frames. ], tot_loss[loss=2.819, ArTop10Accuracy=0.7493, over 4799.69 frames. ], batch size: 61, lr: 6.78e-03
2024-08-06 06:12:16,260 INFO [trainer.py:765] (7/8) Epoch 18, batch 200, train_loss[loss=2.763, ArTop10Accuracy=0.7598, over 13617.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.7497, over 7821.13 frames. ], batch size: 34, lr: 6.77e-03
2024-08-06 06:12:40,318 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 06:12:48,991 INFO [trainer.py:811] (7/8) Epoch 18, validation: loss=2.916, ArTop10Accuracy=0.7343, over 1829298.00 frames.
2024-08-06 06:12:48,992 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 33330MB
2024-08-06 06:12:49,335 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.163e+02 1.377e+02 1.476e+02 1.588e+02 2.450e+02, threshold=2.952e+02, percent-clipped=0.0
2024-08-06 06:13:07,116 INFO [trainer.py:765] (7/8) Epoch 18, batch 300, train_loss[loss=2.901, ArTop10Accuracy=0.7325, over 14308.00 frames. ], tot_loss[loss=2.815, ArTop10Accuracy=0.7496, over 9430.32 frames. ], batch size: 44, lr: 6.75e-03
2024-08-06 06:13:54,097 INFO [trainer.py:765] (7/8) Epoch 18, batch 400, train_loss[loss=2.773, ArTop10Accuracy=0.7506, over 10390.00 frames. ], tot_loss[loss=2.809, ArTop10Accuracy=0.7505, over 10345.82 frames. ], batch size: 14, lr: 6.74e-03
2024-08-06 06:14:38,488 INFO [trainer.py:765] (7/8) Epoch 18, batch 500, train_loss[loss=2.731, ArTop10Accuracy=0.7656, over 12024.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7503, over 10901.56 frames. ], batch size: 22, lr: 6.73e-03
2024-08-06 06:15:23,628 INFO [trainer.py:765] (7/8) Epoch 18, batch 600, train_loss[loss=2.73, ArTop10Accuracy=0.7583, over 11567.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.7492, over 11428.91 frames. ], batch size: 18, lr: 6.71e-03
2024-08-06 06:16:17,342 INFO [trainer.py:765] (7/8) Epoch 18, batch 700, train_loss[loss=2.82, ArTop10Accuracy=0.7657, over 10137.00 frames. ], tot_loss[loss=2.819, ArTop10Accuracy=0.7481, over 11563.74 frames. ], batch size: 12, lr: 6.70e-03
2024-08-06 06:16:51,428 INFO [trainer.py:765] (7/8) Epoch 18, batch 800, train_loss[loss=2.826, ArTop10Accuracy=0.7416, over 10036.00 frames. ], tot_loss[loss=2.825, ArTop10Accuracy=0.7467, over 11673.34 frames. ], batch size: 12, lr: 6.68e-03
2024-08-06 06:17:22,913 INFO [trainer.py:765] (7/8) Epoch 18, batch 900, train_loss[loss=2.732, ArTop10Accuracy=0.7624, over 12841.00 frames. ], tot_loss[loss=2.819, ArTop10Accuracy=0.748, over 11713.59 frames. ], batch size: 27, lr: 6.67e-03
2024-08-06 06:17:54,528 INFO [trainer.py:765] (7/8) Epoch 18, batch 1000, train_loss[loss=2.8, ArTop10Accuracy=0.7475, over 13072.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7464, over 11924.02 frames. ], batch size: 27, lr: 6.65e-03
2024-08-06 06:18:25,663 INFO [trainer.py:765] (7/8) Epoch 18, batch 1100, train_loss[loss=2.814, ArTop10Accuracy=0.7458, over 13834.00 frames. ], tot_loss[loss=2.838, ArTop10Accuracy=0.7444, over 11968.67 frames. ], batch size: 34, lr: 6.64e-03
2024-08-06 06:18:55,971 INFO [trainer.py:765] (7/8) Epoch 18, batch 1200, train_loss[loss=2.987, ArTop10Accuracy=0.7172, over 12370.00 frames. ], tot_loss[loss=2.84, ArTop10Accuracy=0.7439, over 11941.77 frames. ], batch size: 98, lr: 6.63e-03
2024-08-06 06:19:19,163 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.178e+02 1.387e+02 1.492e+02 1.607e+02 2.982e+02, threshold=2.983e+02, percent-clipped=0.1
2024-08-06 06:19:23,796 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 06:20:29,729 INFO [trainer.py:765] (7/8) Epoch 19, batch 100, train_loss[loss=2.899, ArTop10Accuracy=0.7365, over 14522.00 frames. ], tot_loss[loss=2.813, ArTop10Accuracy=0.7499, over 4776.03 frames. ], batch size: 61, lr: 6.43e-03
2024-08-06 06:21:11,275 INFO [trainer.py:765] (7/8) Epoch 19, batch 200, train_loss[loss=2.826, ArTop10Accuracy=0.7503, over 14068.00 frames. ], tot_loss[loss=2.8, ArTop10Accuracy=0.7525, over 7785.96 frames. ], batch size: 35, lr: 6.41e-03
2024-08-06 06:21:56,079 INFO [trainer.py:765] (7/8) Epoch 19, batch 300, train_loss[loss=2.851, ArTop10Accuracy=0.7433, over 14342.00 frames. ], tot_loss[loss=2.798, ArTop10Accuracy=0.753, over 9419.81 frames. ], batch size: 44, lr: 6.40e-03
2024-08-06 06:22:36,013 INFO [trainer.py:765] (7/8) Epoch 19, batch 400, train_loss[loss=2.761, ArTop10Accuracy=0.7579, over 10993.00 frames. ], tot_loss[loss=2.797, ArTop10Accuracy=0.7528, over 10328.28 frames. ], batch size: 15, lr: 6.39e-03
2024-08-06 06:23:18,998 INFO [trainer.py:765] (7/8) Epoch 19, batch 500, train_loss[loss=2.733, ArTop10Accuracy=0.7586, over 12113.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7532, over 10891.54 frames. ], batch size: 22, lr: 6.37e-03
2024-08-06 06:24:03,685 INFO [trainer.py:765] (7/8) Epoch 19, batch 600, train_loss[loss=2.733, ArTop10Accuracy=0.7635, over 11651.00 frames. ], tot_loss[loss=2.802, ArTop10Accuracy=0.7516, over 11428.00 frames. ], batch size: 18, lr: 6.36e-03
2024-08-06 06:24:46,186 INFO [trainer.py:765] (7/8) Epoch 19, batch 700, train_loss[loss=2.761, ArTop10Accuracy=0.7629, over 10146.00 frames. ], tot_loss[loss=2.806, ArTop10Accuracy=0.7508, over 11564.64 frames. ], batch size: 12, lr: 6.35e-03
2024-08-06 06:25:22,355 INFO [trainer.py:765] (7/8) Epoch 19, batch 800, train_loss[loss=2.826, ArTop10Accuracy=0.7441, over 10221.00 frames. ], tot_loss[loss=2.816, ArTop10Accuracy=0.7485, over 11678.87 frames. ], batch size: 12, lr: 6.33e-03
2024-08-06 06:25:53,625 INFO [trainer.py:765] (7/8) Epoch 19, batch 900, train_loss[loss=2.86, ArTop10Accuracy=0.7408, over 13085.00 frames. ], tot_loss[loss=2.812, ArTop10Accuracy=0.7493, over 11728.87 frames. ], batch size: 27, lr: 6.32e-03
2024-08-06 06:26:21,773 INFO [trainer.py:803] (7/8) Computing validation loss
2024-08-06 06:26:30,765 INFO [trainer.py:811] (7/8) Epoch 19, validation: loss=2.918, ArTop10Accuracy=0.733, over 1829298.00 frames.
2024-08-06 06:26:30,766 INFO [trainer.py:814] (7/8) Maximum memory allocated so far is 33330MB
2024-08-06 06:26:31,053 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.198e+02 1.416e+02 1.525e+02 1.662e+02 2.849e+02, threshold=3.050e+02, percent-clipped=0.0
2024-08-06 06:26:34,030 INFO [trainer.py:765] (7/8) Epoch 19, batch 1000, train_loss[loss=2.896, ArTop10Accuracy=0.733, over 13050.00 frames. ], tot_loss[loss=2.819, ArTop10Accuracy=0.7481, over 11956.93 frames. ], batch size: 27, lr: 6.31e-03
2024-08-06 06:27:05,190 INFO [trainer.py:765] (7/8) Epoch 19, batch 1100, train_loss[loss=2.718, ArTop10Accuracy=0.7668, over 13817.00 frames. ], tot_loss[loss=2.827, ArTop10Accuracy=0.7463, over 12009.26 frames. ], batch size: 34, lr: 6.30e-03
2024-08-06 06:27:35,453 INFO [trainer.py:765] (7/8) Epoch 19, batch 1200, train_loss[loss=2.966, ArTop10Accuracy=0.7175, over 11568.00 frames. ], tot_loss[loss=2.829, ArTop10Accuracy=0.7461, over 11933.03 frames. ], batch size: 99, lr: 6.28e-03
2024-08-06 06:28:00,587 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 06:29:08,985 INFO [trainer.py:765] (7/8) Epoch 20, batch 100, train_loss[loss=2.844, ArTop10Accuracy=0.7501, over 14479.00 frames. ], tot_loss[loss=2.801, ArTop10Accuracy=0.7528, over 4787.27 frames. ], batch size: 61, lr: 6.10e-03
2024-08-06 06:29:50,318 INFO [trainer.py:765] (7/8) Epoch 20, batch 200, train_loss[loss=2.834, ArTop10Accuracy=0.7514, over 13578.00 frames. ], tot_loss[loss=2.788, ArTop10Accuracy=0.7551, over 7764.62 frames. ], batch size: 34, lr: 6.09e-03
2024-08-06 06:30:37,106 INFO [trainer.py:765] (7/8) Epoch 20, batch 300, train_loss[loss=2.862, ArTop10Accuracy=0.7425, over 14228.00 frames. ], tot_loss[loss=2.795, ArTop10Accuracy=0.754, over 9402.11 frames. ], batch size: 44, lr: 6.08e-03
2024-08-06 06:31:16,353 INFO [trainer.py:765] (7/8) Epoch 20, batch 400, train_loss[loss=2.766, ArTop10Accuracy=0.7557, over 11066.00 frames. ], tot_loss[loss=2.792, ArTop10Accuracy=0.7543, over 10323.43 frames. ], batch size: 15, lr: 6.07e-03
2024-08-06 06:32:03,759 INFO [trainer.py:765] (7/8) Epoch 20, batch 500, train_loss[loss=2.76, ArTop10Accuracy=0.7643, over 12218.00 frames. ], tot_loss[loss=2.789, ArTop10Accuracy=0.7548, over 10904.40 frames. ], batch size: 22, lr: 6.05e-03
2024-08-06 06:32:43,357 INFO [trainer.py:765] (7/8) Epoch 20, batch 600, train_loss[loss=2.783, ArTop10Accuracy=0.7616, over 11421.00 frames. ], tot_loss[loss=2.795, ArTop10Accuracy=0.7531, over 11422.83 frames. ], batch size: 18, lr: 6.04e-03
2024-08-06 06:33:36,752 INFO [trainer.py:765] (7/8) Epoch 20, batch 700, train_loss[loss=2.64, ArTop10Accuracy=0.7865, over 9276.00 frames. ], tot_loss[loss=2.805, ArTop10Accuracy=0.7511, over 11564.03 frames. ], batch size: 11, lr: 6.03e-03
2024-08-06 06:33:43,829 INFO [optim.py:386] (7/8) Clipping_scale=2.0, grad-norm quartiles 1.196e+02 1.417e+02 1.526e+02 1.639e+02 3.791e+02, threshold=3.052e+02, percent-clipped=0.1
2024-08-06 06:34:13,304 INFO [trainer.py:765] (7/8) Epoch 20, batch 800, train_loss[loss=2.722, ArTop10Accuracy=0.7663, over 10072.00 frames. ], tot_loss[loss=2.81, ArTop10Accuracy=0.7496, over 11684.37 frames. ], batch size: 12, lr: 6.02e-03
2024-08-06 06:34:44,580 INFO [trainer.py:765] (7/8) Epoch 20, batch 900, train_loss[loss=2.875, ArTop10Accuracy=0.7347, over 12869.00 frames. ], tot_loss[loss=2.806, ArTop10Accuracy=0.7507, over 11714.01 frames. ], batch size: 27, lr: 6.01e-03
2024-08-06 06:35:16,139 INFO [trainer.py:765] (7/8) Epoch 20, batch 1000, train_loss[loss=2.824, ArTop10Accuracy=0.7471, over 12908.00 frames. ], tot_loss[loss=2.812, ArTop10Accuracy=0.7495, over 11907.52 frames. ], batch size: 27, lr: 6.00e-03
2024-08-06 06:35:47,214 INFO [trainer.py:765] (7/8) Epoch 20, batch 1100, train_loss[loss=2.749, ArTop10Accuracy=0.7582, over 13606.00 frames. ], tot_loss[loss=2.817, ArTop10Accuracy=0.7485, over 11969.14 frames. ], batch size: 34, lr: 5.99e-03
2024-08-06 06:36:17,439 INFO [trainer.py:765] (7/8) Epoch 20, batch 1200, train_loss[loss=2.948, ArTop10Accuracy=0.7227, over 12958.00 frames. ], tot_loss[loss=2.818, ArTop10Accuracy=0.7484, over 11919.73 frames. ], batch size: 98, lr: 5.97e-03
2024-08-06 06:36:42,404 INFO [trainer.py:650] (7/8) Reaches end of dataloader.
2024-08-06 06:36:42,406 INFO [trainer.py:1069] (7/8) Done!
|