|
2024-07-29 08:02:54,900 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:02:54,900 Training Model |
|
2024-07-29 08:02:54,900 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:02:54,900 Translator( |
|
(encoder): EncoderLSTM( |
|
(embedding): Embedding(114, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 512, batch_first=True, bidirectional=True) |
|
) |
|
(decoder): DecoderLSTM( |
|
(embedding): Embedding(112, 300, padding_idx=0) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(lstm): LSTM(300, 1024, batch_first=True) |
|
(hidden2vocab): Linear(in_features=1024, out_features=112, bias=True) |
|
(log_softmax): LogSoftmax(dim=-1) |
|
) |
|
) |
|
2024-07-29 08:02:54,900 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:02:54,900 Training Hyperparameters: |
|
2024-07-29 08:02:54,900 - max_epochs: 10 |
|
2024-07-29 08:02:54,900 - learning_rate: 0.001 |
|
2024-07-29 08:02:54,900 - batch_size: 128 |
|
2024-07-29 08:02:54,900 - patience: 5 |
|
2024-07-29 08:02:54,900 - scheduler_patience: 3 |
|
2024-07-29 08:02:54,900 - teacher_forcing_ratio: 0.5 |
|
2024-07-29 08:02:54,900 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:02:54,900 Computational Parameters: |
|
2024-07-29 08:02:54,900 - num_workers: 4 |
|
2024-07-29 08:02:54,900 - device: device(type='cuda', index=0) |
|
2024-07-29 08:02:54,900 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:02:54,900 Dataset Splits: |
|
2024-07-29 08:02:54,900 - train: 133623 data points |
|
2024-07-29 08:02:54,900 - dev: 19090 data points |
|
2024-07-29 08:02:54,900 - test: 38179 data points |
|
2024-07-29 08:02:54,900 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:02:54,900 EPOCH 1 |
|
2024-07-29 08:03:53,216 batch 104/1044 - loss 2.73587249 - lr 0.0010 - time 58.32s |
|
2024-07-29 08:04:49,000 batch 208/1044 - loss 2.60424047 - lr 0.0010 - time 114.10s |
|
2024-07-29 08:05:41,466 batch 312/1044 - loss 2.53508188 - lr 0.0010 - time 166.57s |
|
2024-07-29 08:06:40,078 batch 416/1044 - loss 2.48822718 - lr 0.0010 - time 225.18s |
|
2024-07-29 08:07:39,192 batch 520/1044 - loss 2.45172488 - lr 0.0010 - time 284.29s |
|
2024-07-29 08:08:34,794 batch 624/1044 - loss 2.42237450 - lr 0.0010 - time 339.89s |
|
2024-07-29 08:09:31,742 batch 728/1044 - loss 2.39725696 - lr 0.0010 - time 396.84s |
|
2024-07-29 08:10:29,289 batch 832/1044 - loss 2.37658248 - lr 0.0010 - time 454.39s |
|
2024-07-29 08:11:26,901 batch 936/1044 - loss 2.35841719 - lr 0.0010 - time 512.00s |
|
2024-07-29 08:12:23,384 batch 1040/1044 - loss 2.34286929 - lr 0.0010 - time 568.48s |
|
2024-07-29 08:12:25,463 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:12:25,466 EPOCH 1 DONE |
|
2024-07-29 08:12:53,761 TRAIN Loss: 2.3424 |
|
2024-07-29 08:12:53,761 DEV Loss: 3.0900 |
|
2024-07-29 08:12:53,761 DEV Perplexity: 21.9763 |
|
2024-07-29 08:12:53,761 New best score! |
|
2024-07-29 08:12:53,762 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:12:53,762 EPOCH 2 |
|
2024-07-29 08:13:52,690 batch 104/1044 - loss 2.18471333 - lr 0.0010 - time 58.93s |
|
2024-07-29 08:14:49,182 batch 208/1044 - loss 2.17075370 - lr 0.0010 - time 115.42s |
|
2024-07-29 08:15:46,935 batch 312/1044 - loss 2.16476290 - lr 0.0010 - time 173.17s |
|
2024-07-29 08:16:43,859 batch 416/1044 - loss 2.15894632 - lr 0.0010 - time 230.10s |
|
2024-07-29 08:17:40,100 batch 520/1044 - loss 2.15486173 - lr 0.0010 - time 286.34s |
|
2024-07-29 08:18:36,802 batch 624/1044 - loss 2.15043282 - lr 0.0010 - time 343.04s |
|
2024-07-29 08:19:30,048 batch 728/1044 - loss 2.14694826 - lr 0.0010 - time 396.29s |
|
2024-07-29 08:20:28,026 batch 832/1044 - loss 2.14419594 - lr 0.0010 - time 454.26s |
|
2024-07-29 08:21:25,600 batch 936/1044 - loss 2.14010675 - lr 0.0010 - time 511.84s |
|
2024-07-29 08:22:25,420 batch 1040/1044 - loss 2.13670866 - lr 0.0010 - time 571.66s |
|
2024-07-29 08:22:27,758 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:22:27,762 EPOCH 2 DONE |
|
2024-07-29 08:22:55,713 TRAIN Loss: 2.1365 |
|
2024-07-29 08:22:55,714 DEV Loss: 3.1892 |
|
2024-07-29 08:22:55,714 DEV Perplexity: 24.2695 |
|
2024-07-29 08:22:55,714 No improvement for 1 epoch(s) |
|
2024-07-29 08:22:55,714 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:22:55,714 EPOCH 3 |
|
2024-07-29 08:23:53,091 batch 104/1044 - loss 2.08751571 - lr 0.0010 - time 57.38s |
|
2024-07-29 08:24:50,619 batch 208/1044 - loss 2.08733297 - lr 0.0010 - time 114.90s |
|
2024-07-29 08:25:48,704 batch 312/1044 - loss 2.08495532 - lr 0.0010 - time 172.99s |
|
2024-07-29 08:26:46,137 batch 416/1044 - loss 2.08294034 - lr 0.0010 - time 230.42s |
|
2024-07-29 08:27:41,812 batch 520/1044 - loss 2.08286387 - lr 0.0010 - time 286.10s |
|
2024-07-29 08:28:37,415 batch 624/1044 - loss 2.07837076 - lr 0.0010 - time 341.70s |
|
2024-07-29 08:29:35,773 batch 728/1044 - loss 2.07550259 - lr 0.0010 - time 400.06s |
|
2024-07-29 08:30:32,773 batch 832/1044 - loss 2.07277058 - lr 0.0010 - time 457.06s |
|
2024-07-29 08:31:31,914 batch 936/1044 - loss 2.06922043 - lr 0.0010 - time 516.20s |
|
2024-07-29 08:32:27,776 batch 1040/1044 - loss 2.06737263 - lr 0.0010 - time 572.06s |
|
2024-07-29 08:32:29,985 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:32:29,987 EPOCH 3 DONE |
|
2024-07-29 08:32:58,150 TRAIN Loss: 2.0673 |
|
2024-07-29 08:32:58,150 DEV Loss: 3.2107 |
|
2024-07-29 08:32:58,150 DEV Perplexity: 24.7975 |
|
2024-07-29 08:32:58,150 No improvement for 2 epoch(s) |
|
2024-07-29 08:32:58,150 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:32:58,150 EPOCH 4 |
|
2024-07-29 08:33:55,013 batch 104/1044 - loss 2.04089623 - lr 0.0010 - time 56.86s |
|
2024-07-29 08:34:52,898 batch 208/1044 - loss 2.03903778 - lr 0.0010 - time 114.75s |
|
2024-07-29 08:35:51,119 batch 312/1044 - loss 2.03777666 - lr 0.0010 - time 172.97s |
|
2024-07-29 08:36:48,063 batch 416/1044 - loss 2.03265216 - lr 0.0010 - time 229.91s |
|
2024-07-29 08:37:43,123 batch 520/1044 - loss 2.03068389 - lr 0.0010 - time 284.97s |
|
2024-07-29 08:38:42,281 batch 624/1044 - loss 2.02925459 - lr 0.0010 - time 344.13s |
|
2024-07-29 08:39:38,619 batch 728/1044 - loss 2.02635143 - lr 0.0010 - time 400.47s |
|
2024-07-29 08:40:34,110 batch 832/1044 - loss 2.02490569 - lr 0.0010 - time 455.96s |
|
2024-07-29 08:41:30,332 batch 936/1044 - loss 2.02244815 - lr 0.0010 - time 512.18s |
|
2024-07-29 08:42:26,605 batch 1040/1044 - loss 2.02155263 - lr 0.0010 - time 568.45s |
|
2024-07-29 08:42:28,905 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:42:28,908 EPOCH 4 DONE |
|
2024-07-29 08:42:56,907 TRAIN Loss: 2.0215 |
|
2024-07-29 08:42:56,908 DEV Loss: 3.3884 |
|
2024-07-29 08:42:56,908 DEV Perplexity: 29.6186 |
|
2024-07-29 08:42:56,908 No improvement for 3 epoch(s) |
|
2024-07-29 08:42:56,908 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:42:56,908 EPOCH 5 |
|
2024-07-29 08:43:54,221 batch 104/1044 - loss 1.99417387 - lr 0.0010 - time 57.31s |
|
2024-07-29 08:44:52,997 batch 208/1044 - loss 1.98792041 - lr 0.0010 - time 116.09s |
|
2024-07-29 08:45:48,150 batch 312/1044 - loss 1.99154850 - lr 0.0010 - time 171.24s |
|
2024-07-29 08:46:45,419 batch 416/1044 - loss 1.99533101 - lr 0.0010 - time 228.51s |
|
2024-07-29 08:47:44,326 batch 520/1044 - loss 1.99671145 - lr 0.0010 - time 287.42s |
|
2024-07-29 08:48:42,269 batch 624/1044 - loss 1.99625001 - lr 0.0010 - time 345.36s |
|
2024-07-29 08:49:37,222 batch 728/1044 - loss 1.99431187 - lr 0.0010 - time 400.31s |
|
2024-07-29 08:50:32,593 batch 832/1044 - loss 1.99355745 - lr 0.0010 - time 455.68s |
|
2024-07-29 08:51:28,854 batch 936/1044 - loss 1.99387271 - lr 0.0010 - time 511.95s |
|
2024-07-29 08:52:26,219 batch 1040/1044 - loss 1.99341333 - lr 0.0010 - time 569.31s |
|
2024-07-29 08:52:28,341 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:52:28,343 EPOCH 5 DONE |
|
2024-07-29 08:52:56,407 TRAIN Loss: 1.9933 |
|
2024-07-29 08:52:56,407 DEV Loss: 3.4417 |
|
2024-07-29 08:52:56,407 DEV Perplexity: 31.2411 |
|
2024-07-29 08:52:56,408 No improvement for 4 epoch(s) |
|
2024-07-29 08:52:56,408 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 08:52:56,408 EPOCH 6 |
|
2024-07-29 08:53:53,603 batch 104/1044 - loss 1.94269003 - lr 0.0001 - time 57.20s |
|
2024-07-29 08:54:47,873 batch 208/1044 - loss 1.94458400 - lr 0.0001 - time 111.47s |
|
2024-07-29 08:55:46,532 batch 312/1044 - loss 1.94421510 - lr 0.0001 - time 170.12s |
|
2024-07-29 08:56:43,219 batch 416/1044 - loss 1.94502676 - lr 0.0001 - time 226.81s |
|
2024-07-29 08:57:40,699 batch 520/1044 - loss 1.94408302 - lr 0.0001 - time 284.29s |
|
2024-07-29 08:58:39,775 batch 624/1044 - loss 1.94322916 - lr 0.0001 - time 343.37s |
|
2024-07-29 08:59:37,572 batch 728/1044 - loss 1.94293834 - lr 0.0001 - time 401.16s |
|
2024-07-29 09:00:35,569 batch 832/1044 - loss 1.94358721 - lr 0.0001 - time 459.16s |
|
2024-07-29 09:01:31,771 batch 936/1044 - loss 1.94235871 - lr 0.0001 - time 515.36s |
|
2024-07-29 09:02:29,148 batch 1040/1044 - loss 1.94160419 - lr 0.0001 - time 572.74s |
|
2024-07-29 09:02:31,339 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:02:31,341 EPOCH 6 DONE |
|
2024-07-29 09:02:59,432 TRAIN Loss: 1.9417 |
|
2024-07-29 09:02:59,433 DEV Loss: 3.2860 |
|
2024-07-29 09:02:59,433 DEV Perplexity: 26.7353 |
|
2024-07-29 09:02:59,433 No improvement for 5 epoch(s) |
|
2024-07-29 09:02:59,433 Patience reached: Terminating model training due to early stopping |
|
2024-07-29 09:02:59,433 ---------------------------------------------------------------------------------------------------- |
|
2024-07-29 09:02:59,433 Finished Training |
|
2024-07-29 09:03:55,129 TEST Perplexity: 21.9740 |
|
2024-07-29 09:14:15,480 TEST BLEU = 4.83 42.3/11.5/2.1/0.5 (BP = 1.000 ratio = 1.000 hyp_len = 97 ref_len = 97) |
|
|