Before load dataset, RAM used: 0.36 GB | Avaiable: 200.44 GB | Left: 200.08 GB After load dataset, RAM used: 0.61 GB | Avaiable: 200.42 GB | Left: 199.80 GB After Prepare Dataloader, RAM used: 6.94 GB | Avaiable: 203.21 GB | Left: 196.27 GB After epoch 1, RAM used: 27.96 GB | Avaiable: 206.69 GB | Left: 178.73 GB >>> Epoch 1: Perplexity: 8.71756492102604 Loss: 1.987068991905017 Loss improved inf -> 1.987068991905017 Saved training checkpoint After epoch 2, RAM used: 32.61 GB | Avaiable: 217.10 GB | Left: 184.49 GB >>> Epoch 2: Perplexity: 6.334900481327126 Loss: 1.7609998571673966 Loss improved 1.987068991905017 -> 1.7609998571673966 Saved training checkpoint After epoch 3, RAM used: 32.60 GB | Avaiable: 220.86 GB | Left: 188.26 GB >>> Epoch 3: Perplexity: 5.783759349968259 Loss: 1.714263437903529 Loss improved 1.7609998571673966 -> 1.714263437903529 Saved training checkpoint After epoch 4, RAM used: 32.33 GB | Avaiable: 192.09 GB | Left: 159.77 GB >>> Epoch 4: Perplexity: 4.530374708434244 Loss: 1.5004713556098934 Loss improved 1.714263437903529 -> 1.5004713556098934 Saved training checkpoint