Before load dataset, RAM used: 0.36 GB | Avaiable: 233.95 GB | Left: 233.59 GB Before load dataset, RAM used: 0.36 GB | Avaiable: 233.95 GB | Left: 233.59 GB Dataset({ features: ['text'], num_rows: 112759113 }) After load dataset, RAM used: 6.51 GB | Avaiable: 233.75 GB | Left: 227.23 GB After Prepare Dataloader, RAM used: 37.51 GB | Avaiable: 234.01 GB | Left: 196.51 GB After epoch 1, RAM used: 38.33 GB | Avaiable: 214.05 GB | Left: 175.72 GB >>> Epoch 1: Perplexity: 12.33546676560462 Loss: 2.109314857722502 Loss improved inf -> 2.109314857722502 Saved training checkpoint After epoch 2, RAM used: 38.32 GB | Avaiable: 211.02 GB | Left: 172.71 GB >>> Epoch 2: Perplexity: 9.42495984722383 Loss: 1.9333453324274112 Loss improved 2.109314857722502 -> 1.9333453324274112 Saved training checkpoint After epoch 3, RAM used: 38.32 GB | Avaiable: 214.43 GB | Left: 176.11 GB >>> Epoch 3: Perplexity: 7.812616130125877 Loss: 1.8017513724170475 Loss improved 1.9333453324274112 -> 1.8017513724170475 Saved training checkpoint After epoch 4, RAM used: 38.32 GB | Avaiable: 213.09 GB | Left: 174.77 GB >>> Epoch 4: Perplexity: 5.710954100841271 Loss: 1.6808245233119454 Loss improved 1.8017513724170475 -> 1.6808245233119454 Saved training checkpoint