Before load dataset, RAM used: 0.38 GB | Avaiable: 51.32 GB | Left: 50.94 GB Dataset({ features: ['text'], num_rows: 7625355 }) After load dataset, RAM used: 1.78 GB | Avaiable: 50.99 GB | Left: 49.22 GB After Prepare Dataloader, RAM used: 2.30 GB | Avaiable: 46.91 GB | Left: 44.61 GB After epoch 1, RAM used: 3.57 GB | Avaiable: 51.76 GB | Left: 48.19 GB >>> Epoch 1: Perplexity: 21.590424136414132 Loss: 2.6214573416621008 Loss improved inf -> 2.6214573416621008 Saved training checkpoint After epoch 2, RAM used: 3.53 GB | Avaiable: 57.51 GB | Left: 53.98 GB >>> Epoch 2: Perplexity: 14.599685600614704 Loss: 2.4090022345374362 Loss improved 2.6214573416621008 -> 2.4090022345374362 Saved training checkpoint After epoch 3, RAM used: 3.57 GB | Avaiable: 57.47 GB | Left: 53.89 GB >>> Epoch 3: Perplexity: 11.945347969346386 Loss: 2.3160537090725533 Loss improved 2.4090022345374362 -> 2.3160537090725533 Saved training checkpoint After epoch 4, RAM used: 3.58 GB | Avaiable: 57.47 GB | Left: 53.90 GB >>> Epoch 4: Perplexity: 11.61526404278414 Loss: 2.2554892432450773 Loss improved 2.3160537090725533 -> 2.2554892432450773 Saved training checkpoint After epoch 5, RAM used: 3.58 GB | Avaiable: 52.30 GB | Left: 48.72 GB >>> Epoch 5: Perplexity: 10.940166697585614 Loss: 2.211717551305325 Loss improved 2.2554892432450773 -> 2.211717551305325 Saved training checkpoint After epoch 6, RAM used: 3.58 GB | Avaiable: 57.48 GB | Left: 53.90 GB >>> Epoch 6: Perplexity: 9.703375475135896 Loss: 2.1727879395655756 Loss improved 2.211717551305325 -> 2.1727879395655756 Saved training checkpoint After epoch 7, RAM used: 3.58 GB | Avaiable: 57.46 GB | Left: 53.89 GB >>> Epoch 7: Perplexity: 9.611460056753156 Loss: 2.139744746335077 Loss improved 2.1727879395655756 -> 2.139744746335077 Saved training checkpoint After epoch 8, RAM used: 3.58 GB | Avaiable: 57.45 GB | Left: 53.88 GB >>> Epoch 8: Perplexity: 9.253905521907615 Loss: 2.112046389352141 Loss improved 2.139744746335077 -> 2.112046389352141 Saved training checkpoint After epoch 9, RAM used: 3.58 GB | Avaiable: 57.45 GB | Left: 53.87 GB >>> Epoch 9: Perplexity: 8.987782076156853 Loss: 2.088010660353402 Loss improved 2.112046389352141 -> 2.088010660353402 Saved training checkpoint After epoch 10, RAM used: 3.58 GB | Avaiable: 57.46 GB | Left: 53.89 GB >>> Epoch 10: Perplexity: 8.989515812427134 Loss: 2.0606724782881045 Loss improved 2.088010660353402 -> 2.0606724782881045 Saved training checkpoint After epoch 11, RAM used: 3.58 GB | Avaiable: 57.48 GB | Left: 53.91 GB >>> Epoch 11: Perplexity: 8.596087054176957 Loss: 2.0449221819049317 Loss improved 2.0606724782881045 -> 2.0449221819049317 Saved training checkpoint After epoch 12, RAM used: 3.58 GB | Avaiable: 57.43 GB | Left: 53.85 GB >>> Epoch 12: Perplexity: 8.133701984722778 Loss: 2.0227558158141825 Loss improved 2.0449221819049317 -> 2.0227558158141825 Saved training checkpoint After epoch 13, RAM used: 3.58 GB | Avaiable: 57.47 GB | Left: 53.89 GB >>> Epoch 13: Perplexity: 8.120926713195095 Loss: 1.9996797158441224 Loss improved 2.0227558158141825 -> 1.9996797158441224 Saved training checkpoint After epoch 14, RAM used: 3.58 GB | Avaiable: 57.47 GB | Left: 53.90 GB >>> Epoch 14: Perplexity: 7.887189857322398 Loss: 1.986047686118474 Loss improved 1.9996797158441224 -> 1.986047686118474 Saved training checkpoint After epoch 15, RAM used: 3.58 GB | Avaiable: 55.84 GB | Left: 52.26 GB >>> Epoch 15: Perplexity: 7.687336654327518 Loss: 1.9684794586581946 Loss improved 1.986047686118474 -> 1.9684794586581946 Saved training checkpoint After epoch 16, RAM used: 3.58 GB | Avaiable: 57.46 GB | Left: 53.89 GB >>> Epoch 16: Perplexity: 7.549170162424603 Loss: 1.9527537560472294 Loss improved 1.9684794586581946 -> 1.9527537560472294 Saved training checkpoint After epoch 17, RAM used: 3.58 GB | Avaiable: 57.46 GB | Left: 53.88 GB >>> Epoch 17: Perplexity: 7.614053464896555 Loss: 1.9367966973686213 Loss improved 1.9527537560472294 -> 1.9367966973686213 Saved training checkpoint After epoch 18, RAM used: 3.58 GB | Avaiable: 57.47 GB | Left: 53.89 GB >>> Epoch 18: Perplexity: 7.137354318384541 Loss: 1.9238058941700293 Loss improved 1.9367966973686213 -> 1.9238058941700293 Saved training checkpoint After epoch 19, RAM used: 3.58 GB | Avaiable: 51.08 GB | Left: 47.50 GB >>> Epoch 19: Perplexity: 7.229089215165024 Loss: 1.9130641898584901 Loss improved 1.9238058941700293 -> 1.9130641898584901 Saved training checkpoint After epoch 20, RAM used: 3.56 GB | Avaiable: 51.37 GB | Left: 47.80 GB >>> Epoch 20: Perplexity: 7.165172113154145 Loss: 1.9069529029687642 Loss improved 1.9130641898584901 -> 1.9069529029687642 Saved training checkpoint