data: cnn_dailymail split: 0.001