WARNING:root:Dropping 0 rows Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation. /baie/nfs-cluster-1/data1/raid1/homedirs/eliot.maes/multimodal-itmodels/experiments/./ES_corlec is already a clone of https://huggingface.co/maesneako/ES_corlec. Make sure you pull the latest changes with `repo.git_pull()`. WARNING:huggingface_hub.repository:/baie/nfs-cluster-1/data1/raid1/homedirs/eliot.maes/multimodal-itmodels/experiments/./ES_corlec is already a clone of https://huggingface.co/maesneako/ES_corlec. Make sure you pull the latest changes with `repo.git_pull()`. The following columns in the training set don't have a corresponding argument in `GPT2LMHeadModel.forward` and have been ignored: text_input_ids, text_u_full, index, text, text_u, start_idx, speaker, text_input_ids_full, __index_level_0__, length, file. If text_input_ids, text_u_full, index, text, text_u, start_idx, speaker, text_input_ids_full, __index_level_0__, length, file are not expected by `GPT2LMHeadModel.forward`, you can safely ignore this message. /baie/nfs-cluster-1/data1/raid1/homedirs/eliot.maes/env/lib/python3.6/site-packages/transformers/optimization.py:309: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning FutureWarning, ***** Running training ***** Num examples = 80691 Num Epochs = 10 Instantaneous batch size per device = 16 Total train batch size (w. parallel, distributed & accumulation) = 16 Gradient Accumulation steps = 1 Total optimization steps = 50440 0%| | 0/50440 [00:00