internlm/internlm-xcomposer2-7b · fix(internlm): Prevent errors by padding the dimensions of wrap tokens.

The text_input in a batch can contain texts of various lengths.
In this case, the wrap_tokens will be of different lengths and torch.cat will get an error because the dim is not correct.
I added padding to resolve the issue below.
I would appreciate it if you could review this PR.
Error examples

ret_val = func(*args, **kwargs)
File "/tmp/ray/session_2024-02-05_14-30-33_881744_3780/runtime_resources/pip/40ae4806a7327971d7c077068c4b0a3019a14611/virtualenv/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1801, in forward
loss = self.module(*inputs, **kwargs)
File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/py310/lib/python3.10/site-packages/pytorch_lightning/overrides/base.py", line 98, in forward
output = self._forward_module.training_step(*inputs, **kwargs)
File "/tmp/ray/session_2024-02-05_14-30-33_881744_3780/runtime_resources/py_modules_files/_ray_pkg_33670290aabc83b3/ml/model/application/vlm/place_vlm/llava/system_stage2_internlm.py", line 56, in training_step
outputs = self.model(samples=batch)
File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/internlm2-7b/modeling_internlm_xcomposer2.py", line 337, in forward
to_regress_embeds, attention_mask, targets, im_mask = self.interleav_wrap(
File "/root/.cache/huggingface/modules/transformers_modules/internlm2-7b/modeling_internlm_xcomposer2.py", line 266, in interleav_wrap
wrap_embeds = torch.cat(wrap_embeds_list)
RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 1230 but got size 3546 for tensor number 1 in the list.