ryanzhangfan's picture
add support for batch multimodal understanding
fd16886 verified