5 CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations · 11 authors 1