CaRe
Collection
CaReBench data, CaRe models and all the contrastively trained MLLMs (including InternVL2, MiniCPM-V 2.6, LLaVA NeXT Video, Qwen2-VL and Tariser).
β’
6 items
β’
Updated
β’
1
Yifan Xu, Xinhao Li, Yichun Yang, Desen Meng, Rui Huang, Limin Wang
This is CaRe trained after Stage-I. It can only handle video captioning tasks. Refer to our paper for details.
Loading from the huggingface remote path is not tested. It is recommended to download this checkpoint to your local environment to prevent potential bugs.
from utils.video import read_frames_decord
from models.modeling_captioners import AutoCaptioner
captioner = AutoCaptioner.from_pretrained('path/to/checkpoints/CaRe-7B-Stage-1')
frames = read_frames_decord(video_path='assets/demo.mp4', num_frames=32)
description = captioner.describe(frames.unsqueeze(0))
print(description[0])