Video-Text-to-Text
Transformers
Safetensors
English
llava_llama
Inference Endpoints
ColorfulAI's picture
Add tag, link to paper (#1)
9c8fec9 verified
---
license: mit
pipeline_tag: video-text-to-text
datasets:
- liuhaotian/LLaVA-Instruct-150K
- OpenGVLab/VideoChat2-IT
language:
- en
---
Paper: https://huggingface.co/papers/2409.01071