DAMO-NLP-SG
/

VideoLLaMA2-8x7B-Base

Visual Question Answering

videollama2_mixtral

text-generation

multimodal large language model

large video-language model

Inference Endpoints

Model card Files Files and versions Community

merve HF staff commited on Aug 24

Commit

6fed7d6

•

1 Parent(s): bab71d8

fix task tag

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ language:
 metrics:
 - accuracy
 library_name: transformers
-pipeline_tag: visual-question-answering
 tags:
 - multimodal large language model
 - large video-language model
@@ -103,4 +103,4 @@ If you find VideoLLaMA useful for your research and applications, please cite us
   year = {2023},
   url = {https://arxiv.org/abs/2306.02858}
 }
-```

 metrics:
 - accuracy
 library_name: transformers
+pipeline_tag: video-text-to-text
 tags:
 - multimodal large language model
 - large video-language model
   year = {2023},
   url = {https://arxiv.org/abs/2306.02858}
 }
+```