DAMO-NLP-SG
/

VideoLLaMA2-7B-16F-Base

Visual Question Answering

videollama2_mistral

text-generation

multimodal large language model

large video-language model

Inference Endpoints

Model card Files Files and versions Community

ClownRat commited on Aug 13

Commit

28913a8

•

1 Parent(s): bbb42dc

Update config.json

Files changed (1) hide show

config.json +1 -1

config.json CHANGED Viewed

@@ -21,7 +21,7 @@
   "mm_vision_select_feature": "patch",
   "mm_vision_select_layer": -2,
   "mm_vision_tower": "openai/clip-vit-large-patch14-336",
-  "model_type": "mistral",
   "num_attention_heads": 32,
   "num_frames": 16,
   "num_hidden_layers": 32,

   "mm_vision_select_feature": "patch",
   "mm_vision_select_layer": -2,
   "mm_vision_tower": "openai/clip-vit-large-patch14-336",
+  "model_type": "videollama2_mistral",
   "num_attention_heads": 32,
   "num_frames": 16,
   "num_hidden_layers": 32,