DAMO-NLP-SG
/

VideoLLaMA2-8x7B-Base

Visual Question Answering

videollama2_mixtral

text-generation

multimodal large language model

large video-language model

Inference Endpoints

Model card Files Files and versions Community

ClownRat commited on Aug 13, 2024

Commit

bab71d8

·

verified ·

1 Parent(s): a2c9800

Update config.json

Files changed (1) hide show

config.json +1 -1

config.json CHANGED Viewed

@@ -21,7 +21,7 @@
   "mm_vision_select_feature": "patch",
   "mm_vision_select_layer": -2,
   "mm_vision_tower": "openai/clip-vit-large-patch14-336",
-  "model_type": "mixtral",
   "num_attention_heads": 32,
   "num_experts_per_tok": 2,
   "num_frames": 8,

   "mm_vision_select_feature": "patch",
   "mm_vision_select_layer": -2,
   "mm_vision_tower": "openai/clip-vit-large-patch14-336",
+  "model_type": "videollama2_mixtral",
   "num_attention_heads": 32,
   "num_experts_per_tok": 2,
   "num_frames": 8,