DAMO-NLP-SG
/

VideoLLaMA2-72B

Visual Question Answering

videollama2_qwen2

text-generation

multimodal large language model

large video-language model

Inference Endpoints

Model card Files Files and versions Community

lixin4ever commited on Aug 14

Commit

b6fb50e

•

1 Parent(s): d239839

Update config.json

Files changed (1) hide show

config.json +1 -1

config.json CHANGED Viewed

@@ -22,7 +22,7 @@
   "mm_vision_tower": "openai/clip-vit-large-patch14-336",
   "model_type": "videollama2_qwen2",
   "num_attention_heads": 64,
-  "num_frames": 8,
   "num_hidden_layers": 80,
   "num_key_value_heads": 8,
   "rms_norm_eps": 1e-06,

   "mm_vision_tower": "openai/clip-vit-large-patch14-336",
   "model_type": "videollama2_qwen2",
   "num_attention_heads": 64,
+  "num_frames": 16,
   "num_hidden_layers": 80,
   "num_key_value_heads": 8,
   "rms_norm_eps": 1e-06,