Question about LLaVA-Video-32B-Qwen: Performance issues

#7
by RachelZhou - opened

I have a few questions regarding the 32B video model implementations and performance:

  1. Could you clarify which is the latest model: lmms-lab/LLaVA-NeXT-Video-32B-Qwen or lmms-lab/LLaVA-Video-32B-Qwen? It’s unclear which one should be used for the latest evaluations.

  2. In practical implementations, I’ve noticed that the 32B model appears to perform worse than the 7B and 72B models. Any idea why this might be the case?

  3. I also observed that there hasn’t been a performance evaluation of the 32B model on the latest evaluation benchmarks. Is this due to any particular issue with the model, or has it simply not been prioritized for testing?

Sign up or log in to comment