metadata

title: README
emoji: 🌍
colorFrom: blue
colorTo: blue
sdk: static
pinned: false

On Path to Multimodal Generalist: Levels and Benchmarks

Does higher performance across tasks indicate a stronger capability of MLLM, and closer to AGI?
NO! Synergy does.

This project introduces:

General-Level, a 5-scale level evaluation system with a new norm for assessing the multimodal generalists (multimodal LLMs/agents). The core is the use of Synergy as the evaluative criterion, categorizing capabilities based on whether MLLMs preserve synergy across comprehension and generation, as well as across multimodal interactions.
General-Bench, a companion massive multimodal benchmark dataset, encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325K instances.