Our Github Page:

Our Spaces

Great thanks to the research GPU grants!

  • Q-Align (Most Powerful Visual Scorer): Open in Huggingface Spaces
  • Q-Instruct (Low-level Vision-Language Assistant/Chatbot, support 1-4 images): Open in Huggingface Spaces
  • Q-Bench (Benchmark for General Purpose MLLMs): Open in Huggingface Spaces

Our Mainstream Models

  • q-future/one-align: AutoModel for Visual Scoring. Trained with Mixture of existing datasets: See Github for details.
  • q-future/co-instruct: AutoModel for Low-level Visual Dialog (Description, Comparison, Question Answering). Trained with the scaled Co-Instruct-562K dataset (will also release soon!).
  • q-future/q-instruct-mplug-owl2-1031: Older version of Q-Instruct, as reported by paper. Trained with released Q-Instruct-200K dataset.

Though we have other model variants released for the community to replicate our results, please use the previous ones as they are proved to have more stable performance.