-
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Paper • 2407.21770 • Published • 23 -
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 42 -
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
Paper • 2407.08583 • Published • 13 -
Vision language models are blind
Paper • 2407.06581 • Published • 83
RainningXY
xxyyy123
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
18 days ago
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement
Learning
upvoted
a
paper
19 days ago
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging
liked
a model
19 days ago
AIDC-AI/Ovis2-34B-GPTQ-Int8
Organizations
Collections
3
-
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Paper • 2407.14507 • Published • 48 -
New Desiderata for Direct Preference Optimization
Paper • 2407.09072 • Published • 11 -
Self-Recognition in Language Models
Paper • 2407.06946 • Published • 27 -
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Paper • 2407.04842 • Published • 57
models
None public yet
datasets
None public yet