Scaling Language-Free Visual Representation Learning Paper • 2504.01017 • Published 10 days ago • 25 • 4
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 120
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14, 2024 • 53
tsbpp/llava-vicuna-7b-diffusion-sd2_1-p16-res512-737k-bs512 Text Generation • Updated Jul 31, 2024 • 4