MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published 21 days ago • 31
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging Paper • 2407.01470 • Published Jul 1, 2024 • 5