Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning Paper โข 2410.10801 โข Published Oct 14
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper โข 2408.10914 โข Published Aug 20 โข 41
LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives Paper โข 2407.01490 โข Published Jul 1 โข 1
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm Paper โข 2406.18682 โข Published Jun 26
M-RewardBench: Evaluating Reward Models in Multilingual Settings Paper โข 2410.15522 โข Published Oct 20 โข 10