58 Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences · 6 authors 1
27 No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance · 8 authors
22 AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent · 11 authors 2