-
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Paper • 2406.11839 • Published • 37 -
Pandora: Towards General World Model with Natural Language Actions and Video States
Paper • 2406.09455 • Published • 14 -
WPO: Enhancing RLHF with Weighted Preference Optimization
Paper • 2406.11827 • Published • 14 -
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Paper • 2406.11194 • Published • 15
Collections
Discover the best community collections!
Collections including paper arxiv:2404.12318
-
Fine-Tuning Language Models from Human Preferences
Paper • 1909.08593 • Published • 3 -
Transforming and Combining Rewards for Aligning Large Language Models
Paper • 2402.00742 • Published • 11 -
Leverage the Average: an Analysis of KL Regularization in RL
Paper • 2003.14089 • Published • 2 -
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Paper • 2404.01258 • Published • 10
-
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 86 -
Long-form factuality in large language models
Paper • 2403.18802 • Published • 24 -
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion
Paper • 2403.18818 • Published • 25 -
TC4D: Trajectory-Conditioned Text-to-4D Generation
Paper • 2403.17920 • Published • 16
-
One-step Diffusion with Distribution Matching Distillation
Paper • 2311.18828 • Published • 3 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 78 -
Condition-Aware Neural Network for Controlled Image Generation
Paper • 2404.01143 • Published • 11 -
Locating and Editing Factual Associations in GPT
Paper • 2202.05262 • Published • 1