Efficient Exact Optimization - a ehzoah Collection

ehzoah 's Collections

updated Jun 24

SFT & Reward Models used in the experiments of the ICML 2024 paper "Towards Efficient Exact Optimization of Language Model Alignment"