Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time Paper • 2502.19230 • Published Feb 26 • 1
RoleMRC Collection A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following • 6 items • Updated Mar 7 • 1
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23 • 48
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence Paper • 2406.10957 • Published Jun 16, 2024 • 1
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring Paper • 2406.19949 • Published Jun 28, 2024 • 1
AERA Collection Resources for EMNLP 2023 Paper: Distilling ChatGPT for Explainable Automated Student Answer Assessment • 3 items • Updated Oct 14, 2024 • 1
MCTS with Preference Optimisation Collection Resources for EMNLP 2024 Paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring • 8 items • Updated Oct 14, 2024 • 2
SamPO Collection Resources for EMNLP 2024 Paper: Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence • 4 items • Updated Oct 14, 2024 • 2