Exciting Papers
Our curated list of AI papers @Temus AI
Paper • 2310.04406 • Published • 8Note Top reasoning trick on HummanEval: MCTS + LLM + Feedback + Reflection @UIUC
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 90Note Our Re-Implementation code: https://github.com/fangyuan-ksgk/CoT-Reasoning-without-Prompting Insight: Decoding time reasoning is cheap, effective, and can bring out the 'inherent' reasoning capacity from pre-trained LLM. Drawback: Indentification of the set of answer, and its location reamains the million dollar question.
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6Note In-Context-Learning based preference alignment, performance on-par with Supervised Fine-Tuning (SFT). Can be used to generated optimal preference pairs, or augment the preference dataset.
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 102Note Self-Discover solves any task in three steps: Pickging a reasoning structure, designing a stepwise reasoning plan, then implement the thinking process to get the answer. Significant performance improvement is observed.
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 135Note Meta's work on iterative self-improvement of LLM.
Direct Language Model Alignment from Online AI Feedback
Paper • 2402.04792 • Published • 25Note A simplification of Meta's self-rewarding LLM, relying on LLM's innate capacity of understanding the preference shown in the original labeled dataset, and use it to gives thumb up & down, which are then feed back to the model weight through DPO.
Matryoshka Representation Learning
Paper • 2205.13147 • Published • 7