PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking
-
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking
Paper • 2410.12375 • Published • 2 -
lamm-mit/PRefLexOR_ORPO_DPO_EXO_10242024
Text Generation • Updated • 71 -
lamm-mit/PRefLexOR_ORPO_DPO_EXO_REFLECT_10222024
Text Generation • Updated • 32 • 1 -
lamm-mit/meta-llama-Meta-Llama-3.2-3B-Instruct-Reasoning-Tokenizer
Updated