ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition Paper • 2503.21248 • Published 27 days ago • 20
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition Paper • 2503.21248 • Published 27 days ago • 20
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition Paper • 2503.21248 • Published 27 days ago • 20 • 2
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 116
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 120
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published Jan 8 • 37
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published Jan 8 • 37
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published Jan 8 • 37 • 2
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Paper • 2411.18203 • Published Nov 27, 2024 • 37
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Paper • 2411.18203 • Published Nov 27, 2024 • 37
Logical Reasoning over Natural Language as Knowledge Representation: A Survey Paper • 2303.12023 • Published Mar 21, 2023 • 2
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses Paper • 2410.07076 • Published Oct 9, 2024 • 2
Logical Reasoning over Natural Language as Knowledge Representation: A Survey Paper • 2303.12023 • Published Mar 21, 2023 • 2
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses Paper • 2410.07076 • Published Oct 9, 2024 • 2
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published Oct 17, 2024 • 76