-
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Paper • 2403.14624 • Published • 51 -
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
Paper • 2407.01284 • Published • 75 -
MAVIS: Mathematical Visual Instruction Tuning
Paper • 2407.08739 • Published • 30
Collections
Discover the best community collections!
Collections including paper arxiv:2407.01284
-
AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions
Paper • 2312.08472 • Published • 2 -
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Paper • 2403.14624 • Published • 51 -
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Paper • 2404.02893 • Published • 20 -
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 86
-
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Paper • 2403.06775 • Published • 3 -
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Paper • 2010.11929 • Published • 6 -
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper • 2110.07040 • Published • 2 -
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper • 1811.00056 • Published • 2
-
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Paper • 2403.02884 • Published • 15 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 72 -
Improving Small Language Models' Mathematical Reasoning via Mix Thoughts Distillation
Paper • 2401.11864 • Published • 2 -
Common 7B Language Models Already Possess Strong Math Capabilities
Paper • 2403.04706 • Published • 16
-
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Paper • 2402.15627 • Published • 34 -
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 49 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 44 -
Stealing Part of a Production Language Model
Paper • 2403.06634 • Published • 90