Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.44.1
metadata
title: Math Question Validator
emoji: ๐งฎ
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
Math Question Validator
A powerful web-based tool for validating mathematical questions and answers using state-of-the-art AI models.
Features
- ๐ Multiple AI Models: Support for o3-mini, GPT-5, Claude 4, Grok 4, DeepSeek, and more
- ๐ Parallel Processing: Process hundreds of questions simultaneously
- ๐ Detailed Analytics: Track accuracy, timeouts, and errors in real-time
- ๐ LaTeX Reconciliation: Generate detailed comparison documents for mismatched answers
- ๐ผ๏ธ Image Support: Handle questions with diagrams and figures
- ๐ Progress Tracking: Real-time statistics and progress monitoring
Quick Start
- Upload your Excel file containing math questions
- Select models for solving and reconciliation
- Configure processing options (parallel processes, batch size)
- Start validation and monitor progress
- Download results with detailed analysis
Setting Up API Keys
This app requires API keys to function. Add them in the Spaces Settings:
- Go to Settings โ Variables and secrets
- Add your API keys:
OPENAI_API_KEY
- For OpenAI models (o3-mini, GPT-5, GPT-4o)OPENROUTER_API_KEY
- For Claude, Grok, Gemini, and other models
Input Format
Your Excel file should have a "Data" sheet with these columns:
question
- The math question textcorrect_answer
oranswer
- The reference answerraw_subject
- Subject classification (optional, for filtering)file_url
- Image URL if question has a diagram (optional)
Output
The validator generates:
- Validated Excel file with model answers and match results
- LaTeX reconciliation documents for mismatched answers
- Model answer files with complete solutions
- Statistics summary with accuracy metrics
Model Recommendations
For Best Accuracy
- Solver: o3-mini
- Reconciliation: gpt-4o
For Speed
- Solver: gpt-4o
- Reconciliation: gpt-4o
- Use 4-6 parallel processes
For Cost-Effectiveness
- Solver: Claude 3.5 Sonnet
- Reconciliation: Claude 3.5 Sonnet
Advanced Features
Parallel Processing
- Automatically splits large datasets across multiple processes
- Merges results seamlessly
- Optimal for 100+ questions
Custom Ranges
- Process specific question ranges
- Useful for testing or resuming interrupted runs
LaTeX Compilation
- Optional PDF generation from LaTeX reconciliation documents
- Requires pdflatex (not available in HF Spaces)
Limitations
- Maximum file size: 200MB
- Image support requires URLs (local images not supported in HF Spaces)
- LaTeX PDF compilation not available (use .tex files locally)
Support
For issues or questions:
- Check the Configuration tab in the app
- Review error messages in the output log
- Ensure API keys are correctly set
License
MIT License - Free to use and modify
Credits
Built with:
- Gradio for the web interface
- OpenAI, Anthropic, and other AI providers for models
- pandas for data processing