Bangla Math Solver - Fine-tuned Gemma-2B Model

Model Overview

This model is a fine-tuned version of unsloth/gemma-2-2b-it-bnb-4bit specifically designed for solving mathematical problems in Bengali language. The model demonstrates strong performance in understanding Bengali mathematical word problems and providing step-by-step solutions.

Model Details

  • Model Name: Bangla Math Solver
  • Base Model: unsloth/gemma-2-2b-it-bnb-4bit
  • Language: Bengali (বাংলা)
  • Domain: Mathematics Problem Solving
  • License: Apache 2.0
  • Developed by: ajoysr
  • Training Framework: Unsloth + Hugging Face TRL
  • Hardware: Google Colab T4 GPU (Free Tier)

Dataset Information

  • Source Dataset: hamim-87/Ashrafur_bangla_math
  • Training Samples: 5,000 samples (first 5K rows)
  • Data Format: Problem-solution pairs in Bengali
  • Problem Types: Mathematical word problems, arithmetic, algebra, combinatorics

Sample Data Structure

Problem: 5 জন ছাত্র 3টি খেলার প্রতিযোগিতায় অংশগ্রহণের জন্য সাইন আপ করছে...
Solution: এই সমস্যা সমাধান করার জন্য, আমরা গুণন নিয়ম ব্যবহার করে গণনা নীতি প্রয়োগ করি...

Training Details

Training Configuration

  • Epochs: Optimized for convergence
  • Learning Rate: Adaptive scheduling
  • Batch Size: Optimized for T4 GPU memory
  • Sequence Length: 2048 tokens
  • Training Time: ~2x faster with Unsloth optimization
  • Memory Optimization: 4-bit quantization (BNB)

Training Environment

  • Platform: Google Colab (Free Tier)
  • GPU: NVIDIA T4 (16GB VRAM)
  • Memory Management: Gradient checkpointing enabled
  • Mixed Precision: Automatic mixed precision (AMP)

Model Performance

The model excels at:

  • Understanding Bengali mathematical terminology
  • Breaking down complex word problems into steps
  • Providing detailed mathematical explanations in Bengali
  • Handling various mathematical domains (arithmetic, algebra, combinatorics)
  • Maintaining mathematical accuracy while explaining in native language

Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "ajoysr/bangla-math-solver"  # Replace with your actual model name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage
problem = "একটি ত্রিভুজের তিনটি বাহুর দৈর্ঘ্য 3, 4, এবং 5 একক। এর ক্ষেত্রফল কত?"

# Format input
input_text = f"Problem: {problem}\nSolution:"
inputs = tokenizer(input_text, return_tensors="pt")

# Generate solution
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_length=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

solution = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(f"Solution: {solution}")

Input Format

The model expects input in the following format:

Problem: [Bengali mathematical problem]
Solution:

Technical Implementation

Key Features

  • Efficient Training: Leveraged Unsloth for 2x faster training
  • Memory Optimization: 4-bit quantization for resource efficiency
  • Bengali Language Support: Specialized tokenization for Bengali text
  • Mathematical Reasoning: Step-by-step problem-solving approach

Optimization Techniques

  • LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
  • Gradient checkpointing for memory efficiency
  • Dynamic padding for optimal batch processing
  • Learning rate scheduling for stable convergence

Methodology

Data Preprocessing

  1. Data Loading: Utilized first 5K samples from the Ashrafur_bangla_math dataset
  2. Text Formatting: Structured problem-solution pairs with clear delimiters
  3. Tokenization: Applied Bengali-aware tokenization with appropriate padding
  4. Quality Control: Implemented data validation to ensure problem-solution alignment

Fine-tuning Strategy

  1. Base Model Selection: Chose Gemma-2B for optimal balance of performance and efficiency
  2. Parameter-Efficient Training: Applied LoRA adapters to reduce trainable parameters
  3. Hyperparameter Optimization: Tuned learning rate, batch size, and sequence length
  4. Convergence Monitoring: Implemented loss tracking and validation metrics

Evaluation Approach

  • Mathematical accuracy assessment
  • Bengali language fluency evaluation
  • Step-by-step reasoning quality analysis
  • Computational efficiency measurement

Limitations

  • Trained specifically on Bengali mathematical problems
  • Performance may vary on mathematical domains not well-represented in training data
  • Limited to text-based mathematical problems (no geometric diagrams)
  • Optimized for Google Colab T4 environment

Future Improvements

  • Expand training dataset to include more diverse mathematical problems
  • Add support for geometric problems with diagram interpretation
  • Implement multi-turn conversation capability for clarifying questions
  • Optimize for deployment on edge devices

Citation

If you use this model in your research, please cite:

@misc{bangla-math-solver-2025,
  title={Bangla Math Solver: Fine-tuned Gemma-2B for Bengali Mathematical Problem Solving},
  author={ajoysr},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/ajoysr/bangla-math-solver}
}

Acknowledgments

  • Unsloth Team: For providing efficient fine-tuning framework
  • Hugging Face: For model hosting and TRL library
  • Dataset Creator: hamim-87 for the Ashrafur_bangla_math dataset
  • Google Colab: For providing free GPU access for research

Contact

For questions, suggestions, or collaborations, please reach out through the Hugging Face model page or create an issue in the associated repository.


Note: This model is designed for educational and research purposes. Always verify mathematical solutions for critical applications.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ajoysr/bangla-math-gemma

Finetuned
(167)
this model