Bangla Math Solver - Fine-tuned Gemma-2B Model

Model Overview

This model is a fine-tuned version of unsloth/gemma-2-2b-it-bnb-4bit specifically designed for solving mathematical problems in Bengali language. The model demonstrates strong performance in understanding Bengali mathematical word problems and providing step-by-step solutions.

Model Details

Model Name: Bangla Math Solver
Base Model: unsloth/gemma-2-2b-it-bnb-4bit
Language: Bengali (বাংলা)
Domain: Mathematics Problem Solving
License: Apache 2.0
Developed by: ajoysr
Training Framework: Unsloth + Hugging Face TRL
Hardware: Google Colab T4 GPU (Free Tier)

Dataset Information

Source Dataset: hamim-87/Ashrafur_bangla_math
Training Samples: 5,000 samples (first 5K rows)
Data Format: Problem-solution pairs in Bengali
Problem Types: Mathematical word problems, arithmetic, algebra, combinatorics

Sample Data Structure

Problem: 5 জন ছাত্র 3টি খেলার প্রতিযোগিতায় অংশগ্রহণের জন্য সাইন আপ করছে...
Solution: এই সমস্যা সমাধান করার জন্য, আমরা গুণন নিয়ম ব্যবহার করে গণনা নীতি প্রয়োগ করি...

Training Details

Training Configuration

Epochs: Optimized for convergence
Learning Rate: Adaptive scheduling
Batch Size: Optimized for T4 GPU memory
Sequence Length: 2048 tokens
Training Time: ~2x faster with Unsloth optimization
Memory Optimization: 4-bit quantization (BNB)

Training Environment

Platform: Google Colab (Free Tier)
GPU: NVIDIA T4 (16GB VRAM)
Memory Management: Gradient checkpointing enabled
Mixed Precision: Automatic mixed precision (AMP)

Model Performance

The model excels at:

Understanding Bengali mathematical terminology
Breaking down complex word problems into steps
Providing detailed mathematical explanations in Bengali
Handling various mathematical domains (arithmetic, algebra, combinatorics)
Maintaining mathematical accuracy while explaining in native language

Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "ajoysr/bangla-math-solver"  # Replace with your actual model name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage
problem = "একটি ত্রিভুজের তিনটি বাহুর দৈর্ঘ্য 3, 4, এবং 5 একক। এর ক্ষেত্রফল কত?"

# Format input
input_text = f"Problem: {problem}\nSolution:"
inputs = tokenizer(input_text, return_tensors="pt")

# Generate solution
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_length=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

solution = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(f"Solution: {solution}")

Input Format

The model expects input in the following format:

Problem: [Bengali mathematical problem]
Solution:

Technical Implementation

Key Features

Efficient Training: Leveraged Unsloth for 2x faster training
Memory Optimization: 4-bit quantization for resource efficiency
Bengali Language Support: Specialized tokenization for Bengali text
Mathematical Reasoning: Step-by-step problem-solving approach

Optimization Techniques

LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
Gradient checkpointing for memory efficiency
Dynamic padding for optimal batch processing
Learning rate scheduling for stable convergence

Methodology

Data Preprocessing

Data Loading: Utilized first 5K samples from the Ashrafur_bangla_math dataset
Text Formatting: Structured problem-solution pairs with clear delimiters
Tokenization: Applied Bengali-aware tokenization with appropriate padding
Quality Control: Implemented data validation to ensure problem-solution alignment

Fine-tuning Strategy

Base Model Selection: Chose Gemma-2B for optimal balance of performance and efficiency
Parameter-Efficient Training: Applied LoRA adapters to reduce trainable parameters
Hyperparameter Optimization: Tuned learning rate, batch size, and sequence length
Convergence Monitoring: Implemented loss tracking and validation metrics

Evaluation Approach

Mathematical accuracy assessment
Bengali language fluency evaluation
Step-by-step reasoning quality analysis
Computational efficiency measurement

Limitations

Trained specifically on Bengali mathematical problems
Performance may vary on mathematical domains not well-represented in training data
Limited to text-based mathematical problems (no geometric diagrams)
Optimized for Google Colab T4 environment

Future Improvements

Expand training dataset to include more diverse mathematical problems
Add support for geometric problems with diagram interpretation
Implement multi-turn conversation capability for clarifying questions
Optimize for deployment on edge devices

Citation

If you use this model in your research, please cite:

@misc{bangla-math-solver-2025,
  title={Bangla Math Solver: Fine-tuned Gemma-2B for Bengali Mathematical Problem Solving},
  author={ajoysr},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/ajoysr/bangla-math-solver}
}

Acknowledgments

Unsloth Team: For providing efficient fine-tuning framework
Hugging Face: For model hosting and TRL library
Dataset Creator: hamim-87 for the Ashrafur_bangla_math dataset
Google Colab: For providing free GPU access for research

Contact

For questions, suggestions, or collaborations, please reach out through the Hugging Face model page or create an issue in the associated repository.

Note: This model is designed for educational and research purposes. Always verify mathematical solutions for critical applications.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ajoysr/bangla-math-gemma

Base model

unsloth/gemma-2-2b-it-bnb-4bit

Finetuned

(167)

this model