๐งช Testing Guide for Multilingual Emotion Classifier
This guide provides comprehensive testing capabilities for the rmtariq/multilingual-emotion-classifier
model.
๐ Quick Start
Installation
# Install requirements
pip install -r requirements_testing.txt
# Or install manually
pip install torch transformers numpy pandas scikit-learn
Basic Usage
# Quick test (recommended for first-time users)
python test_model.py --test-type quick
# Comprehensive test
python test_model.py --test-type comprehensive
# Interactive testing
python test_model.py --test-type interactive
# Performance benchmark
python test_model.py --test-type benchmark
# Run all tests
python test_model.py --test-type all
๐ Test Types
1. ๐ Quick Test
Purpose: Fast validation of core functionality
Duration: ~30 seconds
Coverage: 13 essential test cases (English + Malay)
python test_model.py --test-type quick
What it tests:
- โ Basic English emotions (6 cases)
- โ Basic Malay emotions (4 cases)
- โ Previously problematic cases (3 cases)
Expected Results: >90% accuracy
2. ๐ฌ Comprehensive Test
Purpose: Thorough validation across all categories
Duration: ~2 minutes
Coverage: 24 test cases across multiple categories
python test_model.py --test-type comprehensive
Test Categories:
- English Basic: Core English emotion expressions
- Malay Basic: Core Malay emotion expressions
- Malay Fixed Issues: Previously problematic cases (now fixed)
- Edge Cases: Boundary and special cases
Expected Results: >85% overall accuracy
3. ๐ฎ Interactive Test
Purpose: Manual testing with custom inputs
Duration: User-controlled
Coverage: Unlimited custom test cases
python test_model.py --test-type interactive
Features:
- Real-time emotion classification
- Confidence scoring
- Emoji visualization
- Easy exit (type 'quit')
Example Session:
๐ฌ Your text: I am so excited!
๐ญ Result: ๐ happy
๐ Confidence: 99.8%
๐ช High confidence!
๐ฌ Your text: Saya gembira!
๐ญ Result: ๐ happy
๐ Confidence: 99.9%
๐ช High confidence!
4. โก Benchmark Test
Purpose: Performance and speed evaluation
Duration: ~1 minute
Coverage: 100 predictions for timing analysis
python test_model.py --test-type benchmark
Metrics Measured:
- Total processing time
- Average time per prediction
- Predictions per second
- Performance classification
Expected Results: >5 predictions/second
๐ฏ Supported Emotions
The model classifies text into 6 emotion categories:
Emotion | Emoji | Description | Example (English) | Example (Malay) |
---|---|---|---|---|
anger | ๐ | Frustration, rage | "I'm so angry!" | "Marah betul!" |
fear | ๐จ | Anxiety, worry | "I'm scared!" | "Takut sangat!" |
happy | ๐ | Joy, excitement | "I'm so happy!" | "Gembira sangat!" |
love | โค๏ธ | Affection, care | "I love you!" | "Sayang kamu!" |
sadness | ๐ข | Sorrow, grief | "I'm so sad" | "Sedih betul" |
surprise | ๐ฒ | Amazement, shock | "What a surprise!" | "Terkejut betul!" |
๐ง Advanced Usage
Custom Model Testing
# Test a different model
python test_model.py --model "your-model-name" --test-type quick
# Test local model
python test_model.py --model "./path/to/local/model" --test-type comprehensive
Programmatic Usage
from test_model import EmotionModelTester
# Initialize tester
tester = EmotionModelTester("rmtariq/multilingual-emotion-classifier")
# Run specific tests
quick_accuracy = tester.quick_test()
comprehensive_accuracy = tester.comprehensive_test()
speed = tester.benchmark_test()
print(f"Quick test accuracy: {quick_accuracy:.1%}")
print(f"Comprehensive accuracy: {comprehensive_accuracy:.1%}")
print(f"Speed: {speed:.1f} predictions/second")
๐ Expected Performance
Accuracy Targets
- Quick Test: >90% accuracy
- Comprehensive Test: >85% accuracy
- English Performance: >95% accuracy
- Malay Performance: >85% accuracy
Speed Targets
- CPU Performance: >5 predictions/second
- GPU Performance: >20 predictions/second
Confidence Levels
- High Confidence: >90% (๐ช)
- Good Confidence: 70-90% (๐)
- Low Confidence: <70% (โ ๏ธ)
๐ Troubleshooting
Common Issues
1. Model Loading Errors
โ Error loading model: ...
Solutions:
- Check internet connection
- Verify model name spelling
- Try:
pip install --upgrade transformers
2. CUDA/GPU Issues
CUDA out of memory
Solutions:
- The model automatically falls back to CPU
- Reduce batch size if using custom code
- Use
--device cpu
flag if available
3. Slow Performance
โ ๏ธ SLOW. Consider optimization.
Solutions:
- Use GPU if available
- Close other applications
- Consider model quantization for production
Getting Help
If you encounter issues:
- Check Requirements: Ensure all dependencies are installed
- Update Libraries:
pip install --upgrade transformers torch
- Check Model Status: Visit model page
- Report Issues: Create an issue on the repository
๐ฏ Test Case Examples
English Test Cases
# Basic emotions
"I am so happy today!" # โ happy
"This makes me really angry!" # โ anger
"I love you so much!" # โ love
"I'm scared of spiders" # โ fear
"This news makes me sad" # โ sadness
"What a surprise!" # โ surprise
Malay Test Cases
# Basic emotions
"Saya sangat gembira!" # โ happy
"Aku marah dengan keadaan ini" # โ anger
"Aku sayang kamu" # โ love
"Saya takut dengan ini" # โ fear
"Sedih betul dengan berita" # โ sadness
"Terkejut dengan kejadian" # โ surprise
# Fixed issues (previously problematic)
"Ini adalah hari jadi terbaik" # โ happy (was: anger)
"Terbaik!" # โ happy (was: surprise)
"Ini adalah hari yang baik" # โ happy (was: anger)
๐ Performance History
Version 2.1 (Current)
- โ Overall Accuracy: 85.0%
- โ English Performance: 100%
- โ Malay Performance: 100% (fixed issues)
- โ Speed: 5-20 predictions/second
Key Improvements
- ๐ง Fixed Malay birthday context classification
- ๐ง Fixed "baik/terbaik" positive expression recognition
- ๐ง Improved confidence scores
- ๐ง Enhanced robustness
๐ Success Criteria
A successful test run should show:
- โ Quick Test: >90% accuracy
- โ No Critical Failures: All basic emotions working
- โ Malay Fixes Verified: Birthday/positive contexts โ happy
- โ Reasonable Speed: >5 predictions/second
- โ High Confidence: Most predictions >90%
Model Repository: https://huggingface.co/rmtariq/multilingual-emotion-classifier
Author: rmtariq
Last Updated: June 2024