Spaces:
				
			
			
	
			
			
		Sleeping
		
	
	
	
			
			
	
	
	
	
		
		
		Sleeping
		
	A newer version of the Streamlit SDK is available:
									1.50.0
metadata
			title: ASL Recognition App
sdk: streamlit
emoji: π
colorFrom: blue
colorTo: green
app_file: streamlit_app.py
pinned: false
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/67bc2842593452cc18976b31/bUJ1gK4YPzTvhoh3KKt_z.webp
license: mit
sdk_version: 1.45.1
π€ Automatic Sign Language Recognition - Complete Project
A comprehensive, production-ready American Sign Language (ASL) alphabet recognition system using state-of-the-art deep learning techniques, transfer learning, and real-time detection capabilities.
π― Project Overview
This project implements an end-to-end ASL recognition system with:
- Multiple CNN Architectures: VGG16, ResNet50, InceptionV3, EfficientNet, MobileNet
- Transfer Learning: Pre-trained models fine-tuned for ASL recognition
- Real-time Detection: MediaPipe + OpenCV integration for live recognition
- Web Interfaces: FastAPI REST API and Streamlit web app
- Comprehensive Evaluation: Detailed metrics, visualizations, and model comparison
- Production Ready: Deployment packages and configuration files
π Dataset Information
- Source: ASL Alphabet Dataset on Kaggle
- Classes: 29 total (A-Z + SPACE, DELETE, NOTHING)
- Images: ~87,000 training images
- Format: 200x200 RGB images organized by class folders
π Quick Start
1. Installation
# Clone the repository
git clone <repository-url>
cd asl-recognition-project
# Install dependencies
pip install -r requirements.txt
2. Download Dataset
- Download the ASL Alphabet dataset from Kaggle
- Extract to your desired location
- Ensure the structure matches:
dataset/
βββ asl_alphabet_train/
β   βββ A/
β   βββ B/
β   βββ ...
β   βββ NOTHING/
βββ asl_alphabet_test/
    βββ A/
    βββ B/
    βββ ...
    βββ NOTHING/
3. Training Models
# Create configuration file
python main_training.py --create-config
# Edit training_config.json with your paths
# Then run training
python main_training.py --data-dir /path/to/dataset --epochs 30
4. Real-time Detection
# After training, use the best model for real-time detection
python real_time_detection.py
5. Web Interfaces
# FastAPI REST API
python app.py
# Streamlit Web App
streamlit run streamlit_app.py
π Project Structure
asl_recognition_project/
βββ π Core Modules
β   βββ data_preprocessing.py      # Data loading and augmentation
β   βββ model_architectures.py    # CNN models and transfer learning
β   βββ train_compare_models.py   # Training and model comparison
β   βββ evaluate_models.py        # Comprehensive evaluation
β   βββ real_time_detection.py    # Live ASL recognition
βββ π Deployment
β   βββ app.py                     # FastAPI REST API
β   βββ streamlit_app.py          # Streamlit web interface
βββ π― Main Scripts
β   βββ main_training.py          # Complete training pipeline
β   βββ training_config.json      # Configuration file
βββ π Documentation
β   βββ requirements.txt          # Dependencies
β   βββ asl-project-structure.md  # Detailed project info
β   βββ README.md                 # This file
βββ π Generated Outputs
    βββ models/                   # Trained models
    βββ logs/                     # Training logs
    βββ results/                  # Evaluation results
    βββ deployment/               # Deployment package
π§ Core Components
	
		
	
	
		1. Data Preprocessing (data_preprocessing.py)
	
- Advanced data augmentation techniques
- MediaPipe hand detection integration
- Albumentations transformations
- Dataset analysis and visualization
	
		
	
	
		2. Model Architectures (model_architectures.py)
	
- Transfer learning implementations
- Multiple CNN architectures (VGG16, ResNet50, InceptionV3, EfficientNet, MobileNet)
- Custom CNN architectures
- Model factory for easy instantiation
	
		
	
	
		3. Training Pipeline (train_compare_models.py)
	
- Multi-model training and comparison
- Early stopping and learning rate scheduling
- TensorBoard integration
- Comprehensive training logs
	
		
	
	
		4. Model Evaluation (evaluate_models.py)
	
- Detailed metrics (accuracy, precision, recall, F1)
- Confusion matrix visualization
- Per-class performance analysis
- Model comparison charts
	
		
	
	
		5. Real-time Detection (real_time_detection.py)
	
- Live webcam ASL recognition
- MediaPipe hand tracking
- Prediction smoothing
- Word building interface
- Video file processing
6. Web Deployment
- FastAPI API (app.py): RESTful API with batch processing
- Streamlit App (streamlit_app.py): Interactive web interface
π― Usage Examples
Training Custom Models
from main_training import ASLTrainingPipeline
config = {
    'data_dir': '/path/to/dataset',
    'train_dir': '/path/to/dataset/asl_alphabet_train',
    'output_dir': 'my_training_results',
    'model_types': ['resnet50', 'efficientnet_b0'],
    'epochs': 25,
    'batch_size': 64
}
pipeline = ASLTrainingPipeline(config)
results = pipeline.run_complete_pipeline()
Real-time Recognition
from real_time_detection import RealTimeASLDetector
# ASL class names
asl_classes = ['A', 'B', 'C', ..., 'SPACE', 'DELETE', 'NOTHING']
# Initialize detector
detector = RealTimeASLDetector(
    model_path='models/best_model.h5',
    class_names=asl_classes,
    confidence_threshold=0.7
)
# Run detection
detector.run_detection()
API Usage
import requests
# Upload image for prediction
files = {'file': open('test_image.jpg', 'rb')}
response = requests.post('http://localhost:8000/predict', files=files)
result = response.json()
print(f"Predicted: {result['predicted_class']}")
print(f"Confidence: {result['confidence']}")
π Performance Results
Based on research and implementation:
| Model | Accuracy | Parameters | Training Time | 
|---|---|---|---|
| EfficientNet-B0 | 99.2% | 5.3M | ~45 min | 
| ResNet50 | 98.8% | 25.6M | ~60 min | 
| InceptionV3 | 98.5% | 23.9M | ~55 min | 
| VGG16 | 97.9% | 138.4M | ~75 min | 
| MobileNetV2 | 96.7% | 3.5M | ~35 min | 
π οΈ Configuration
	
		
	
	
		Training Configuration (training_config.json)
	
{
  "data_dir": "/path/to/asl/dataset",
  "train_dir": "/path/to/asl/dataset/asl_alphabet_train", 
  "test_dir": "/path/to/asl/dataset/asl_alphabet_test",
  "output_dir": "training_output",
  "model_types": ["vgg16", "resnet50", "inceptionv3", "efficientnet_b0"],
  "validation_split": 0.2,
  "batch_size": 32,
  "epochs": 30,
  "fine_tune": true
}
π Deployment Options
1. Local Development
# Real-time detection
python real_time_detection.py
# API server
python app.py
# Web interface  
streamlit run streamlit_app.py
2. Docker Deployment
FROM python:3.9-slim
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "app.py"]
3. Cloud Deployment
- AWS EC2/Lambda
- Google Cloud Platform
- Azure Container Instances
- Heroku
π Evaluation Metrics
The system provides comprehensive evaluation including:
- Accuracy Metrics: Overall, top-3, top-5 accuracy
- Per-class Metrics: Precision, recall, F1-score for each ASL sign
- Confusion Matrices: Detailed error analysis
- ROC Curves: Performance visualization
- Training History: Loss and accuracy curves
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
π Requirements
Hardware
- Minimum: 8GB RAM, 4-core CPU
- Recommended: 16GB RAM, 8-core CPU, GPU (NVIDIA with CUDA)
- Storage: 10GB free space
Software
- Python 3.8+
- TensorFlow 2.13+
- OpenCV 4.8+
- MediaPipe 0.10+
π References
- Transfer Learning for Sign Language Recognition
- MediaPipe Hands Documentation
- EfficientNet: Rethinking Model Scaling for CNNs
- ASL Alphabet Dataset on Kaggle
π License
This project is licensed under the MIT License - see the LICENSE file for details.
β Acknowledgments
- Kaggle for providing the ASL Alphabet dataset
- Google for MediaPipe hand tracking
- TensorFlow/Keras teams for deep learning frameworks
- OpenCV community for computer vision tools
Ready to recognize ASL signs? Start with the quick start guide above! π€# ASL-AI
