YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Smart Money Concepts (SMC) Trading Signal Generator for XAUUSD
This project processes hourly XAUUSD (Gold) price data to generate trading signals based on Smart Money Concepts (SMC) patterns, including Order Blocks (OB), Fair Value Gaps (FVG), and Change of Character (CHOCH). It uses a Long Short-Term Memory (LSTM) neural network to predict trading signals (Buy=1, Sell=0, Hold=2) and includes take-profit (TP) and stop-loss (SL) calculations for trading strategies.
Features
- Data Preprocessing: Processes a large dataset (~150,000 rows) of hourly XAUUSD OHLCV data.
- SMC Signal Generation: Identifies SMC patterns using vectorized operations for efficiency.
- Feature Engineering: Extracts SMC-derived features like higher highs (
hh), lower lows (ll), trend, body size, and volume ratio. - Sequence Creation: Generates sequences (window size=60) for time-series modeling.
- LSTM Model: Trains an LSTM model with class-weighted loss to handle imbalanced classes (many Hold signals).
- Evaluation: Provides a classification report with precision, recall, and F1-score for Sell, Buy, and Hold classes.
- Optimization: Handles large datasets with chunked sequence creation, memory-efficient data types, and early stopping.
Requirements
- Python 3.8+
- Libraries:
pandas==2.0.3 numpy==1.24.3 scikit-learn==1.2.2 datasets==2.14.4 torch==2.0.1 - Hardware:
- CPU (multi-core recommended for parallel processing)
- GPU (optional, for faster training with CUDA)
- At least 16 GB RAM for handling ~150,000 rows
- Input Data:
XAUUSD_H1.csv: Hourly OHLCV data with columnsTime,Open,High,Low,Close,Volume(tab-separated).
Installation
- Clone the repository:
git clone <repository-url> cd <repository-directory> - Install dependencies:
pip install -r requirements.txt - Ensure the dataset (
XAUUSD_H1.csv) is placed in the project directory.
Usage
Preprocess Data: Run the preprocessing script to generate SMC signals and create a Hugging Face
Dataset:python preprocess_smc_optimized.py- Input:
XAUUSD_H1.csv(~150,000 rows) - Output:
/xauusd_smc_dataset(saved dataset with sequences and labels) - Features:
Open,High,Low,Close,Volume,hh,ll,trend,body_size,volume_ratio - Labels: Buy=1, Sell=0, Hold=2
- Sequence length: 60 hours
- Train-test split: 80/20
- Input:
Train and Evaluate Model: Run the training script to train an LSTM model and generate a classification report:
python train_evaluate_lstm_smc.py- Loads
/xauusd_smc_dataset - Trains an LSTM model with class-weighted loss and early stopping
- Outputs training logs (train loss, test loss, test accuracy) and a classification report
- Saves the best model to
/best_lstm_model.pth
- Loads
Example Output:
Epoch 1/100, Train Loss: 0.571448, Test Loss: 0.568098, Test Accuracy: 0.9834 ... Epoch 100/100, Train Loss: 0.232735, Test Loss: 0.235532, Test Accuracy: 0.9650 Classification Report: precision recall f1-score support Sell 0.XX 0.XX 0.XX XXXX Buy 0.XX 0.XX 0.XX XXXX Hold 0.XX 0.XX 0.XX XXXXX accuracy 0.XX 29988 macro avg 0.XX 0.XX 0.XX 29988 weighted avg 0.XX 0.XX 0.XX 29988
Project Structure
preprocess_smc_optimized.py: Generates SMC signals, creates sequences, and saves the dataset.train_evaluate_lstm_smc.py: Trains an LSTM model and evaluates performance with a classification report.XAUUSD_H1.csv: Input dataset (not included; provide your own)./xauusd_smc_dataset: Saved Hugging Face dataset./best_lstm_model.pth: Saved best model weights.
Performance Optimizations
- Vectorized SMC Calculations: Uses pandas boolean indexing instead of loops for signal generation.
- Chunked Sequence Creation: Processes sequences in chunks (10,000 rows) to manage memory for ~150,000 rows.
- Class Imbalance Handling: Applies class weights (
[1.0, 1.0, 0.1]) to prioritize Buy/Sell signals over Hold. - Early Stopping: Stops training if test loss doesn’t improve for 10 epochs.
- Memory Efficiency: Uses
float32for features,int64for labels, and saves dataset to disk.
Future Improvements
- Threshold Tuning: Adjust SMC thresholds (e.g., SL to 0.7%) for XAUUSD volatility.
- Additional Features: Add OB/FVG/CHOCH flags as features.
- Alternative Models: Experiment with Transformer models for better sequence modeling.
- Parallel Processing: Use
daskfor larger datasets.
License
MIT License
Contact
For issues or contributions, please open a issue or contact semsankken.
Generated on September 8, 2025, 16:15 +07
- Downloads last month
- 1