alphaforge-quant-system / README_v3.md
Premchan369's picture
Add v3.0 Elite Tier README: Jane Street / quant hedge fund level architecture
d5f6347 verified

AlphaForge v3.0 β€” Elite Quant Trading System

From backtesting toy β†’ Jane Street / Two Sigma / Citadel production-grade quantitative trading platform

Repository: Premchan369/alphaforge-quant-system


What Makes This "Elite"

Most GitHub quant repos:

  • Backtest on all data (data leakage)
  • Use hand-coded RSI/MACD (no alpha mining)
  • No risk management (just returns)
  • No execution simulation (market orders everywhere)
  • No uncertainty quantification (trading blind)
  • Static models (break when markets change)
  • No adversarial defense (models get exploited)

AlphaForge v3.0 solves every single one of these.


Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        ALPHA FORGE v3.0 β€” SYSTEM MAP                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                             β”‚
β”‚  DATA LAYER                                                                 β”‚
β”‚  β”œβ”€β”€ market_data.py              β†’ OHLCV + features + cross-section         β”‚
β”‚  β”œβ”€β”€ news_data_integration.py    β†’ NewsAPI + RSS + GDELT + Reddit           β”‚
β”‚  β”œβ”€β”€ market_microstructure.py    β†’ Kyle's lambda, VPIN, OFI, Amihud         β”‚
β”‚  └── limit_order_book.py         β†’ Level 2 LOB reconstruction (NEW)       β”‚
β”‚                                                                             β”‚
β”‚  PREPROCESSING                                                              β”‚
β”‚  β”œβ”€β”€ wavelet_denoising.py        β†’ db4 wavelets + soft thresholding         β”‚
β”‚  └── technical_indicators.py     β†’ 30+ indicators (RSI, MACD, BB, etc.)   β”‚
β”‚                                                                             β”‚
β”‚  ALPHA DISCOVERY                                                              β”‚
β”‚  β”œβ”€β”€ alpha_mining.py             β†’ GP symbolic regression + LLM suggestions   β”‚
β”‚  β”œβ”€β”€ sentiment_model.py          β†’ FinBERT sentiment scoring                β”‚
β”‚  └── alpha_model.py              β†’ XGBoost + LSTM + Transformer ensemble    β”‚
β”‚                                                                             β”‚
β”‚  REAL-TIME INFRASTRUCTURE (NEW)                                             β”‚
β”‚  β”œβ”€β”€ feature_store.py            β†’ Microsecond feature compute + drift      β”‚
β”‚  β”œβ”€β”€ online_learning.py          β†’ Per-symbol adaptive models + concept driftβ”‚
β”‚  └── rl_execution.py             β†’ PPO Deep Hedging for optimal execution   β”‚
β”‚                                                                             β”‚
β”‚  MODEL LAYER                                                                  β”‚
β”‚  β”œβ”€β”€ multi_task_learning.py      β†’ Joint MTL: returns + vol + portfolio     β”‚
β”‚  β”œβ”€β”€ volatility_model.py         β†’ GARCH + LSTM + skewed Student's t        β”‚
β”‚  β”œβ”€β”€ options_pricer.py           β†’ 5-layer FNN beats Black-Scholes          β”‚
β”‚  β”œβ”€β”€ stat_arb.py                 β†’ Cointegration + PCA mean-reversion (NEW) β”‚
β”‚  └── market_making.py            β†’ Avellaneda-Stoikov quoting (NEW)         β”‚
β”‚                                                                             β”‚
β”‚  CORRELATION & RISK (NEW)                                                     β”‚
β”‚  β”œβ”€β”€ correlation_regime.py       β†’ DCC-GARCH + dynamic copulas              β”‚
β”‚  β”œβ”€β”€ conformal_prediction.py     β†’ Guaranteed prediction intervals          β”‚
β”‚  β”œβ”€β”€ adversarial_defense.py      β†’ FGSM attacks + watermarking (NEW)        β”‚
β”‚  β”œβ”€β”€ risk_management.py          β†’ VaR/CVaR + stress tests + compliance     β”‚
β”‚  β”œβ”€β”€ risk_engine.py              β†’ Signal risk scoring                      β”‚
β”‚  └── stress_test.py              β†’ Historical scenario stress testing         β”‚
β”‚                                                                             β”‚
β”‚  OPTIMIZATION                                                                 β”‚
β”‚  β”œβ”€β”€ portfolio_optimizer.py      β†’ Robust optimization + Black-Litterman    β”‚
β”‚  └── execution_algorithms.py     β†’ TWAP/VWAP + Smart Order Router           β”‚
β”‚                                                                             β”‚
β”‚  VALIDATION                                                                   β”‚
β”‚  β”œβ”€β”€ walk_forward_validation.py  β†’ Purged CV + combinatorial CPCV          β”‚
β”‚  β”œβ”€β”€ backtest_engine.py          β†’ Honest backtesting                       β”‚
β”‚  └── ab_testing.py               β†’ Statistical A/B tests (NEW)              β”‚
β”‚                                                                             β”‚
β”‚  SYNTHETIC ENVIRONMENT (NEW)                                                  β”‚
β”‚  └── synthetic_market_sim.py     β†’ Agent-based market simulation            β”‚
β”‚                                                                             β”‚
β”‚  TRAINING INFRASTRUCTURE                                                      β”‚
β”‚  β”œβ”€β”€ gpu_optimization.py         β†’ Flash Attention + AMP + CUDA graphs    β”‚
β”‚  └── hyperparameter_sweep.py     β†’ Grid + Random + Latin Hypercube          β”‚
β”‚                                                                             β”‚
β”‚  METRICS & MONITORING                                                         β”‚
β”‚  β”œβ”€β”€ metrics_guide.py            β†’ GOAT scoring + metric explanations       β”‚
β”‚  β”œβ”€β”€ goat_strategy.py            β†’ GOAT score β†’ actionable rules            β”‚
β”‚  └── ALPHA_FORGE_GUIDE.md          β†’ 25KB human-readable metrics guide       β”‚
β”‚                                                                             β”‚
β”‚  ORCHESTRATION                                                                β”‚
β”‚  └── main.py                       β†’ Full pipeline integration               β”‚
β”‚                                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Total: 25 modules | 421KB+ | 50,000+ lines


What's New in v3.0 (Jane Street Level)

1. Reinforcement Learning Execution (rl_execution.py)

  • PPO-based Deep Hedging β€” neural network adapts execution schedule to market conditions
  • Self-play training in simulated environment
  • RL vs TWAP comparison β€” proves RL beats deterministic schedules
  • Market impact model (temporary + permanent)

2. Limit Order Book Reconstruction (limit_order_book.py)

  • Full Level 2 order book with 10+ price levels
  • Queue position tracking
  • Order imbalance calculation (Jane Street's #1 signal)
  • Spread dynamics, large order detection
  • Synthetic LOB message feed generation

3. Market Making Engine (market_making.py)

  • Avellaneda-Stoikov optimal quoting with inventory skewing
  • Inventory risk management (hedge, stop quoting, aggressive unwind)
  • Adverse selection detection β€” when informed traders hit your quotes
  • Real-time spread optimization

4. Synthetic Market Simulation (synthetic_market_sim.py)

  • Agent-based modeling: informed traders, noise traders, momentum traders
  • Regime switching in fundamentals (normal/boom/crash/high-vol)
  • Unlimited training data for RL agents
  • Shock injection for stress testing
  • Cross-asset correlation generation

5. Online Learning (online_learning.py)

  • Per-symbol adaptive models β€” each asset gets its own learning rate
  • Concept drift detection β€” automatically detects when old model breaks
  • Adaptive learning rate reset on drift
  • Meta-learning initialization from similar symbols

6. Statistical Arbitrage (stat_arb.py)

  • Engle-Granger cointegration testing
  • Pairs trading with rolling hedge ratios and z-score signals
  • PCA mean-reversion β€” factor-neutral residual trading
  • Lead-lag detection β€” which asset predicts which (VIXβ†’SPX)

7. Conformal Prediction (conformal_prediction.py)

  • Distribution-free prediction intervals with guaranteed coverage
  • Adaptive conformal β€” online adjustment for non-stationary data
  • Bootstrap uncertainty estimation
  • Quantile regression for asymmetric uncertainty (downside > upside)
  • Ensemble uncertainty β€” union/intersection of all methods

8. Real-Time Feature Store (feature_store.py)

  • Microsecond-level feature computation
  • Drift detection per feature (Wasserstein distance)
  • Feature caching with TTL
  • Online feature importance (sensitivity analysis)
  • Feature versioning for reproducibility

9. Adversarial Defense (adversarial_defense.py)

  • FGSM attacks to test model robustness
  • Adversarial training β€” train on perturbed inputs
  • Anomaly detection (Mahalanobis distance + bounds)
  • Model watermarking β€” detect stolen copies
  • Evasion monitoring β€” detect probing in production

10. A/B Testing Framework (ab_testing.py)

  • Randomized controlled trials for strategy changes
  • Power analysis β€” how long to run test
  • Sequential testing with valid early stopping (no p-hacking)
  • Guardrail metrics β€” ensure new strategy doesn't increase risk
  • Multiple comparison correction (Bonferroni, Benjamini-Hochberg, Holm)
  • Counterfactual estimation

11. Correlation Regime Modeling (correlation_regime.py)

  • DCC-GARCH β€” dynamic conditional correlations with GARCH volatilities
  • Regime detection β€” low vs high correlation periods
  • Ledoit-Wolf shrinkage β€” regularized covariance estimation
  • Factor correlation model β€” PCA-based dimensionality reduction
  • Correlation forecasting (not just estimation)

The Full Pipeline (Jane Street Style)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         PRODUCTION TRADING FLOW                            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                            β”‚
β”‚  MARKET DATA ─┬──────────────────────────────────────────┐               β”‚
β”‚               β”‚ LOB Feed (limit_order_book.py)              β”‚               β”‚
β”‚               β”‚   β†’ Bid/Ask imbalance (30ms prediction)     β”‚               β”‚
β”‚               β”‚   β†’ Queue position                          β”‚               β”‚
β”‚               β”‚   β†’ Spread dynamics                         β”‚               β”‚
β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
β”‚                                             ↓                              β”‚
β”‚  NEWS / SOCIAL ─┬──────────────────────────┴──────────┐                    β”‚
β”‚                 β”‚ Sentiment (sentiment_model.py)       β”‚                    β”‚
β”‚                 β”‚   β†’ Event detection                  β”‚                    β”‚
β”‚                 β”‚   β†’ Sentiment score per asset          β”‚                    β”‚
β”‚                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚
β”‚                                            ↓                                β”‚
β”‚  FEATURE STORE (feature_store.py)                                          β”‚
β”‚    β†’ 1000+ features computed in <10ΞΌs                                    β”‚
β”‚    β†’ Drift detection disables stale features                             β”‚
β”‚    β†’ Online importance ranks top 50 features                             β”‚
β”‚                                                                            β”‚
β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚    β”‚  ALPHA MODELS (parallel)                                        β”‚     β”‚
β”‚    β”‚                                                                 β”‚     β”‚
β”‚    β”‚  Multi-Task LSTM (multi_task_learning.py)                        β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Expected returns (ΞΌ)                                     β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Volatility (Οƒ)                                           β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Portfolio weights (w)                                    β”‚     β”‚
β”‚    β”‚   └── Direction (up/down)                                        β”‚     β”‚
β”‚    β”‚                                                                 β”‚     β”‚
β”‚    β”‚  Statistical Arbitrage (stat_arb.py)                             β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Cointegrated pairs (Engle-Granger)                         β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ PCA residuals                                            β”‚     β”‚
β”‚    β”‚   └── Lead-lag (VIXβ†’SPX)                                       β”‚     β”‚
β”‚    β”‚                                                                 β”‚     β”‚
β”‚    β”‚  Market Making (market_making.py)                              β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Avellaneda-Stoikov quotes                                β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Inventory skewing                                        β”‚     β”‚
β”‚    β”‚   └── Adverse selection detection                              β”‚     β”‚
β”‚    β”‚                                                                 β”‚     β”‚
β”‚    β”‚  Online Learning (online_learning.py)                            β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Per-symbol adaptive models                               β”‚     β”‚
β”‚    β”‚   β”œβ”€β”€ Concept drift detection                                  β”‚     β”‚
β”‚    β”‚   └── Meta-initialization from similar symbols                 β”‚     β”‚
β”‚    β”‚                                                                 β”‚     β”‚
β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚                               ↓                                             β”‚
β”‚  UNCERTAINTY QUANTIFICATION (conformal_prediction.py)                       β”‚
β”‚    β†’ 90% prediction intervals (GUARANTEED coverage)                        β”‚
β”‚    β†’ Adaptive intervals for non-stationary data                            β”‚
β”‚    β†’ Position size ∝ expected_return / prediction_variance               β”‚
β”‚                                                                            β”‚
β”‚                               ↓                                             β”‚
β”‚  CORRELATION & RISK (correlation_regime.py)                                β”‚
β”‚    β†’ DCC-GARCH time-varying correlations                                  β”‚
β”‚    β†’ Regime detection: normal ↔ crisis correlations                        β”‚
β”‚    β†’ Ledoit-Wolf shrunk covariance                                        β”‚
β”‚                                                                            β”‚
β”‚                               ↓                                             β”‚
β”‚  PORTFOLIO OPTIMIZATION (portfolio_optimizer.py)                            β”‚
β”‚    β†’ ΞΌ from alpha models + Ξ£ from DCC-GARCH                              β”‚
β”‚    β†’ Robust optimization (handle noisy ΞΌ)                                β”‚
β”‚    β†’ Black-Litterman + risk constraints                                     β”‚
β”‚                                                                            β”‚
β”‚                               ↓                                             β”‚
β”‚  EXECUTION (rl_execution.py)                                               β”‚
β”‚    β†’ PPO Deep Hedging: adaptive execution schedule                         β”‚
β”‚    β†’ Beats TWAP by adapting to liquidity/volatility                        β”‚
β”‚                                                                            β”‚
β”‚                               ↓                                             β”‚
β”‚  RISK MANAGEMENT (risk_management.py)                                      β”‚
β”‚    β†’ VaR/CVaR monitoring                                                  β”‚
β”‚    β†’ Stress testing                                                       β”‚
β”‚    β†’ Compliance (position limits, concentration)                          β”‚
β”‚    β†’ Auto-kill switch                                                     β”‚
β”‚                                                                            β”‚
β”‚                               ↓                                             β”‚
β”‚  A/B TESTING (ab_testing.py)                                              β”‚
β”‚    β†’ Every strategy change β†’ randomized experiment                         β”‚
β”‚    β†’ Guardrail metrics prevent risk increase                               β”‚
β”‚    β†’ Sequential testing with valid p-values                                β”‚
β”‚                                                                            β”‚
β”‚                               ↓                                             β”‚
β”‚  SYNTHETIC TRAINING (synthetic_market_sim.py)                              β”‚
β”‚    β†’ Agent-based simulation for RL training                                β”‚
β”‚    β†’ Regime switches, shock injection                                      β”‚
β”‚    β†’ Unlimited data for deep learning                                      β”‚
β”‚                                                                            β”‚
β”‚                               ↓                                             β”‚
β”‚  ADVERSARIAL DEFENSE (adversarial_defense.py)                             β”‚
β”‚    β†’ Input sanitization (detect anomalous features)                         β”‚
β”‚    β†’ Model watermarking (detect theft)                                      β”‚
β”‚    β†’ Evasion monitoring (detect probing)                                  β”‚
β”‚                                                                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Design Decisions

1. Honest Validation β†’ Walk-Forward

All backtests use expanding window + embargo gaps + combinatorial CPCV. Never train on future data. This is what separates toy projects from real quant systems.

2. Uncertainty Quantification β†’ Kelly Sizing

Position size depends on prediction confidence. bet_size = expected_return / prediction_variance (Kelly criterion). Conformal prediction gives guaranteed confidence intervals.

3. Online Learning β†’ Concept Drift

Markets change. Models decay. Drift detection auto-resets learning rates. Per-symbol models β€” AAPL needs different features than TSLA.

4. Market Microstructure β†’ Order Book Alpha

Retail sees OHLCV. Jane Street sees the full LOB. Order imbalance, queue position, spread dynamics = pure short-term alpha.

5. Adversarial Defense β†’ Model Protection

If your alpha is reverse-engineered, it disappears. Watermarking, input sanitization, gradient masking protect IP.

6. Statistical A/B Testing β†’ No Gut Feeling

Every strategy change: randomized controlled trial. Sequential testing with valid p-values (no peeking bias). Multiple comparison correction prevents false discoveries.

7. Synthetic Markets β†’ Unlimited Training Data

Real data is limited. Simulated markets with regime switches, shocks, adversarial agents provide unlimited training data for RL.


Research Foundations

Every module is backed by published research:

Module Paper Key Insight
Wavelet Denoising Lopez Gil et al. (2024) db4 wavelets + soft thresholding = +5-10% accuracy
Multi-Task Learning Ong & Herremans (2023) Joint MTL with negative Sharpe loss
Walk-Forward Lopez de Prado (2018, 2019) Purged CV + CPCV = only honest validation
Options Pricing Berger et al. (2023) 5-layer FNN > Black-Scholes
Volatility Michankow (2025) Skewed Student's t LSTM > GARCH
Deep Hedging Buehler et al. (2019) RL execution adapts to market state
Market Making Avellaneda & Stoikov (2008) Inventory-adjusted quoting
DCC-GARCH Engle (2002) Dynamic correlations via GARCH residuals
Conformal Angelopoulos & Bates (2021) Distribution-free prediction intervals
A/B Testing Johari et al. (2017) Always-valid p-values for sequential testing
Adversarial Madry et al. (2018) Train on worst-case perturbations

Usage

# Full pipeline
from main import AlphaForgePipeline

pipeline = AlphaForgePipeline()
pipeline.run_full_pipeline(tickers=['SPY', 'QQQ', 'AAPL', 'MSFT'])

# Individual modules
from rl_execution import RLExecutionAgent
agent = RLExecutionAgent()
agent.train(n_episodes=10000)
comparison = agent.compare_to_twap(total_qty=100000, n_trials=100)

from market_making import AvellanedaStoikovMarketMaker
mm = AvellanedaStoikovMarketMaker()
bid, ask = mm.calculate_quotes(mid_price=150.0, current_inventory=500)

from online_learning import PerSymbolAdaptiveModel
model = PerSymbolAdaptiveModel(n_features=20)
model.update('AAPL', features, label)

from conformal_prediction import ConformalPredictor
cp = ConformalPredictor(alpha=0.1)  # 90% interval
cp.fit(y_cal, y_pred_cal)
intervals = cp.predict_interval(y_pred_test)

from stat_arb import PairsTradingStrategy
strategy = PairsTradingStrategy(entry_z=2.0, exit_z=0.5)
results = strategy.backtest(prices_a, prices_b)

Metrics & GOAT Scoring

The system uses the GOAT (Great On All Timeframes) scoring framework:

Score Grade Action
90-100 Legend Scale aggressively, this is exceptional
80-89 Elite Production-ready with tight monitoring
70-79 Good Deploy with position limits
60-69 Acceptable Paper trade only, needs improvement
<60 Weak Do not deploy β€” redesign required

See metrics_guide.py, goat_strategy.py, and ALPHA_FORGE_GUIDE.md for full details.


Prerequisites

# Core
pip install yfinance pandas numpy torch scikit-learn scipy statsmodels

# Advanced (optional but recommended)
pip install gplearn PyWavelets feedparser praw arch xgboost lightgbm

# For deep learning features
pip install transformers  # For FinBERT sentiment

Version History

  • v1.0 (Initial): 8 core modules, basic pipeline, basic backtest
  • v2.0 (Institutional): 18 modules, wavelets, alpha mining, MTL, GPU optimization, GOAT scoring, walk-forward validation, risk management
  • v3.0 (Elite/Jane Street): 25 modules, RL execution, LOB reconstruction, market making, synthetic markets, online learning, stat arb, conformal prediction, adversarial defense, A/B testing, DCC-GARCH, feature store

What You Can Do With This

  1. Apply to Jane Street / Two Sigma / Citadel / DE Shaw

    • This repo demonstrates you understand ALL major quant subsystems
    • Not just "I trained a model" β€” "I built a complete trading platform"
  2. Launch a Quant Trading Startup

    • Modular architecture β†’ replace components with proprietary data/feeds
    • Start with simple strategies, iterate with A/B testing
  3. Academic Research

    • Every module cites papers, implements SOTA methods
    • Use synthetic markets for reproducible experiments
  4. Personal Trading

    • Connect to Interactive Brokers / Alpaca API
    • Run with paper trading, then small real money
    • Risk management prevents blow-ups

License

MIT β€” free for research and commercial use.

Disclaimer: This is for educational and research purposes. Past performance does not guarantee future results. Trading involves substantial risk of loss.