Oguzz07
/

causal-discovery-algorithm-selection

Model card Files Files and versions

xet

Community

Oguzz07 commited on 9 days ago

Commit

70796a3

verified ·

1 Parent(s): 845e234

Update README with final results (71.3% hit rate, 178 configs, pairwise ranking)

Browse files

Files changed (1) hide show

README.md +62 -25

README.md CHANGED Viewed

@@ -6,33 +6,50 @@ A meta-learning system that predicts the **top-3 best causal discovery algorithm
 Given a new discrete dataset (pandas DataFrame), the system:
 1. **Extracts 34 meta-features** (entropy, mutual information, chi² statistics, CI test probes, etc.)
-2. **Predicts normalized SHD** for each of 9 algorithms via a trained Random Forest
 3. **Ranks and returns the top-3** algorithms expected to produce the most accurate CPDAG
 ## 📊 Performance (Leave-One-Network-Out Cross-Validation)
 | Metric | Value |
 |--------|-------|
-| **Top-3 Hit Rate** | **67.2%** (true best algorithm is in predicted top-3) |
-| **NDCG@3** | **0.947** (ranking quality) |
-| **Mean Regret** | **0.012** (tiny SHD gap vs oracle selection) |
 | **Median Regret** | **0.000** (majority of predictions are perfect) |
-Evaluated on 116 benchmark configs across 13 bnlearn networks (5–70 nodes).
 ## 🧪 Algorithm Pool (9 algorithms)
-| Algorithm | Family | Library | Output |
-|-----------|--------|---------|--------|
-| **PC** | Constraint-based | causal-learn | CPDAG |
-| **FCI** | Constraint-based | causal-learn | PAG |
-| **GES** | Score-based | causal-learn | CPDAG |
-| **BOSS** | Permutation-based | causal-learn | CPDAG |
-| **GRaSP** | Permutation-based | causal-learn | CPDAG |
-| **HC** | Score-based (greedy) | pgmpy | DAG |
-| **Tabu** | Score-based (meta-heuristic) | pgmpy | DAG |
-| **MMHC** | Hybrid | pgmpy | DAG |
-| **K2** | Score-based (ordering) | pgmpy | DAG |
 ## 🔬 Key Insight: Dependency Parsing Connection
@@ -43,11 +60,16 @@ This project was inspired by a structural parallel between **NLP dependency pars
 The biaffine pairwise scoring mechanism from Dozat & Manning (2017) was independently reinvented by AVICI and CauScale for causal structure learning — validating this connection.
-**Top predictive meta-features** (confirming the parsing analogy):
-1. `max_pairwise_MI` (24.6%) — strongest pairwise dependency (≈ biaffine arc score)
-2. `n_variables` (14.8%) — network size
-3. `max_entropy` (9.5%) — variable complexity
-4. `max_cramers_v` (6.7%) — strongest association strength
 ## 🚀 Quick Start
@@ -80,17 +102,24 @@ causal_selection/
 │   ├── trainer.py            # Multi-Output RF/GBM + LONO-CV evaluation
 │   └── predictor.py          # Inference: dataset → top-3 prediction
 ├── models/
-│   ├── meta_learner.pkl      # Trained Random Forest
 │   └── scaler.pkl            # Feature scaler
 ├── benchmark.py              # Full benchmark orchestration
-└── run_benchmark.py          # Resumable benchmark runner
 ```
 ## 📈 Benchmark Data
 - **14 bnlearn networks**: asia, cancer, earthquake, sachs, survey, alarm, barley, child, insurance, mildew, water, hailfinder, hepar2, win95pts
-- **116+ dataset configs**: varying sample sizes (250–10,000) × multiple seeds
-- **1,000+ algorithm runs**: 9 algorithms × 116 configs with per-algorithm timeout
 ## 🔧 Dependencies
@@ -108,10 +137,18 @@ joblib
 - **Causal-Copilot** (arxiv:2504.13263) — Closest existing algorithm selection system
 - **AVICI** (arxiv:2205.12934) — Amortized causal structure learning (biaffine architecture)
 - **Dozat & Manning** (arxiv:1611.01734) — Deep Biaffine Attention for dependency parsing
 - **SATzilla** (arxiv:1401.2474) — Algorithm selection via meta-learning
 - **bnlearn** (bnlearn.com) — Bayesian network benchmark repository
 ## License
 MIT

 Given a new discrete dataset (pandas DataFrame), the system:
 1. **Extracts 34 meta-features** (entropy, mutual information, chi² statistics, CI test probes, etc.)
+2. **Predicts normalized SHD** for each of 9 algorithms via trained models
 3. **Ranks and returns the top-3** algorithms expected to produce the most accurate CPDAG
 ## 📊 Performance (Leave-One-Network-Out Cross-Validation)
+### Best Model: Pairwise-GBM Ranking
 | Metric | Value |
 |--------|-------|
+| **Top-3 Hit Rate** | **71.3%** (true best algorithm is in predicted top-3) |
+| **Mean Regret** | **0.011** (tiny SHD gap vs oracle selection) |
 | **Median Regret** | **0.000** (majority of predictions are perfect) |
+### Model Comparison (178 configs, 14 networks + augmented)
+| Model | Top-3 Hit Rate | NDCG@3 | Mean Regret |
+|-------|---------------|--------|-------------|
+| **Pairwise-GBM** | **71.3%** | — | 0.011 |
+| GBM-300-lr01 | 67.4% | 0.957 | 0.011 |
+| RF-200 | 66.9% | 0.961 | 0.007 |
+| RF-500 | 66.3% | 0.962 | 0.007 |
+| GBM-500-lr05 | 65.2% | 0.948 | 0.013 |
+### Progression
+| Stage | Configs | Networks | Top-3 Hit Rate |
+|-------|---------|----------|---------------|
+| Initial (small nets) | 65 | 4 | 68.2% |
+| All 14 networks | 122 | 14 | 70.5% |
+| + Data augmentation | 178 | 14+aug | **71.3%** |
 ## 🧪 Algorithm Pool (9 algorithms)
+| Algorithm | Family | Library | Output | Wins |
+|-----------|--------|---------|--------|------|
+| **GES** | Score-based | causal-learn | CPDAG | 47% |
+| **PC** | Constraint-based | causal-learn | CPDAG | 32% |
+| **FCI** | Constraint-based | causal-learn | PAG | 8% |
+| **K2** | Score-based | pgmpy | DAG | 6% |
+| **HC** | Score-based (greedy) | pgmpy | DAG | 3% |
+| **Tabu** | Score-based (meta) | pgmpy | DAG | 2% |
+| **GRaSP** | Permutation-based | causal-learn | CPDAG | 1% |
+| **BOSS** | Permutation-based | causal-learn | CPDAG | 1% |
+| **MMHC** | Hybrid | pgmpy | DAG | <1% |
 ## 🔬 Key Insight: Dependency Parsing Connection
 The biaffine pairwise scoring mechanism from Dozat & Manning (2017) was independently reinvented by AVICI and CauScale for causal structure learning — validating this connection.
+### Top Predictive Meta-Features
+1. `n_variables` (30%) — network size (how many nodes in the graph)
+2. `max_pairwise_MI` (24%) — strongest pairwise dependency (≈ biaffine arc score)
+3. `max_cramers_v` (8%) — strongest association strength
+4. `max_entropy` (7%) — variable complexity
+### Three Ideas Borrowed from Parsing
+1. **Biaffine-style pairwise features**: MI and Cramér's V between all variable pairs = parsing's arc scores
+2. **Pairwise ranking** (our best model): For each algorithm pair (A,B), predict which wins → count wins to rank. Inspired by pairwise tournament-style parser selection
+3. **Cross-domain transfer**: Train on well-characterized bnlearn networks → predict on new unseen datasets (= cross-lingual parser transfer)
 ## 🚀 Quick Start
 │   ├── trainer.py            # Multi-Output RF/GBM + LONO-CV evaluation
 │   └── predictor.py          # Inference: dataset → top-3 prediction
 ├── models/
+│   ├── meta_learner.pkl      # Trained GBM (multi-output fallback)
+│   ├── pairwise_model.pkl    # Pairwise ranking GBM (best model)
 │   └── scaler.pkl            # Feature scaler
 ├── benchmark.py              # Full benchmark orchestration
+├── run_benchmark.py          # Resumable benchmark runner
+└── augment_and_improve.py    # Data augmentation + model improvement
 ```
 ## 📈 Benchmark Data
 - **14 bnlearn networks**: asia, cancer, earthquake, sachs, survey, alarm, barley, child, insurance, mildew, water, hailfinder, hepar2, win95pts
+- **178 dataset configs**: 122 original + 56 augmented (variable subsampling, sample-size variation, noise injection)
+- **1,600+ algorithm runs**: 9 algorithms × 178 configs with per-algorithm timeout
+### Data Augmentation Strategies
+- **Variable subsampling**: Drop 20-40% of variables to create virtual sub-networks
+- **Sample-size variation**: Generate N=300, 750, 1500, 3000 for each network
+- **Noise injection**: Randomly flip 5-10% of categorical values
 ## 🔧 Dependencies
 - **Causal-Copilot** (arxiv:2504.13263) — Closest existing algorithm selection system
 - **AVICI** (arxiv:2205.12934) — Amortized causal structure learning (biaffine architecture)
+- **CauScale** (arxiv:2602.08629) — Scalable neural causal discovery
 - **Dozat & Manning** (arxiv:1611.01734) — Deep Biaffine Attention for dependency parsing
+- **TreeCRF** (arxiv:2005.00975) — Global structural training loss for parsing
 - **SATzilla** (arxiv:1401.2474) — Algorithm selection via meta-learning
 - **bnlearn** (bnlearn.com) — Bayesian network benchmark repository
+## 🔮 Future Work (Phase 2)
+1. **Biaffine neural encoder**: Pre-train a neural feature extractor that learns variable-pair "arc scores"
+2. **Portfolio regret loss** (TreeCRF-inspired): Global ranking optimization instead of per-algorithm MSE
+3. **Hyperparameter co-selection**: Predict not just which algorithm but optimal hyperparameters (CASH)
+4. **Ensemble prediction**: Run top-3 and vote on edges across their CPDAGs
 ## License
 MIT