hamverbot
/

bidding_algorithms_benchmark

ml-intern

Model card Files Files and versions

xet

Community

hamverbot commited on 3 days ago

Commit

a40d07a

verified ·

1 Parent(s): 0b15a4d

Upload RESEARCH_RESOURCES.md

Browse files

Files changed (1) hide show

RESEARCH_RESOURCES.md +239 -166

RESEARCH_RESOURCES.md CHANGED Viewed

@@ -1,254 +1,327 @@
-# RTB Bidding Algorithm Comparison — Complete Research Resource List
-> Generated: 2026-05-05 | Repository: https://huggingface.co/hamverbot/bidding_algorithms_benchmark
 ---
-## Table of Contents
-1. [Bidding Algorithms](#1-bidding-algorithms)
-2. [CTR Prediction Models](#2-ctr-prediction-models)
-3. [Clearing Price / Market Price Prediction](#3-clearing-price--market-price-prediction)
-4. [Datasets](#4-datasets)
-5. [Codebases & Implementations](#5-codebases--implementations)
-6. [Benchmark Leaderboards](#6-benchmark-leaderboards)
-7. [Recommended Architecture](#7-recommended-architecture)
----
-## 1. Bidding Algorithms
-### 1.1 Lagrangian Dual + Online Gradient Descent (BEST MATCH)
 | Property | Detail |
 |----------|--------|
 | **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
-| **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong (2023) |
-| **Venue** | NeurIPS 2023 |
 | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
 | **HF Papers** | https://huggingface.co/papers/2304.13477 |
 | **Algorithm** | DualOGD — Lagrangian dual multiplier updated by online error gradient descent |
-| **Auction Type** | First-price (also handles second-price) |
 | **Constraints** | Budget cap: total spend ≤ ρT |
 | **Regret Bound** | Õ(√T) for both full-information and one-sided feedback |
 | **Key Formula** | λ_{t+1} = Proj_{λ>0}(λ_t − ε·(ρ − c̃_t(b_t))) |
 | **Bid Rule** | b_t = argmax_b (r̃_t(v_t, b) − λ_t·c̃_t(b)) |
-| **Prediction Models Needed** | CTR predictor (for v_t), empirical CDF of competing bids (G̃) |
-### 1.2 Dual Mirror Descent (Second-Price)
 | Property | Detail |
 |----------|--------|
-| **Paper** | "The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems" |
-| **Authors** | Santiago Balseiro, Haihao Lu, Vahab Mirrokni (2020) |
-| **Venue** | Operations Research (2023) |
-| **arXiv** | [2011.10124](https://arxiv.org/abs/2011.10124) |
-| **Citations** | 135+ |
-| **Algorithm** | Dual mirror descent — generalizes OGD with Bregman divergences |
-| **Auction Type** | Second-price (truthful) |
-| **Bid Rule** | b_t = v_t / (1 + μ_t) |
-| **Dual Update** | μ_{t+1} = Proj(μ_t − η·(ρ − payment_t)) |
-| **Key Insight** | No market price model needed for second-price auctions |
-| **Prediction Models** | CTR predictor only |
-### 1.3 Dual Descent with RoS + Budget (Multi-Constraint)
 | Property | Detail |
 |----------|--------|
-| **Paper** | "Online Bidding Algorithms for Return-on-Spend Constrained Advertisers" |
-| **Authors** | Zhe Feng, Swati Padmanabhan, Di Wang (2022) |
-| **Venue** | ICML 2022 |
-| **arXiv** | [2208.13713](https://arxiv.org/abs/2208.13713) |
-| **Algorithm** | Two dual variables: λ for RoS, μ for budget |
-| **Bid Rule** | b_t = ((1+λ_t)/(μ_t+λ_t)) · v_t |
-| **Key Insight** | Adaptable for k% spend floor — second dual variable enforces minimum spend |
-### 1.4 RLB — Reinforcement Learning Bidding
 | Property | Detail |
 |----------|--------|
 | **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
-| **Authors** | Han Cai et al. (2017) |
-| **Venue** | WSDM 2017 |
 | **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
 | **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
 | **Algorithm** | MDP + Dynamic Programming + Neural value function |
-| **Results** | +22% clicks over linear bidding on iPinYou |
 | **Prediction Models** | CTR θ(x) + market price distribution m(δ, x) |
-### 1.5 HiBid — Industrial Hierarchical Dual-RL
-| Property | Detail |
-|----------|--------|
-| **Paper** | "HiBid: A Cross-Channel Constrained Bidding System" |
-| **arXiv** | [2312.17503](https://arxiv.org/abs/2312.17503) |
-| **Scale** | 64K advertisers, 70M requests/day, 4 channels, Meituan |
-| **Algorithm** | High-level RL budget allocation + Low-level λ-parameterized bidding |
-### Unified Dual Multiplier Template
-```
-For each auction t:
-1. Observe value v_t (from CTR prediction × click value)
-2. Compute bid: b_t = f(v_t, dual_multiplier_t)
-3. Observe outcome: payment c_t (if won) or 0 (if lost)
-4. Compute gradient: g_t = ρ − c_t
-5. Update multiplier: λ_{t+1} = Proj_{λ≥0}(λ_t − η·g_t)
-```
-| Method | Auction | Bid Function f(v, λ) |
-|--------|---------|----------------------|
-| Wang 2023 | First-price | argmax_b (r̃(v,b) − λ·c̃(b)) |
-| Balseiro 2020 | Second-price | v / (1+λ) |
-| Feng 2022 | Second-price | ((1+λ_RoS)/(λ_RoS+λ_budget)) · v |
 ---
 ## 2. CTR Prediction Models
-### 2.1 FinalMLP (RECOMMENDED)
 | Property | Detail |
 |----------|--------|
 | **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
 | **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
 | **Criteo AUC** | **0.8149** |
-| **Avazu AUC** | **0.7666** |
-| **Architecture** | Two-stream MLP + feature gating + bilinear fusion |
-| **Inference** | <1ms — best for RTB latency constraints |
-### 2.2 Other Top Models
-| Model | Criteo AUC | Architecture | RTB-Suitable |
-|-------|-----------|-------------|--------------|
-| **FinalMLP** | 0.8149 | Two-stream MLP | ✅ Best |
-| **DCNv2** | 0.8142-0.8144 | CrossNetV2 + DNN | ✅ |
-| **GDCN** | 0.8161* | Gated Cross + DNN | ✅ |
-| **DeepFM** | 0.8138 | FM + DNN | ✅ |
-| **FCN** | New | LCN + ECN (no DNN) | ✅ |
-| DIN | — | Attention (user history) | ❌ Slow |
-| DIEN | — | GRU + attention | ❌ Slow |
-*GDCN uses own data split — not directly comparable.
-**BARS Meta-Finding (2009.05794):** After 7,000+ experiments, SOTA deep CTR models differ by only 0.1-0.3% AUC. Architecture matters less than data preprocessing, hyperparameter tuning, and feature engineering.
 ---
-## 3. Clearing Price / Market Price Prediction
-### 3.1 Non-Parametric Empirical CDF (BASELINE)
 | Property | Detail |
 |----------|--------|
-| **Source** | Wang et al. (2023), Algorithm 1 |
-| **Method** | G̃_t(b) = (1/(t-1))∑𝟙{b ≥ d_s} |
-| **Pros** | No training, theoretically sound, handles distribution shift |
-| **Cons** | No context, cold-start |
-### 3.2 Deep Censored Learning / Survival Analysis
 | Property | Detail |
 |----------|--------|
-| **Library** | **TorchSurv** (Novartis, 200★) [2404.10761] |
-| **URL** | https://github.com/Novartis/torchsurv |
-| **Method** | Neural net with censored survival loss |
-| **Loss** | Win: -log f(price\|x); Loss: -log S(bid\|x) |
-| **Key Insight** | Proper survival framework handles censoring |
-### 3.3 Censored Linear Regression (Wu et al. 2015, KDD)
 | Property | Detail |
 |----------|--------|
-| **Method** | Tobit-like: log(market_price) = β·x + ε, ε ~ N(0, σ²) |
-| **Pros** | Contextual, simple |
-| **Cons** | Linear — limited capacity |
-### Comparison
-| Method | Contextual? | Handles Censoring? | Training? | Complexity |
-|--------|-------------|-------------------|-----------|------------|
-| Empirical CDF | ❌ | N/A | None | Minimal |
-| Censored Linear | ✅ | ✅ | Light | Low |
-| Deep Survival | ✅ | ✅ | Neural net | Medium |
-| Win Prob NN | ✅ | ❌ | Neural net | Low |
 ---
 ## 4. Datasets
-### CTR Prediction (Verified on HF Hub)
-| Dataset | HF Path | Size | Verified |
-|---------|---------|------|----------|
-| Criteo_x4 | reczoo/Criteo_x4 | 45.8M rows, 5.6GB | ✅ |
-| Avazu_x4 | reczoo/Avazu_x4 | 40.4M rows, 1.8GB | ✅ |
-### RTB Bidding (External Only)
-| Dataset | Source | Availability |
-|---------|--------|-------------|
-| iPinYou | data.computational-advertising.org | External download |
-| YOYI | Various mirrors | External download |
----
-## 5. Codebases
-| Library | URL | Purpose |
-|---------|-----|---------|
-| **FuxiCTR** | https://github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
-| **DeepCTR-Torch** | https://github.com/shenweichen/DeepCTR-Torch | 20+ CTR models, simple API |
-| **TorchSurv** | https://github.com/Novartis/torchsurv | Deep survival for clearing price |
-| **BARS** | https://github.com/openbenchmark/BARS | Standardized CTR benchmark |
-| **rlb-dp** | https://github.com/han-cai/rlb-dp | RL for RTB |
-| **budget_constrained_bidding** | https://github.com/dingmu365/budget_constrained_bidding | Budget-constrained algorithms |
 ---
-## 6. Benchmark Leaderboards
-| Leaderboard | URL |
-|-------------|-----|
-| BARS CTR Criteo_x4 | https://openbenchmark.github.io/BARS/CTR/leaderboard/criteo_x4.html |
-| BARS CTR Avazu | https://openbenchmark.github.io/BARS/CTR/leaderboard/avazu_x4.html |
 ---
-## 7. Recommended Architecture
 ```
-┌─────────────────────────────────────────────┐
-│            BIDDING ALGORITHM                  │
-│  Dual OGD: λ_{t+1} = Proj(λ_t - ε·(ρ - c̃)) │
-│  Two-sided: μ (cap) + ν (floor)              │
-├─────────────────────────────────────────────┤
-│          PREDICTION MODELS                    │
-│  ┌──────────────┐  ┌────────────────────┐    │
-│  │ FinalMLP     │  │ Empirical CDF /     │    │
-│  │ v_t=pCTR×V   │  │ TorchSurv           │    │
-│  └──────────────┘  └────────────────────┘    │
-├─────────────────────────────────────────────┤
-│              DATASETS                         │
-│  Criteo_x4 + synthetic auction simulation    │
-└─────────────────────────────────────────────┘
 ```
 ## Paper Index
-| # | Paper | arXiv | Year | Citations |
-|---|-------|-------|------|-----------|
-| 1 | Wang et al. — First-Price Auctions with Budgets | 2304.13477 | 2023 | Growing |
-| 2 | Balseiro et al. — Dual Mirror Descent | 2011.10124 | 2020 | 135+ |
-| 3 | Feng et al. — RoS Constrained Bidding | 2208.13713 | 2022 | 38+ |
-| 4 | Cai et al. — RLB | 1701.02490 | 2017 | 300+ |
-| 5 | Wang et al. — HiBid | 2312.17503 | 2023 | New |
-| 6 | — Contextual First-Price (Quantile) | 2603.07207 | 2026 | New |
-| 7 | Mao et al. — FinalMLP | 2304.00902 | 2023 | Growing |
-| 8 | Wang et al. — GDCN | 2311.04635 | 2023 | Growing |
-| 9 | Wang et al. — DCN V2 | 2008.13535 | 2021 | 500+ |
-| 10 | Guo et al. — DeepFM | — | 2017 | 3000+ |
-| 11 | Zhu et al. — BARS-CTR | 2009.05794 | 2021 | 100+ |
-| 12 | Wu et al. — Censored Price Prediction | — | 2015 | 101 |
-| 13 | — TorchSurv | 2404.10761 | 2024 | New |
-| 14 | — Robust Budget Pacing | 2302.02006 | 2023 | Growing |

+# Bidding Algorithms Benchmark — Research Resources
+> First-Price Auction Focus | Generated: 2026-05-05
+> Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark
 ---
+## 1. Bidding Algorithms for First-Price Auctions
+### 1.1 DualOGD — Lagrangian Dual + Online Gradient Descent ⭐ PRIMARY
 | Property | Detail |
 |----------|--------|
 | **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
+| **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong |
+| **Venue** | 2023 |
 | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
 | **HF Papers** | https://huggingface.co/papers/2304.13477 |
 | **Algorithm** | DualOGD — Lagrangian dual multiplier updated by online error gradient descent |
+| **Auction Type** | **First-price** |
 | **Constraints** | Budget cap: total spend ≤ ρT |
 | **Regret Bound** | Õ(√T) for both full-information and one-sided feedback |
 | **Key Formula** | λ_{t+1} = Proj_{λ>0}(λ_t − ε·(ρ − c̃_t(b_t))) |
 | **Bid Rule** | b_t = argmax_b (r̃_t(v_t, b) − λ_t·c̃_t(b)) |
+| **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (G̃_t) |
+| **Code Pattern (Wang 2023, Algorithm 1)** |
+```python
+# Initialization
+λ = 0.0; ε = 1.0 / sqrt(T); ρ = B / T
+for t in range(T):
+    v_t = pCTR(features_t) * click_value  # from CTR model
+    # Estimate reward and cost from historical competing bids
+    r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
+    c_tilde = lambda b: mean([b for d in d_history if b >= d])
+    # Bid: maximize cost-adjusted reward
+    b_t = argmax_b (r_tilde(v_t, b) - λ * c_tilde(b))
+    # Observe maximum competing bid d_t (full feedback)
+    won = (b_t >= d_t); cost = b_t if won else 0
+    # Online gradient descent on dual multiplier
+    λ = max(0, λ - ε * (ρ - c_tilde(b_t)))
+```
+### 1.2 TwoSidedDual — Budget Cap + Spend Floor
+| Property | Detail |
+|----------|--------|
+| **Base** | Extension of Wang et al. (2023) |
+| **Constraints** | Total spend ≤ B (cap) AND spend ≥ k·B (floor) |
+| **Dual Variables** | μ for cap, ν for floor |
+| **Updates** |
+| μ_{t+1} = Proj_{μ≥0}(μ_t − η₁·(ρ − c̃_t(b_t))) | cap penalty |
+| ν_{t+1} = Proj_{ν≥0}(ν_t − η₂·(c̃_t(b_t) − kρ)) | floor incentive |
+| **Bid Rule** | b_t = argmax_b (r̃_t(v_t, b) − (μ_t − ν_t)·c̃_t(b)) |
+| **Key Insight** | When μ > ν: bidding is restrained (ahead on spend). When ν > μ: bidding is encouraged (behind on spend floor). |
+### 1.3 Adversarial Bidding — Non-Stationary Environments
+| Property | Detail |
+|----------|--------|
+| **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" |
+| **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) |
+| **Algorithm** | Adaptive dual OGD with change-point detection |
+| **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning |
+### 1.4 Contextual First-Price (2026)
 | Property | Detail |
 |----------|--------|
+| **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
+| **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
+| **Algorithm** | Dual OGD + quantile-based contextual censored regression |
+| **Key Innovation** | Extends Wang to contextual (feature-based) auctions |
+### 1.5 Joint Value Estimation + Bidding
 | Property | Detail |
 |----------|--------|
+| **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" |
+| **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) |
+| **Key Insight** | Simultaneously learn CTR and bidding strategy — no separate CTR model training phase |
+### 1.6 RLB — Reinforcement Learning (Baseline)
 | Property | Detail |
 |----------|--------|
 | **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
+| **Authors** | Han Cai et al., WSDM 2017 |
 | **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
 | **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
 | **Algorithm** | MDP + Dynamic Programming + Neural value function |
+| **State** | (remaining auctions, remaining budget, features) |
+| **Action** | bid price a ∈ [0, budget] |
 | **Prediction Models** | CTR θ(x) + market price distribution m(δ, x) |
+### 1.7 Static Baselines
+| Algorithm | Bid Rule | Notes |
+|-----------|----------|-------|
+| **Linear** | b = base_bid × (pCTR / avg_pCTR) | Simple proportional |
+| **Threshold** | b = fixed_bid if pCTR > τ else 0 | Binary decision |
+| **ValueShading** | b = v_t / (1 + λ) | From second-price literature, adapted |
+### Algorithm Comparison Matrix
+| Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity |
+|-----------|-----------|-------------------|------------|-----------------|------------|
+| **DualOGD** | ✅ Online | Empirical CDF | ❌ (cap only) | ✅ Õ(√T) | Medium |
+| **TwoSidedDual** | ✅ Online | Empirical CDF | ✅ (cap+floor) | ❌ (heuristic) | Medium |
+| **RLB** | ✅ DP | Neural dist. model | ❌ | ❌ | High |
+| **Linear** | ❌ | None | ❌ | ❌ | Minimal |
+| **Threshold** | ❌ | None | ❌ | ❌ | Minimal |
 ---
 ## 2. CTR Prediction Models
+### 2.1 FinalMLP ⭐ RECOMMENDED
 | Property | Detail |
 |----------|--------|
 | **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
+| **Authors** | Kelong Mao et al., AAAI 2023 |
 | **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
 | **Criteo AUC** | **0.8149** |
+| **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion |
+| **Why Best for RTB** | Pure feed-forward MLP — <1ms inference, no attention/RNN overhead |
+| **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch |
+```python
+# FinalMLP architecture
+# Stream 1: MLP(features * gate_weights) → feature selection
+# Stream 2: MLP(features * (1-gate_weights)) → complementary view
+# Fusion: Bilinear(stream1_output, stream2_output) → sigmoid
+```
+### 2.2 DeepFM — Simple Baseline
+| Property | Detail |
+|----------|--------|
+| **Paper** | "DeepFM: A Factorization-Machine based Neural Network" |
+| **Criteo AUC** | 0.8138 |
+| **Architecture** | Shared embedding → FM (2nd-order) + DNN → sum → sigmoid |
+### 2.3 DCNv2 — Industry Standard
+| Property | Detail |
+|----------|--------|
+| **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) |
+| **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) |
+| **Criteo AUC** | 0.8142-0.8144 |
+| **Architecture** | CrossNetV2 (low-rank) + DNN in parallel |
+### 2.4 BARS Meta-Finding ⚠️ IMPORTANT
+| Property | Detail |
+|----------|--------|
+| **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] |
+| **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are ≤0.1-0.3% AUC** |
+| **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering |
+### CTR Model Comparison for RTB
+| Model | AUC (Criteo) | Inference Speed | RTB Latency OK? |
+|-------|-------------|-----------------|-----------------|
+| **FinalMLP** | 0.8149 | ⭐⭐⭐⭐⭐ | ✅ Yes |
+| DCNv2 | 0.8142 | ⭐⭐⭐⭐ | ✅ Yes |
+| DeepFM | 0.8138 | ⭐⭐⭐⭐ | ✅ Yes |
+| GDCN | 0.8161* | ⭐⭐⭐⭐ | ✅ Yes |
+| DIN | — | ⭐⭐ | ❌ No |
+| DIEN | — | ⭐ | ❌ No |
+*Own data split, not directly comparable.
 ---
+## 3. Clearing Price / Win Probability Prediction
+### 3.1 Empirical CDF (Non-Parametric) ⭐ BASELINE
 | Property | Detail |
 |----------|--------|
+| **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 |
+| **Method** | Maintain array of observed competing bids d_s |
+| **Win Probability** | P(win|b) = G̃_t(b) = (1/(t-1))∑_{s=1}^{t-1} 𝟙{b ≥ d_s} |
+| **Expected Cost** | E[cost|win,b] = (1/G̃_t(b)) · mean({d_s : d_s ≤ b}) |
+| **Expected Reward** | r̃_t(v,b) = (v - b) · G̃_t(b) |
+| **Expected Cost (dual)** | c̃_t(b) = b · G̃_t(b) |
+| **Pros** | No training, theoretically sound, adapts online |
+| **Cons** | No context, cold-start issue, requires full feedback |
+```python
+class EmpiricalCDF:
+    def __init__(self):
+        self.competing_bids = []
+    def update(self, d_t):
+        """d_t = maximum competing bid (observed under full feedback)"""
+        self.competing_bids.append(d_t)
+    def win_prob(self, b):
+        if not self.competing_bids:
+            return 0.5
+        return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
+    def expected_cost(self, b):
+        wins = [d for d in self.competing_bids if b >= d]
+        if not wins:
+            return b
+        return np.mean(wins)
+```
+### 3.2 TorchSurv — Deep Censored Learning
 | Property | Detail |
 |----------|--------|
+| **Library** | TorchSurv (Novartis, 200★) |
+| **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) |
+| **GitHub** | https://github.com/Novartis/torchsurv |
+| **Install** | `pip install torchsurv` |
+| **Method** | Neural network with Cox PH or Weibull AFT loss |
+| **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) |
+| **Output** | Survival function S(b|x) = P(market_price > b | features) |
+| **Win Prob** | P(win|b,x) = 1 − S(b|x) |
+```python
+from torchsurv.loss import cox
+# log_hazard = model(features)  # shape (batch,)
+# event = 1 if won (uncensored), 0 if lost (censored)
+# time = market_price if won, bid if lost
+loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
+```
+### 3.3 Win Probability NN (Simplest ML)
 | Property | Detail |
 |----------|--------|
+| **Method** | Binary classifier: P(win | bid_price, features) |
+| **Pros** | Dead simple, BCELoss |
+| **Cons** | Ignores price magnitude info when winning |
+| **Architecture** | features ⊕ bid_price → MLP → sigmoid |
 ---
 ## 4. Datasets
+### Verified on HuggingFace Hub
+| Dataset | HF Path | Rows | Features | Label | Status |
+|---------|---------|------|----------|-------|--------|
+| **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | ✅ Ready |
+| Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | ✅ Ready |
+### RTB-Specific (External Only)
+| Dataset | Description | Availability |
+|---------|-------------|-------------|
+| iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org |
+| YOYI | ~400M bid log records | Various mirrors |
+**Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation.
 ---
+## 5. Codebases & Libraries
+| Library | URL | Use |
+|---------|-----|-----|
+| FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
+| DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models |
+| TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price |
+| BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark |
+| rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) |
 ---
+## 6. Recommended Architecture
 ```
+┌──────────────────────────────────────────────────────┐
+│              FIRST-PRICE BIDDING ENGINE               │
+│                                                       │
+│  Dual OGD: λ_{t+1} = max(0, λ_t - ε·(ρ - c̃_t(b_t))) │
+│  Two-Sided: μ (cap) + ν (floor) dual variables       │
+│                                                       │
+├──────────────────────────────────────────────────────┤
+│                PREDICTION MODELS                       │
+│                                                       │
+│  ┌─────────────────┐  ┌──────────────────────────┐   │
+│  │ CTR: FinalMLP   │  │ Win Prob: Empirical CDF   │   │
+│  │ v_t = pCTR × V  │  │ G̃_t(b) = frac of d_s ≤ b │   │
+│  └─────────────────┘  └──────────────────────────┘   │
+│                                                       │
+│  ┌──────────────────────────────────────────────┐    │
+│  │ Optional: TorchSurv for contextual win prob  │    │
+│  │ P(win|b,x) = 1 - S(b|x)                      │    │
+│  └──────────────────────────────────────────────┘    │
+├──────────────────────────────────────────────────────┤
+│                   DATASETS                            │
+│  Criteo_x4 (CTR training) + synthetic market prices  │
+└──────────────────────────────────────────────────────┘
 ```
+---
 ## Paper Index
+| # | Paper | arXiv | Focus |
+|---|-------|-------|-------|
+| 1 | Wang et al. — Learning to Bid in Repeated FPA with Budgets | 2304.13477 | ⭐ Primary algorithm |
+| 2 | — Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift |
+| 3 | — Contextual First-Price (Quantile) | 2603.07207 | Contextual extension |
+| 4 | — Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding |
+| 5 | — No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework |
+| 6 | Cai et al. — RLB | 1701.02490 | RL baseline |
+| 7 | — Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts |
+| 8 | Mao et al. — FinalMLP | 2304.00902 | CTR model |
+| 9 | Wang et al. — DCN V2 | 2008.13535 | CTR model |
+| 10 | Guo et al. — DeepFM | — | CTR model |
+| 11 | Zhu et al. — BARS-CTR | 2009.05794 | CTR benchmark |
+| 12 | Wu et al. — Censored Price Prediction | — | Clearing price |
+| 13 | — TorchSurv | 2404.10761 | Survival analysis library |