hamverbot commited on
Commit
a40d07a
Β·
verified Β·
1 Parent(s): 0b15a4d

Upload RESEARCH_RESOURCES.md

Browse files
Files changed (1) hide show
  1. RESEARCH_RESOURCES.md +239 -166
RESEARCH_RESOURCES.md CHANGED
@@ -1,254 +1,327 @@
1
- # RTB Bidding Algorithm Comparison β€” Complete Research Resource List
2
 
3
- > Generated: 2026-05-05 | Repository: https://huggingface.co/hamverbot/bidding_algorithms_benchmark
 
4
 
5
  ---
6
 
7
- ## Table of Contents
8
 
9
- 1. [Bidding Algorithms](#1-bidding-algorithms)
10
- 2. [CTR Prediction Models](#2-ctr-prediction-models)
11
- 3. [Clearing Price / Market Price Prediction](#3-clearing-price--market-price-prediction)
12
- 4. [Datasets](#4-datasets)
13
- 5. [Codebases & Implementations](#5-codebases--implementations)
14
- 6. [Benchmark Leaderboards](#6-benchmark-leaderboards)
15
- 7. [Recommended Architecture](#7-recommended-architecture)
16
-
17
- ---
18
-
19
- ## 1. Bidding Algorithms
20
-
21
- ### 1.1 Lagrangian Dual + Online Gradient Descent (BEST MATCH)
22
 
23
  | Property | Detail |
24
  |----------|--------|
25
  | **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
26
- | **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong (2023) |
27
- | **Venue** | NeurIPS 2023 |
28
  | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
29
  | **HF Papers** | https://huggingface.co/papers/2304.13477 |
30
  | **Algorithm** | DualOGD β€” Lagrangian dual multiplier updated by online error gradient descent |
31
- | **Auction Type** | First-price (also handles second-price) |
32
  | **Constraints** | Budget cap: total spend ≀ ρT |
33
  | **Regret Bound** | Γ•(√T) for both full-information and one-sided feedback |
34
  | **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t βˆ’ Ρ·(ρ βˆ’ cΜƒ_t(b_t))) |
35
  | **Bid Rule** | b_t = argmax_b (rΜƒ_t(v_t, b) βˆ’ Ξ»_tΒ·cΜƒ_t(b)) |
36
- | **Prediction Models Needed** | CTR predictor (for v_t), empirical CDF of competing bids (G̃) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
- ### 1.2 Dual Mirror Descent (Second-Price)
 
 
 
 
 
 
 
39
 
40
  | Property | Detail |
41
  |----------|--------|
42
- | **Paper** | "The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems" |
43
- | **Authors** | Santiago Balseiro, Haihao Lu, Vahab Mirrokni (2020) |
44
- | **Venue** | Operations Research (2023) |
45
- | **arXiv** | [2011.10124](https://arxiv.org/abs/2011.10124) |
46
- | **Citations** | 135+ |
47
- | **Algorithm** | Dual mirror descent β€” generalizes OGD with Bregman divergences |
48
- | **Auction Type** | Second-price (truthful) |
49
- | **Bid Rule** | b_t = v_t / (1 + ΞΌ_t) |
50
- | **Dual Update** | ΞΌ_{t+1} = Proj(ΞΌ_t βˆ’ Ξ·Β·(ρ βˆ’ payment_t)) |
51
- | **Key Insight** | No market price model needed for second-price auctions |
52
- | **Prediction Models** | CTR predictor only |
53
-
54
- ### 1.3 Dual Descent with RoS + Budget (Multi-Constraint)
55
 
56
  | Property | Detail |
57
  |----------|--------|
58
- | **Paper** | "Online Bidding Algorithms for Return-on-Spend Constrained Advertisers" |
59
- | **Authors** | Zhe Feng, Swati Padmanabhan, Di Wang (2022) |
60
- | **Venue** | ICML 2022 |
61
- | **arXiv** | [2208.13713](https://arxiv.org/abs/2208.13713) |
62
- | **Algorithm** | Two dual variables: Ξ» for RoS, ΞΌ for budget |
63
- | **Bid Rule** | b_t = ((1+Ξ»_t)/(ΞΌ_t+Ξ»_t)) Β· v_t |
64
- | **Key Insight** | Adaptable for k% spend floor β€” second dual variable enforces minimum spend |
65
 
66
- ### 1.4 RLB β€” Reinforcement Learning Bidding
67
 
68
  | Property | Detail |
69
  |----------|--------|
70
  | **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
71
- | **Authors** | Han Cai et al. (2017) |
72
- | **Venue** | WSDM 2017 |
73
  | **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
74
  | **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
75
  | **Algorithm** | MDP + Dynamic Programming + Neural value function |
76
- | **Results** | +22% clicks over linear bidding on iPinYou |
 
77
  | **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |
78
 
79
- ### 1.5 HiBid β€” Industrial Hierarchical Dual-RL
80
-
81
- | Property | Detail |
82
- |----------|--------|
83
- | **Paper** | "HiBid: A Cross-Channel Constrained Bidding System" |
84
- | **arXiv** | [2312.17503](https://arxiv.org/abs/2312.17503) |
85
- | **Scale** | 64K advertisers, 70M requests/day, 4 channels, Meituan |
86
- | **Algorithm** | High-level RL budget allocation + Low-level Ξ»-parameterized bidding |
87
 
88
- ### Unified Dual Multiplier Template
 
 
 
 
89
 
90
- ```
91
- For each auction t:
92
- 1. Observe value v_t (from CTR prediction Γ— click value)
93
- 2. Compute bid: b_t = f(v_t, dual_multiplier_t)
94
- 3. Observe outcome: payment c_t (if won) or 0 (if lost)
95
- 4. Compute gradient: g_t = ρ βˆ’ c_t
96
- 5. Update multiplier: Ξ»_{t+1} = Proj_{Ξ»β‰₯0}(Ξ»_t βˆ’ Ξ·Β·g_t)
97
- ```
98
 
99
- | Method | Auction | Bid Function f(v, Ξ») |
100
- |--------|---------|----------------------|
101
- | Wang 2023 | First-price | argmax_b (rΜƒ(v,b) βˆ’ λ·cΜƒ(b)) |
102
- | Balseiro 2020 | Second-price | v / (1+Ξ») |
103
- | Feng 2022 | Second-price | ((1+Ξ»_RoS)/(Ξ»_RoS+Ξ»_budget)) Β· v |
 
 
104
 
105
  ---
106
 
107
  ## 2. CTR Prediction Models
108
 
109
- ### 2.1 FinalMLP (RECOMMENDED)
110
 
111
  | Property | Detail |
112
  |----------|--------|
113
  | **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
 
114
  | **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
115
  | **Criteo AUC** | **0.8149** |
116
- | **Avazu AUC** | **0.7666** |
117
- | **Architecture** | Two-stream MLP + feature gating + bilinear fusion |
118
- | **Inference** | <1ms β€” best for RTB latency constraints |
 
 
 
 
 
 
 
119
 
120
- ### 2.2 Other Top Models
121
 
122
- | Model | Criteo AUC | Architecture | RTB-Suitable |
123
- |-------|-----------|-------------|--------------|
124
- | **FinalMLP** | 0.8149 | Two-stream MLP | βœ… Best |
125
- | **DCNv2** | 0.8142-0.8144 | CrossNetV2 + DNN | βœ… |
126
- | **GDCN** | 0.8161* | Gated Cross + DNN | βœ… |
127
- | **DeepFM** | 0.8138 | FM + DNN | βœ… |
128
- | **FCN** | New | LCN + ECN (no DNN) | βœ… |
129
- | DIN | β€” | Attention (user history) | ❌ Slow |
130
- | DIEN | β€” | GRU + attention | ❌ Slow |
 
 
 
 
 
 
 
 
 
 
 
 
 
131
 
132
- *GDCN uses own data split β€” not directly comparable.
133
 
134
- **BARS Meta-Finding (2009.05794):** After 7,000+ experiments, SOTA deep CTR models differ by only 0.1-0.3% AUC. Architecture matters less than data preprocessing, hyperparameter tuning, and feature engineering.
 
 
 
 
 
 
 
 
 
135
 
136
  ---
137
 
138
- ## 3. Clearing Price / Market Price Prediction
139
 
140
- ### 3.1 Non-Parametric Empirical CDF (BASELINE)
141
 
142
  | Property | Detail |
143
  |----------|--------|
144
- | **Source** | Wang et al. (2023), Algorithm 1 |
145
- | **Method** | GΜƒ_t(b) = (1/(t-1))βˆ‘πŸ™{b β‰₯ d_s} |
146
- | **Pros** | No training, theoretically sound, handles distribution shift |
147
- | **Cons** | No context, cold-start |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
 
149
- ### 3.2 Deep Censored Learning / Survival Analysis
150
 
151
  | Property | Detail |
152
  |----------|--------|
153
- | **Library** | **TorchSurv** (Novartis, 200β˜…) [2404.10761] |
154
- | **URL** | https://github.com/Novartis/torchsurv |
155
- | **Method** | Neural net with censored survival loss |
156
- | **Loss** | Win: -log f(price\|x); Loss: -log S(bid\|x) |
157
- | **Key Insight** | Proper survival framework handles censoring |
 
 
 
 
 
 
 
 
 
 
 
 
158
 
159
- ### 3.3 Censored Linear Regression (Wu et al. 2015, KDD)
160
 
161
  | Property | Detail |
162
  |----------|--------|
163
- | **Method** | Tobit-like: log(market_price) = Ξ²Β·x + Ξ΅, Ξ΅ ~ N(0, σ²) |
164
- | **Pros** | Contextual, simple |
165
- | **Cons** | Linear β€” limited capacity |
166
-
167
- ### Comparison
168
-
169
- | Method | Contextual? | Handles Censoring? | Training? | Complexity |
170
- |--------|-------------|-------------------|-----------|------------|
171
- | Empirical CDF | ❌ | N/A | None | Minimal |
172
- | Censored Linear | βœ… | βœ… | Light | Low |
173
- | Deep Survival | βœ… | βœ… | Neural net | Medium |
174
- | Win Prob NN | βœ… | ❌ | Neural net | Low |
175
 
176
  ---
177
 
178
  ## 4. Datasets
179
 
180
- ### CTR Prediction (Verified on HF Hub)
181
 
182
- | Dataset | HF Path | Size | Verified |
183
- |---------|---------|------|----------|
184
- | Criteo_x4 | reczoo/Criteo_x4 | 45.8M rows, 5.6GB | βœ… |
185
- | Avazu_x4 | reczoo/Avazu_x4 | 40.4M rows, 1.8GB | βœ… |
186
 
187
- ### RTB Bidding (External Only)
188
 
189
- | Dataset | Source | Availability |
190
- |---------|--------|-------------|
191
- | iPinYou | data.computational-advertising.org | External download |
192
- | YOYI | Various mirrors | External download |
193
 
194
- ---
195
-
196
- ## 5. Codebases
197
-
198
- | Library | URL | Purpose |
199
- |---------|-----|---------|
200
- | **FuxiCTR** | https://github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
201
- | **DeepCTR-Torch** | https://github.com/shenweichen/DeepCTR-Torch | 20+ CTR models, simple API |
202
- | **TorchSurv** | https://github.com/Novartis/torchsurv | Deep survival for clearing price |
203
- | **BARS** | https://github.com/openbenchmark/BARS | Standardized CTR benchmark |
204
- | **rlb-dp** | https://github.com/han-cai/rlb-dp | RL for RTB |
205
- | **budget_constrained_bidding** | https://github.com/dingmu365/budget_constrained_bidding | Budget-constrained algorithms |
206
 
207
  ---
208
 
209
- ## 6. Benchmark Leaderboards
210
 
211
- | Leaderboard | URL |
212
- |-------------|-----|
213
- | BARS CTR Criteo_x4 | https://openbenchmark.github.io/BARS/CTR/leaderboard/criteo_x4.html |
214
- | BARS CTR Avazu | https://openbenchmark.github.io/BARS/CTR/leaderboard/avazu_x4.html |
 
 
 
215
 
216
  ---
217
 
218
- ## 7. Recommended Architecture
219
 
220
  ```
221
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
222
- β”‚ BIDDING ALGORITHM β”‚
223
- β”‚ Dual OGD: Ξ»_{t+1} = Proj(Ξ»_t - Ρ·(ρ - cΜƒ)) β”‚
224
- β”‚ Two-sided: ΞΌ (cap) + Ξ½ (floor) β”‚
225
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
226
- β”‚ PREDICTION MODELS β”‚
227
- β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
228
- β”‚ β”‚ FinalMLP β”‚ β”‚ Empirical CDF / β”‚ β”‚
229
- β”‚ β”‚ v_t=pCTRΓ—V β”‚ β”‚ TorchSurv β”‚ β”‚
230
- β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
231
- β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
232
- β”‚ DATASETS β”‚
233
- β”‚ Criteo_x4 + synthetic auction simulation β”‚
234
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 
 
 
 
 
 
 
 
235
  ```
236
 
 
 
237
  ## Paper Index
238
 
239
- | # | Paper | arXiv | Year | Citations |
240
- |---|-------|-------|------|-----------|
241
- | 1 | Wang et al. β€” First-Price Auctions with Budgets | 2304.13477 | 2023 | Growing |
242
- | 2 | Balseiro et al. β€” Dual Mirror Descent | 2011.10124 | 2020 | 135+ |
243
- | 3 | Feng et al. β€” RoS Constrained Bidding | 2208.13713 | 2022 | 38+ |
244
- | 4 | Cai et al. β€” RLB | 1701.02490 | 2017 | 300+ |
245
- | 5 | Wang et al. β€” HiBid | 2312.17503 | 2023 | New |
246
- | 6 | β€” Contextual First-Price (Quantile) | 2603.07207 | 2026 | New |
247
- | 7 | Mao et al. β€” FinalMLP | 2304.00902 | 2023 | Growing |
248
- | 8 | Wang et al. β€” GDCN | 2311.04635 | 2023 | Growing |
249
- | 9 | Wang et al. β€” DCN V2 | 2008.13535 | 2021 | 500+ |
250
- | 10 | Guo et al. β€” DeepFM | β€” | 2017 | 3000+ |
251
- | 11 | Zhu et al. β€” BARS-CTR | 2009.05794 | 2021 | 100+ |
252
- | 12 | Wu et al. β€” Censored Price Prediction | β€” | 2015 | 101 |
253
- | 13 | β€” TorchSurv | 2404.10761 | 2024 | New |
254
- | 14 | β€” Robust Budget Pacing | 2302.02006 | 2023 | Growing |
 
1
+ # Bidding Algorithms Benchmark β€” Research Resources
2
 
3
+ > First-Price Auction Focus | Generated: 2026-05-05
4
+ > Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark
5
 
6
  ---
7
 
8
+ ## 1. Bidding Algorithms for First-Price Auctions
9
 
10
+ ### 1.1 DualOGD β€” Lagrangian Dual + Online Gradient Descent ⭐ PRIMARY
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  | Property | Detail |
13
  |----------|--------|
14
  | **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
15
+ | **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong |
16
+ | **Venue** | 2023 |
17
  | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
18
  | **HF Papers** | https://huggingface.co/papers/2304.13477 |
19
  | **Algorithm** | DualOGD β€” Lagrangian dual multiplier updated by online error gradient descent |
20
+ | **Auction Type** | **First-price** |
21
  | **Constraints** | Budget cap: total spend ≀ ρT |
22
  | **Regret Bound** | Γ•(√T) for both full-information and one-sided feedback |
23
  | **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t βˆ’ Ρ·(ρ βˆ’ cΜƒ_t(b_t))) |
24
  | **Bid Rule** | b_t = argmax_b (rΜƒ_t(v_t, b) βˆ’ Ξ»_tΒ·cΜƒ_t(b)) |
25
+ | **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (G̃_t) |
26
+ | **Code Pattern (Wang 2023, Algorithm 1)** |
27
+ ```python
28
+ # Initialization
29
+ λ = 0.0; Ρ = 1.0 / sqrt(T); ρ = B / T
30
+
31
+ for t in range(T):
32
+ v_t = pCTR(features_t) * click_value # from CTR model
33
+
34
+ # Estimate reward and cost from historical competing bids
35
+ r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
36
+ c_tilde = lambda b: mean([b for d in d_history if b >= d])
37
+
38
+ # Bid: maximize cost-adjusted reward
39
+ b_t = argmax_b (r_tilde(v_t, b) - Ξ» * c_tilde(b))
40
+
41
+ # Observe maximum competing bid d_t (full feedback)
42
+ won = (b_t >= d_t); cost = b_t if won else 0
43
+
44
+ # Online gradient descent on dual multiplier
45
+ λ = max(0, λ - Ρ * (ρ - c_tilde(b_t)))
46
+ ```
47
+
48
+ ### 1.2 TwoSidedDual β€” Budget Cap + Spend Floor
49
+
50
+ | Property | Detail |
51
+ |----------|--------|
52
+ | **Base** | Extension of Wang et al. (2023) |
53
+ | **Constraints** | Total spend ≀ B (cap) AND spend β‰₯ kΒ·B (floor) |
54
+ | **Dual Variables** | ΞΌ for cap, Ξ½ for floor |
55
+ | **Updates** |
56
+ | ΞΌ_{t+1} = Proj_{ΞΌβ‰₯0}(ΞΌ_t βˆ’ η₁·(ρ βˆ’ cΜƒ_t(b_t))) | cap penalty |
57
+ | Ξ½_{t+1} = Proj_{Ξ½β‰₯0}(Ξ½_t βˆ’ Ξ·β‚‚Β·(cΜƒ_t(b_t) βˆ’ kρ)) | floor incentive |
58
+ | **Bid Rule** | b_t = argmax_b (rΜƒ_t(v_t, b) βˆ’ (ΞΌ_t βˆ’ Ξ½_t)Β·cΜƒ_t(b)) |
59
+ | **Key Insight** | When ΞΌ > Ξ½: bidding is restrained (ahead on spend). When Ξ½ > ΞΌ: bidding is encouraged (behind on spend floor). |
60
+
61
+ ### 1.3 Adversarial Bidding β€” Non-Stationary Environments
62
 
63
+ | Property | Detail |
64
+ |----------|--------|
65
+ | **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" |
66
+ | **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) |
67
+ | **Algorithm** | Adaptive dual OGD with change-point detection |
68
+ | **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning |
69
+
70
+ ### 1.4 Contextual First-Price (2026)
71
 
72
  | Property | Detail |
73
  |----------|--------|
74
+ | **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
75
+ | **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
76
+ | **Algorithm** | Dual OGD + quantile-based contextual censored regression |
77
+ | **Key Innovation** | Extends Wang to contextual (feature-based) auctions |
78
+
79
+ ### 1.5 Joint Value Estimation + Bidding
 
 
 
 
 
 
 
80
 
81
  | Property | Detail |
82
  |----------|--------|
83
+ | **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" |
84
+ | **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) |
85
+ | **Key Insight** | Simultaneously learn CTR and bidding strategy β€” no separate CTR model training phase |
 
 
 
 
86
 
87
+ ### 1.6 RLB β€” Reinforcement Learning (Baseline)
88
 
89
  | Property | Detail |
90
  |----------|--------|
91
  | **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
92
+ | **Authors** | Han Cai et al., WSDM 2017 |
 
93
  | **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
94
  | **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
95
  | **Algorithm** | MDP + Dynamic Programming + Neural value function |
96
+ | **State** | (remaining auctions, remaining budget, features) |
97
+ | **Action** | bid price a ∈ [0, budget] |
98
  | **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |
99
 
100
+ ### 1.7 Static Baselines
 
 
 
 
 
 
 
101
 
102
+ | Algorithm | Bid Rule | Notes |
103
+ |-----------|----------|-------|
104
+ | **Linear** | b = base_bid Γ— (pCTR / avg_pCTR) | Simple proportional |
105
+ | **Threshold** | b = fixed_bid if pCTR > Ο„ else 0 | Binary decision |
106
+ | **ValueShading** | b = v_t / (1 + Ξ») | From second-price literature, adapted |
107
 
108
+ ### Algorithm Comparison Matrix
 
 
 
 
 
 
 
109
 
110
+ | Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity |
111
+ |-----------|-----------|-------------------|------------|-----------------|------------|
112
+ | **DualOGD** | βœ… Online | Empirical CDF | ❌ (cap only) | βœ… Γ•(√T) | Medium |
113
+ | **TwoSidedDual** | βœ… Online | Empirical CDF | βœ… (cap+floor) | ❌ (heuristic) | Medium |
114
+ | **RLB** | βœ… DP | Neural dist. model | ❌ | ❌ | High |
115
+ | **Linear** | ❌ | None | ❌ | ❌ | Minimal |
116
+ | **Threshold** | ❌ | None | ❌ | ❌ | Minimal |
117
 
118
  ---
119
 
120
  ## 2. CTR Prediction Models
121
 
122
+ ### 2.1 FinalMLP ⭐ RECOMMENDED
123
 
124
  | Property | Detail |
125
  |----------|--------|
126
  | **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
127
+ | **Authors** | Kelong Mao et al., AAAI 2023 |
128
  | **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
129
  | **Criteo AUC** | **0.8149** |
130
+ | **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion |
131
+ | **Why Best for RTB** | Pure feed-forward MLP β€” <1ms inference, no attention/RNN overhead |
132
+ | **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch |
133
+
134
+ ```python
135
+ # FinalMLP architecture
136
+ # Stream 1: MLP(features * gate_weights) β†’ feature selection
137
+ # Stream 2: MLP(features * (1-gate_weights)) β†’ complementary view
138
+ # Fusion: Bilinear(stream1_output, stream2_output) β†’ sigmoid
139
+ ```
140
 
141
+ ### 2.2 DeepFM β€” Simple Baseline
142
 
143
+ | Property | Detail |
144
+ |----------|--------|
145
+ | **Paper** | "DeepFM: A Factorization-Machine based Neural Network" |
146
+ | **Criteo AUC** | 0.8138 |
147
+ | **Architecture** | Shared embedding β†’ FM (2nd-order) + DNN β†’ sum β†’ sigmoid |
148
+
149
+ ### 2.3 DCNv2 β€” Industry Standard
150
+
151
+ | Property | Detail |
152
+ |----------|--------|
153
+ | **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) |
154
+ | **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) |
155
+ | **Criteo AUC** | 0.8142-0.8144 |
156
+ | **Architecture** | CrossNetV2 (low-rank) + DNN in parallel |
157
+
158
+ ### 2.4 BARS Meta-Finding ⚠️ IMPORTANT
159
+
160
+ | Property | Detail |
161
+ |----------|--------|
162
+ | **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] |
163
+ | **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are ≀0.1-0.3% AUC** |
164
+ | **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering |
165
 
166
+ ### CTR Model Comparison for RTB
167
 
168
+ | Model | AUC (Criteo) | Inference Speed | RTB Latency OK? |
169
+ |-------|-------------|-----------------|-----------------|
170
+ | **FinalMLP** | 0.8149 | ⭐⭐⭐⭐⭐ | βœ… Yes |
171
+ | DCNv2 | 0.8142 | ⭐⭐⭐⭐ | βœ… Yes |
172
+ | DeepFM | 0.8138 | ⭐⭐⭐⭐ | βœ… Yes |
173
+ | GDCN | 0.8161* | ⭐⭐⭐⭐ | βœ… Yes |
174
+ | DIN | β€” | ⭐⭐ | ❌ No |
175
+ | DIEN | β€” | ⭐ | ❌ No |
176
+
177
+ *Own data split, not directly comparable.
178
 
179
  ---
180
 
181
+ ## 3. Clearing Price / Win Probability Prediction
182
 
183
+ ### 3.1 Empirical CDF (Non-Parametric) ⭐ BASELINE
184
 
185
  | Property | Detail |
186
  |----------|--------|
187
+ | **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 |
188
+ | **Method** | Maintain array of observed competing bids d_s |
189
+ | **Win Probability** | P(win|b) = GΜƒ_t(b) = (1/(t-1))βˆ‘_{s=1}^{t-1} πŸ™{b β‰₯ d_s} |
190
+ | **Expected Cost** | E[cost|win,b] = (1/GΜƒ_t(b)) Β· mean({d_s : d_s ≀ b}) |
191
+ | **Expected Reward** | r̃_t(v,b) = (v - b) · G̃_t(b) |
192
+ | **Expected Cost (dual)** | c̃_t(b) = b · G̃_t(b) |
193
+ | **Pros** | No training, theoretically sound, adapts online |
194
+ | **Cons** | No context, cold-start issue, requires full feedback |
195
+
196
+ ```python
197
+ class EmpiricalCDF:
198
+ def __init__(self):
199
+ self.competing_bids = []
200
+
201
+ def update(self, d_t):
202
+ """d_t = maximum competing bid (observed under full feedback)"""
203
+ self.competing_bids.append(d_t)
204
+
205
+ def win_prob(self, b):
206
+ if not self.competing_bids:
207
+ return 0.5
208
+ return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
209
+
210
+ def expected_cost(self, b):
211
+ wins = [d for d in self.competing_bids if b >= d]
212
+ if not wins:
213
+ return b
214
+ return np.mean(wins)
215
+ ```
216
 
217
+ ### 3.2 TorchSurv β€” Deep Censored Learning
218
 
219
  | Property | Detail |
220
  |----------|--------|
221
+ | **Library** | TorchSurv (Novartis, 200β˜…) |
222
+ | **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) |
223
+ | **GitHub** | https://github.com/Novartis/torchsurv |
224
+ | **Install** | `pip install torchsurv` |
225
+ | **Method** | Neural network with Cox PH or Weibull AFT loss |
226
+ | **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) |
227
+ | **Output** | Survival function S(b|x) = P(market_price > b | features) |
228
+ | **Win Prob** | P(win|b,x) = 1 βˆ’ S(b|x) |
229
+
230
+ ```python
231
+ from torchsurv.loss import cox
232
+
233
+ # log_hazard = model(features) # shape (batch,)
234
+ # event = 1 if won (uncensored), 0 if lost (censored)
235
+ # time = market_price if won, bid if lost
236
+ loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
237
+ ```
238
 
239
+ ### 3.3 Win Probability NN (Simplest ML)
240
 
241
  | Property | Detail |
242
  |----------|--------|
243
+ | **Method** | Binary classifier: P(win | bid_price, features) |
244
+ | **Pros** | Dead simple, BCELoss |
245
+ | **Cons** | Ignores price magnitude info when winning |
246
+ | **Architecture** | features βŠ• bid_price β†’ MLP β†’ sigmoid |
 
 
 
 
 
 
 
 
247
 
248
  ---
249
 
250
  ## 4. Datasets
251
 
252
+ ### Verified on HuggingFace Hub
253
 
254
+ | Dataset | HF Path | Rows | Features | Label | Status |
255
+ |---------|---------|------|----------|-------|--------|
256
+ | **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | βœ… Ready |
257
+ | Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | βœ… Ready |
258
 
259
+ ### RTB-Specific (External Only)
260
 
261
+ | Dataset | Description | Availability |
262
+ |---------|-------------|-------------|
263
+ | iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org |
264
+ | YOYI | ~400M bid log records | Various mirrors |
265
 
266
+ **Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation.
 
 
 
 
 
 
 
 
 
 
 
267
 
268
  ---
269
 
270
+ ## 5. Codebases & Libraries
271
 
272
+ | Library | URL | Use |
273
+ |---------|-----|-----|
274
+ | FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
275
+ | DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models |
276
+ | TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price |
277
+ | BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark |
278
+ | rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) |
279
 
280
  ---
281
 
282
+ ## 6. Recommended Architecture
283
 
284
  ```
285
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
286
+ β”‚ FIRST-PRICE BIDDING ENGINE β”‚
287
+ β”‚ β”‚
288
+ β”‚ Dual OGD: Ξ»_{t+1} = max(0, Ξ»_t - Ρ·(ρ - cΜƒ_t(b_t))) β”‚
289
+ β”‚ Two-Sided: ΞΌ (cap) + Ξ½ (floor) dual variables β”‚
290
+ β”‚ β”‚
291
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
292
+ β”‚ PREDICTION MODELS β”‚
293
+ β”‚ β”‚
294
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
295
+ β”‚ β”‚ CTR: FinalMLP β”‚ β”‚ Win Prob: Empirical CDF β”‚ β”‚
296
+ β”‚ β”‚ v_t = pCTR Γ— V β”‚ β”‚ GΜƒ_t(b) = frac of d_s ≀ b β”‚ β”‚
297
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
298
+ β”‚ β”‚
299
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
300
+ β”‚ β”‚ Optional: TorchSurv for contextual win prob β”‚ β”‚
301
+ β”‚ β”‚ P(win|b,x) = 1 - S(b|x) β”‚ β”‚
302
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
303
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
304
+ β”‚ DATASETS β”‚
305
+ β”‚ Criteo_x4 (CTR training) + synthetic market prices β”‚
306
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
307
  ```
308
 
309
+ ---
310
+
311
  ## Paper Index
312
 
313
+ | # | Paper | arXiv | Focus |
314
+ |---|-------|-------|-------|
315
+ | 1 | Wang et al. β€” Learning to Bid in Repeated FPA with Budgets | 2304.13477 | ⭐ Primary algorithm |
316
+ | 2 | β€” Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift |
317
+ | 3 | β€” Contextual First-Price (Quantile) | 2603.07207 | Contextual extension |
318
+ | 4 | β€” Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding |
319
+ | 5 | β€” No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework |
320
+ | 6 | Cai et al. β€” RLB | 1701.02490 | RL baseline |
321
+ | 7 | β€” Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts |
322
+ | 8 | Mao et al. β€” FinalMLP | 2304.00902 | CTR model |
323
+ | 9 | Wang et al. β€” DCN V2 | 2008.13535 | CTR model |
324
+ | 10 | Guo et al. β€” DeepFM | β€” | CTR model |
325
+ | 11 | Zhu et al. β€” BARS-CTR | 2009.05794 | CTR benchmark |
326
+ | 12 | Wu et al. β€” Censored Price Prediction | β€” | Clearing price |
327
+ | 13 | β€” TorchSurv | 2404.10761 | Survival analysis library |