Upload RESEARCH_RESOURCES.md
Browse files- RESEARCH_RESOURCES.md +239 -166
RESEARCH_RESOURCES.md
CHANGED
|
@@ -1,254 +1,327 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
-
> Generated: 2026-05-05
|
|
|
|
| 4 |
|
| 5 |
---
|
| 6 |
|
| 7 |
-
##
|
| 8 |
|
| 9 |
-
1.
|
| 10 |
-
2. [CTR Prediction Models](#2-ctr-prediction-models)
|
| 11 |
-
3. [Clearing Price / Market Price Prediction](#3-clearing-price--market-price-prediction)
|
| 12 |
-
4. [Datasets](#4-datasets)
|
| 13 |
-
5. [Codebases & Implementations](#5-codebases--implementations)
|
| 14 |
-
6. [Benchmark Leaderboards](#6-benchmark-leaderboards)
|
| 15 |
-
7. [Recommended Architecture](#7-recommended-architecture)
|
| 16 |
-
|
| 17 |
-
---
|
| 18 |
-
|
| 19 |
-
## 1. Bidding Algorithms
|
| 20 |
-
|
| 21 |
-
### 1.1 Lagrangian Dual + Online Gradient Descent (BEST MATCH)
|
| 22 |
|
| 23 |
| Property | Detail |
|
| 24 |
|----------|--------|
|
| 25 |
| **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
|
| 26 |
-
| **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong
|
| 27 |
-
| **Venue** |
|
| 28 |
| **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
|
| 29 |
| **HF Papers** | https://huggingface.co/papers/2304.13477 |
|
| 30 |
| **Algorithm** | DualOGD β Lagrangian dual multiplier updated by online error gradient descent |
|
| 31 |
-
| **Auction Type** | First-price
|
| 32 |
| **Constraints** | Budget cap: total spend β€ ΟT |
|
| 33 |
| **Regret Bound** | Γ(βT) for both full-information and one-sided feedback |
|
| 34 |
| **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t β Ρ·(Ο β cΜ_t(b_t))) |
|
| 35 |
| **Bid Rule** | b_t = argmax_b (rΜ_t(v_t, b) β Ξ»_tΒ·cΜ_t(b)) |
|
| 36 |
-
| **Prediction Models
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
| Property | Detail |
|
| 41 |
|----------|--------|
|
| 42 |
-
| **Paper** | "
|
| 43 |
-
| **
|
| 44 |
-
| **
|
| 45 |
-
| **
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
| **Auction Type** | Second-price (truthful) |
|
| 49 |
-
| **Bid Rule** | b_t = v_t / (1 + ΞΌ_t) |
|
| 50 |
-
| **Dual Update** | ΞΌ_{t+1} = Proj(ΞΌ_t β Ξ·Β·(Ο β payment_t)) |
|
| 51 |
-
| **Key Insight** | No market price model needed for second-price auctions |
|
| 52 |
-
| **Prediction Models** | CTR predictor only |
|
| 53 |
-
|
| 54 |
-
### 1.3 Dual Descent with RoS + Budget (Multi-Constraint)
|
| 55 |
|
| 56 |
| Property | Detail |
|
| 57 |
|----------|--------|
|
| 58 |
-
| **Paper** | "
|
| 59 |
-
| **
|
| 60 |
-
| **
|
| 61 |
-
| **arXiv** | [2208.13713](https://arxiv.org/abs/2208.13713) |
|
| 62 |
-
| **Algorithm** | Two dual variables: Ξ» for RoS, ΞΌ for budget |
|
| 63 |
-
| **Bid Rule** | b_t = ((1+Ξ»_t)/(ΞΌ_t+Ξ»_t)) Β· v_t |
|
| 64 |
-
| **Key Insight** | Adaptable for k% spend floor β second dual variable enforces minimum spend |
|
| 65 |
|
| 66 |
-
### 1.
|
| 67 |
|
| 68 |
| Property | Detail |
|
| 69 |
|----------|--------|
|
| 70 |
| **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
|
| 71 |
-
| **Authors** | Han Cai et al.
|
| 72 |
-
| **Venue** | WSDM 2017 |
|
| 73 |
| **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
|
| 74 |
| **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
|
| 75 |
| **Algorithm** | MDP + Dynamic Programming + Neural value function |
|
| 76 |
-
| **
|
|
|
|
| 77 |
| **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |
|
| 78 |
|
| 79 |
-
### 1.
|
| 80 |
-
|
| 81 |
-
| Property | Detail |
|
| 82 |
-
|----------|--------|
|
| 83 |
-
| **Paper** | "HiBid: A Cross-Channel Constrained Bidding System" |
|
| 84 |
-
| **arXiv** | [2312.17503](https://arxiv.org/abs/2312.17503) |
|
| 85 |
-
| **Scale** | 64K advertisers, 70M requests/day, 4 channels, Meituan |
|
| 86 |
-
| **Algorithm** | High-level RL budget allocation + Low-level Ξ»-parameterized bidding |
|
| 87 |
|
| 88 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
|
| 90 |
-
|
| 91 |
-
For each auction t:
|
| 92 |
-
1. Observe value v_t (from CTR prediction Γ click value)
|
| 93 |
-
2. Compute bid: b_t = f(v_t, dual_multiplier_t)
|
| 94 |
-
3. Observe outcome: payment c_t (if won) or 0 (if lost)
|
| 95 |
-
4. Compute gradient: g_t = Ο β c_t
|
| 96 |
-
5. Update multiplier: Ξ»_{t+1} = Proj_{Ξ»β₯0}(Ξ»_t β Ξ·Β·g_t)
|
| 97 |
-
```
|
| 98 |
|
| 99 |
-
|
|
| 100 |
-
|--------|---------|----------------------|
|
| 101 |
-
|
|
| 102 |
-
|
|
| 103 |
-
|
|
|
|
|
|
|
|
| 104 |
|
| 105 |
---
|
| 106 |
|
| 107 |
## 2. CTR Prediction Models
|
| 108 |
|
| 109 |
-
### 2.1 FinalMLP
|
| 110 |
|
| 111 |
| Property | Detail |
|
| 112 |
|----------|--------|
|
| 113 |
| **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
|
|
|
|
| 114 |
| **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
|
| 115 |
| **Criteo AUC** | **0.8149** |
|
| 116 |
-
| **
|
| 117 |
-
| **
|
| 118 |
-
| **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 119 |
|
| 120 |
-
### 2.2
|
| 121 |
|
| 122 |
-
|
|
| 123 |
-
|-------
|
| 124 |
-
| **
|
| 125 |
-
| **
|
| 126 |
-
| **
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 131 |
|
| 132 |
-
|
| 133 |
|
| 134 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
|
| 136 |
---
|
| 137 |
|
| 138 |
-
## 3. Clearing Price /
|
| 139 |
|
| 140 |
-
### 3.1
|
| 141 |
|
| 142 |
| Property | Detail |
|
| 143 |
|----------|--------|
|
| 144 |
-
| **Source** | Wang et al. (2023), Algorithm 1 |
|
| 145 |
-
| **Method** |
|
| 146 |
-
| **
|
| 147 |
-
| **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 148 |
|
| 149 |
-
### 3.2 Deep Censored Learning
|
| 150 |
|
| 151 |
| Property | Detail |
|
| 152 |
|----------|--------|
|
| 153 |
-
| **Library** |
|
| 154 |
-
| **
|
| 155 |
-
| **
|
| 156 |
-
| **
|
| 157 |
-
| **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 158 |
|
| 159 |
-
### 3.3
|
| 160 |
|
| 161 |
| Property | Detail |
|
| 162 |
|----------|--------|
|
| 163 |
-
| **Method** |
|
| 164 |
-
| **Pros** |
|
| 165 |
-
| **Cons** |
|
| 166 |
-
|
| 167 |
-
### Comparison
|
| 168 |
-
|
| 169 |
-
| Method | Contextual? | Handles Censoring? | Training? | Complexity |
|
| 170 |
-
|--------|-------------|-------------------|-----------|------------|
|
| 171 |
-
| Empirical CDF | β | N/A | None | Minimal |
|
| 172 |
-
| Censored Linear | β
| β
| Light | Low |
|
| 173 |
-
| Deep Survival | β
| β
| Neural net | Medium |
|
| 174 |
-
| Win Prob NN | β
| β | Neural net | Low |
|
| 175 |
|
| 176 |
---
|
| 177 |
|
| 178 |
## 4. Datasets
|
| 179 |
|
| 180 |
-
###
|
| 181 |
|
| 182 |
-
| Dataset | HF Path |
|
| 183 |
-
|---------|---------|------|----------|
|
| 184 |
-
| Criteo_x4 | reczoo/Criteo_x4 | 45.8M
|
| 185 |
-
| Avazu_x4 | reczoo/Avazu_x4 | 40.4M
|
| 186 |
|
| 187 |
-
### RTB
|
| 188 |
|
| 189 |
-
| Dataset |
|
| 190 |
-
|---------|--------|-------------|
|
| 191 |
-
| iPinYou | data.computational-advertising.org |
|
| 192 |
-
| YOYI |
|
| 193 |
|
| 194 |
-
-
|
| 195 |
-
|
| 196 |
-
## 5. Codebases
|
| 197 |
-
|
| 198 |
-
| Library | URL | Purpose |
|
| 199 |
-
|---------|-----|---------|
|
| 200 |
-
| **FuxiCTR** | https://github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
|
| 201 |
-
| **DeepCTR-Torch** | https://github.com/shenweichen/DeepCTR-Torch | 20+ CTR models, simple API |
|
| 202 |
-
| **TorchSurv** | https://github.com/Novartis/torchsurv | Deep survival for clearing price |
|
| 203 |
-
| **BARS** | https://github.com/openbenchmark/BARS | Standardized CTR benchmark |
|
| 204 |
-
| **rlb-dp** | https://github.com/han-cai/rlb-dp | RL for RTB |
|
| 205 |
-
| **budget_constrained_bidding** | https://github.com/dingmu365/budget_constrained_bidding | Budget-constrained algorithms |
|
| 206 |
|
| 207 |
---
|
| 208 |
|
| 209 |
-
##
|
| 210 |
|
| 211 |
-
|
|
| 212 |
-
|-------------|-----|
|
| 213 |
-
|
|
| 214 |
-
|
|
|
|
|
|
|
|
|
|
|
| 215 |
|
| 216 |
---
|
| 217 |
|
| 218 |
-
##
|
| 219 |
|
| 220 |
```
|
| 221 |
-
βββββββββββββββββββββββββββββββββββββββββββββββ
|
| 222 |
-
β
|
| 223 |
-
β
|
| 224 |
-
β
|
| 225 |
-
|
| 226 |
-
β
|
| 227 |
-
|
| 228 |
-
β
|
| 229 |
-
β
|
| 230 |
-
β
|
| 231 |
-
|
| 232 |
-
β
|
| 233 |
-
β
|
| 234 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 235 |
```
|
| 236 |
|
|
|
|
|
|
|
| 237 |
## Paper Index
|
| 238 |
|
| 239 |
-
| # | Paper | arXiv |
|
| 240 |
-
|---|-------|-------|------
|
| 241 |
-
| 1 | Wang et al. β
|
| 242 |
-
| 2 |
|
| 243 |
-
| 3 |
|
| 244 |
-
| 4 |
|
| 245 |
-
| 5 |
|
| 246 |
-
| 6 |
|
| 247 |
-
| 7 |
|
| 248 |
-
| 8 |
|
| 249 |
-
| 9 | Wang et al. β DCN V2 | 2008.13535 |
|
| 250 |
-
| 10 | Guo et al. β DeepFM | β |
|
| 251 |
-
| 11 | Zhu et al. β BARS-CTR | 2009.05794 |
|
| 252 |
-
| 12 | Wu et al. β Censored Price Prediction | β |
|
| 253 |
-
| 13 | β TorchSurv | 2404.10761 |
|
| 254 |
-
| 14 | β Robust Budget Pacing | 2302.02006 | 2023 | Growing |
|
|
|
|
| 1 |
+
# Bidding Algorithms Benchmark β Research Resources
|
| 2 |
|
| 3 |
+
> First-Price Auction Focus | Generated: 2026-05-05
|
| 4 |
+
> Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark
|
| 5 |
|
| 6 |
---
|
| 7 |
|
| 8 |
+
## 1. Bidding Algorithms for First-Price Auctions
|
| 9 |
|
| 10 |
+
### 1.1 DualOGD β Lagrangian Dual + Online Gradient Descent β PRIMARY
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
| Property | Detail |
|
| 13 |
|----------|--------|
|
| 14 |
| **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
|
| 15 |
+
| **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong |
|
| 16 |
+
| **Venue** | 2023 |
|
| 17 |
| **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
|
| 18 |
| **HF Papers** | https://huggingface.co/papers/2304.13477 |
|
| 19 |
| **Algorithm** | DualOGD β Lagrangian dual multiplier updated by online error gradient descent |
|
| 20 |
+
| **Auction Type** | **First-price** |
|
| 21 |
| **Constraints** | Budget cap: total spend β€ ΟT |
|
| 22 |
| **Regret Bound** | Γ(βT) for both full-information and one-sided feedback |
|
| 23 |
| **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t β Ρ·(Ο β cΜ_t(b_t))) |
|
| 24 |
| **Bid Rule** | b_t = argmax_b (rΜ_t(v_t, b) β Ξ»_tΒ·cΜ_t(b)) |
|
| 25 |
+
| **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (GΜ_t) |
|
| 26 |
+
| **Code Pattern (Wang 2023, Algorithm 1)** |
|
| 27 |
+
```python
|
| 28 |
+
# Initialization
|
| 29 |
+
Ξ» = 0.0; Ξ΅ = 1.0 / sqrt(T); Ο = B / T
|
| 30 |
+
|
| 31 |
+
for t in range(T):
|
| 32 |
+
v_t = pCTR(features_t) * click_value # from CTR model
|
| 33 |
+
|
| 34 |
+
# Estimate reward and cost from historical competing bids
|
| 35 |
+
r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
|
| 36 |
+
c_tilde = lambda b: mean([b for d in d_history if b >= d])
|
| 37 |
+
|
| 38 |
+
# Bid: maximize cost-adjusted reward
|
| 39 |
+
b_t = argmax_b (r_tilde(v_t, b) - Ξ» * c_tilde(b))
|
| 40 |
+
|
| 41 |
+
# Observe maximum competing bid d_t (full feedback)
|
| 42 |
+
won = (b_t >= d_t); cost = b_t if won else 0
|
| 43 |
+
|
| 44 |
+
# Online gradient descent on dual multiplier
|
| 45 |
+
Ξ» = max(0, Ξ» - Ξ΅ * (Ο - c_tilde(b_t)))
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
### 1.2 TwoSidedDual β Budget Cap + Spend Floor
|
| 49 |
+
|
| 50 |
+
| Property | Detail |
|
| 51 |
+
|----------|--------|
|
| 52 |
+
| **Base** | Extension of Wang et al. (2023) |
|
| 53 |
+
| **Constraints** | Total spend β€ B (cap) AND spend β₯ kΒ·B (floor) |
|
| 54 |
+
| **Dual Variables** | ΞΌ for cap, Ξ½ for floor |
|
| 55 |
+
| **Updates** |
|
| 56 |
+
| ΞΌ_{t+1} = Proj_{ΞΌβ₯0}(ΞΌ_t β Ξ·βΒ·(Ο β cΜ_t(b_t))) | cap penalty |
|
| 57 |
+
| Ξ½_{t+1} = Proj_{Ξ½β₯0}(Ξ½_t β Ξ·βΒ·(cΜ_t(b_t) β kΟ)) | floor incentive |
|
| 58 |
+
| **Bid Rule** | b_t = argmax_b (rΜ_t(v_t, b) β (ΞΌ_t β Ξ½_t)Β·cΜ_t(b)) |
|
| 59 |
+
| **Key Insight** | When ΞΌ > Ξ½: bidding is restrained (ahead on spend). When Ξ½ > ΞΌ: bidding is encouraged (behind on spend floor). |
|
| 60 |
+
|
| 61 |
+
### 1.3 Adversarial Bidding β Non-Stationary Environments
|
| 62 |
|
| 63 |
+
| Property | Detail |
|
| 64 |
+
|----------|--------|
|
| 65 |
+
| **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" |
|
| 66 |
+
| **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) |
|
| 67 |
+
| **Algorithm** | Adaptive dual OGD with change-point detection |
|
| 68 |
+
| **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning |
|
| 69 |
+
|
| 70 |
+
### 1.4 Contextual First-Price (2026)
|
| 71 |
|
| 72 |
| Property | Detail |
|
| 73 |
|----------|--------|
|
| 74 |
+
| **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
|
| 75 |
+
| **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
|
| 76 |
+
| **Algorithm** | Dual OGD + quantile-based contextual censored regression |
|
| 77 |
+
| **Key Innovation** | Extends Wang to contextual (feature-based) auctions |
|
| 78 |
+
|
| 79 |
+
### 1.5 Joint Value Estimation + Bidding
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
|
| 81 |
| Property | Detail |
|
| 82 |
|----------|--------|
|
| 83 |
+
| **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" |
|
| 84 |
+
| **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) |
|
| 85 |
+
| **Key Insight** | Simultaneously learn CTR and bidding strategy β no separate CTR model training phase |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
+
### 1.6 RLB β Reinforcement Learning (Baseline)
|
| 88 |
|
| 89 |
| Property | Detail |
|
| 90 |
|----------|--------|
|
| 91 |
| **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
|
| 92 |
+
| **Authors** | Han Cai et al., WSDM 2017 |
|
|
|
|
| 93 |
| **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
|
| 94 |
| **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
|
| 95 |
| **Algorithm** | MDP + Dynamic Programming + Neural value function |
|
| 96 |
+
| **State** | (remaining auctions, remaining budget, features) |
|
| 97 |
+
| **Action** | bid price a β [0, budget] |
|
| 98 |
| **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |
|
| 99 |
|
| 100 |
+
### 1.7 Static Baselines
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
|
| 102 |
+
| Algorithm | Bid Rule | Notes |
|
| 103 |
+
|-----------|----------|-------|
|
| 104 |
+
| **Linear** | b = base_bid Γ (pCTR / avg_pCTR) | Simple proportional |
|
| 105 |
+
| **Threshold** | b = fixed_bid if pCTR > Ο else 0 | Binary decision |
|
| 106 |
+
| **ValueShading** | b = v_t / (1 + Ξ») | From second-price literature, adapted |
|
| 107 |
|
| 108 |
+
### Algorithm Comparison Matrix
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
|
| 110 |
+
| Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity |
|
| 111 |
+
|-----------|-----------|-------------------|------------|-----------------|------------|
|
| 112 |
+
| **DualOGD** | β
Online | Empirical CDF | β (cap only) | β
Γ(βT) | Medium |
|
| 113 |
+
| **TwoSidedDual** | β
Online | Empirical CDF | β
(cap+floor) | β (heuristic) | Medium |
|
| 114 |
+
| **RLB** | β
DP | Neural dist. model | β | β | High |
|
| 115 |
+
| **Linear** | β | None | β | β | Minimal |
|
| 116 |
+
| **Threshold** | β | None | β | β | Minimal |
|
| 117 |
|
| 118 |
---
|
| 119 |
|
| 120 |
## 2. CTR Prediction Models
|
| 121 |
|
| 122 |
+
### 2.1 FinalMLP β RECOMMENDED
|
| 123 |
|
| 124 |
| Property | Detail |
|
| 125 |
|----------|--------|
|
| 126 |
| **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
|
| 127 |
+
| **Authors** | Kelong Mao et al., AAAI 2023 |
|
| 128 |
| **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
|
| 129 |
| **Criteo AUC** | **0.8149** |
|
| 130 |
+
| **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion |
|
| 131 |
+
| **Why Best for RTB** | Pure feed-forward MLP β <1ms inference, no attention/RNN overhead |
|
| 132 |
+
| **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch |
|
| 133 |
+
|
| 134 |
+
```python
|
| 135 |
+
# FinalMLP architecture
|
| 136 |
+
# Stream 1: MLP(features * gate_weights) β feature selection
|
| 137 |
+
# Stream 2: MLP(features * (1-gate_weights)) β complementary view
|
| 138 |
+
# Fusion: Bilinear(stream1_output, stream2_output) β sigmoid
|
| 139 |
+
```
|
| 140 |
|
| 141 |
+
### 2.2 DeepFM β Simple Baseline
|
| 142 |
|
| 143 |
+
| Property | Detail |
|
| 144 |
+
|----------|--------|
|
| 145 |
+
| **Paper** | "DeepFM: A Factorization-Machine based Neural Network" |
|
| 146 |
+
| **Criteo AUC** | 0.8138 |
|
| 147 |
+
| **Architecture** | Shared embedding β FM (2nd-order) + DNN β sum β sigmoid |
|
| 148 |
+
|
| 149 |
+
### 2.3 DCNv2 β Industry Standard
|
| 150 |
+
|
| 151 |
+
| Property | Detail |
|
| 152 |
+
|----------|--------|
|
| 153 |
+
| **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) |
|
| 154 |
+
| **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) |
|
| 155 |
+
| **Criteo AUC** | 0.8142-0.8144 |
|
| 156 |
+
| **Architecture** | CrossNetV2 (low-rank) + DNN in parallel |
|
| 157 |
+
|
| 158 |
+
### 2.4 BARS Meta-Finding β οΈ IMPORTANT
|
| 159 |
+
|
| 160 |
+
| Property | Detail |
|
| 161 |
+
|----------|--------|
|
| 162 |
+
| **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] |
|
| 163 |
+
| **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are β€0.1-0.3% AUC** |
|
| 164 |
+
| **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering |
|
| 165 |
|
| 166 |
+
### CTR Model Comparison for RTB
|
| 167 |
|
| 168 |
+
| Model | AUC (Criteo) | Inference Speed | RTB Latency OK? |
|
| 169 |
+
|-------|-------------|-----------------|-----------------|
|
| 170 |
+
| **FinalMLP** | 0.8149 | βββββ | β
Yes |
|
| 171 |
+
| DCNv2 | 0.8142 | ββββ | β
Yes |
|
| 172 |
+
| DeepFM | 0.8138 | ββββ | β
Yes |
|
| 173 |
+
| GDCN | 0.8161* | ββββ | β
Yes |
|
| 174 |
+
| DIN | β | ββ | β No |
|
| 175 |
+
| DIEN | β | β | β No |
|
| 176 |
+
|
| 177 |
+
*Own data split, not directly comparable.
|
| 178 |
|
| 179 |
---
|
| 180 |
|
| 181 |
+
## 3. Clearing Price / Win Probability Prediction
|
| 182 |
|
| 183 |
+
### 3.1 Empirical CDF (Non-Parametric) β BASELINE
|
| 184 |
|
| 185 |
| Property | Detail |
|
| 186 |
|----------|--------|
|
| 187 |
+
| **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 |
|
| 188 |
+
| **Method** | Maintain array of observed competing bids d_s |
|
| 189 |
+
| **Win Probability** | P(win|b) = GΜ_t(b) = (1/(t-1))β_{s=1}^{t-1} π{b β₯ d_s} |
|
| 190 |
+
| **Expected Cost** | E[cost|win,b] = (1/GΜ_t(b)) Β· mean({d_s : d_s β€ b}) |
|
| 191 |
+
| **Expected Reward** | rΜ_t(v,b) = (v - b) Β· GΜ_t(b) |
|
| 192 |
+
| **Expected Cost (dual)** | cΜ_t(b) = b Β· GΜ_t(b) |
|
| 193 |
+
| **Pros** | No training, theoretically sound, adapts online |
|
| 194 |
+
| **Cons** | No context, cold-start issue, requires full feedback |
|
| 195 |
+
|
| 196 |
+
```python
|
| 197 |
+
class EmpiricalCDF:
|
| 198 |
+
def __init__(self):
|
| 199 |
+
self.competing_bids = []
|
| 200 |
+
|
| 201 |
+
def update(self, d_t):
|
| 202 |
+
"""d_t = maximum competing bid (observed under full feedback)"""
|
| 203 |
+
self.competing_bids.append(d_t)
|
| 204 |
+
|
| 205 |
+
def win_prob(self, b):
|
| 206 |
+
if not self.competing_bids:
|
| 207 |
+
return 0.5
|
| 208 |
+
return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
|
| 209 |
+
|
| 210 |
+
def expected_cost(self, b):
|
| 211 |
+
wins = [d for d in self.competing_bids if b >= d]
|
| 212 |
+
if not wins:
|
| 213 |
+
return b
|
| 214 |
+
return np.mean(wins)
|
| 215 |
+
```
|
| 216 |
|
| 217 |
+
### 3.2 TorchSurv β Deep Censored Learning
|
| 218 |
|
| 219 |
| Property | Detail |
|
| 220 |
|----------|--------|
|
| 221 |
+
| **Library** | TorchSurv (Novartis, 200β
) |
|
| 222 |
+
| **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) |
|
| 223 |
+
| **GitHub** | https://github.com/Novartis/torchsurv |
|
| 224 |
+
| **Install** | `pip install torchsurv` |
|
| 225 |
+
| **Method** | Neural network with Cox PH or Weibull AFT loss |
|
| 226 |
+
| **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) |
|
| 227 |
+
| **Output** | Survival function S(b|x) = P(market_price > b | features) |
|
| 228 |
+
| **Win Prob** | P(win|b,x) = 1 β S(b|x) |
|
| 229 |
+
|
| 230 |
+
```python
|
| 231 |
+
from torchsurv.loss import cox
|
| 232 |
+
|
| 233 |
+
# log_hazard = model(features) # shape (batch,)
|
| 234 |
+
# event = 1 if won (uncensored), 0 if lost (censored)
|
| 235 |
+
# time = market_price if won, bid if lost
|
| 236 |
+
loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
|
| 237 |
+
```
|
| 238 |
|
| 239 |
+
### 3.3 Win Probability NN (Simplest ML)
|
| 240 |
|
| 241 |
| Property | Detail |
|
| 242 |
|----------|--------|
|
| 243 |
+
| **Method** | Binary classifier: P(win | bid_price, features) |
|
| 244 |
+
| **Pros** | Dead simple, BCELoss |
|
| 245 |
+
| **Cons** | Ignores price magnitude info when winning |
|
| 246 |
+
| **Architecture** | features β bid_price β MLP β sigmoid |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 247 |
|
| 248 |
---
|
| 249 |
|
| 250 |
## 4. Datasets
|
| 251 |
|
| 252 |
+
### Verified on HuggingFace Hub
|
| 253 |
|
| 254 |
+
| Dataset | HF Path | Rows | Features | Label | Status |
|
| 255 |
+
|---------|---------|------|----------|-------|--------|
|
| 256 |
+
| **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | β
Ready |
|
| 257 |
+
| Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | β
Ready |
|
| 258 |
|
| 259 |
+
### RTB-Specific (External Only)
|
| 260 |
|
| 261 |
+
| Dataset | Description | Availability |
|
| 262 |
+
|---------|-------------|-------------|
|
| 263 |
+
| iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org |
|
| 264 |
+
| YOYI | ~400M bid log records | Various mirrors |
|
| 265 |
|
| 266 |
+
**Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 267 |
|
| 268 |
---
|
| 269 |
|
| 270 |
+
## 5. Codebases & Libraries
|
| 271 |
|
| 272 |
+
| Library | URL | Use |
|
| 273 |
+
|---------|-----|-----|
|
| 274 |
+
| FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
|
| 275 |
+
| DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models |
|
| 276 |
+
| TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price |
|
| 277 |
+
| BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark |
|
| 278 |
+
| rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) |
|
| 279 |
|
| 280 |
---
|
| 281 |
|
| 282 |
+
## 6. Recommended Architecture
|
| 283 |
|
| 284 |
```
|
| 285 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 286 |
+
β FIRST-PRICE BIDDING ENGINE β
|
| 287 |
+
β β
|
| 288 |
+
β Dual OGD: Ξ»_{t+1} = max(0, Ξ»_t - Ρ·(Ο - cΜ_t(b_t))) β
|
| 289 |
+
β Two-Sided: ΞΌ (cap) + Ξ½ (floor) dual variables β
|
| 290 |
+
β β
|
| 291 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
|
| 292 |
+
β PREDICTION MODELS β
|
| 293 |
+
β β
|
| 294 |
+
β βββββββββββββββββββ ββββββββββββββββββββββββββββ β
|
| 295 |
+
β β CTR: FinalMLP β β Win Prob: Empirical CDF β β
|
| 296 |
+
β β v_t = pCTR Γ V β β GΜ_t(b) = frac of d_s β€ b β β
|
| 297 |
+
β βββββββββββββββββββ ββββββββββββββββββββββββββββ β
|
| 298 |
+
β β
|
| 299 |
+
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
|
| 300 |
+
β β Optional: TorchSurv for contextual win prob β β
|
| 301 |
+
β β P(win|b,x) = 1 - S(b|x) β β
|
| 302 |
+
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
|
| 303 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
|
| 304 |
+
β DATASETS β
|
| 305 |
+
β Criteo_x4 (CTR training) + synthetic market prices β
|
| 306 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 307 |
```
|
| 308 |
|
| 309 |
+
---
|
| 310 |
+
|
| 311 |
## Paper Index
|
| 312 |
|
| 313 |
+
| # | Paper | arXiv | Focus |
|
| 314 |
+
|---|-------|-------|-------|
|
| 315 |
+
| 1 | Wang et al. β Learning to Bid in Repeated FPA with Budgets | 2304.13477 | β Primary algorithm |
|
| 316 |
+
| 2 | β Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift |
|
| 317 |
+
| 3 | β Contextual First-Price (Quantile) | 2603.07207 | Contextual extension |
|
| 318 |
+
| 4 | β Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding |
|
| 319 |
+
| 5 | β No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework |
|
| 320 |
+
| 6 | Cai et al. β RLB | 1701.02490 | RL baseline |
|
| 321 |
+
| 7 | β Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts |
|
| 322 |
+
| 8 | Mao et al. β FinalMLP | 2304.00902 | CTR model |
|
| 323 |
+
| 9 | Wang et al. β DCN V2 | 2008.13535 | CTR model |
|
| 324 |
+
| 10 | Guo et al. β DeepFM | β | CTR model |
|
| 325 |
+
| 11 | Zhu et al. β BARS-CTR | 2009.05794 | CTR benchmark |
|
| 326 |
+
| 12 | Wu et al. β Censored Price Prediction | β | Clearing price |
|
| 327 |
+
| 13 | β TorchSurv | 2404.10761 | Survival analysis library |
|
|
|