Compound Poisson–Lognormal Monte Carlo Risk Model

Author: Prof. Hernan Huwyler, CIAO MBA CPA
Affiliation: IE Law School Center for Risk and Compliance | Capgemini Applied AI Lab
License: CC BY 4.0
Version: 1.0.0
Last Updated: 2025


Model Overview

This dataset and accompanying Python model implement a compound frequency–severity Monte Carlo simulation engine for operational and AI risk quantification.

The model answers the core question facing Chief Risk Officers, CFOs, and AI Governance practitioners:

Given uncertainty in how often loss events occur and how severe each event is, what is the probability distribution of total annual loss — and what capital reserve does that distribution require?

Mathematical Foundation

Frequency model: N ~ Poisson(λ)

Each simulation period draws the number of loss events from a Poisson distribution with rate parameter λ (expected events per year). The Poisson distribution is the standard actuarial and operational risk choice for independent, random event counts.

Severity model: Lᵢ ~ Lognormal(μ, σ)

Each individual loss is drawn from a lognormal distribution, which is standard for financial loss severity because it enforces non-negativity, captures heavy right tails, and is consistent with regulatory frameworks including Basel III operational risk, Solvency II internal models, and NIST SP 800-30 risk quantification guidance.

Calibration method: μ and σ are calibrated analytically from business-facing inputs: a confidence interval [lower, upper] that the practitioner believes contains the central mass of individual loss severity at a specified probability level. This eliminates the need to estimate lognormal parameters directly from sparse loss data.

Aggregate loss: S = Σᵢ₌₁ᴺ Lᵢ

The total annual loss is the sum of all individual losses in the period. When N = 0, S = 0.

Simulation: 100,000 independent trials by default, vectorized using NumPy for performance. The law of large numbers ensures stable tail estimates at this sample size.


Risk Metrics Produced

Metric Formula Regulatory Reference
Mean Loss E[S] Baseline planning
Median Loss P50(S) Central tendency
Standard Deviation Std(S) Volatility measure
Value at Risk 95% P95(S) Basel III, Solvency II
Conditional VaR 95% E[S | S > VaR95] Expected Shortfall
Reserve Percentile Pₖ(S) user-defined ICAAP capital buffer
Exceedance Curve P(S ≥ x) Catastrophe modeling
Return Period 1 / P(S ≥ x) Infrastructure risk

Regulatory and Framework Mapping

ISO/IEC 42001 — AI Management System

Percentile Range ISO 42001 Control Theme
≥ 95th OC-8: Incident Response and Recovery
75th–94th OC-4: Risk Treatment and Controls
< 75th OC-2: Risk Assessment and Identification

NIST AI Risk Management Framework (AI RMF)

Loss Range NIST AI RMF Function
Extreme tail (≥ 99th) RESPOND + RECOVER
High risk (75th–98th) DETECT + RESPOND
Baseline (< 75th) GOVERN + MAP

EU AI Act Risk Tiers

  • Unacceptable risk: Scenarios with CVaR exceeding regulatory capital thresholds
  • High risk: Loss scenarios at or above VaR(95%) with systemic or rights-impacting AI
  • Limited risk: Base-case scenarios with adequate reserve coverage
  • Minimal risk: Best-case scenarios below reserve threshold

Basel III and Solvency II

  • P99: Standard for Basel III Advanced Measurement Approach operational risk capital
  • P99.5: Solvency II Solvency Capital Requirement standard
  • P99.9: ICAAP extreme stress scenario buffer

Dataset Contents

data/train.csv — 100,000 simulation trials

Primary dataset. One row per Monte Carlo trial. Contains aggregate loss outcome, event frequency, average severity, scenario classification, exceedance probability, VaR and CVaR flags, and all calibration parameters for full reproducibility.

data/test.csv — 288 stress scenarios

Validation dataset. One row per stress scenario combining six lambda values, four severity ranges, four confidence levels, and three random seeds. Contains full percentile distribution per scenario for sensitivity analysis.

data/percentile_table.csv — Percentile distribution table

Structured percentile summary with regulatory mapping. One row per percentile point from P1 to P99.9. Directly usable in risk reports, board presentations, and regulatory submissions.


Python Model — Key Features

from compound_risk_model import RiskModel, RiskModelConfig

cfg = RiskModelConfig(
    simulations    = 100_000,
    lower          = 1_000.0,    # Lower severity bound (monetary units)
    upper          = 2_000.0,    # Upper severity bound (monetary units)
    confidence_level = 0.80,     # Probability mass in [lower, upper]
    events         = 4.0,        # Expected loss events per year (Poisson λ)
    reserve        = 0.75,       # Reserve percentile for capital planning
    seed           = 123         # Reproducibility seed
)

model = RiskModel.from_interval(cfg)
model.summary()
model.plot_loss_exceedance_curve()
model.plot_loss_distribution()
model.plot_scatter()
model.plot_heatmap()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support