Spaces:

cloud450
/

SheildSense_API_SDK

Sleeping

App Files Files Community

cloud450 commited on Mar 17

Commit

c3045ee

verified ·

1 Parent(s): 4afcb3a

Delete deepfake_audio_detection.ipynb

Browse files

Files changed (1) hide show

deepfake_audio_detection.ipynb +0 -1624

deepfake_audio_detection.ipynb DELETED Viewed

@@ -1,1624 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# 🎙️ Deepfake Audio Detection System\n",
-    "\n",
-    "**Pipeline Overview:**\n",
-    "```\n",
-    "Audio → Noise Removal → Feature Extraction (Log-Mel + TEO)\n",
-    "      → ECAPA-TDNN Embeddings (192-dim) → XGBoost → REAL / FAKE\n",
-    "```\n",
-    "\n",
-    "**Architecture Highlights:**\n",
-    "- Spectral gating denoising\n",
-    "- 40-band log-mel spectrogram + Teager Energy Operator\n",
-    "- Simplified ECAPA-TDNN for speaker/spoof-aware embeddings\n",
-    "- XGBoost classifier on top of embeddings\n",
-    "\n",
-    "**Dataset:** Synthetic balanced dataset (real vs fake WAV files)  \n",
-    "Compatible with ASVspoof / WaveFake / FakeAVCeleb folder structure.\n",
-    "\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 📦 Cell 1 — Install Dependencies"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ── Cell 1: Install Dependencies (Google Colab) ──────────────────────────────\n",
-    "# Colab pre-installs torch, numpy, etc. — we only upgrade what needs changing.\n",
-    "# Do NOT restart runtime manually; the code handles it automatically.\n",
-    "\n",
-    "import subprocess, sys, importlib, os\n",
-    "\n",
-    "def get_version(pkg):\n",
-    "    try:\n",
-    "        return importlib.metadata.version(pkg)\n",
-    "    except:\n",
-    "        return None\n",
-    "\n",
-    "# ── Packages to install ───────────────────────────────────────────────────────\n",
-    "# Colab already has torch ~2.3+, numpy ~1.26+, pandas, sklearn, matplotlib.\n",
-    "# We only pin the ones Colab doesn't ship or ships at wrong versions.\n",
-    "PACKAGES = [\n",
-    "    \"librosa==0.10.1\",\n",
-    "    \"soundfile>=0.12.1\",\n",
-    "    \"xgboost==2.0.3\",\n",
-    "    \"tqdm==4.66.1\",\n",
-    "    \"seaborn>=0.12.0\",\n",
-    "    # torch and torchaudio are pre-installed on Colab — skip to save time\n",
-    "    # numpy, pandas, sklearn, matplotlib are also pre-installed\n",
-    "]\n",
-    "\n",
-    "print(\"📦 Installing packages for Google Colab...\\n\")\n",
-    "\n",
-    "try:\n",
-    "    result = subprocess.run(\n",
-    "        [sys.executable, \"-m\", \"pip\", \"install\", \"--quiet\"] + PACKAGES,\n",
-    "        check=True,\n",
-    "        capture_output=True,\n",
-    "        text=True,\n",
-    "    )\n",
-    "    print(result.stdout or \"\")\n",
-    "    if result.stderr:\n",
-    "        print(\"[pip warnings]:\", result.stderr[:500])\n",
-    "    print(\"✅ Installation complete.\\n\")\n",
-    "\n",
-    "except subprocess.CalledProcessError as e:\n",
-    "    print(f\"❌ pip failed (exit code {e.returncode})\")\n",
-    "    print(\"STDOUT:\", e.stdout[-2000:])\n",
-    "    print(\"STDERR:\", e.stderr[-2000:])\n",
-    "    raise\n",
-    "\n",
-    "# ── Version report ────────────────────────────────────────────────────────────\n",
-    "import torch, torchaudio, librosa, numpy, pandas, sklearn, xgboost, tqdm\n",
-    "\n",
-    "print(\"🖥️  Environment report:\")\n",
-    "print(f\"   Python      : {sys.version.split()[0]}\")\n",
-    "print(f\"   torch       : {torch.__version__}\")\n",
-    "print(f\"   torchaudio  : {torchaudio.__version__}\")\n",
-    "print(f\"   librosa     : {librosa.__version__}\")\n",
-    "print(f\"   numpy       : {numpy.__version__}\")\n",
-    "print(f\"   pandas      : {pandas.__version__}\")\n",
-    "print(f\"   sklearn     : {sklearn.__version__}\")\n",
-    "print(f\"   xgboost     : {xgboost.__version__}\")\n",
-    "print(f\"   tqdm        : {tqdm.__version__}\")\n",
-    "print(f\"\\n🖥️  GPU available : {torch.cuda.is_available()}\")\n",
-    "if torch.cuda.is_available():\n",
-    "    print(f\"   GPU name    : {torch.cuda.get_device_name(0)}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 📚 Cell 2 — All Imports (Single Setup Cell)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "256a6f57",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "# Cell 2+3 — All Imports + Global Configuration (Google Colab)\n",
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "\n",
-    "# ── Standard library ──────────────────────────────────────────────────────────\n",
-    "import os\n",
-    "import random\n",
-    "import warnings\n",
-    "import time\n",
-    "from pathlib import Path\n",
-    "from typing import Tuple, List, Dict, Optional\n",
-    "\n",
-    "# ── Numerical & data ──────────────────────────────────────────────────────────\n",
-    "import numpy as np\n",
-    "import pandas as pd\n",
-    "\n",
-    "# ── Audio processing ──────────────────────────────────────────────────────────\n",
-    "import librosa\n",
-    "import librosa.display\n",
-    "import soundfile as sf\n",
-    "\n",
-    "# ── Deep learning ─────────────────────────────────────────────────────────────\n",
-    "import torch\n",
-    "import torch.nn as nn\n",
-    "import torch.nn.functional as F\n",
-    "from torch.utils.data import Dataset, DataLoader\n",
-    "import torchaudio\n",
-    "\n",
-    "# ── Machine learning ──────────────────────────────────────────────────────────\n",
-    "from sklearn.model_selection import train_test_split\n",
-    "from sklearn.preprocessing import StandardScaler\n",
-    "from sklearn.metrics import (\n",
-    "    accuracy_score, f1_score, roc_auc_score,\n",
-    "    confusion_matrix, roc_curve, ConfusionMatrixDisplay\n",
-    ")\n",
-    "import xgboost as xgb\n",
-    "\n",
-    "# ── Visualization ─────────────────────────────────────────────────────────────\n",
-    "import matplotlib.pyplot as plt\n",
-    "import matplotlib.gridspec as gridspec\n",
-    "import seaborn as sns\n",
-    "\n",
-    "# ── Progress bar ──────────────────────────────────────────────────────────────\n",
-    "from tqdm import tqdm\n",
-    "\n",
-    "# ── Suppress non-critical warnings ────────────────────────────────────────────\n",
-    "warnings.filterwarnings(\"ignore\")\n",
-    "\n",
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "# Reproducibility  ← MUST come before anything that uses SEED\n",
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "SEED = 42\n",
-    "random.seed(SEED)\n",
-    "np.random.seed(SEED)\n",
-    "torch.manual_seed(SEED)\n",
-    "if torch.cuda.is_available():\n",
-    "    torch.cuda.manual_seed_all(SEED)\n",
-    "\n",
-    "# ── Device  ← MUST come before XGB_PARAMS which references torch ─────────────\n",
-    "DEVICE = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
-    "\n",
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "# Audio signal parameters\n",
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "SAMPLE_RATE   = 16000\n",
-    "DURATION      = 3.0\n",
-    "N_SAMPLES     = int(SAMPLE_RATE * DURATION)   # 48 000\n",
-    "\n",
-    "# ── Log-mel parameters ────────────────────────────────────────────────────────\n",
-    "N_MELS        = 40\n",
-    "N_FFT         = int(0.025 * SAMPLE_RATE)      # 400  (25 ms window)\n",
-    "HOP_LENGTH    = int(0.010 * SAMPLE_RATE)      # 160  (10 ms hop)\n",
-    "FMIN          = 20\n",
-    "FMAX          = 8000\n",
-    "\n",
-    "# ── ECAPA-TDNN parameters ─────────────────────────────────────────────────────\n",
-    "EMBEDDING_DIM  = 192\n",
-    "CHANNELS       = 512\n",
-    "ECAPA_EPOCHS   = 15\n",
-    "ECAPA_BATCH    = 32\n",
-    "ECAPA_LR       = 1e-3\n",
-    "\n",
-    "# ── Dataset parameters ────────────────────────────────────────────────────────\n",
-    "MAX_SAMPLES    = 1000                         # per class → 2 000 total\n",
-    "DATASET_ROOT   = Path(\"dataset\")\n",
-    "\n",
-    "# ── XGBoost parameters  ← SEED and DEVICE are now defined above ───────────────\n",
-    "XGB_PARAMS = dict(\n",
-    "    objective        = \"binary:logistic\",\n",
-    "    max_depth        = 6,\n",
-    "    learning_rate    = 0.1,\n",
-    "    n_estimators     = 200,\n",
-    "    subsample        = 0.8,\n",
-    "    colsample_bytree = 0.8,\n",
-    "    eval_metric      = \"logloss\",\n",
-    "    random_state     = SEED,                  # ✅ defined 20 lines above\n",
-    "    n_jobs           = -1,\n",
-    "    device           = \"cuda\" if torch.cuda.is_available() else \"cpu\",  # ✅ torch imported\n",
-    ")\n",
-    "\n",
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "# Environment report\n",
-    "# ══════════════════════════════════════════════════════════════════════════════\n",
-    "print(\"✅ Imports + config complete.\")\n",
-    "print(f\"🖥️  Device         : {DEVICE}\")\n",
-    "print(f\"🔢  PyTorch        : {torch.__version__}\")\n",
-    "print(f\"🔢  Torchaudio     : {torchaudio.__version__}\")\n",
-    "print(f\"🔢  Librosa        : {librosa.__version__}\")\n",
-    "print(f\"🔢  XGBoost        : {xgb.__version__}\")\n",
-    "print(f\"🔢  NumPy          : {np.__version__}\")\n",
-    "print(f\"🔢  Pandas         : {pd.__version__}\")\n",
-    "print(f\"\\n⚙️  Sample rate    : {SAMPLE_RATE} Hz\")\n",
-    "print(f\"⚙️  Clip duration  : {DURATION} s  ({N_SAMPLES} samples)\")\n",
-    "print(f\"⚙️  Mel bands      : {N_MELS}\")\n",
-    "print(f\"⚙️  Embedding dim  : {EMBEDDING_DIM}\")\n",
-    "print(f\"⚙️  Max per class  : {MAX_SAMPLES}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d8c67257",
-   "metadata": {},
-   "source": [
-    "## ⚙️ Cell 3 — Global Configuration"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b518441d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ─── Audio signal parameters ──────────────────────────────────────────────\n",
-    "SAMPLE_RATE       = 16000      # Target sample rate in Hz\n",
-    "DURATION          = 3.0        # Fixed clip duration in seconds\n",
-    "N_SAMPLES         = int(SAMPLE_RATE * DURATION)  # 48 000 samples per clip\n",
-    "\n",
-    "# ─── Log-mel spectrogram parameters ───────────────────────────────────────\n",
-    "N_MELS            = 40         # Number of mel filterbanks\n",
-    "N_FFT             = int(0.025 * SAMPLE_RATE)   # 25 ms window → 400 samples\n",
-    "HOP_LENGTH        = int(0.010 * SAMPLE_RATE)   # 10 ms hop   → 160 samples\n",
-    "FMIN              = 20         # Min frequency for mel filters\n",
-    "FMAX              = 8000       # Max frequency for mel filters\n",
-    "\n",
-    "# ─── ECAPA-TDNN model parameters ──────────────────────────────────────────\n",
-    "EMBEDDING_DIM     = 192        # Output embedding size\n",
-    "CHANNELS          = 512        # Internal channel width\n",
-    "ECAPA_EPOCHS      = 15         # Training epochs for the neural model\n",
-    "ECAPA_BATCH       = 32         # Batch size\n",
-    "ECAPA_LR          = 1e-3       # Learning rate\n",
-    "\n",
-    "# ─── Dataset parameters ───────────────────────────────────────────────────\n",
-    "MAX_SAMPLES       = 1000       # Samples PER CLASS (1000 real + 1000 fake = 2000 total)\n",
-    "DATASET_ROOT      = Path(\"dataset\")  # Root folder containing real/ and fake/\n",
-    "\n",
-    "# ─── XGBoost parameters ───────────────────────────────────────────────────\n",
-    "XGB_PARAMS = dict(\n",
-    "    objective       = \"binary:logistic\",\n",
-    "    max_depth       = 6,\n",
-    "    learning_rate   = 0.1,\n",
-    "    n_estimators    = 200,\n",
-    "    subsample       = 0.8,\n",
-    "    colsample_bytree= 0.8,\n",
-    "    use_label_encoder = False,\n",
-    "    eval_metric     = \"logloss\",\n",
-    "    random_state    = SEED,\n",
-    "    n_jobs          = -1,\n",
-    ")\n",
-    "\n",
-    "print(\"✅ Configuration loaded.\")\n",
-    "print(f\"   Sample rate   : {SAMPLE_RATE} Hz\")\n",
-    "print(f\"   Clip duration : {DURATION} s  ({N_SAMPLES} samples)\")\n",
-    "print(f\"   Mel bands     : {N_MELS}\")\n",
-    "print(f\"   Embedding dim : {EMBEDDING_DIM}\")\n",
-    "print(f\"   Max per class : {MAX_SAMPLES}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f1cd5010",
-   "metadata": {},
-   "source": [
-    "## 🗄️ Cell 4 — Download ASVspoof 2019 LA Dataset\n",
-    "\n",
-    "> **ASVspoof 2019 LA** is the official benchmark for logical-access spoofed/deepfake speech detection.  \n",
-    "> It contains **bonafide** (real human speech) and **spoof** (TTS / voice-conversion generated) utterances.  \n",
-    "> We download the training partition, parse the official protocol file, and copy files into `dataset/real/` and `dataset/fake/`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ae82ace4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ── CELL 4: Download ASVspoof 2019 LA subset ────────────────────────────────\n",
-    "# Official benchmark for spoofed/deepfake speech detection\n",
-    "# Free, no login needed via Zenodo\n",
-    "\n",
-    "!pip install -q zenodo_get\n",
-    "\n",
-    "import zipfile, shutil\n",
-    "from pathlib import Path\n",
-    "\n",
-    "# ── Download LA (Logical Access) partition ─────────────────────────────────\n",
-    "# Contains TTS/VC deepfakes + bonafide speech\n",
-    "RAW_DIR = Path(\"asvspoof_raw\")\n",
-    "if not RAW_DIR.exists():\n",
-    "    print(\"📥 Downloading ASVspoof 2019 LA from Zenodo (this may take a few minutes)...\")\n",
-    "    !zenodo_get 10.5281/zenodo.10509676 -o {RAW_DIR}\n",
-    "else:\n",
-    "    print(f\"✅ Raw data directory '{RAW_DIR}' already exists, skipping download.\")\n",
-    "\n",
-    "# ── Extract the ZIP ────────────────────────────────────────────────────────\n",
-    "zip_path = RAW_DIR / \"LA.zip\"\n",
-    "extracted_marker = RAW_DIR / \"LA\"\n",
-    "\n",
-    "if zip_path.exists() and not extracted_marker.exists():\n",
-    "    print(\"📦 Extracting LA.zip...\")\n",
-    "    with zipfile.ZipFile(str(zip_path), \"r\") as z:\n",
-    "        z.extractall(str(RAW_DIR))\n",
-    "    print(\"✅ Extraction complete.\")\n",
-    "elif extracted_marker.exists():\n",
-    "    print(\"✅ Already extracted.\")\n",
-    "else:\n",
-    "    print(\"⚠️  LA.zip not found — check the download step above.\")\n",
-    "\n",
-    "# ── Create dataset/real and dataset/fake from official labels ──────────────\n",
-    "Path(\"dataset/real\").mkdir(parents=True, exist_ok=True)\n",
-    "Path(\"dataset/fake\").mkdir(parents=True, exist_ok=True)\n",
-    "\n",
-    "# Format of each protocol line:\n",
-    "#   SPEAKER_ID  FILENAME  ENV  ATTACK_TYPE  LABEL\n",
-    "#   LABEL is either \"bonafide\" (real) or \"spoof\" (fake)\n",
-    "label_file = RAW_DIR / \"LA\" / \"ASVspoof2019_LA_cm_protocols\" / \"ASVspoof2019.LA.cm.train.trn.txt\"\n",
-    "audio_dir  = RAW_DIR / \"LA\" / \"ASVspoof2019_LA_train\" / \"flac\"\n",
-    "\n",
-    "if not label_file.exists():\n",
-    "    raise FileNotFoundError(\n",
-    "        f\"Protocol file not found at {label_file}. \"\n",
-    "        f\"Check that the Zenodo download and extraction succeeded.\"\n",
-    "    )\n",
-    "\n",
-    "real_count = 0\n",
-    "fake_count = 0\n",
-    "MAX_PER_CLASS = 1000   # cap at 1000 each for Colab speed\n",
-    "\n",
-    "# Only copy if dataset dirs are empty (skip if already done)\n",
-    "existing_real = len(list(Path(\"dataset/real\").glob(\"*.flac\")))\n",
-    "existing_fake = len(list(Path(\"dataset/fake\").glob(\"*.flac\")))\n",
-    "\n",
-    "if existing_real >= MAX_PER_CLASS and existing_fake >= MAX_PER_CLASS:\n",
-    "    real_count = existing_real\n",
-    "    fake_count = existing_fake\n",
-    "    print(f\"✅ Dataset already prepared ({existing_real} real, {existing_fake} fake). Skipping copy.\")\n",
-    "else:\n",
-    "    print(\"🔄 Copying audio files into dataset/real/ and dataset/fake/...\")\n",
-    "    with open(label_file) as f:\n",
-    "        for line in f:\n",
-    "            parts  = line.strip().split()\n",
-    "            utt_id = parts[1]\n",
-    "            label  = parts[4]   # \"bonafide\" or \"spoof\"\n",
-    "\n",
-    "            src = audio_dir / f\"{utt_id}.flac\"\n",
-    "            if not src.exists():\n",
-    "                continue\n",
-    "\n",
-    "            if label == \"bonafide\" and real_count < MAX_PER_CLASS:\n",
-    "                shutil.copy(str(src), f\"dataset/real/{utt_id}.flac\")\n",
-    "                real_count += 1\n",
-    "            elif label == \"spoof\" and fake_count < MAX_PER_CLASS:\n",
-    "                shutil.copy(str(src), f\"dataset/fake/{utt_id}.flac\")\n",
-    "                fake_count += 1\n",
-    "\n",
-    "            if real_count >= MAX_PER_CLASS and fake_count >= MAX_PER_CLASS:\n",
-    "                break\n",
-    "\n",
-    "print(f\"\\n✅ ASVspoof 2019 LA dataset ready.\")\n",
-    "print(f\"   Real (bonafide) : {real_count}\")\n",
-    "print(f\"   Fake (spoof)    : {fake_count}\")\n",
-    "\n",
-    "\n",
-    "# ── load_file_list — supports .wav AND .flac ──────────────────────────────\n",
-    "def load_file_list(\n",
-    "    root: Path,\n",
-    "    max_per_class: int = MAX_SAMPLES,\n",
-    ") -> pd.DataFrame:\n",
-    "    \"\"\"\n",
-    "    Build a balanced DataFrame of audio file paths and labels.\n",
-    "    Supports .wav, .flac, and .ogg files.\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    DataFrame with columns: [path, label]  where label ∈ {0=real, 1=fake}\n",
-    "    \"\"\"\n",
-    "    rows: List[Dict] = []\n",
-    "\n",
-    "    for label_name, label_int in [(\"real\", 0), (\"fake\", 1)]:\n",
-    "        folder = root / label_name\n",
-    "        if not folder.exists():\n",
-    "            raise FileNotFoundError(f\"Expected folder not found: {folder}\")\n",
-    "\n",
-    "        # Collect all common audio formats\n",
-    "        files = []\n",
-    "        for ext in [\"*.wav\", \"*.flac\", \"*.ogg\"]:\n",
-    "            files.extend(folder.glob(ext))\n",
-    "        files = sorted(files)\n",
-    "\n",
-    "        if len(files) == 0:\n",
-    "            raise FileNotFoundError(\n",
-    "                f\"No audio files (.wav/.flac/.ogg) found in {folder}\"\n",
-    "            )\n",
-    "\n",
-    "        # Shuffle to avoid ordering bias, then cap\n",
-    "        random.shuffle(files)\n",
-    "        files = files[:max_per_class]\n",
-    "\n",
-    "        for fp in files:\n",
-    "            rows.append({\"path\": str(fp), \"label\": label_int})\n",
-    "\n",
-    "    df = pd.DataFrame(rows).sample(frac=1, random_state=SEED).reset_index(drop=True)\n",
-    "    return df\n",
-    "\n",
-    "\n",
-    "# ── Load the file list ─────────────────────────────────────────────────────\n",
-    "df = load_file_list(DATASET_ROOT)\n",
-    "\n",
-    "print(f\"\\n📊 Dataset summary:\")\n",
-    "print(df[\"label\"].value_counts().rename({0: \"real\", 1: \"fake\"}).to_string())\n",
-    "print(f\"   Total files  : {len(df)}\")\n",
-    "df.head()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 🔊 Cell 5 — Audio Preprocessing"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def load_and_normalize(\n",
-    "    path: str,\n",
-    "    target_sr: int = SAMPLE_RATE,\n",
-    "    target_len: int = N_SAMPLES,\n",
-    ") -> np.ndarray:\n",
-    "    \"\"\"\n",
-    "    Load a WAV file, resample, pad/trim to a fixed length, and normalise.\n",
-    "\n",
-    "    Parameters\n",
-    "    ----------\n",
-    "    path       : path to WAV file\n",
-    "    target_sr  : desired sample rate (default 16 kHz)\n",
-    "    target_len : desired number of samples (sr × duration)\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    y : float32 array of shape (target_len,), amplitude in [-1, 1]\n",
-    "    \"\"\"\n",
-    "    # librosa.load resamples and returns mono float32\n",
-    "    y, _ = librosa.load(path, sr=target_sr, mono=True)\n",
-    "\n",
-    "    # ── Trim or zero-pad to exactly target_len samples ────────────────────\n",
-    "    if len(y) >= target_len:\n",
-    "        y = y[:target_len]\n",
-    "    else:\n",
-    "        pad = target_len - len(y)\n",
-    "        y = np.pad(y, (0, pad), mode=\"constant\")\n",
-    "\n",
-    "    # ── Peak normalisation ────────────────────────────────────────────────\n",
-    "    peak = np.abs(y).max()\n",
-    "    if peak > 1e-9:\n",
-    "        y = y / peak\n",
-    "\n",
-    "    return y.astype(np.float32)\n",
-    "\n",
-    "\n",
-    "def spectral_gate_denoise(\n",
-    "    y: np.ndarray,\n",
-    "    sr: int = SAMPLE_RATE,\n",
-    "    noise_percentile: float = 15.0,\n",
-    "    threshold_scale: float = 1.5,\n",
-    ") -> np.ndarray:\n",
-    "    \"\"\"\n",
-    "    Simple spectral-gating denoiser.\n",
-    "\n",
-    "    Algorithm\n",
-    "    ---------\n",
-    "    1. Compute STFT of the signal.\n",
-    "    2. Estimate the noise floor from the lowest-magnitude frames\n",
-    "       (using the bottom `noise_percentile`-th percentile of the\n",
-    "       per-frequency mean magnitudes).\n",
-    "    3. Build a soft mask: bins above threshold_scale × noise_floor\n",
-    "       are kept; bins below are attenuated.\n",
-    "    4. Apply the mask and reconstruct via inverse STFT.\n",
-    "\n",
-    "    Parameters\n",
-    "    ----------\n",
-    "    y                : input waveform (float32, mono)\n",
-    "    sr               : sample rate\n",
-    "    noise_percentile : percentile used to estimate the noise floor\n",
-    "    threshold_scale  : multiplier on the noise floor threshold\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    Denoised waveform (float32), same length as input.\n",
-    "    \"\"\"\n",
-    "    n_fft    = 512\n",
-    "    hop      = 128\n",
-    "\n",
-    "    # Forward STFT: shape (n_fft//2+1, n_frames)\n",
-    "    stft = librosa.stft(y, n_fft=n_fft, hop_length=hop)\n",
-    "    magnitude, phase = np.abs(stft), np.angle(stft)\n",
-    "\n",
-    "    # Estimate noise profile (per-frequency mean of lowest frames)\n",
-    "    noise_profile = np.percentile(magnitude, noise_percentile, axis=1, keepdims=True)\n",
-    "\n",
-    "    # Compute soft mask (sigmoid-like gate)\n",
-    "    threshold = threshold_scale * noise_profile\n",
-    "    mask = np.where(magnitude >= threshold, 1.0, magnitude / (threshold + 1e-9))\n",
-    "\n",
-    "    # Apply mask and reconstruct\n",
-    "    denoised_stft = mask * magnitude * np.exp(1j * phase)\n",
-    "    y_denoised = librosa.istft(denoised_stft, hop_length=hop, length=len(y))\n",
-    "\n",
-    "    return y_denoised.astype(np.float32)\n",
-    "\n",
-    "\n",
-    "def preprocess_audio(path: str) -> np.ndarray:\n",
-    "    \"\"\"Full preprocessing pipeline: load → normalise → denoise.\"\"\"\n",
-    "    y = load_and_normalize(path)\n",
-    "    y = spectral_gate_denoise(y)\n",
-    "    return y\n",
-    "\n",
-    "\n",
-    "# ── Quick sanity check ────────────────────────────────────────────────────\n",
-    "sample_path = df[\"path\"].iloc[0]\n",
-    "sample_wave = preprocess_audio(sample_path)\n",
-    "\n",
-    "print(f\"✅ Preprocessing OK.\")\n",
-    "print(f\"   Waveform shape : {sample_wave.shape}\")\n",
-    "print(f\"   Duration       : {len(sample_wave) / SAMPLE_RATE:.2f} s\")\n",
-    "print(f\"   Peak amplitude : {np.abs(sample_wave).max():.4f}\")\n",
-    "\n",
-    "# Plot preprocessed waveform\n",
-    "fig, ax = plt.subplots(figsize=(10, 2))\n",
-    "librosa.display.waveshow(sample_wave, sr=SAMPLE_RATE, ax=ax, color=\"steelblue\")\n",
-    "ax.set_title(f\"Preprocessed waveform — label={df['label'].iloc[0]} (0=real, 1=fake)\")\n",
-    "ax.set_xlabel(\"Time (s)\")\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 🔬 Cell 6 — Feature Extraction (Log-Mel + Teager Energy Operator)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def compute_log_mel(\n",
-    "    y: np.ndarray,\n",
-    "    sr: int = SAMPLE_RATE,\n",
-    "    n_mels: int = N_MELS,\n",
-    "    n_fft: int = N_FFT,\n",
-    "    hop_length: int = HOP_LENGTH,\n",
-    "    fmin: float = FMIN,\n",
-    "    fmax: float = FMAX,\n",
-    ") -> np.ndarray:\n",
-    "    \"\"\"\n",
-    "    Compute log-mel spectrogram.\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    log_mel : shape (n_mels, T)  — float32\n",
-    "    \"\"\"\n",
-    "    mel_spec = librosa.feature.melspectrogram(\n",
-    "        y       = y,\n",
-    "        sr      = sr,\n",
-    "        n_mels  = n_mels,\n",
-    "        n_fft   = n_fft,\n",
-    "        hop_length = hop_length,\n",
-    "        fmin    = fmin,\n",
-    "        fmax    = fmax,\n",
-    "    )  # shape: (n_mels, T)  — power spectrogram\n",
-    "\n",
-    "    # Convert to log scale (decibels), clamp floor at -80 dB\n",
-    "    log_mel = librosa.power_to_db(mel_spec, ref=np.max)\n",
-    "    return log_mel.astype(np.float32)\n",
-    "\n",
-    "\n",
-    "def compute_teager_energy(\n",
-    "    y: np.ndarray,\n",
-    "    sr: int = SAMPLE_RATE,\n",
-    "    hop_length: int = HOP_LENGTH,\n",
-    "    n_fft: int = N_FFT,\n",
-    ") -> np.ndarray:\n",
-    "    \"\"\"\n",
-    "    Compute frame-level Teager Energy Operator (TEO).\n",
-    "\n",
-    "    The discrete TEO is defined as:\n",
-    "        Ψ[x(n)] = x(n)^2 − x(n−1) · x(n+1)\n",
-    "\n",
-    "    This captures instantaneous energy and is sensitive to\n",
-    "    unnatural modulation artefacts introduced by vocoders.\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    teo_frames : shape (1, T)  — frame-level mean TEO  — float32\n",
-    "    \"\"\"\n",
-    "    # Compute per-sample TEO (boundary samples use clipped indexing)\n",
-    "    y_pad   = np.pad(y, 1, mode=\"edge\")       # length N+2\n",
-    "    teo_raw = y_pad[1:-1]**2 - y_pad[:-2] * y_pad[2:]  # length N\n",
-    "    teo_raw = np.abs(teo_raw)                  # take absolute value\n",
-    "\n",
-    "    # Frame the TEO signal to match the mel spectrogram time axis\n",
-    "    # Using librosa.util.frame for consistent framing\n",
-    "    frames = librosa.util.frame(\n",
-    "        teo_raw,\n",
-    "        frame_length = n_fft,\n",
-    "        hop_length   = hop_length,\n",
-    "    )  # shape: (n_fft, T)\n",
-    "\n",
-    "    # Collapse to a single row per frame: mean TEO energy\n",
-    "    teo_frames = frames.mean(axis=0, keepdims=True)  # shape: (1, T)\n",
-    "    return np.log1p(teo_frames).astype(np.float32)   # log-compress\n",
-    "\n",
-    "\n",
-    "def extract_features(y: np.ndarray) -> np.ndarray:\n",
-    "    \"\"\"\n",
-    "    Combined feature extraction: log-mel + TEO.\n",
-    "\n",
-    "    Steps\n",
-    "    -----\n",
-    "    1. Compute 40-band log-mel spectrogram  → shape (40, T)\n",
-    "    2. Compute frame-level TEO              → shape (1,  T)\n",
-    "    3. Concatenate along feature axis       → shape (41, T)\n",
-    "    4. Align T across both via min-trimming.\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    feature_matrix : np.ndarray, shape (41, T)  — float32\n",
-    "    \"\"\"\n",
-    "    log_mel = compute_log_mel(y)       # (40, T_mel)\n",
-    "    teo     = compute_teager_energy(y) # (1,  T_teo)\n",
-    "\n",
-    "    # Align time dimensions (may differ by 1-2 frames due to boundary effects)\n",
-    "    T = min(log_mel.shape[1], teo.shape[1])\n",
-    "    log_mel = log_mel[:, :T]\n",
-    "    teo     = teo[:, :T]\n",
-    "\n",
-    "    return np.concatenate([log_mel, teo], axis=0)  # (41, T)\n",
-    "\n",
-    "\n",
-    "# ── Verify feature extraction on the sample ────────────────────────────────\n",
-    "feat = extract_features(sample_wave)\n",
-    "print(f\"✅ Feature matrix shape: {feat.shape}  (features × time_frames)\")\n",
-    "\n",
-    "# Visualise features\n",
-    "fig, axes = plt.subplots(1, 2, figsize=(14, 4))\n",
-    "\n",
-    "# Log-mel panel\n",
-    "img = librosa.display.specshow(\n",
-    "    feat[:40],\n",
-    "    sr=SAMPLE_RATE,\n",
-    "    hop_length=HOP_LENGTH,\n",
-    "    x_axis=\"time\",\n",
-    "    y_axis=\"mel\",\n",
-    "    ax=axes[0],\n",
-    "    cmap=\"magma\",\n",
-    ")\n",
-    "axes[0].set_title(\"40-band Log-Mel Spectrogram\")\n",
-    "fig.colorbar(img, ax=axes[0], format=\"%+2.0f dB\")\n",
-    "\n",
-    "# TEO panel\n",
-    "axes[1].plot(feat[40], color=\"darkorange\", lw=0.8)\n",
-    "axes[1].set_title(\"Teager Energy Operator (frame-level)\")\n",
-    "axes[1].set_xlabel(\"Frame index\")\n",
-    "axes[1].set_ylabel(\"log(1 + TEO)\")\n",
-    "axes[1].grid(True, alpha=0.3)\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 🧠 Cell 7 — ECAPA-TDNN Architecture"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "class SEBlock(nn.Module):\n",
-    "    \"\"\"\n",
-    "    Squeeze-and-Excitation (SE) channel attention block.\n",
-    "\n",
-    "    Adaptively re-weights each channel by learning global statistics.\n",
-    "    Introduced in 'Squeeze-and-Excitation Networks' (Hu et al., 2018).\n",
-    "    \"\"\"\n",
-    "\n",
-    "    def __init__(self, channels: int, bottleneck: int = 128):\n",
-    "        super().__init__()\n",
-    "        self.squeeze  = nn.AdaptiveAvgPool1d(1)           # global average pool\n",
-    "        self.excite   = nn.Sequential(\n",
-    "            nn.Linear(channels, bottleneck),\n",
-    "            nn.ReLU(inplace=True),\n",
-    "            nn.Linear(bottleneck, channels),\n",
-    "            nn.Sigmoid(),\n",
-    "        )\n",
-    "\n",
-    "    def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
-    "        # x: (B, C, T)\n",
-    "        s = self.squeeze(x).squeeze(-1)      # (B, C)\n",
-    "        e = self.excite(s).unsqueeze(-1)     # (B, C, 1)\n",
-    "        return x * e                          # channel-wise scaling\n",
-    "\n",
-    "\n",
-    "class TDNNBlock(nn.Module):\n",
-    "    \"\"\"\n",
-    "    Res2Net-style TDNN block with dilated 1-D convolution + SE attention.\n",
-    "\n",
-    "    Each TDNN block:\n",
-    "        1. Projects input to the same channel width.\n",
-    "        2. Applies a dilated depthwise-style 1D conv (captures long-range context).\n",
-    "        3. Applies channel attention via SE block.\n",
-    "        4. Adds residual connection.\n",
-    "    \"\"\"\n",
-    "\n",
-    "    def __init__(\n",
-    "        self,\n",
-    "        in_channels: int,\n",
-    "        out_channels: int,\n",
-    "        kernel_size: int = 3,\n",
-    "        dilation: int = 1,\n",
-    "    ):\n",
-    "        super().__init__()\n",
-    "        self.conv = nn.Conv1d(\n",
-    "            in_channels,\n",
-    "            out_channels,\n",
-    "            kernel_size = kernel_size,\n",
-    "            dilation    = dilation,\n",
-    "            padding     = (kernel_size - 1) * dilation // 2,  # same padding\n",
-    "        )\n",
-    "        self.bn   = nn.BatchNorm1d(out_channels)\n",
-    "        self.act  = nn.ReLU(inplace=True)\n",
-    "        self.se   = SEBlock(out_channels)\n",
-    "\n",
-    "        # Residual projection if channel dims differ\n",
-    "        self.res_proj = (\n",
-    "            nn.Conv1d(in_channels, out_channels, kernel_size=1)\n",
-    "            if in_channels != out_channels\n",
-    "            else nn.Identity()\n",
-    "        )\n",
-    "\n",
-    "    def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
-    "        residual = self.res_proj(x)\n",
-    "        out = self.act(self.bn(self.conv(x)))\n",
-    "        out = self.se(out)\n",
-    "        return out + residual\n",
-    "\n",
-    "\n",
-    "class AttentiveStatPooling(nn.Module):\n",
-    "    \"\"\"\n",
-    "    Attentive statistics pooling (temporal aggregation).\n",
-    "\n",
-    "    Learns a soft alignment over time frames and computes\n",
-    "    the weighted mean and standard deviation, producing a\n",
-    "    fixed-length utterance-level representation.\n",
-    "    \"\"\"\n",
-    "\n",
-    "    def __init__(self, in_channels: int, attention_hidden: int = 128):\n",
-    "        super().__init__()\n",
-    "        self.attention = nn.Sequential(\n",
-    "            nn.Conv1d(in_channels, attention_hidden, kernel_size=1),\n",
-    "            nn.Tanh(),\n",
-    "            nn.Conv1d(attention_hidden, in_channels, kernel_size=1),\n",
-    "            nn.Softmax(dim=-1),   # softmax over the time axis\n",
-    "        )\n",
-    "\n",
-    "    def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
-    "        # x: (B, C, T)\n",
-    "        w    = self.attention(x)                  # (B, C, T) — attention weights\n",
-    "        mean = (w * x).sum(dim=-1)                # (B, C)    — weighted mean\n",
-    "        var  = (w * (x - mean.unsqueeze(-1))**2).sum(dim=-1)  # (B, C)\n",
-    "        std  = torch.sqrt(var + 1e-8)             # (B, C)\n",
-    "        return torch.cat([mean, std], dim=1)       # (B, 2C)\n",
-    "\n",
-    "\n",
-    "class ECAPATDNN(nn.Module):\n",
-    "    \"\"\"\n",
-    "    Simplified ECAPA-TDNN speaker/spoof embedding model.\n",
-    "\n",
-    "    Input  : feature matrix of shape (B, n_features, T)\n",
-    "             where n_features = 41 (40 log-mel + 1 TEO)\n",
-    "    Output : (B, 2)  logits for binary classification\n",
-    "             Embeddings can be extracted from the penultimate FC layer.\n",
-    "\n",
-    "    Architecture\n",
-    "    ------------\n",
-    "    Input conv   → TDNN × 3 (dilations 1, 2, 3)\n",
-    "                 → concatenation of multi-scale features\n",
-    "                 → 1×1 aggregation conv\n",
-    "                 → attentive statistics pooling\n",
-    "                 → FC → BN → ReLU  (embedding layer, 192-dim)\n",
-    "                 → linear classifier (2 classes)\n",
-    "    \"\"\"\n",
-    "\n",
-    "    def __init__(\n",
-    "        self,\n",
-    "        in_channels:   int = 41,\n",
-    "        channels:      int = CHANNELS,\n",
-    "        emb_dim:       int = EMBEDDING_DIM,\n",
-    "    ):\n",
-    "        super().__init__()\n",
-    "\n",
-    "        # ── Entry convolution ───────────────────────────────────────────\n",
-    "        self.input_conv = nn.Sequential(\n",
-    "            nn.Conv1d(in_channels, channels, kernel_size=5, padding=2),\n",
-    "            nn.BatchNorm1d(channels),\n",
-    "            nn.ReLU(inplace=True),\n",
-    "        )\n",
-    "\n",
-    "        # ── Multi-scale TDNN blocks ─────────────────────────────────────\n",
-    "        # Three blocks with increasing dilation to model different\n",
-    "        # temporal receptive fields simultaneously.\n",
-    "        self.tdnn1 = TDNNBlock(channels, channels, kernel_size=3, dilation=1)\n",
-    "        self.tdnn2 = TDNNBlock(channels, channels, kernel_size=3, dilation=2)\n",
-    "        self.tdnn3 = TDNNBlock(channels, channels, kernel_size=3, dilation=3)\n",
-    "\n",
-    "        # ── Multi-scale aggregation ─────────────────────────────────────\n",
-    "        # Concatenate outputs from all three TDNN blocks → 3×channels,\n",
-    "        # then compress back to `channels` with a 1×1 conv.\n",
-    "        self.agg_conv = nn.Sequential(\n",
-    "            nn.Conv1d(channels * 3, channels, kernel_size=1),\n",
-    "            nn.BatchNorm1d(channels),\n",
-    "            nn.ReLU(inplace=True),\n",
-    "        )\n",
-    "\n",
-    "        # ── Temporal pooling ────────────────────────────────────────────\n",
-    "        self.pool = AttentiveStatPooling(channels)\n",
-    "        # After pooling: mean + std concatenated → 2 × channels\n",
-    "\n",
-    "        # ── Embedding FC ────────────────────────────────────────────────\n",
-    "        self.emb_fc = nn.Sequential(\n",
-    "            nn.Linear(channels * 2, emb_dim),\n",
-    "            nn.BatchNorm1d(emb_dim),\n",
-    "            nn.ReLU(inplace=True),\n",
-    "        )\n",
-    "\n",
-    "        # ── Binary classifier ───────────────────────────────────────────\n",
-    "        self.classifier = nn.Linear(emb_dim, 2)\n",
-    "\n",
-    "        self._init_weights()\n",
-    "\n",
-    "    def _init_weights(self):\n",
-    "        \"\"\"Xavier initialisation for all Conv1d and Linear layers.\"\"\"\n",
-    "        for m in self.modules():\n",
-    "            if isinstance(m, (nn.Conv1d, nn.Linear)):\n",
-    "                nn.init.xavier_uniform_(m.weight)\n",
-    "                if m.bias is not None:\n",
-    "                    nn.init.zeros_(m.bias)\n",
-    "\n",
-    "    def embed(self, x: torch.Tensor) -> torch.Tensor:\n",
-    "        \"\"\"\n",
-    "        Extract 192-dim embedding (used post-training for XGBoost input).\n",
-    "\n",
-    "        Parameters\n",
-    "        ----------\n",
-    "        x : (B, in_channels, T)\n",
-    "\n",
-    "        Returns\n",
-    "        -------\n",
-    "        emb : (B, emb_dim)\n",
-    "        \"\"\"\n",
-    "        x = self.input_conv(x)\n",
-    "        t1 = self.tdnn1(x)\n",
-    "        t2 = self.tdnn2(x)\n",
-    "        t3 = self.tdnn3(x)\n",
-    "        x  = self.agg_conv(torch.cat([t1, t2, t3], dim=1))\n",
-    "        x  = self.pool(x)\n",
-    "        return self.emb_fc(x)\n",
-    "\n",
-    "    def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
-    "        \"\"\"Full forward pass returning classification logits.\"\"\"\n",
-    "        return self.classifier(self.embed(x))\n",
-    "\n",
-    "\n",
-    "# ── Instantiate and profile the model ────────────────────────────────────\n",
-    "model = ECAPATDNN().to(DEVICE)\n",
-    "\n",
-    "# Count trainable parameters\n",
-    "n_params = sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
-    "print(f\"✅ ECAPA-TDNN instantiated.\")\n",
-    "print(f\"   Trainable parameters : {n_params:,}\")\n",
-    "\n",
-    "# Sanity-check a forward pass\n",
-    "T_test   = feat.shape[1]\n",
-    "dummy    = torch.randn(4, 41, T_test).to(DEVICE)\n",
-    "logits   = model(dummy)\n",
-    "emb      = model.embed(dummy)\n",
-    "print(f\"   Logit shape          : {logits.shape}   (expected [4, 2])\")\n",
-    "print(f\"   Embedding shape      : {emb.shape}     (expected [4, {EMBEDDING_DIM}])\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 📦 Cell 8 — PyTorch Dataset & DataLoader"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "class AudioDataset(Dataset):\n",
-    "    \"\"\"\n",
-    "    PyTorch Dataset for audio deepfake detection.\n",
-    "\n",
-    "    Each __getitem__ call:\n",
-    "        1. Loads and preprocesses the WAV file (load → normalise → denoise).\n",
-    "        2. Extracts the feature matrix (log-mel + TEO).\n",
-    "        3. Returns (feature_tensor, label).\n",
-    "\n",
-    "    Parameters\n",
-    "    ----------\n",
-    "    df         : DataFrame with columns [path, label]\n",
-    "    fixed_T    : fixed number of time frames (pad/trim feature matrix)\n",
-    "    \"\"\"\n",
-    "\n",
-    "    def __init__(self, df: pd.DataFrame, fixed_T: Optional[int] = None):\n",
-    "        self.paths  = df[\"path\"].tolist()\n",
-    "        self.labels = df[\"label\"].tolist()\n",
-    "        self.fixed_T = fixed_T\n",
-    "\n",
-    "    def __len__(self) -> int:\n",
-    "        return len(self.paths)\n",
-    "\n",
-    "    def __getitem__(self, idx: int) -> Tuple[torch.Tensor, torch.Tensor]:\n",
-    "        y    = preprocess_audio(self.paths[idx])\n",
-    "        feat = extract_features(y)             # (41, T)\n",
-    "\n",
-    "        # Align time dimension across all samples in the batch\n",
-    "        if self.fixed_T is not None:\n",
-    "            T = feat.shape[1]\n",
-    "            if T >= self.fixed_T:\n",
-    "                feat = feat[:, :self.fixed_T]\n",
-    "            else:\n",
-    "                feat = np.pad(feat, ((0, 0), (0, self.fixed_T - T)), mode=\"constant\")\n",
-    "\n",
-    "        x = torch.tensor(feat, dtype=torch.float32)           # (41, T)\n",
-    "        y = torch.tensor(self.labels[idx], dtype=torch.long)  # scalar\n",
-    "        return x, y\n",
-    "\n",
-    "\n",
-    "# ── Determine fixed T from the first sample ─────────────────────────────\n",
-    "sample_feat = extract_features(preprocess_audio(df[\"path\"].iloc[0]))\n",
-    "FIXED_T = sample_feat.shape[1]\n",
-    "print(f\"✅ Fixed time frames per sample: {FIXED_T}\")\n",
-    "\n",
-    "# ── Train / validation split (80 / 20) ──────────────────────────────────\n",
-    "df_train, df_val = train_test_split(\n",
-    "    df,\n",
-    "    test_size    = 0.20,\n",
-    "    stratify     = df[\"label\"],\n",
-    "    random_state = SEED,\n",
-    ")\n",
-    "\n",
-    "print(f\"   Train samples : {len(df_train)}\")\n",
-    "print(f\"   Val   samples : {len(df_val)}\")\n",
-    "\n",
-    "# ── Build datasets and loaders ──────────────────────────────────────────\n",
-    "train_ds = AudioDataset(df_train, fixed_T=FIXED_T)\n",
-    "val_ds   = AudioDataset(df_val,   fixed_T=FIXED_T)\n",
-    "\n",
-    "train_loader = DataLoader(\n",
-    "    train_ds,\n",
-    "    batch_size  = ECAPA_BATCH,\n",
-    "    shuffle     = True,\n",
-    "    num_workers = 0,      # 0 avoids multiprocessing issues in Kaggle notebooks\n",
-    "    pin_memory  = DEVICE.type == \"cuda\",\n",
-    ")\n",
-    "val_loader = DataLoader(\n",
-    "    val_ds,\n",
-    "    batch_size  = ECAPA_BATCH,\n",
-    "    shuffle     = False,\n",
-    "    num_workers = 0,\n",
-    "    pin_memory  = DEVICE.type == \"cuda\",\n",
-    ")\n",
-    "\n",
-    "print(f\"\\n   Train batches : {len(train_loader)}\")\n",
-    "print(f\"   Val   batches : {len(val_loader)}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 🏋️ Cell 9 — Train ECAPA-TDNN"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def train_one_epoch(\n",
-    "    model:      nn.Module,\n",
-    "    loader:     DataLoader,\n",
-    "    optimizer:  torch.optim.Optimizer,\n",
-    "    criterion:  nn.Module,\n",
-    ") -> float:\n",
-    "    \"\"\"\n",
-    "    Run one training epoch.\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    avg_loss : mean cross-entropy loss over all batches\n",
-    "    \"\"\"\n",
-    "    model.train()\n",
-    "    total_loss = 0.0\n",
-    "\n",
-    "    for x, y in loader:\n",
-    "        x, y = x.to(DEVICE), y.to(DEVICE)\n",
-    "\n",
-    "        optimizer.zero_grad()\n",
-    "        logits = model(x)               # (B, 2)\n",
-    "        loss   = criterion(logits, y)\n",
-    "        loss.backward()\n",
-    "        optimizer.step()\n",
-    "\n",
-    "        total_loss += loss.item() * len(y)\n",
-    "\n",
-    "    return total_loss / len(loader.dataset)\n",
-    "\n",
-    "\n",
-    "@torch.no_grad()\n",
-    "def evaluate(\n",
-    "    model:     nn.Module,\n",
-    "    loader:    DataLoader,\n",
-    "    criterion: nn.Module,\n",
-    ") -> Tuple[float, float]:\n",
-    "    \"\"\"\n",
-    "    Evaluate model on a DataLoader.\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    avg_loss : float\n",
-    "    accuracy : float  (fraction correct)\n",
-    "    \"\"\"\n",
-    "    model.eval()\n",
-    "    total_loss = 0.0\n",
-    "    correct    = 0\n",
-    "\n",
-    "    for x, y in loader:\n",
-    "        x, y   = x.to(DEVICE), y.to(DEVICE)\n",
-    "        logits = model(x)\n",
-    "        loss   = criterion(logits, y)\n",
-    "\n",
-    "        total_loss += loss.item() * len(y)\n",
-    "        preds = logits.argmax(dim=1)\n",
-    "        correct += (preds == y).sum().item()\n",
-    "\n",
-    "    avg_loss = total_loss / len(loader.dataset)\n",
-    "    accuracy = correct   / len(loader.dataset)\n",
-    "    return avg_loss, accuracy\n",
-    "\n",
-    "\n",
-    "# ── Optimiser, scheduler, loss ───────────────────────────────────────────\n",
-    "optimizer = torch.optim.AdamW(\n",
-    "    model.parameters(),\n",
-    "    lr           = ECAPA_LR,\n",
-    "    weight_decay = 1e-4,\n",
-    ")\n",
-    "scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(\n",
-    "    optimizer, T_max=ECAPA_EPOCHS, eta_min=1e-5\n",
-    ")\n",
-    "criterion = nn.CrossEntropyLoss()  # binary CE via 2-class softmax\n",
-    "\n",
-    "# ── Training loop ────────────────────────────────────────────────────────\n",
-    "history = {\"train_loss\": [], \"val_loss\": [], \"val_acc\": []}\n",
-    "\n",
-    "best_val_loss = float(\"inf\")\n",
-    "best_weights  = None\n",
-    "\n",
-    "print(f\"🚀 Training ECAPA-TDNN for {ECAPA_EPOCHS} epochs on {DEVICE}...\\n\")\n",
-    "start_time = time.time()\n",
-    "\n",
-    "for epoch in range(1, ECAPA_EPOCHS + 1):\n",
-    "    t_loss          = train_one_epoch(model, train_loader, optimizer, criterion)\n",
-    "    v_loss, v_acc   = evaluate(model, val_loader, criterion)\n",
-    "    scheduler.step()\n",
-    "\n",
-    "    history[\"train_loss\"].append(t_loss)\n",
-    "    history[\"val_loss\"].append(v_loss)\n",
-    "    history[\"val_acc\"].append(v_acc)\n",
-    "\n",
-    "    # Save best checkpoint (by validation loss)\n",
-    "    if v_loss < best_val_loss:\n",
-    "        best_val_loss = v_loss\n",
-    "        best_weights  = {k: v.cpu().clone() for k, v in model.state_dict().items()}\n",
-    "\n",
-    "    print(\n",
-    "        f\"  Epoch {epoch:03d}/{ECAPA_EPOCHS:03d}  \"\n",
-    "        f\"train_loss={t_loss:.4f}  \"\n",
-    "        f\"val_loss={v_loss:.4f}  \"\n",
-    "        f\"val_acc={v_acc*100:.2f}%\"\n",
-    "    )\n",
-    "\n",
-    "elapsed = time.time() - start_time\n",
-    "print(f\"\\n✅ Training complete in {elapsed:.1f}s.  Best val loss: {best_val_loss:.4f}\")\n",
-    "\n",
-    "# Restore best weights\n",
-    "model.load_state_dict(best_weights)\n",
-    "\n",
-    "# ── Plot training curves ─────────────────────────────────────────────────\n",
-    "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(13, 4))\n",
-    "\n",
-    "ax1.plot(history[\"train_loss\"], label=\"Train\", color=\"steelblue\")\n",
-    "ax1.plot(history[\"val_loss\"],   label=\"Val\",   color=\"tomato\")\n",
-    "ax1.set_title(\"Cross-Entropy Loss\")\n",
-    "ax1.set_xlabel(\"Epoch\")\n",
-    "ax1.set_ylabel(\"Loss\")\n",
-    "ax1.legend()\n",
-    "ax1.grid(True, alpha=0.3)\n",
-    "\n",
-    "ax2.plot(np.array(history[\"val_acc\"]) * 100, color=\"seagreen\", label=\"Val Accuracy\")\n",
-    "ax2.set_title(\"Validation Accuracy\")\n",
-    "ax2.set_xlabel(\"Epoch\")\n",
-    "ax2.set_ylabel(\"Accuracy (%)\")\n",
-    "ax2.legend()\n",
-    "ax2.grid(True, alpha=0.3)\n",
-    "\n",
-    "plt.suptitle(\"ECAPA-TDNN Training Curves\", fontsize=13, fontweight=\"bold\")\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 🔢 Cell 10 — Extract 192-dim Embeddings"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "@torch.no_grad()\n",
-    "def extract_embeddings(\n",
-    "    model:  nn.Module,\n",
-    "    loader: DataLoader,\n",
-    ") -> Tuple[np.ndarray, np.ndarray]:\n",
-    "    \"\"\"\n",
-    "    Pass all samples through the trained ECAPA-TDNN to obtain\n",
-    "    192-dimensional embeddings.\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    embeddings : np.ndarray, shape (N, 192)\n",
-    "    labels     : np.ndarray, shape (N,)\n",
-    "    \"\"\"\n",
-    "    model.eval()\n",
-    "    all_embs   = []\n",
-    "    all_labels = []\n",
-    "\n",
-    "    for x, y in tqdm(loader, desc=\"Extracting embeddings\", leave=False):\n",
-    "        x = x.to(DEVICE)\n",
-    "        emb = model.embed(x)           # (B, 192)\n",
-    "        all_embs.append(emb.cpu().numpy())\n",
-    "        all_labels.append(y.numpy())\n",
-    "\n",
-    "    embeddings = np.vstack(all_embs)         # (N, 192)\n",
-    "    labels     = np.concatenate(all_labels)  # (N,)\n",
-    "    return embeddings, labels\n",
-    "\n",
-    "\n",
-    "# Build a single DataLoader covering the full dataset (no shuffling)\n",
-    "# We will split embeddings later into train/test for XGBoost\n",
-    "full_ds     = AudioDataset(df, fixed_T=FIXED_T)\n",
-    "full_loader = DataLoader(\n",
-    "    full_ds,\n",
-    "    batch_size  = ECAPA_BATCH,\n",
-    "    shuffle     = False,\n",
-    "    num_workers = 0,\n",
-    ")\n",
-    "\n",
-    "print(\"🔄 Extracting embeddings for all samples...\")\n",
-    "embeddings, labels = extract_embeddings(model, full_loader)\n",
-    "\n",
-    "print(f\"✅ Embedding matrix shape : {embeddings.shape}\")\n",
-    "print(f\"   Label array shape     : {labels.shape}\")\n",
-    "print(f\"   Class balance — real  : {(labels==0).sum()}\")\n",
-    "print(f\"   Class balance — fake  : {(labels==1).sum()}\")\n",
-    "\n",
-    "# ── t-SNE visualisation of embeddings ────────────────────────────────────\n",
-    "from sklearn.manifold import TSNE\n",
-    "\n",
-    "print(\"\\n🔄 Running t-SNE (may take ~30 s)...\")\n",
-    "tsne   = TSNE(n_components=2, random_state=SEED, perplexity=30, n_iter=500)\n",
-    "emb_2d = tsne.fit_transform(embeddings)\n",
-    "\n",
-    "fig, ax = plt.subplots(figsize=(8, 6))\n",
-    "colours = [\"steelblue\", \"tomato\"]\n",
-    "for c, label_name in enumerate([\"Real\", \"Fake\"]):\n",
-    "    mask = labels == c\n",
-    "    ax.scatter(\n",
-    "        emb_2d[mask, 0], emb_2d[mask, 1],\n",
-    "        c=colours[c], label=label_name, alpha=0.55, s=18,\n",
-    "    )\n",
-    "ax.set_title(\"t-SNE of 192-dim ECAPA-TDNN Embeddings\")\n",
-    "ax.set_xlabel(\"t-SNE dim 1\")\n",
-    "ax.set_ylabel(\"t-SNE dim 2\")\n",
-    "ax.legend()\n",
-    "ax.grid(True, alpha=0.3)\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 🌲 Cell 11 — XGBoost Classifier"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ── Train / test split on embeddings ─────────────────────────────────────\n",
-    "X_train, X_test, y_train, y_test = train_test_split(\n",
-    "    embeddings,\n",
-    "    labels,\n",
-    "    test_size    = 0.20,\n",
-    "    stratify     = labels,\n",
-    "    random_state = SEED,\n",
-    ")\n",
-    "\n",
-    "# ── Standardise embeddings (mean=0, std=1) ────────────────────────────────\n",
-    "# XGBoost is tree-based (scale-invariant), but normalisation helps when\n",
-    "# we later use the same scaler inside the inference function.\n",
-    "scaler  = StandardScaler()\n",
-    "X_train = scaler.fit_transform(X_train)\n",
-    "X_test  = scaler.transform(X_test)\n",
-    "\n",
-    "print(f\"   X_train shape : {X_train.shape}\")\n",
-    "print(f\"   X_test  shape : {X_test.shape}\")\n",
-    "\n",
-    "# ── Train XGBoost ─────────────────────────────────────────────────────────\n",
-    "xgb_clf = xgb.XGBClassifier(**XGB_PARAMS)\n",
-    "\n",
-    "print(\"\\n🚀 Training XGBoost...\")\n",
-    "xgb_clf.fit(\n",
-    "    X_train, y_train,\n",
-    "    eval_set        = [(X_test, y_test)],\n",
-    "    verbose         = 50,   # print every 50 rounds\n",
-    ")\n",
-    "\n",
-    "print(\"\\n✅ XGBoost training complete.\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 📊 Cell 12 — Evaluation Metrics"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ── Predictions ───────────────────────────────────────────────────────────\n",
-    "y_pred      = xgb_clf.predict(X_test)\n",
-    "y_prob      = xgb_clf.predict_proba(X_test)[:, 1]   # probability of FAKE\n",
-    "\n",
-    "# ── Core metrics ──────────────────────────────────────────────────────────\n",
-    "acc     = accuracy_score(y_test, y_pred)\n",
-    "f1      = f1_score(y_test, y_pred)\n",
-    "roc_auc = roc_auc_score(y_test, y_prob)\n",
-    "cm      = confusion_matrix(y_test, y_pred)\n",
-    "\n",
-    "print(\"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\")\n",
-    "print(\"📈 Evaluation Results\")\n",
-    "print(\"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\")\n",
-    "print(f\"   Accuracy  : {acc*100:.2f}%\")\n",
-    "print(f\"   F1 Score  : {f1:.4f}\")\n",
-    "print(f\"   ROC-AUC   : {roc_auc:.4f}\")\n",
-    "print(\"━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\")\n",
-    "\n",
-    "# ── Figure layout: confusion matrix + ROC + feature importance ────────────\n",
-    "fig = plt.figure(figsize=(17, 5))\n",
-    "gs  = gridspec.GridSpec(1, 3, figure=fig)\n",
-    "\n",
-    "# --- Panel 1: Confusion Matrix -------------------------------------------\n",
-    "ax1 = fig.add_subplot(gs[0])\n",
-    "disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=[\"Real\", \"Fake\"])\n",
-    "disp.plot(ax=ax1, colorbar=False, cmap=\"Blues\")\n",
-    "ax1.set_title(\"Confusion Matrix\", fontweight=\"bold\")\n",
-    "\n",
-    "# --- Panel 2: ROC Curve --------------------------------------------------\n",
-    "ax2  = fig.add_subplot(gs[1])\n",
-    "fpr, tpr, _ = roc_curve(y_test, y_prob)\n",
-    "ax2.plot(fpr, tpr, color=\"tomato\", lw=2, label=f\"AUC = {roc_auc:.3f}\")\n",
-    "ax2.plot([0, 1], [0, 1], \"k--\", lw=1, alpha=0.5)\n",
-    "ax2.set_title(\"ROC Curve\", fontweight=\"bold\")\n",
-    "ax2.set_xlabel(\"False Positive Rate\")\n",
-    "ax2.set_ylabel(\"True Positive Rate\")\n",
-    "ax2.legend(loc=\"lower right\")\n",
-    "ax2.grid(True, alpha=0.3)\n",
-    "\n",
-    "# --- Panel 3: Top-20 XGBoost Feature Importances -------------------------\n",
-    "ax3 = fig.add_subplot(gs[2])\n",
-    "importances = xgb_clf.feature_importances_           # shape: (192,)\n",
-    "top20_idx   = np.argsort(importances)[::-1][:20]     # top-20 by importance\n",
-    "top20_imp   = importances[top20_idx]\n",
-    "\n",
-    "colors = plt.cm.viridis(np.linspace(0.2, 0.85, 20))\n",
-    "ax3.barh(\n",
-    "    [f\"dim {i}\" for i in top20_idx],\n",
-    "    top20_imp,\n",
-    "    color=colors,\n",
-    ")\n",
-    "ax3.invert_yaxis()\n",
-    "ax3.set_title(\"Top-20 XGBoost Feature Importances\", fontweight=\"bold\")\n",
-    "ax3.set_xlabel(\"Importance (gain)\")\n",
-    "ax3.grid(True, axis=\"x\", alpha=0.3)\n",
-    "\n",
-    "plt.suptitle(\n",
-    "    f\"Deepfake Audio Detection — Acc={acc*100:.1f}%  F1={f1:.3f}  AUC={roc_auc:.3f}\",\n",
-    "    fontsize=13,\n",
-    "    fontweight=\"bold\",\n",
-    ")\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 🔍 Cell 13 — Inference Function"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "@torch.no_grad()\n",
-    "def detect_deepfake(\n",
-    "    audio_path: str,\n",
-    "    ecapa_model:  nn.Module    = model,\n",
-    "    xgb_model:    xgb.XGBClassifier = xgb_clf,\n",
-    "    feat_scaler:  StandardScaler    = scaler,\n",
-    "    fixed_T:      int               = FIXED_T,\n",
-    "    device:       torch.device      = DEVICE,\n",
-    ") -> Dict[str, object]:\n",
-    "    \"\"\"\n",
-    "    End-to-end deepfake audio detection for a single WAV file.\n",
-    "\n",
-    "    Pipeline\n",
-    "    --------\n",
-    "    WAV → preprocess → log-mel+TEO features → ECAPA-TDNN embedding\n",
-    "        → StandardScaler → XGBoost → REAL / FAKE\n",
-    "\n",
-    "    Parameters\n",
-    "    ----------\n",
-    "    audio_path  : path to input WAV file\n",
-    "    ecapa_model : trained ECAPA-TDNN (default: module-level `model`)\n",
-    "    xgb_model   : trained XGBoost (default: module-level `xgb_clf`)\n",
-    "    feat_scaler : fitted StandardScaler (default: module-level `scaler`)\n",
-    "    fixed_T     : fixed frame count used during training\n",
-    "    device      : torch device\n",
-    "\n",
-    "    Returns\n",
-    "    -------\n",
-    "    dict with keys:\n",
-    "        label      : 'REAL' or 'FAKE'\n",
-    "        confidence : float in [0, 1] — probability of the predicted class\n",
-    "        fake_prob  : float in [0, 1] — raw probability of being FAKE\n",
-    "    \"\"\"\n",
-    "    # ── Step 1: Preprocess ───────────────────────────────────────────────\n",
-    "    y    = preprocess_audio(audio_path)\n",
-    "\n",
-    "    # ── Step 2: Feature extraction ───────────────────────────────────────\n",
-    "    feat = extract_features(y)              # (41, T_raw)\n",
-    "\n",
-    "    # Align to fixed_T (pad or trim)\n",
-    "    T = feat.shape[1]\n",
-    "    if T >= fixed_T:\n",
-    "        feat = feat[:, :fixed_T]\n",
-    "    else:\n",
-    "        feat = np.pad(feat, ((0, 0), (0, fixed_T - T)), mode=\"constant\")\n",
-    "\n",
-    "    # ── Step 3: ECAPA-TDNN embedding ─────────────────────────────────────\n",
-    "    x_tensor = torch.tensor(feat, dtype=torch.float32).unsqueeze(0).to(device)\n",
-    "    ecapa_model.eval()\n",
-    "    emb = ecapa_model.embed(x_tensor).cpu().numpy()  # (1, 192)\n",
-    "\n",
-    "    # ── Step 4: Normalise embedding ──────────────────────────────────────\n",
-    "    emb_scaled = feat_scaler.transform(emb)           # (1, 192)\n",
-    "\n",
-    "    # ── Step 5: XGBoost prediction ───────────────────────────────────────\n",
-    "    pred_class = int(xgb_model.predict(emb_scaled)[0])\n",
-    "    probs      = xgb_model.predict_proba(emb_scaled)[0]  # [p_real, p_fake]\n",
-    "    fake_prob  = float(probs[1])\n",
-    "    confidence = float(probs[pred_class])\n",
-    "\n",
-    "    label = \"FAKE\" if pred_class == 1 else \"REAL\"\n",
-    "\n",
-    "    return {\n",
-    "        \"label\":      label,\n",
-    "        \"confidence\": round(confidence, 4),\n",
-    "        \"fake_prob\":  round(fake_prob, 4),\n",
-    "    }\n",
-    "\n",
-    "\n",
-    "# ── Demo inference on a few test samples ───────────────────────────���─────\n",
-    "print(\"🔎 Running detect_deepfake() on 6 random samples:\\n\")\n",
-    "print(f\"{'File':<50} {'True':>6} {'Predicted':>10} {'Confidence':>12} {'Fake Prob':>10}\")\n",
-    "print(\"-\" * 95)\n",
-    "\n",
-    "for _, row in df.sample(6, random_state=SEED).iterrows():\n",
-    "    result    = detect_deepfake(row[\"path\"])\n",
-    "    true_lbl  = \"REAL\" if row[\"label\"] == 0 else \"FAKE\"\n",
-    "    match_sym = \"✅\" if result[\"label\"] == true_lbl else \"❌\"\n",
-    "    fname     = Path(row[\"path\"]).name\n",
-    "\n",
-    "    print(\n",
-    "        f\"{fname:<50} \"\n",
-    "        f\"{true_lbl:>6} \"\n",
-    "        f\"{result['label']:>9} {match_sym} \"\n",
-    "        f\"{result['confidence']:>10.4f} \"\n",
-    "        f\"{result['fake_prob']:>10.4f}\"\n",
-    "    )"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 💾 Cell 14 — Save / Load Artefacts"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import pickle\n",
-    "from pathlib import Path\n",
-    "\n",
-    "SAVE_DIR = Path(\"saved_models\")\n",
-    "SAVE_DIR.mkdir(exist_ok=True)\n",
-    "\n",
-    "# ── Save ECAPA-TDNN weights ───────────────────────────────────────────────\n",
-    "torch.save(model.state_dict(), SAVE_DIR / \"ecapa_tdnn.pt\")\n",
-    "print(\"✅ ECAPA-TDNN weights saved.\")\n",
-    "\n",
-    "# ── Save XGBoost model ────────────────────────────────────────────────────\n",
-    "xgb_clf.save_model(str(SAVE_DIR / \"xgboost.json\"))\n",
-    "print(\"✅ XGBoost model saved.\")\n",
-    "\n",
-    "# ── Save StandardScaler ───────────────────────────────────────────────────\n",
-    "with open(SAVE_DIR / \"scaler.pkl\", \"wb\") as f:\n",
-    "    pickle.dump(scaler, f)\n",
-    "print(\"✅ StandardScaler saved.\")\n",
-    "\n",
-    "# ── Save FIXED_T (needed for exact inference alignment) ───────────────────\n",
-    "with open(SAVE_DIR / \"config.pkl\", \"wb\") as f:\n",
-    "    pickle.dump({\"fixed_T\": FIXED_T, \"embedding_dim\": EMBEDDING_DIM}, f)\n",
-    "print(\"✅ Config saved.\")\n",
-    "\n",
-    "print(f\"\\nAll artefacts saved to '{SAVE_DIR.resolve()}'\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 📋 Cell 15 — Results Summary Dashboard"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# ── Final consolidated summary ─────────────────────────────────────────────\n",
-    "print(\"=\"*60)\n",
-    "print(\"       DEEPFAKE AUDIO DETECTION — FINAL RESULTS\")\n",
-    "print(\"=\"*60)\n",
-    "\n",
-    "# Pipeline parameters\n",
-    "print(\"\\n📐 Pipeline configuration:\")\n",
-    "print(f\"   Sample rate         : {SAMPLE_RATE} Hz\")\n",
-    "print(f\"   Clip duration       : {DURATION} s\")\n",
-    "print(f\"   Features            : {N_MELS} log-mel + 1 TEO = 41 channels\")\n",
-    "print(f\"   ECAPA-TDNN params   : {n_params:,}\")\n",
-    "print(f\"   Embedding dim       : {EMBEDDING_DIM}\")\n",
-    "print(f\"   XGBoost estimators  : {XGB_PARAMS['n_estimators']}\")\n",
-    "\n",
-    "# Dataset stats\n",
-    "print(\"\\n📊 Dataset:\")\n",
-    "vc = pd.Series(labels).value_counts()\n",
-    "print(f\"   Real samples        : {vc.get(0, 0)}\")\n",
-    "print(f\"   Fake samples        : {vc.get(1, 0)}\")\n",
-    "print(f\"   Test set size       : {len(y_test)}\")\n",
-    "\n",
-    "# Performance\n",
-    "print(\"\\n🏆 Test-set performance:\")\n",
-    "print(f\"   Accuracy            : {acc*100:.2f}%\")\n",
-    "print(f\"   F1 Score            : {f1:.4f}\")\n",
-    "print(f\"   ROC-AUC             : {roc_auc:.4f}\")\n",
-    "\n",
-    "tn, fp, fn, tp = cm.ravel()\n",
-    "print(f\"\\n   Confusion matrix:\")\n",
-    "print(f\"   TP={tp}  FP={fp}\")\n",
-    "print(f\"   FN={fn}  TN={tn}\")\n",
-    "\n",
-    "precision = tp / (tp + fp + 1e-9)\n",
-    "recall    = tp / (tp + fn + 1e-9)\n",
-    "print(f\"\\n   Precision (fake)    : {precision:.4f}\")\n",
-    "print(f\"   Recall    (fake)    : {recall:.4f}\")\n",
-    "\n",
-    "print(\"\\n\" + \"=\"*60)\n",
-    "print(\"  detect_deepfake(audio_path) → {label, confidence, fake_prob}\")\n",
-    "print(\"=\"*60)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## 📝 Notes & Extension Ideas\n",
-    "\n",
-    "| Area | What to try |\n",
-    "|---|---|\n",
-    "| **Data** | Replace synthetic data with ASVspoof2019 LA / WaveFake (see links below) |\n",
-    "| **Features** | Add MFCC delta/delta-delta, CQT, or group delay features |\n",
-    "| **Denoising** | Replace spectral gating with RNNoise or DeepFilterNet |\n",
-    "| **Model** | Use the full Res2Net-based ECAPA-TDNN (SpeechBrain implementation) |\n",
-    "| **Classifier** | Compare with LightGBM, SVM, or a shallow MLP |\n",
-    "| **Augmentation** | Add RIR simulation, speed perturbation, codec compression |\n",
-    "| **Deployment** | Wrap `detect_deepfake` in a FastAPI endpoint |\n",
-    "\n",
-    "### Recommended Datasets\n",
-    "- **ASVspoof 2019 LA**: https://www.asvspoof.org/\n",
-    "- **WaveFake**: https://github.com/RUB-SysSec/WaveFake\n",
-    "- **FakeAVCeleb**: https://github.com/DASH-Lab/FakeAVCeleb\n",
-    "\n",
-    "### Key References\n",
-    "- *ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification* — Desplanques et al., 2020\n",
-    "- *WaveFake: A Data Set to Facilitate Audio Deepfake Detection* — Frank & Schönherr, 2021\n",
-    "- *ASVspoof 2019: A Large-Scale Public Database* — Wang et al., 2020"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "name": "python",
-   "version": "3.10.0"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}