RawNet2 — In-the-Wild Robustness Specialist (ITW)

This repository contains a fine-tuned :contentReference[oaicite:5]{index=5} checkpoint optimized for robustness against real-world audio degradation conditions.

The model was trained using heavy augmentation pipelines targeting:

  • codec artifacts
  • platform compression
  • noisy environments
  • variable channel conditions
  • social-media-distributed audio

This checkpoint serves as the robustness-specialized parent model within the :contentReference[oaicite:6]{index=6}.


Model Details

Property Value
Architecture RawNet2
Domain Specialization In-the-wild robustness
Training Dataset Müller ITW Dataset
Input Raw mono waveform
Sample Rate 16 kHz
Framework PyTorch

Intended Use

This checkpoint is intended for:

  • real-world audio deepfake detection
  • robustness research
  • codec-resilient anti-spoofing
  • noisy environment evaluation
  • weight-merging experiments

Augmentation Pipeline

Training used aggressive augmentation strategies designed to simulate real-world distribution shift.

Augmentations

  1. Random offset crop/pad
  2. Gaussian channel noise
  3. MP3 compression simulation
  4. Telephone-band filtering
  5. Randomized channel degradation

These augmentations were specifically designed to improve robustness against:

  • TikTok compression
  • Instagram re-encoding
  • YouTube Shorts transcoding
  • mobile-recorded speech
  • noisy reposted media

Validation Performance

Metric Value
Validation AUC 0.9983
Validation EER 0.0107
Deepfake-Evals Generalization AUC 0.4826

The large gap between validation and Deepfake-Evals performance highlights the significant distribution-shift challenge in real-world deepfake detection.


Repository Contents

File Description
best_auc.pth Best validation AUC checkpoint
latest.pth Latest training checkpoint
model.py RawNet2 architecture definition

Usage

Load Checkpoint

import torch
from model import RawNet

model = RawNet()

checkpoint = torch.load(
    "best_auc.pth",
    map_location="cpu"
)

model.load_state_dict(checkpoint)
model.eval()

Relationship to MeGA-IA

This model serves as the:

Robustness-specialized parent model

inside the MeGA-IA genetic weight merging framework.

Its role is to contribute:

  • robustness priors
  • codec-invariant features
  • noise-tolerant representations
  • distribution-shift resilience

during genetic weight fusion.


Limitations

  • Validation metrics may overestimate real-world performance
  • Still vulnerable to unseen synthesis methods
  • Performance remains sensitive to extreme domain shift

Citation

If you use these weights, please cite:

@inproceedings{ahmad2026megaia,
  title     = {MeGA-IA: Genetic Algorithm-Driven Weight Merging for In-the-Wild Deepfake Detection},
  author    = {Ahmad, Awwab Ext},
  booktitle = {Proceedings of the 23rd International Bhurban Conference on Applied Sciences and Technology (IBCAST)},
  year      = {2026},
  note      = {Under Review}
}

Please also cite the ITW dataset work:

@inproceedings{muller2022itw,
  title     = {In-the-Wild Audio Deepfake Detection},
  author    = {Müller, Nicolas and others},
  booktitle = {Proceedings of IWBF},
  year      = {2022}
}
@inproceedings{jung2020rawnet2,
  title     = {RawNet2: Bootstrapping Raw Audio End-to-End Neural Network for Speaker Verification},
  author    = {Jung, Jee-weon and Kim, Heo-jin and Kwon, Yeun-ju and Jung, Jae-hak and Yu, Hsin-Min},
  booktitle = {Proceedings of Interspeech},
  year      = {2020}
}

License

This repository is released under the :contentReference[oaicite:7]{index=7}.

Weights are provided for research and benchmarking purposes.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results