RawNet2 — In-the-Wild Robustness Specialist (ITW)

This repository contains a fine-tuned :contentReference[oaicite:5]{index=5} checkpoint optimized for robustness against real-world audio degradation conditions.

The model was trained using heavy augmentation pipelines targeting:

codec artifacts
platform compression
noisy environments
variable channel conditions
social-media-distributed audio

This checkpoint serves as the robustness-specialized parent model within the :contentReference[oaicite:6]{index=6}.

Model Details

Property	Value
Architecture	RawNet2
Domain Specialization	In-the-wild robustness
Training Dataset	Müller ITW Dataset
Input	Raw mono waveform
Sample Rate	16 kHz
Framework	PyTorch

Intended Use

This checkpoint is intended for:

real-world audio deepfake detection
robustness research
codec-resilient anti-spoofing
noisy environment evaluation
weight-merging experiments

Augmentation Pipeline

Training used aggressive augmentation strategies designed to simulate real-world distribution shift.

Augmentations

Random offset crop/pad
Gaussian channel noise
MP3 compression simulation
Telephone-band filtering
Randomized channel degradation

These augmentations were specifically designed to improve robustness against:

TikTok compression
Instagram re-encoding
YouTube Shorts transcoding
mobile-recorded speech
noisy reposted media

Validation Performance

Metric	Value
Validation AUC	0.9983
Validation EER	0.0107
Deepfake-Evals Generalization AUC	0.4826

The large gap between validation and Deepfake-Evals performance highlights the significant distribution-shift challenge in real-world deepfake detection.

Repository Contents

File	Description
`best_auc.pth`	Best validation AUC checkpoint
`latest.pth`	Latest training checkpoint
`model.py`	RawNet2 architecture definition

Usage

Load Checkpoint

import torch
from model import RawNet

model = RawNet()

checkpoint = torch.load(
    "best_auc.pth",
    map_location="cpu"
)

model.load_state_dict(checkpoint)
model.eval()

Relationship to MeGA-IA

This model serves as the:

Robustness-specialized parent model

inside the MeGA-IA genetic weight merging framework.

Its role is to contribute:

robustness priors
codec-invariant features
noise-tolerant representations
distribution-shift resilience

during genetic weight fusion.

Limitations

Validation metrics may overestimate real-world performance
Still vulnerable to unseen synthesis methods
Performance remains sensitive to extreme domain shift

Citation

If you use these weights, please cite:

@inproceedings{ahmad2026megaia,
  title     = {MeGA-IA: Genetic Algorithm-Driven Weight Merging for In-the-Wild Deepfake Detection},
  author    = {Ahmad, Awwab Ext},
  booktitle = {Proceedings of the 23rd International Bhurban Conference on Applied Sciences and Technology (IBCAST)},
  year      = {2026},
  note      = {Under Review}
}

Please also cite the ITW dataset work:

@inproceedings{muller2022itw,
  title     = {In-the-Wild Audio Deepfake Detection},
  author    = {Müller, Nicolas and others},
  booktitle = {Proceedings of IWBF},
  year      = {2022}
}

@inproceedings{jung2020rawnet2,
  title     = {RawNet2: Bootstrapping Raw Audio End-to-End Neural Network for Speaker Verification},
  author    = {Jung, Jee-weon and Kim, Heo-jin and Kwon, Yeun-ju and Jung, Jae-hak and Yu, Hsin-Min},
  booktitle = {Proceedings of Interspeech},
  year      = {2020}
}

License

This repository is released under the :contentReference[oaicite:7]{index=7}.

Weights are provided for research and benchmarking purposes.

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

auc on In-the-Wild Audio Deepfake Dataset
self-reported

0.998
eer on In-the-Wild Audio Deepfake Dataset
self-reported

0.011