FakeMark: Deepfake Speech Attribution With Watermarked Artifacts
Paper β’ 2510.12042 β’ Published
Official pretrained checkpoints for FakeMark, a deepfake speech attribution system. FakeMark injects system-specific watermark artifacts into synthesized speech to attribute waveforms back to their originating Text-to-Speech (TTS) architecture.
FakeMark provides a robust framework for speech provenance. By leveraging SEANet-based generators and MMS-300M collaborators, it allows for high-fidelity audio watermarking that survives common distortions while maintaining high attribution accuracy.
The repository is organized as follows:
checkpoints/
βββ FakeMarkA/
β βββ encoder.ckpt # SEANet watermark generator
β βββ decoder.ckpt # SEANet decoder
β βββ colprocessor.ckpt # ColProcessor conditioning module
β βββ collaborator.ckpt # MMS-300M collaborator (attribution classifier)
βββ FakeMarkT/
β βββ encoder.ckpt # Timbre watermark generator
β βββ collaborator.ckpt # MMS-300M collaborator
βββ AudioSeal/
β βββ checkpoint_generator_epoch260.pth # AudioSeal generator (retrained)
β βββ checkpoint_detector_epoch260.pth # AudioSeal detector (retrained)
βββ Timbre-4bit.pth.tar # Timbre generator/detector (retrained)
βββ MMS_300M.ckpt # Standalone MMS-300M classifier
βββ ResNet.ckpt # Standalone ResNet34 + LFB + LMCL classifier