reach-vb HF staff commited on
Commit
f82cc86
1 Parent(s): 20f3f5a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -0
README.md ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - audioseal
4
+ inference: false
5
+ ---
6
+ # AudioSeal
7
+
8
+ We introduce AudioSeal, a method for speech localized watermarking, with state-of-the-art robustness and detector speed. It jointly trains a generator that embeds a watermark in the audio, and a detector that detects the watermarked fragments in longer audios, even in the presence of editing.
9
+ Audioseal achieves state-of-the-art detection performance of both natural and synthetic speech at the sample level (1/16k second resolution), it generates limited alteration of signal quality and is robust to many types of audio editing.
10
+ Audioseal is designed with a fast, single-pass detector, that significantly surpasses existing models in speed — achieving detection up to two orders of magnitude faster, making it ideal for large-scale and real-time applications.
11
+
12
+ # :mate: Installation
13
+
14
+ AudioSeal requires Python >=3.8, Pytorch >= 1.13.0, [omegaconf](https://omegaconf.readthedocs.io/), [julius](https://pypi.org/project/julius/), and numpy. To install from PyPI:
15
+
16
+ ```
17
+ pip install audioseal
18
+ ```
19
+
20
+ To install from source: Clone this repo and install in editable mode:
21
+
22
+ ```
23
+ git clone https://github.com/facebookresearch/audioseal
24
+ cd audioseal
25
+ pip install -e .
26
+ ```
27
+
28
+ # :gear: Models
29
+
30
+ We provide the checkpoints for the following models:
31
+
32
+ - AudioSeal Generator.
33
+ It takes as input an audio signal (as a waveform), and outputs a watermark of the same size as the input, that can be added to the input to watermark it.
34
+ Optionally, it can also take as input a secret message of 16-bits that will be encoded in the watermark.
35
+ - AudioSeal Detector.
36
+ It takes as input an audio signal (as a waveform), and outputs a probability that the input contains a watermark at each sample of the audio (every 1/16k s).
37
+ Optionally, it may also output the secret message encoded in the watermark.
38
+
39
+ Note that the message is optional and has no influence on the detection output. It may be used to identify a model version for instance (up to $2**16=65536$ possible choices).
40
+
41
+ **Note**: We are working to release the training code for anyone wants to build their own watermarker. Stay tuned !
42
+
43
+ # :abacus: Usage
44
+
45
+ Audioseal provides a simple API to watermark and detect the watermarks from an audio sample. Example usage:
46
+
47
+ ```python
48
+
49
+ from audioseal import AudioSeal
50
+
51
+ # model name corresponds to the YAML card file name found in audioseal/cards
52
+ model = AudioSeal.load_generator("audioseal_wm_16bits")
53
+
54
+ # Other way is to load directly from the checkpoint
55
+ # model = Watermarker.from_pretrained(checkpoint_path, device = wav.device)
56
+
57
+ # a torch tensor of shape (batch, channels, samples) and a sample rate
58
+ # It is important to process the audio to the same sample rate as the model
59
+ # expectes. In our case, we support 16khz audio
60
+ wav, sr = ..., 16000
61
+
62
+ watermark = model.get_watermark(wav, sr)
63
+
64
+ # Optional: you can add a 16-bit message to embed in the watermark
65
+ # msg = torch.randint(0, 2, (wav.shape(0), model.msg_processor.nbits), device=wav.device)
66
+ # watermark = model.get_watermark(wav, message = msg)
67
+
68
+ watermarked_audio = wav + watermark
69
+
70
+ detector = AudioSeal.load_detector("audioseal_detector_16bits")
71
+
72
+ # To detect the messages in the high-level.
73
+ result, message = detector.detect_watermark(watermarked_audio, sr)
74
+
75
+ print(result) # result is a float number indicating the probability of the audio being watermarked,
76
+ print(message) # message is a binary vector of 16 bits
77
+
78
+
79
+ # To detect the messages in the low-level.
80
+ result, message = detector(watermarked_audio, sr)
81
+
82
+ # result is a tensor of size batch x 2 x frames, indicating the probability (positive and negative) of watermarking for each frame
83
+ # A watermarked audio should have result[:, 1, :] > 0.5
84
+ print(result[:, 1 , :])
85
+
86
+ # Message is a tensor of size batch x 16, indicating of the probability of each bit to be 1.
87
+ # message will be a random tensor if the detector detects no watermarking from the audio
88
+ print(message)
89
+ ```