VoiceBlock

Privacy through Real-Time Adversarial Attacks with Audio-to-Audio Models

Installation
Reproducing Results
Streaming Implementation
Citation

Installation

Clone the repository:

 git clone https://github.com/voiceboxneurips/voicebox.git

We recommend working from a clean environment, e.g. using conda:

 conda create --name voicebox python=3.9
 source activate voicebox

Install dependencies:

 cd voicebox
 pip install -r requirements.txt
 pip install -e .

Grant permissions:
```
 chmod -R u+x scripts/
```

Reproducing Results

To reproduce our results, first download the corresponding data. Note that to download the VoxCeleb1 dataset, you must register and obtain a username and password.

Task	Dataset (Size)	Command
Objective evaluation	VoxCeleb1 (39G)	`python scripts/downloads/download_voxceleb.py --subset=1 --username=<VGG_USERNAME> --password=<VGG_PASSWORD>`
WER / supplemental evaluations	LibriSpeech `train-clean-360` (23G)	`./scripts/downloads/download_librispeech_eval.sh`
Train attacks	LibriSpeech `train-clean-100` (11G)	`./scripts/downloads/download_librispeech_train.sh`

We provide scripts to reproduce our experiments and save results, including generated audio, to named and time-stamped subdirectories within runs/. To reproduce our objective evaluation experiments using pre-trained attacks, run:

python scripts/experiments/evaluate.py

To reproduce our training, run:

python scripts/experiments/train.py

Streaming Implementation

As a proof of concept, we provide a streaming implementation of VoiceBox capable of modifying user audio in real-time. Here, we provide installation instructions for MacOS and Ubuntu 20.04.

MacOS

See video below:

Ubuntu 20.04

Open a terminal and follow the installation instructions above. Change directory to the root of this repository.
Run the following command:
```
 pacmd load-module module-null-sink sink_name=voicebox sink_properties=device.description=voicebox
```
If you are using PipeWire instead of PulseAudio:
```
 pactl load-module module-null-sink media.class=Audio/Sink sink_name=voicebox sink_properties=device.description=voicebox
 
```
PulseAudio is the default on Ubuntu. If you haven't changed your system defaults, you are probably using PulseAudio. This will add "voicebox" as an output device. Select it as the input to your chosen audio software.

Find which audio device to read and write from. In your conda environment, run:

 python -m sounddevice

You will get output similar to this:

   0 HDA Intel HDMI: 0 (hw:0,3), ALSA (0 in, 8 out)
   1 HDA Intel HDMI: 1 (hw:0,7), ALSA (0 in, 8 out)
   2 HDA Intel HDMI: 2 (hw:0,8), ALSA (0 in, 8 out)
   3 HDA Intel HDMI: 3 (hw:0,9), ALSA (0 in, 8 out)
   4 HDA Intel HDMI: 4 (hw:0,10), ALSA (0 in, 8 out)
   5 hdmi, ALSA (0 in, 8 out)
   6 jack, ALSA (2 in, 2 out)
   7 pipewire, ALSA (64 in, 64 out)
   8 pulse, ALSA (32 in, 32 out)
 * 9 default, ALSA (32 in, 32 out)

In this example, we are going to route the audio through PipeWire (channel 7). This will be our INPUT_NUM and OUTPUT_NUM

First, we need to create a conditioning embedding. To do this, run the enrollment script and follow its on-screen instructions:
```
 python scripts/streamer/enroll.py --input INPUT_NUM
```

We can now use the streamer. Run:

 python scripts/stream.py --input INPUT_NUM --output OUTPUT_NUM

Once the streamer is running, open pavucontrol.

a. In pavucontrol, go to the "Playback" tab and find "ALSA pug-in [python3.9]: ALSA Playback on". Set the output to "voicebox".

b. Then, go to "Recording" and find "ALSA pug-in [python3.9]: ALSA Playback from", and set the input to your desired microphone device.

Citation

If you use this your academic research, please cite the following:

@inproceedings{authors2022voicelock,
title={VoiceBlock: Privacy through Real-Time Adversarial Attacks with Audio-to-Audio Models},
author={Patrick O'Reilly, Andreas Bugler, Keshav Bhandari, Max Morrison, Bryan Pardo},
booktitle={Neural Information Processing Systems},
month={November},
year={2022}
}

Spaces:

ALeLacheur
/

voiceblock

Sleeping