csteinmetz1 commited on
Commit
fa298e8
1 Parent(s): 083ef90

README formatting

Browse files
Files changed (1) hide show
  1. README.md +29 -15
README.md CHANGED
@@ -1,23 +1,36 @@
1
- # General Purpose Audio Effect Removal
2
- Removing multiple audio effects from multiple sources with compositional audio effect removal using source separation and speech enhancement models.
3
 
4
- This repo contains the code for the paper [General Purpose Audio Effect Removal](https://arxiv.org/abs/2110.00484). (Todo: Paper link broken, Arxiv badge broken, citation, license)
 
5
 
6
- [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
7
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1LoLgL1YHzIQfILEayDmRUZzDZzJpD6rD)
8
  [![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)](https://arxiv.org/abs/1234.56789)
 
 
 
9
 
10
- <img width="700px" src="remfx-headline.jpg">
11
 
12
  Listening examples can be found [here](https://csteinmetz1.github.io/RemFX/).
13
 
 
 
 
14
  ## Abstract
 
15
 
16
  Although the design and application of audio effects is well understood, the inverse problem of removing these effects is significantly more challenging and far less studied. Recently, deep learning has been applied to audio effect removal; however, existing approaches have focused on narrow formulations considering only one effect or source type at a time. In realistic scenarios, multiple effects are applied with varying source content. This motivates a more general task, which we refer to as general purpose audio effect removal. We developed a dataset for this task using five audio effects across four different sources and used it to train and evaluate a set of existing architectures. We found that no single model performed optimally on all effect types and sources. To address this, we introduced <b>RemFX</b>, an approach designed to mirror the compositionality of applied effects. We first trained a set of the best-performing effect-specific
17
  removal models and then leveraged an audio effect classification model to dynamically construct a graph of our models at inference. We found our approach to outperform single model baselines, although examples with many effects present remain challenging.
18
 
 
 
 
 
 
 
 
 
 
19
 
20
- # Setup
21
  ```
22
  git clone https://github.com/mhrice/RemFx.git
23
  cd RemFx
@@ -29,9 +42,10 @@ Due to incompatabilities with hearbaseline's dependencies (namely numpy/numba) a
29
  <b>Please run the setup code before running any scripts.</b>
30
  All scripts should be launched from the top level after installing.
31
 
32
- # Usage
33
  This repo can be used for many different tasks. Here are some examples. Ensure you have run the setup code before running any scripts.
34
- ## Run RemFX Detect on a single file
 
35
  Here we will attempt to detect, then remove effects that are present in an audio file. For the best results, use a file from our [evaluation dataset](https://zenodo.org/record/8187288). We support detection and removal of the following effects: chorus, delay, distortion, dynamic range compression, and reverb.
36
 
37
  First, we need to download the pytorch checkpoints from [zenodo](https://zenodo.org/record/8218621)
@@ -42,13 +56,13 @@ Then run the detect script. This repo contains an example file `example.wav` fro
42
  ```
43
  scripts/remfx_detect.sh example.wav -o dry.wav
44
  ```
45
- ## Download the [General Purpose Audio Effect Removal evaluation datasets](https://zenodo.org/record/8187288)
46
  We provide a script to download and unzip the datasets used in table 4 of the paper.
47
  ```
48
  scripts/download_eval_datasets.sh
49
  ```
50
 
51
- ## Download the starter datasets
52
 
53
  If you'd like to train your own model and/or generate a dataset, you can download the starter datasets using the following command:
54
 
@@ -182,7 +196,7 @@ The dataset that is generated contains 8000 train examples, 1000 validation exam
182
 
183
  Note: if training, this process will be done automatically at the start of training. To disable this, set `render_files=False` in the config or command-line, and set `render_root={path/to/dataset}` if it is in a custom location.
184
 
185
- # Experimental parameters
186
  Some relevant dataset/training parameters descriptions
187
  - `num_kept_effects={[min, max]}` range of <b> Kept </b> effects to apply to each file. Inclusive.
188
  - `num_removed_effects={[min, max]}` range of <b> Removed </b> effects to apply to each file. Inclusive.
@@ -195,20 +209,20 @@ Some relevant dataset/training parameters descriptions
195
  - `datamodule.train_batch_size={batch_size}`. Change batch size (default: varies).
196
  - `logger=wandb`. Use weights and biases logger (default: csv). Ensure you set the wandb environment variables (see training section).
197
 
198
- ## Effect Removal Models
199
  - `umx`
200
  - `demucs`
201
  - `tcn`
202
  - `dcunet`
203
  - `dptnet`
204
 
205
- ## Effect Classification Models
206
  - `cls_vggish`
207
  - `cls_panns_pt`
208
  - `cls_wav2vec2`
209
  - `cls_wav2clip`
210
 
211
- ## Effects
212
  - `delay`
213
  - `distortion`
214
  - `chorus`
 
1
+ <div align="center">
 
2
 
3
+ # RemFx
4
+ General Purpose Audio Effect Removal
5
 
 
 
6
  [![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)](https://arxiv.org/abs/1234.56789)
7
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1LoLgL1YHzIQfILEayDmRUZzDZzJpD6rD)
8
+ [![Dataset](https://zenodo.org/badge/DOI/10.5281/zenodo.8187288.svg)](https://zenodo.org/record/8187288)
9
+ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
10
 
 
11
 
12
  Listening examples can be found [here](https://csteinmetz1.github.io/RemFX/).
13
 
14
+
15
+ <img width="450px" src="remfx-headline.jpg">
16
+
17
  ## Abstract
18
+ </div>
19
 
20
  Although the design and application of audio effects is well understood, the inverse problem of removing these effects is significantly more challenging and far less studied. Recently, deep learning has been applied to audio effect removal; however, existing approaches have focused on narrow formulations considering only one effect or source type at a time. In realistic scenarios, multiple effects are applied with varying source content. This motivates a more general task, which we refer to as general purpose audio effect removal. We developed a dataset for this task using five audio effects across four different sources and used it to train and evaluate a set of existing architectures. We found that no single model performed optimally on all effect types and sources. To address this, we introduced <b>RemFX</b>, an approach designed to mirror the compositionality of applied effects. We first trained a set of the best-performing effect-specific
21
  removal models and then leveraged an audio effect classification model to dynamically construct a graph of our models at inference. We found our approach to outperform single model baselines, although examples with many effects present remain challenging.
22
 
23
+ ```bibtex
24
+ @inproceedings{rice2023remfx,
25
+ title={General Purpose Audio Effect Removal},
26
+ author={Rice, Matthew and Steinmetz, Christian J. and Fazekas, George and Reiss, Joshua D.},
27
+ booktitle={IEEE Workshop on Applications of Signal Processing to Audio and Acoustics},
28
+ year={2023}
29
+ }
30
+ ```
31
+
32
 
33
+ ## Setup
34
  ```
35
  git clone https://github.com/mhrice/RemFx.git
36
  cd RemFx
 
42
  <b>Please run the setup code before running any scripts.</b>
43
  All scripts should be launched from the top level after installing.
44
 
45
+ ## Usage
46
  This repo can be used for many different tasks. Here are some examples. Ensure you have run the setup code before running any scripts.
47
+
48
+ ### Run RemFX Detect on a single file
49
  Here we will attempt to detect, then remove effects that are present in an audio file. For the best results, use a file from our [evaluation dataset](https://zenodo.org/record/8187288). We support detection and removal of the following effects: chorus, delay, distortion, dynamic range compression, and reverb.
50
 
51
  First, we need to download the pytorch checkpoints from [zenodo](https://zenodo.org/record/8218621)
 
56
  ```
57
  scripts/remfx_detect.sh example.wav -o dry.wav
58
  ```
59
+ ### Download the [General Purpose Audio Effect Removal evaluation datasets](https://zenodo.org/record/8187288)
60
  We provide a script to download and unzip the datasets used in table 4 of the paper.
61
  ```
62
  scripts/download_eval_datasets.sh
63
  ```
64
 
65
+ ### Download the starter datasets
66
 
67
  If you'd like to train your own model and/or generate a dataset, you can download the starter datasets using the following command:
68
 
 
196
 
197
  Note: if training, this process will be done automatically at the start of training. To disable this, set `render_files=False` in the config or command-line, and set `render_root={path/to/dataset}` if it is in a custom location.
198
 
199
+ ## Experimental parameters
200
  Some relevant dataset/training parameters descriptions
201
  - `num_kept_effects={[min, max]}` range of <b> Kept </b> effects to apply to each file. Inclusive.
202
  - `num_removed_effects={[min, max]}` range of <b> Removed </b> effects to apply to each file. Inclusive.
 
209
  - `datamodule.train_batch_size={batch_size}`. Change batch size (default: varies).
210
  - `logger=wandb`. Use weights and biases logger (default: csv). Ensure you set the wandb environment variables (see training section).
211
 
212
+ ### Effect Removal Models
213
  - `umx`
214
  - `demucs`
215
  - `tcn`
216
  - `dcunet`
217
  - `dptnet`
218
 
219
+ ### Effect Classification Models
220
  - `cls_vggish`
221
  - `cls_panns_pt`
222
  - `cls_wav2vec2`
223
  - `cls_wav2clip`
224
 
225
+ ### Effects
226
  - `delay`
227
  - `distortion`
228
  - `chorus`