jhtonyKoo commited on
Commit
2aef76d
1 Parent(s): 6b95f60

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -126
README.md CHANGED
@@ -1,126 +1,9 @@
1
- # Music Mixing Style Transfer
2
-
3
- This repository includes source code and pre-trained models of the work *Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects* by [Junghyun Koo](https://linkedin.com/in/junghyun-koo-525a31251), [Marco A. Martínez-Ramírez](https://m-marco.com/about/), [Wei-Hsiang Liao](https://jp.linkedin.com/in/wei-hsiang-liao-66283154), [Stefan Uhlich](https://scholar.google.de/citations?user=hja8ejYAAAAJ&hl=de), [Kyogu Lee](https://linkedin.com/in/kyogu-lee-7a93b611), and [Yuki Mitsufuji](https://www.yukimitsufuji.com/).
4
-
5
-
6
- [![arXiv](https://img.shields.io/badge/arXiv-2211.02247-b31b1b.svg)](https://arxiv.org/abs/2211.02247)
7
- [![Web](https://img.shields.io/badge/Web-Demo_Page-green.svg)](https://jhtonyKoo.github.io/MixingStyleTransfer/)
8
- [![Supplementary](https://img.shields.io/badge/Supplementary-Materials-white.svg)](https://tinyurl.com/4math4pm)
9
-
10
-
11
-
12
- ## Pre-trained Models
13
- | Model | Configuration | Training Dataset |
14
- |-------------|-------------|-------------|
15
- [FXencoder (Φ<sub>p.s.</sub>)](https://drive.google.com/file/d/1BFABsJRUVgJS5UE5iuM03dbfBjmI9LT5/view?usp=sharing) | Used *FX normalization* and *probability scheduling* techniques for training | Trained with [MUSDB18](https://sigsep.github.io/datasets/musdb.html) Dataset
16
- [MixFXcloner](https://drive.google.com/file/d/1Qu8rD7HpTNA1gJUVp2IuaeU_Nue8-VA3/view?usp=sharing) | Mixing style converter trained with Φ<sub>p.s.</sub> | Trained with [MUSDB18](https://sigsep.github.io/datasets/musdb.html) Dataset
17
-
18
-
19
- ## Installation
20
- ```
21
- pip install -r "requirements.txt"
22
- ```
23
-
24
- # Inference
25
-
26
- ## Mixing Style Transfer
27
-
28
- To run the inference code for <i>mixing style transfer</i>,
29
- 1. Download pre-trained models above and place them under the folder named 'weights' (default)
30
- 2. Prepare input and reference tracks under the folder named 'samples/style_transfer' (default)
31
- Target files should be organized as follow:
32
- ```
33
- "path_to_data_directory"/"song_name_#1"/"input_file_name".wav
34
- "path_to_data_directory"/"song_name_#1"/"reference_file_name".wav
35
- ...
36
- "path_to_data_directory"/"song_name_#n"/"input_file_name".wav
37
- "path_to_data_directory"/"song_name_#n"/"reference_file_name".wav
38
- ```
39
- 3. Run 'inference/style_transfer.py'
40
- ```
41
- python inference/style_transfer.py \
42
- --ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
43
- --ckpt_path_conv "path_to_checkpoint_of_MixFXcloner" \
44
- --target_dir "path_to_directory_containing_inference_samples"
45
- ```
46
- 4. Outputs will be stored under the same folder to inference data directory (default)
47
-
48
- *Note: The system accepts WAV files of stereo-channeled, 44.1kHZ, and 16-bit rate. We recommend to use audio samples that are not too loud: it's better for the system to transfer these samples by reducing the loudness of mixture-wise inputs (maintaining the overall balance of each instrument).*
49
-
50
-
51
-
52
- ## Interpolation With 2 Different Reference Tracks
53
-
54
- Inference code for <interpolating> two reference tracks is almost the same as <i>mixing style transfer</i>.
55
- 1. Download pre-trained models above and place them under the folder named 'weights' (default)
56
- 2. Prepare input and 2 reference tracks under the folder named 'samples/style_transfer' (default)
57
- Target files should be organized as follow:
58
- ```
59
- "path_to_data_directory"/"song_name_#1"/"input_track_name".wav
60
- "path_to_data_directory"/"song_name_#1"/"reference_file_name".wav
61
- "path_to_data_directory"/"song_name_#1"/"reference_file_name_2interpolate".wav
62
- ...
63
- "path_to_data_directory"/"song_name_#n"/"input_track_name".wav
64
- "path_to_data_directory"/"song_name_#n"/"reference_file_name".wav
65
- "path_to_data_directory"/"song_name_#n"/"reference_file_name_2interpolate".wav
66
- ```
67
- 3. Run 'inference/style_transfer.py'
68
- ```
69
- python inference/style_transfer.py \
70
- --ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
71
- --ckpt_path_conv "path_to_checkpoint_of_MixFXcloner" \
72
- --target_dir "path_to_directory_containing_inference_samples" \
73
- --interpolation True \
74
- --interpolate_segments "number of segments to perform interpolation"
75
- ```
76
- 4. Outputs will be stored under the same folder to inference data directory (default)
77
-
78
- *Note: This example of interpolating 2 different reference tracks is not mentioned in the paper, but this example implies a potential for controllable style transfer using latent space.*
79
-
80
-
81
-
82
- ## Feature Extraction Using *FXencoder*
83
-
84
- This inference code will extracts audio effects-related embeddings using our proposed <i>FXencoder</i>. This code will process all the .wav files under the target directory.
85
-
86
- 1. Download <i>FXencoder</i>'s pre-trained model above and place it under the folder named 'weights' (default)=
87
- 2. Run 'inference/style_transfer.py'
88
- ```
89
- python inference/feature_extraction.py \
90
- --ckpt_path_enc "path_to_checkpoint_of_FXencoder" \
91
- --target_dir "path_to_directory_containing_inference_samples"
92
- ```
93
- 3. Outputs will be stored under the same folder to inference data directory (default)
94
-
95
-
96
-
97
-
98
- # Implementation
99
-
100
- All the details of our system implementation are under the folder "mixing_style_transfer".
101
-
102
- <li><i>FXmanipulator</i></li>
103
- &emsp;&emsp;-> mixing_style_transfer/mixing_manipulator/
104
- <li>network architectures</li>
105
- &emsp;&emsp;-> mixing_style_transfer/networks/
106
- <li>configuration of each sub-networks</li>
107
- &emsp;&emsp;-> mixing_style_transfer/networks/configs.yaml
108
- <li>data loader</li>
109
- &emsp;&emsp;-> mixing_style_transfer/data_loader/
110
-
111
-
112
- # Citation
113
-
114
- Please consider citing the work upon usage.
115
-
116
- ```
117
- @article{koo2022music,
118
- title={Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects},
119
- author={Koo, Junghyun and Martinez-Ramirez, Marco A and Liao, Wei-Hsiang and Uhlich, Stefan and Lee, Kyogu and Mitsufuji, Yuki},
120
- journal={arXiv preprint arXiv:2211.02247},
121
- year={2022}
122
- }
123
- ```
124
-
125
-
126
-
 
1
+ ---
2
+ license: mit
3
+ title: Music Mixing Style Transfer Demo
4
+ sdk: gradio
5
+ emoji: 🎶
6
+ pinned: true
7
+ colorFrom: black
8
+ colorTo: white
9
+ ---