Spaces:
Running
Running
darksakura
commited on
Commit
•
5815543
1
Parent(s):
b5fe430
Upload 179 files
Browse files- diffusion/__pycache__/__init__.cpython-38.pyc +0 -0
- diffusion/__pycache__/diffusion.cpython-38.pyc +0 -0
- diffusion/__pycache__/unit2mel.cpython-38.pyc +0 -0
- diffusion/__pycache__/vocoder.cpython-38.pyc +0 -0
- diffusion/__pycache__/wavenet.cpython-38.pyc +0 -0
- diffusion/logger/__pycache__/__init__.cpython-38.pyc +0 -0
- diffusion/logger/__pycache__/utils.cpython-38.pyc +0 -0
- inference/__pycache__/__init__.cpython-38.pyc +0 -0
- inference/__pycache__/infer_tool_webui.cpython-38.pyc +0 -0
- inference/__pycache__/slicer.cpython-38.pyc +0 -0
- modules/F0Predictor/__pycache__/CrepeF0Predictor.cpython-38.pyc +0 -0
- modules/F0Predictor/__pycache__/F0Predictor.cpython-38.pyc +0 -0
- modules/F0Predictor/__pycache__/FCPEF0Predictor.cpython-38.pyc +0 -0
- modules/F0Predictor/__pycache__/RMVPEF0Predictor.cpython-38.pyc +0 -0
- modules/F0Predictor/__pycache__/__init__.cpython-38.pyc +0 -0
- modules/F0Predictor/__pycache__/crepe.cpython-38.pyc +0 -0
- modules/F0Predictor/fcpe/__pycache__/__init__.cpython-38.pyc +0 -0
- modules/F0Predictor/fcpe/__pycache__/model.cpython-38.pyc +0 -0
- modules/F0Predictor/fcpe/__pycache__/nvSTFT.cpython-38.pyc +0 -0
- modules/F0Predictor/fcpe/__pycache__/pcmer.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/__init__.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/constants.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/deepunet.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/inference.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/model.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/seq.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/spec.cpython-38.pyc +0 -0
- modules/F0Predictor/rmvpe/__pycache__/utils.cpython-38.pyc +0 -0
- modules/__pycache__/DSConv.cpython-38.pyc +0 -0
- modules/__pycache__/__init__.cpython-38.pyc +0 -0
- modules/__pycache__/attentions.cpython-38.pyc +0 -0
- modules/__pycache__/commons.cpython-38.pyc +0 -0
- modules/__pycache__/enhancer.cpython-38.pyc +0 -0
- modules/__pycache__/losses.cpython-38.pyc +0 -0
- modules/__pycache__/mel_processing.cpython-38.pyc +0 -0
- modules/__pycache__/modules.cpython-38.pyc +0 -0
- preprocess_flist_config.py +14 -9
- preprocess_hubert_f0.py +31 -22
- resample.py +2 -2
- vdecoder/__pycache__/__init__.cpython-38.pyc +0 -0
- vdecoder/hifigan/__pycache__/env.cpython-38.pyc +0 -0
- vdecoder/hifigan/__pycache__/models.cpython-38.pyc +0 -0
- vdecoder/hifigan/__pycache__/utils.cpython-38.pyc +0 -0
- vdecoder/nsf_hifigan/__pycache__/env.cpython-38.pyc +0 -0
- vdecoder/nsf_hifigan/__pycache__/models.cpython-38.pyc +0 -0
- vdecoder/nsf_hifigan/__pycache__/nvSTFT.cpython-38.pyc +0 -0
- vdecoder/nsf_hifigan/__pycache__/utils.cpython-38.pyc +0 -0
- vencoder/__pycache__/ContentVec768L12.cpython-38.pyc +0 -0
- vencoder/__pycache__/__init__.cpython-38.pyc +0 -0
- vencoder/__pycache__/encoder.cpython-38.pyc +0 -0
diffusion/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/diffusion/__pycache__/__init__.cpython-38.pyc and b/diffusion/__pycache__/__init__.cpython-38.pyc differ
|
|
diffusion/__pycache__/diffusion.cpython-38.pyc
CHANGED
Binary files a/diffusion/__pycache__/diffusion.cpython-38.pyc and b/diffusion/__pycache__/diffusion.cpython-38.pyc differ
|
|
diffusion/__pycache__/unit2mel.cpython-38.pyc
CHANGED
Binary files a/diffusion/__pycache__/unit2mel.cpython-38.pyc and b/diffusion/__pycache__/unit2mel.cpython-38.pyc differ
|
|
diffusion/__pycache__/vocoder.cpython-38.pyc
CHANGED
Binary files a/diffusion/__pycache__/vocoder.cpython-38.pyc and b/diffusion/__pycache__/vocoder.cpython-38.pyc differ
|
|
diffusion/__pycache__/wavenet.cpython-38.pyc
CHANGED
Binary files a/diffusion/__pycache__/wavenet.cpython-38.pyc and b/diffusion/__pycache__/wavenet.cpython-38.pyc differ
|
|
diffusion/logger/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/diffusion/logger/__pycache__/__init__.cpython-38.pyc and b/diffusion/logger/__pycache__/__init__.cpython-38.pyc differ
|
|
diffusion/logger/__pycache__/utils.cpython-38.pyc
CHANGED
Binary files a/diffusion/logger/__pycache__/utils.cpython-38.pyc and b/diffusion/logger/__pycache__/utils.cpython-38.pyc differ
|
|
inference/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/inference/__pycache__/__init__.cpython-38.pyc and b/inference/__pycache__/__init__.cpython-38.pyc differ
|
|
inference/__pycache__/infer_tool_webui.cpython-38.pyc
CHANGED
Binary files a/inference/__pycache__/infer_tool_webui.cpython-38.pyc and b/inference/__pycache__/infer_tool_webui.cpython-38.pyc differ
|
|
inference/__pycache__/slicer.cpython-38.pyc
CHANGED
Binary files a/inference/__pycache__/slicer.cpython-38.pyc and b/inference/__pycache__/slicer.cpython-38.pyc differ
|
|
modules/F0Predictor/__pycache__/CrepeF0Predictor.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/__pycache__/CrepeF0Predictor.cpython-38.pyc and b/modules/F0Predictor/__pycache__/CrepeF0Predictor.cpython-38.pyc differ
|
|
modules/F0Predictor/__pycache__/F0Predictor.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/__pycache__/F0Predictor.cpython-38.pyc and b/modules/F0Predictor/__pycache__/F0Predictor.cpython-38.pyc differ
|
|
modules/F0Predictor/__pycache__/FCPEF0Predictor.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/__pycache__/FCPEF0Predictor.cpython-38.pyc and b/modules/F0Predictor/__pycache__/FCPEF0Predictor.cpython-38.pyc differ
|
|
modules/F0Predictor/__pycache__/RMVPEF0Predictor.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/__pycache__/RMVPEF0Predictor.cpython-38.pyc and b/modules/F0Predictor/__pycache__/RMVPEF0Predictor.cpython-38.pyc differ
|
|
modules/F0Predictor/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/__pycache__/__init__.cpython-38.pyc and b/modules/F0Predictor/__pycache__/__init__.cpython-38.pyc differ
|
|
modules/F0Predictor/__pycache__/crepe.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/__pycache__/crepe.cpython-38.pyc and b/modules/F0Predictor/__pycache__/crepe.cpython-38.pyc differ
|
|
modules/F0Predictor/fcpe/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/fcpe/__pycache__/__init__.cpython-38.pyc and b/modules/F0Predictor/fcpe/__pycache__/__init__.cpython-38.pyc differ
|
|
modules/F0Predictor/fcpe/__pycache__/model.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/fcpe/__pycache__/model.cpython-38.pyc and b/modules/F0Predictor/fcpe/__pycache__/model.cpython-38.pyc differ
|
|
modules/F0Predictor/fcpe/__pycache__/nvSTFT.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/fcpe/__pycache__/nvSTFT.cpython-38.pyc and b/modules/F0Predictor/fcpe/__pycache__/nvSTFT.cpython-38.pyc differ
|
|
modules/F0Predictor/fcpe/__pycache__/pcmer.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/fcpe/__pycache__/pcmer.cpython-38.pyc and b/modules/F0Predictor/fcpe/__pycache__/pcmer.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/__init__.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/__init__.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/constants.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/constants.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/constants.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/deepunet.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/deepunet.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/deepunet.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/inference.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/inference.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/inference.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/model.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/model.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/model.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/seq.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/seq.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/seq.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/spec.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/spec.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/spec.cpython-38.pyc differ
|
|
modules/F0Predictor/rmvpe/__pycache__/utils.cpython-38.pyc
CHANGED
Binary files a/modules/F0Predictor/rmvpe/__pycache__/utils.cpython-38.pyc and b/modules/F0Predictor/rmvpe/__pycache__/utils.cpython-38.pyc differ
|
|
modules/__pycache__/DSConv.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/DSConv.cpython-38.pyc and b/modules/__pycache__/DSConv.cpython-38.pyc differ
|
|
modules/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/__init__.cpython-38.pyc and b/modules/__pycache__/__init__.cpython-38.pyc differ
|
|
modules/__pycache__/attentions.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/attentions.cpython-38.pyc and b/modules/__pycache__/attentions.cpython-38.pyc differ
|
|
modules/__pycache__/commons.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/commons.cpython-38.pyc and b/modules/__pycache__/commons.cpython-38.pyc differ
|
|
modules/__pycache__/enhancer.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/enhancer.cpython-38.pyc and b/modules/__pycache__/enhancer.cpython-38.pyc differ
|
|
modules/__pycache__/losses.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/losses.cpython-38.pyc and b/modules/__pycache__/losses.cpython-38.pyc differ
|
|
modules/__pycache__/mel_processing.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/mel_processing.cpython-38.pyc and b/modules/__pycache__/mel_processing.cpython-38.pyc differ
|
|
modules/__pycache__/modules.cpython-38.pyc
CHANGED
Binary files a/modules/__pycache__/modules.cpython-38.pyc and b/modules/__pycache__/modules.cpython-38.pyc differ
|
|
preprocess_flist_config.py
CHANGED
@@ -5,12 +5,11 @@ import re
|
|
5 |
import wave
|
6 |
from random import shuffle
|
7 |
|
|
|
8 |
from tqdm import tqdm
|
9 |
|
10 |
import diffusion.logger.utils as du
|
11 |
|
12 |
-
config_template = json.load(open("configs_template/config_template.json"))
|
13 |
-
|
14 |
pattern = re.compile(r'^[\.a-zA-Z0-9_\/]+$')
|
15 |
|
16 |
def get_wav_duration(file_path):
|
@@ -30,13 +29,16 @@ if __name__ == "__main__":
|
|
30 |
parser.add_argument("--source_dir", type=str, default="./dataset/44k", help="path to source dir")
|
31 |
parser.add_argument("--speech_encoder", type=str, default="vec768l12", help="choice a speech encoder|'vec768l12','vec256l9','hubertsoft','whisper-ppg','cnhubertlarge','dphubert','whisper-ppg-large','wavlmbase+'")
|
32 |
parser.add_argument("--vol_aug", action="store_true", help="Whether to use volume embedding and volume augmentation")
|
|
|
33 |
args = parser.parse_args()
|
34 |
|
|
|
35 |
train = []
|
36 |
val = []
|
37 |
idx = 0
|
38 |
spk_dict = {}
|
39 |
spk_id = 0
|
|
|
40 |
for speaker in tqdm(os.listdir(args.source_dir)):
|
41 |
spk_dict[speaker] = spk_id
|
42 |
spk_id += 1
|
@@ -46,9 +48,9 @@ if __name__ == "__main__":
|
|
46 |
if not file.endswith("wav"):
|
47 |
continue
|
48 |
if not pattern.match(file):
|
49 |
-
|
50 |
if get_wav_duration(file) < 0.3:
|
51 |
-
|
52 |
continue
|
53 |
new_wavs.append(file)
|
54 |
wavs = new_wavs
|
@@ -59,13 +61,13 @@ if __name__ == "__main__":
|
|
59 |
shuffle(train)
|
60 |
shuffle(val)
|
61 |
|
62 |
-
|
63 |
with open(args.train_list, "w") as f:
|
64 |
for fname in tqdm(train):
|
65 |
wavpath = fname
|
66 |
f.write(wavpath + "\n")
|
67 |
|
68 |
-
|
69 |
with open(args.val_list, "w") as f:
|
70 |
for fname in tqdm(val):
|
71 |
wavpath = fname
|
@@ -85,7 +87,7 @@ if __name__ == "__main__":
|
|
85 |
config_template["model"]["ssl_dim"] = config_template["model"]["filter_channels"] = config_template["model"]["gin_channels"] = 768
|
86 |
d_config_template["data"]["encoder_out_channels"] = 768
|
87 |
elif args.speech_encoder == "vec256l9" or args.speech_encoder == 'hubertsoft':
|
88 |
-
config_template["model"]["ssl_dim"] = config_template["model"]["
|
89 |
d_config_template["data"]["encoder_out_channels"] = 256
|
90 |
elif args.speech_encoder == "whisper-ppg" or args.speech_encoder == 'cnhubertlarge':
|
91 |
config_template["model"]["ssl_dim"] = config_template["model"]["filter_channels"] = config_template["model"]["gin_channels"] = 1024
|
@@ -97,8 +99,11 @@ if __name__ == "__main__":
|
|
97 |
if args.vol_aug:
|
98 |
config_template["train"]["vol_aug"] = config_template["model"]["vol_embedding"] = True
|
99 |
|
100 |
-
|
|
|
|
|
|
|
101 |
with open("configs/config.json", "w") as f:
|
102 |
json.dump(config_template, f, indent=2)
|
103 |
-
|
104 |
du.save_config("configs/diffusion.yaml",d_config_template)
|
|
|
5 |
import wave
|
6 |
from random import shuffle
|
7 |
|
8 |
+
from loguru import logger
|
9 |
from tqdm import tqdm
|
10 |
|
11 |
import diffusion.logger.utils as du
|
12 |
|
|
|
|
|
13 |
pattern = re.compile(r'^[\.a-zA-Z0-9_\/]+$')
|
14 |
|
15 |
def get_wav_duration(file_path):
|
|
|
29 |
parser.add_argument("--source_dir", type=str, default="./dataset/44k", help="path to source dir")
|
30 |
parser.add_argument("--speech_encoder", type=str, default="vec768l12", help="choice a speech encoder|'vec768l12','vec256l9','hubertsoft','whisper-ppg','cnhubertlarge','dphubert','whisper-ppg-large','wavlmbase+'")
|
31 |
parser.add_argument("--vol_aug", action="store_true", help="Whether to use volume embedding and volume augmentation")
|
32 |
+
parser.add_argument("--tiny", action="store_true", help="Whether to train sovits tiny")
|
33 |
args = parser.parse_args()
|
34 |
|
35 |
+
config_template = json.load(open("configs_template/config_tiny_template.json")) if args.tiny else json.load(open("configs_template/config_template.json"))
|
36 |
train = []
|
37 |
val = []
|
38 |
idx = 0
|
39 |
spk_dict = {}
|
40 |
spk_id = 0
|
41 |
+
|
42 |
for speaker in tqdm(os.listdir(args.source_dir)):
|
43 |
spk_dict[speaker] = spk_id
|
44 |
spk_id += 1
|
|
|
48 |
if not file.endswith("wav"):
|
49 |
continue
|
50 |
if not pattern.match(file):
|
51 |
+
logger.warning(f"文件名{file}中包含非字母数字下划线,可能会导致错误。(也可能不会)")
|
52 |
if get_wav_duration(file) < 0.3:
|
53 |
+
logger.info("Skip too short audio:" + file)
|
54 |
continue
|
55 |
new_wavs.append(file)
|
56 |
wavs = new_wavs
|
|
|
61 |
shuffle(train)
|
62 |
shuffle(val)
|
63 |
|
64 |
+
logger.info("Writing" + args.train_list)
|
65 |
with open(args.train_list, "w") as f:
|
66 |
for fname in tqdm(train):
|
67 |
wavpath = fname
|
68 |
f.write(wavpath + "\n")
|
69 |
|
70 |
+
logger.info("Writing" + args.val_list)
|
71 |
with open(args.val_list, "w") as f:
|
72 |
for fname in tqdm(val):
|
73 |
wavpath = fname
|
|
|
87 |
config_template["model"]["ssl_dim"] = config_template["model"]["filter_channels"] = config_template["model"]["gin_channels"] = 768
|
88 |
d_config_template["data"]["encoder_out_channels"] = 768
|
89 |
elif args.speech_encoder == "vec256l9" or args.speech_encoder == 'hubertsoft':
|
90 |
+
config_template["model"]["ssl_dim"] = config_template["model"]["gin_channels"] = 256
|
91 |
d_config_template["data"]["encoder_out_channels"] = 256
|
92 |
elif args.speech_encoder == "whisper-ppg" or args.speech_encoder == 'cnhubertlarge':
|
93 |
config_template["model"]["ssl_dim"] = config_template["model"]["filter_channels"] = config_template["model"]["gin_channels"] = 1024
|
|
|
99 |
if args.vol_aug:
|
100 |
config_template["train"]["vol_aug"] = config_template["model"]["vol_embedding"] = True
|
101 |
|
102 |
+
if args.tiny:
|
103 |
+
config_template["model"]["filter_channels"] = 512
|
104 |
+
|
105 |
+
logger.info("Writing to configs/config.json")
|
106 |
with open("configs/config.json", "w") as f:
|
107 |
json.dump(config_template, f, indent=2)
|
108 |
+
logger.info("Writing to configs/diffusion.yaml")
|
109 |
du.save_config("configs/diffusion.yaml",d_config_template)
|
preprocess_hubert_f0.py
CHANGED
@@ -1,6 +1,5 @@
|
|
1 |
import argparse
|
2 |
import logging
|
3 |
-
import multiprocessing
|
4 |
import os
|
5 |
import random
|
6 |
from concurrent.futures import ProcessPoolExecutor
|
@@ -10,6 +9,8 @@ from random import shuffle
|
|
10 |
import librosa
|
11 |
import numpy as np
|
12 |
import torch
|
|
|
|
|
13 |
from tqdm import tqdm
|
14 |
|
15 |
import diffusion.logger.utils as du
|
@@ -27,13 +28,10 @@ hop_length = hps.data.hop_length
|
|
27 |
speech_encoder = hps["model"]["speech_encoder"]
|
28 |
|
29 |
|
30 |
-
def process_one(filename, hmodel,f0p,diff=False,mel_extractor=None):
|
31 |
-
# print(filename)
|
32 |
wav, sr = librosa.load(filename, sr=sampling_rate)
|
33 |
audio_norm = torch.FloatTensor(wav)
|
34 |
audio_norm = audio_norm.unsqueeze(0)
|
35 |
-
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
36 |
-
|
37 |
soft_path = filename + ".soft.pt"
|
38 |
if not os.path.exists(soft_path):
|
39 |
wav16k = librosa.resample(wav, orig_sr=sampling_rate, target_sr=16000)
|
@@ -104,29 +102,34 @@ def process_one(filename, hmodel,f0p,diff=False,mel_extractor=None):
|
|
104 |
if not os.path.exists(aug_vol_path):
|
105 |
np.save(aug_vol_path,aug_vol.to('cpu').numpy())
|
106 |
|
107 |
-
def process_batch(file_chunk, f0p, diff=False, mel_extractor=None):
|
108 |
-
print("Loading speech encoder for content...")
|
109 |
-
device = "cuda" if torch.cuda.is_available() else "cpu"
|
110 |
-
hmodel = utils.get_speech_encoder(speech_encoder, device=device)
|
111 |
-
print("Loaded speech encoder.")
|
112 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
113 |
for filename in tqdm(file_chunk):
|
114 |
-
process_one(filename, hmodel, f0p, diff, mel_extractor)
|
115 |
|
116 |
-
def parallel_process(filenames, num_processes, f0p, diff, mel_extractor):
|
117 |
with ProcessPoolExecutor(max_workers=num_processes) as executor:
|
118 |
tasks = []
|
119 |
for i in range(num_processes):
|
120 |
start = int(i * len(filenames) / num_processes)
|
121 |
end = int((i + 1) * len(filenames) / num_processes)
|
122 |
file_chunk = filenames[start:end]
|
123 |
-
tasks.append(executor.submit(process_batch, file_chunk, f0p, diff, mel_extractor))
|
124 |
-
|
125 |
for task in tqdm(tasks):
|
126 |
task.result()
|
127 |
|
128 |
if __name__ == "__main__":
|
129 |
parser = argparse.ArgumentParser()
|
|
|
130 |
parser.add_argument(
|
131 |
"--in_dir", type=str, default="dataset/44k", help="path to input dir"
|
132 |
)
|
@@ -134,30 +137,36 @@ if __name__ == "__main__":
|
|
134 |
'--use_diff',action='store_true', help='Whether to use the diffusion model'
|
135 |
)
|
136 |
parser.add_argument(
|
137 |
-
'--f0_predictor', type=str, default="dio", help='Select F0 predictor, can select crepe,pm,dio,harvest,rmvpe,
|
138 |
)
|
139 |
parser.add_argument(
|
140 |
'--num_processes', type=int, default=1, help='You are advised to set the number of processes to the same as the number of CPU cores'
|
141 |
)
|
142 |
-
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
143 |
args = parser.parse_args()
|
144 |
f0p = args.f0_predictor
|
|
|
|
|
|
|
|
|
145 |
print(speech_encoder)
|
146 |
-
|
147 |
-
|
|
|
|
|
|
|
148 |
if args.use_diff:
|
149 |
print("use_diff")
|
150 |
print("Loading Mel Extractor...")
|
151 |
-
mel_extractor = Vocoder(dconfig.vocoder.type, dconfig.vocoder.ckpt, device
|
152 |
print("Loaded Mel Extractor.")
|
153 |
else:
|
154 |
mel_extractor = None
|
155 |
filenames = glob(f"{args.in_dir}/*/*.wav", recursive=True) # [:10]
|
156 |
shuffle(filenames)
|
157 |
-
|
158 |
|
159 |
num_processes = args.num_processes
|
160 |
if num_processes == 0:
|
161 |
num_processes = os.cpu_count()
|
162 |
-
|
163 |
-
parallel_process(filenames, num_processes, f0p, args.use_diff, mel_extractor)
|
|
|
1 |
import argparse
|
2 |
import logging
|
|
|
3 |
import os
|
4 |
import random
|
5 |
from concurrent.futures import ProcessPoolExecutor
|
|
|
9 |
import librosa
|
10 |
import numpy as np
|
11 |
import torch
|
12 |
+
import torch.multiprocessing as mp
|
13 |
+
from loguru import logger
|
14 |
from tqdm import tqdm
|
15 |
|
16 |
import diffusion.logger.utils as du
|
|
|
28 |
speech_encoder = hps["model"]["speech_encoder"]
|
29 |
|
30 |
|
31 |
+
def process_one(filename, hmodel, f0p, device, diff=False, mel_extractor=None):
|
|
|
32 |
wav, sr = librosa.load(filename, sr=sampling_rate)
|
33 |
audio_norm = torch.FloatTensor(wav)
|
34 |
audio_norm = audio_norm.unsqueeze(0)
|
|
|
|
|
35 |
soft_path = filename + ".soft.pt"
|
36 |
if not os.path.exists(soft_path):
|
37 |
wav16k = librosa.resample(wav, orig_sr=sampling_rate, target_sr=16000)
|
|
|
102 |
if not os.path.exists(aug_vol_path):
|
103 |
np.save(aug_vol_path,aug_vol.to('cpu').numpy())
|
104 |
|
|
|
|
|
|
|
|
|
|
|
105 |
|
106 |
+
def process_batch(file_chunk, f0p, diff=False, mel_extractor=None, device="cpu"):
|
107 |
+
logger.info("Loading speech encoder for content...")
|
108 |
+
rank = mp.current_process()._identity
|
109 |
+
rank = rank[0] if len(rank) > 0 else 0
|
110 |
+
if torch.cuda.is_available():
|
111 |
+
gpu_id = rank % torch.cuda.device_count()
|
112 |
+
device = torch.device(f"cuda:{gpu_id}")
|
113 |
+
logger.info(f"Rank {rank} uses device {device}")
|
114 |
+
hmodel = utils.get_speech_encoder(speech_encoder, device=device)
|
115 |
+
logger.info(f"Loaded speech encoder for rank {rank}")
|
116 |
for filename in tqdm(file_chunk):
|
117 |
+
process_one(filename, hmodel, f0p, device, diff, mel_extractor)
|
118 |
|
119 |
+
def parallel_process(filenames, num_processes, f0p, diff, mel_extractor, device):
|
120 |
with ProcessPoolExecutor(max_workers=num_processes) as executor:
|
121 |
tasks = []
|
122 |
for i in range(num_processes):
|
123 |
start = int(i * len(filenames) / num_processes)
|
124 |
end = int((i + 1) * len(filenames) / num_processes)
|
125 |
file_chunk = filenames[start:end]
|
126 |
+
tasks.append(executor.submit(process_batch, file_chunk, f0p, diff, mel_extractor, device=device))
|
|
|
127 |
for task in tqdm(tasks):
|
128 |
task.result()
|
129 |
|
130 |
if __name__ == "__main__":
|
131 |
parser = argparse.ArgumentParser()
|
132 |
+
parser.add_argument('-d', '--device', type=str, default=None)
|
133 |
parser.add_argument(
|
134 |
"--in_dir", type=str, default="dataset/44k", help="path to input dir"
|
135 |
)
|
|
|
137 |
'--use_diff',action='store_true', help='Whether to use the diffusion model'
|
138 |
)
|
139 |
parser.add_argument(
|
140 |
+
'--f0_predictor', type=str, default="dio", help='Select F0 predictor, can select crepe,pm,dio,harvest,rmvpe,fcpe|default: pm(note: crepe is original F0 using mean filter)'
|
141 |
)
|
142 |
parser.add_argument(
|
143 |
'--num_processes', type=int, default=1, help='You are advised to set the number of processes to the same as the number of CPU cores'
|
144 |
)
|
|
|
145 |
args = parser.parse_args()
|
146 |
f0p = args.f0_predictor
|
147 |
+
device = args.device
|
148 |
+
if device is None:
|
149 |
+
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
|
150 |
+
|
151 |
print(speech_encoder)
|
152 |
+
logger.info("Using device: ", device)
|
153 |
+
logger.info("Using SpeechEncoder: " + speech_encoder)
|
154 |
+
logger.info("Using extractor: " + f0p)
|
155 |
+
logger.info("Using diff Mode: " + str( args.use_diff))
|
156 |
+
|
157 |
if args.use_diff:
|
158 |
print("use_diff")
|
159 |
print("Loading Mel Extractor...")
|
160 |
+
mel_extractor = Vocoder(dconfig.vocoder.type, dconfig.vocoder.ckpt, device=device)
|
161 |
print("Loaded Mel Extractor.")
|
162 |
else:
|
163 |
mel_extractor = None
|
164 |
filenames = glob(f"{args.in_dir}/*/*.wav", recursive=True) # [:10]
|
165 |
shuffle(filenames)
|
166 |
+
mp.set_start_method("spawn", force=True)
|
167 |
|
168 |
num_processes = args.num_processes
|
169 |
if num_processes == 0:
|
170 |
num_processes = os.cpu_count()
|
171 |
+
|
172 |
+
parallel_process(filenames, num_processes, f0p, args.use_diff, mel_extractor, device)
|
resample.py
CHANGED
@@ -6,8 +6,8 @@ from multiprocessing import cpu_count
|
|
6 |
|
7 |
import librosa
|
8 |
import numpy as np
|
|
|
9 |
from scipy.io import wavfile
|
10 |
-
from tqdm import tqdm
|
11 |
|
12 |
|
13 |
def load_wav(wav_path):
|
@@ -81,7 +81,7 @@ def process_all_speakers():
|
|
81 |
if os.path.isdir(spk_dir):
|
82 |
print(spk_dir)
|
83 |
futures = [executor.submit(process, (spk_dir, i, args)) for i in os.listdir(spk_dir) if i.endswith("wav")]
|
84 |
-
for _ in
|
85 |
pass
|
86 |
|
87 |
|
|
|
6 |
|
7 |
import librosa
|
8 |
import numpy as np
|
9 |
+
from rich.progress import track
|
10 |
from scipy.io import wavfile
|
|
|
11 |
|
12 |
|
13 |
def load_wav(wav_path):
|
|
|
81 |
if os.path.isdir(spk_dir):
|
82 |
print(spk_dir)
|
83 |
futures = [executor.submit(process, (spk_dir, i, args)) for i in os.listdir(spk_dir) if i.endswith("wav")]
|
84 |
+
for _ in track(concurrent.futures.as_completed(futures), total=len(futures), description="resampling:"):
|
85 |
pass
|
86 |
|
87 |
|
vdecoder/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/vdecoder/__pycache__/__init__.cpython-38.pyc and b/vdecoder/__pycache__/__init__.cpython-38.pyc differ
|
|
vdecoder/hifigan/__pycache__/env.cpython-38.pyc
CHANGED
Binary files a/vdecoder/hifigan/__pycache__/env.cpython-38.pyc and b/vdecoder/hifigan/__pycache__/env.cpython-38.pyc differ
|
|
vdecoder/hifigan/__pycache__/models.cpython-38.pyc
CHANGED
Binary files a/vdecoder/hifigan/__pycache__/models.cpython-38.pyc and b/vdecoder/hifigan/__pycache__/models.cpython-38.pyc differ
|
|
vdecoder/hifigan/__pycache__/utils.cpython-38.pyc
CHANGED
Binary files a/vdecoder/hifigan/__pycache__/utils.cpython-38.pyc and b/vdecoder/hifigan/__pycache__/utils.cpython-38.pyc differ
|
|
vdecoder/nsf_hifigan/__pycache__/env.cpython-38.pyc
CHANGED
Binary files a/vdecoder/nsf_hifigan/__pycache__/env.cpython-38.pyc and b/vdecoder/nsf_hifigan/__pycache__/env.cpython-38.pyc differ
|
|
vdecoder/nsf_hifigan/__pycache__/models.cpython-38.pyc
CHANGED
Binary files a/vdecoder/nsf_hifigan/__pycache__/models.cpython-38.pyc and b/vdecoder/nsf_hifigan/__pycache__/models.cpython-38.pyc differ
|
|
vdecoder/nsf_hifigan/__pycache__/nvSTFT.cpython-38.pyc
CHANGED
Binary files a/vdecoder/nsf_hifigan/__pycache__/nvSTFT.cpython-38.pyc and b/vdecoder/nsf_hifigan/__pycache__/nvSTFT.cpython-38.pyc differ
|
|
vdecoder/nsf_hifigan/__pycache__/utils.cpython-38.pyc
CHANGED
Binary files a/vdecoder/nsf_hifigan/__pycache__/utils.cpython-38.pyc and b/vdecoder/nsf_hifigan/__pycache__/utils.cpython-38.pyc differ
|
|
vencoder/__pycache__/ContentVec768L12.cpython-38.pyc
CHANGED
Binary files a/vencoder/__pycache__/ContentVec768L12.cpython-38.pyc and b/vencoder/__pycache__/ContentVec768L12.cpython-38.pyc differ
|
|
vencoder/__pycache__/__init__.cpython-38.pyc
CHANGED
Binary files a/vencoder/__pycache__/__init__.cpython-38.pyc and b/vencoder/__pycache__/__init__.cpython-38.pyc differ
|
|
vencoder/__pycache__/encoder.cpython-38.pyc
CHANGED
Binary files a/vencoder/__pycache__/encoder.cpython-38.pyc and b/vencoder/__pycache__/encoder.cpython-38.pyc differ
|
|