PyTorch
Catalan
TTS
audio
synthesis
VITS
speech
coqui.ai
Baybars commited on
Commit
309b5b6
1 Parent(s): 29a3fd7

new model added

Browse files
README.md CHANGED
@@ -14,18 +14,21 @@ tags:
14
  - pytorch
15
 
16
  datasets:
17
- - mozilla-foundation/common_voice_8_0
18
- - openslr
 
19
 
20
  ---
21
 
22
  # Aina Project's Catalan multi-speaker text-to-speech model
23
  ## Model description
24
 
25
- This model was trained from scratch using the [Coqui TTS](https://github.com/coqui-ai/TTS) toolkit on a combination of 3 datasets: [Festcat](http://festcat.talp.cat/devel.php), high quality open speech dataset of [Google](http://openslr.org/69/) (can be found in [OpenSLR 69](https://huggingface.co/datasets/openslr/viewer/SLR69/train)) and [Common Voice v8](https://commonvoice.mozilla.org/ca). For the training, 101460 utterances consisting of 257 speakers were used, which corresponds to nearly 138 hours of speech.
26
 
27
  A live inference demo can be found in our spaces, [here](https://huggingface.co/spaces/projecte-aina/tts-ca-coqui-vits-multispeaker).
28
 
 
 
29
  ## Intended uses and limitations
30
 
31
  You can use this model to generate synthetic speech in Catalan with different voices.
@@ -33,7 +36,7 @@ You can use this model to generate synthetic speech in Catalan with different vo
33
  ## How to use
34
  ### Usage
35
 
36
- Requiered libraries:
37
 
38
  ```bash
39
  pip install git+https://github.com/coqui-ai/TTS@dev#egg=TTS
@@ -70,8 +73,6 @@ wavs = synthesizer.tts(text, speaker_idx)
70
  ## Training
71
  ### Training Procedure
72
  ### Data preparation
73
- The data has been processed using the script [process_data.sh](https://huggingface.co/projecte-aina/tts-ca-coqui-vits-multispeaker/blob/main/data_processing/process_data.sh), which reduces the sampling frequency of the audios, eliminates silences, adds padding and structures the data in the format accepted by the framework. You can find more information [here](https://huggingface.co/projecte-aina/tts-ca-coqui-vits-multispeaker/blob/main/data_processing/README.md).
74
-
75
  ### Hyperparameter
76
 
77
  The model is based on VITS proposed by [Kim et al](https://arxiv.org/abs/2106.06103). The following hyperparameters were set in the coqui framework.
@@ -106,7 +107,7 @@ The model was trained for 730962 steps.
106
  ## Additional information
107
 
108
  ### Author
109
- Text Mining Unit (TeMU) at the Barcelona Supercomputing Center (bsc-temu@bsc.es)
110
 
111
  ### Contact information
112
  For further information, send an email to aina@bsc.es
@@ -119,7 +120,7 @@ Copyright (c) 2022 Text Mining Unit at Barcelona Supercomputing Center
119
  [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
120
 
121
  ### Funding
122
- This work was funded by the [Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
123
 
124
 
125
  ## Disclaimer
 
14
  - pytorch
15
 
16
  datasets:
17
+ - mozilla-foundation/common_voice_12_0
18
+ - projecte-aina/festcat_trimmed_denoised
19
+ - projecte-aina/openslr-slr69-ca-trimmed-denoised
20
 
21
  ---
22
 
23
  # Aina Project's Catalan multi-speaker text-to-speech model
24
  ## Model description
25
 
26
+ This model was trained from scratch using the [Coqui TTS](https://github.com/coqui-ai/TTS) toolkit on a combination of 3 datasets: [Festcat](http://festcat.talp.cat/devel.php), [OpenSLR69](http://openslr.org/69/) and [Common Voice v12](https://commonvoice.mozilla.org/ca). For the training, we used 487 hours of recordings from 255 speakers. We have trimmed and denoised the data which all except Common Voice can be found in a seperate dataset in [festcat_trimmed_denoised](projecte-aina/festcat_trimmed_denoised) and [openslr69_trimmed_denoised](projecte-aina/openslr-slr69-ca-trimmed-denoised).
27
 
28
  A live inference demo can be found in our spaces, [here](https://huggingface.co/spaces/projecte-aina/tts-ca-coqui-vits-multispeaker).
29
 
30
+ The model needs our fork of [espeak-ng](https://github.com/projecte-aina/espeak-ng) to work correctly. For installation and deployment please consult the docker file of our [inference demo](https://huggingface.co/spaces/projecte-aina/tts-ca-coqui-vits-multispeaker/blob/main/Dockerfile).
31
+
32
  ## Intended uses and limitations
33
 
34
  You can use this model to generate synthetic speech in Catalan with different voices.
 
36
  ## How to use
37
  ### Usage
38
 
39
+ Required libraries:
40
 
41
  ```bash
42
  pip install git+https://github.com/coqui-ai/TTS@dev#egg=TTS
 
73
  ## Training
74
  ### Training Procedure
75
  ### Data preparation
 
 
76
  ### Hyperparameter
77
 
78
  The model is based on VITS proposed by [Kim et al](https://arxiv.org/abs/2106.06103). The following hyperparameters were set in the coqui framework.
 
107
  ## Additional information
108
 
109
  ### Author
110
+ Language Technologies Unit (LangTech) at the Barcelona Supercomputing Center
111
 
112
  ### Contact information
113
  For further information, send an email to aina@bsc.es
 
120
  [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
121
 
122
  ### Funding
123
+ This work was funded by the [Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://projecteaina.cat).
124
 
125
 
126
  ## Disclaimer
data_processing/README.md DELETED
@@ -1,40 +0,0 @@
1
- # Data preparation
2
-
3
- Scripts to process [festcat](http://festcat.talp.cat/devel.php) and [google_tts](http://openslr.org/69/) datasets, to make them compatible with training of modern TTS architectures
4
-
5
- ## Requirements
6
- `sox`, `ffmpeg`
7
-
8
- ### Processing steps
9
-
10
- #### Downloads
11
- Download [festcat](http://festcat.talp.cat/devel.php) and [google_tts](http://openslr.org/69/)
12
-
13
- #### Variables definition
14
-
15
- Open the shell script `.../data_processing/process_data.sh` and modify the following fields:
16
-
17
- ```bash
18
- ### Festcat variables ###
19
- export PATH_TO_FESTCAT_SHELL='.../data_processing/festcat_processing_test.sh' # Absolute path to festcat_processing_test.sh script
20
- export PATH_TO_FESTCAT_PY='.../data_processing/extract_festcat.py' # Absolute path to extract_festcat.py script
21
- export PATH_TO_FESTCAT_DATA='.../festcat/' # Path to Festcat dataset
22
- export FESTCAT_FINAL_PATH='.../festcat_processed' # Path where preprocessed Festcat will be stored
23
-
24
- ### Google_tts variables ###
25
- export PATH_TO_GOOGLE_TTS_SHELL='.../data_processing/google_tts_processing_test.sh' # Absolute path to google_tts_processing_test.sh script
26
- export PATH_TO_GOOGLE_TTS_PY='.../data_processing/extract_google_tts.py' # Absolute path to extract_google_tts.py script
27
- export PATH_TO_GOOGLE_TTS_DATA='.../google_tts' # Path to Google TTS dataset
28
- export GOOGLE_TTS_FINAL_PATH='.../google_tts_processed' # Path where preprocessed Google TTS will be stored
29
-
30
- ### General variables ###
31
- export VCTK_FORMATER_PATH='.../data_processing/ca_multi2vckt.py' # Absolute path to ca_multi2vckt.py script
32
- export FINAL_PATH='.../multispeaker_ca_test/' # Path where preprocessed and vctk formatted datasets will be stored.
33
- ```
34
- #### Run preprocessing
35
-
36
- Once the variables are correctly defined, execute the following command in the terminal:
37
-
38
- `sh <...>/data_processing/process_data.sh`
39
-
40
- The processed data in vctk format will be in the directory defined in `export FINAL_PATH='.../multispeaker_ca_test/'`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
data_processing/ca_multi2vckt.py DELETED
@@ -1,152 +0,0 @@
1
- import os
2
- import re
3
- import argparse
4
- from glob import glob
5
- from pathlib import Path
6
- from subprocess import call
7
-
8
- def main():
9
- my_parser = argparse.ArgumentParser()
10
- my_parser.add_argument('--google-path',
11
- metavar='path',
12
- type=str,
13
- help='the path to tsv file')
14
- my_parser.add_argument('--festcat-path',
15
- metavar='path',
16
- type=str,
17
- help='the path to wavs file')
18
- #my_parser.add_argument('--cv-path',
19
- # metavar='path',
20
- # type=str,
21
- # help='the path to wavs file')
22
- my_parser.add_argument('--final-path',
23
- metavar='path',
24
- type=str,
25
- help='the path to wavs file')
26
- args = my_parser.parse_args()
27
- google_path = args.google_path
28
- festcat_path = args.festcat_path
29
- #common_voice_path = args.cv_path
30
- target_base_path = args.final_path
31
-
32
- google_tts_male = google_path + "/male/"
33
- google_tts_female = google_path + "/female/"
34
- google_tts_paths = [google_tts_male, google_tts_female]
35
-
36
- #google_tts_paths = ["/gpfs/scratch/bsc88/bsc88858/google_tts/male/","/gpfs/scratch/bsc88/bsc88858/google_tts/female/"]
37
- #festcat_path = "/gpfs/scratch/bsc88/bsc88858/festcat/"
38
- #common_voice_path = "/gpfs/scratch/bsc88/bsc88858/cv-corpus-9.0-2022-04-27/ca/"
39
- #target_base_path = "/gpfs/scratch/bsc88/bsc88474/data/multispeaker_ca/"
40
-
41
- if os.path.exists(google_path):
42
- print("Converting google_tts data to vctk format")
43
- convert_google(google_tts_paths, target_base_path)
44
- else:
45
- print("Google_tts processed data not found")
46
-
47
- if os.path.exists(festcat_path):
48
- print("Converting festcat data to vctk format")
49
- convert_festcat(festcat_path, target_base_path)
50
- else:
51
- print("Festcat processed data not found")
52
-
53
- #convert_cv(common_voice_path, target_base_path)
54
-
55
- def convert_google(google_tts_paths, target_base_path):
56
- for g_path in google_tts_paths[:1]:
57
- meta_files = glob(f"{g_path}/*_*.txt")
58
- for meta_file in meta_files:
59
- print(meta_file)
60
- for line in open(meta_file).readlines():
61
- text_id, text = line.strip().split('|')
62
- text.replace('¿','')
63
- text.replace('¡','')
64
- #speaker_id = '_'.join(text_id.split('_')[:2])
65
- speaker_id = text_id.split('_')[1]
66
- target_text_file = os.path.join(target_base_path, 'txt',
67
- speaker_id, text_id+'.txt')
68
- target_wav_file = os.path.join(target_base_path, 'wav',
69
- speaker_id, text_id+'.wav')
70
- source_wav_file = os.path.join(g_path, 'wavs', text_id+'.wav')
71
-
72
- speaker_paths = [os.path.dirname(target_text_file),
73
- os.path.dirname(target_wav_file)]
74
-
75
- convert_meta(target_text_file, target_wav_file,
76
- source_wav_file, speaker_paths, text)
77
-
78
- def convert_meta(target_text_file,
79
- target_wav_file,
80
- source_wav_file,
81
- speaker_paths, text):
82
-
83
- # create directories
84
- for speaker_path in speaker_paths:
85
- if not os.path.isdir(speaker_path):
86
- os.mkdir(speaker_path)
87
-
88
- # write text file
89
- with open(target_text_file, 'w') as out:
90
- out.write(text)
91
-
92
- # copy wav file
93
- try:
94
- os.path.isfile(source_wav_file)
95
- except:
96
- raise IOError('{} does not exist'.format(source_wav_file))
97
-
98
- cp_args = ['cp', source_wav_file, target_wav_file]
99
- if not os.path.isfile(target_wav_file):
100
- #print(' '.join(cp_args))
101
- call(cp_args)
102
-
103
- def convert_festcat(festcat_path, target_base_path):
104
- meta_files = glob(f"{festcat_path}/*/*_train.txt")
105
- for meta_file in meta_files:
106
- speaker_name = meta_file.split(os.sep)[-2]
107
- print(meta_file)
108
- for line in open(meta_file).readlines():
109
- if '[' not in line:
110
- text_id, text = line.strip().split('|')
111
- text.replace('¿','')
112
- text.replace('¡','')
113
- #speaker_id = '_'.join(text_id.split('_')[:3])
114
- speaker_id = speaker_name
115
- target_text_file = os.path.join(target_base_path, 'txt',
116
- speaker_id, text_id+'.txt')
117
- target_wav_file = os.path.join(target_base_path, 'wav',
118
- speaker_id, text_id+'.wav')
119
- source_wav_file = os.path.join(festcat_path, speaker_name,
120
- 'wavs', text_id+'.wav')
121
-
122
- speaker_paths = [os.path.dirname(target_text_file),
123
- os.path.dirname(target_wav_file)]
124
-
125
- convert_meta(target_text_file, target_wav_file,
126
- source_wav_file, speaker_paths, text)
127
- else:
128
- print('line: {} skipped'.format(line))
129
-
130
- def convert_cv(common_voice_path, target_base_path):
131
- meta_files = glob(f"{common_voice_path}/*.txt")
132
- for meta_file in meta_files:
133
- print(meta_file)
134
- speaker_id = meta_file.split(os.sep)[-1].replace("ca_","").replace(".txt","")
135
- for line in open(meta_file).readlines():
136
- text_id, text = line.strip().split('|')
137
-
138
- target_text_file = os.path.join(target_base_path, 'txt',
139
- speaker_id, text_id+'.txt')
140
- target_wav_file = os.path.join(target_base_path, 'wav',
141
- speaker_id, text_id+'.wav')
142
- source_wav_file = os.path.join(common_voice_path,
143
- 'wavs', text_id+'.wav')
144
-
145
- speaker_paths = [os.path.dirname(target_text_file),
146
- os.path.dirname(target_wav_file)]
147
-
148
- convert_meta(target_text_file, target_wav_file,
149
- source_wav_file, speaker_paths, text)
150
-
151
- if __name__ == "__main__":
152
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
data_processing/extract_festcat.py DELETED
@@ -1,139 +0,0 @@
1
- import os
2
- import re
3
- import json
4
- import subprocess
5
- import argparse
6
- import logging
7
-
8
- logger = logging.getLogger(__name__)
9
-
10
- def main():
11
- my_parser = argparse.ArgumentParser()
12
- my_parser.add_argument('--utterance-path',
13
- metavar='path',
14
- type=str,
15
- help='the path to utterance file')
16
- my_parser.add_argument('--wavs-path',
17
- metavar='path',
18
- type=str,
19
- help='the path to wavs file')
20
- my_parser.add_argument('--locutors',
21
- metavar='N',
22
- type=str,
23
- help='list of speakers names/id separated with commas')
24
- args = my_parser.parse_args()
25
- locutors = args.locutors
26
- locutors = locutors.replace(" ", "");
27
- locutors = locutors.split(",")
28
- utterance_path = args.utterance_path
29
- wavs_path = args.wavs_path
30
-
31
- for locutor in locutors:
32
- # get durations
33
- durations = get_durations_dict(wavs_path + '%s_sil_stats.csv'%locutor)
34
- aggregate_duration = 0
35
- rejected_duration = 0
36
- large_duration = 0
37
- total_duration = 0
38
- path = 'upc_ca_%s_utt/utt'%locutor
39
- path = utterance_path + path
40
-
41
- files = []
42
- long_files = []
43
- for filename in os.listdir(path):
44
- sentence = get_sentence(os.path.join(path, filename))
45
- audio_filename = filename.replace('.utt','.wav') # upc_ca_pep_203479.wav
46
- if sentence:
47
- target_path = 'upc_ca_%s_wav_22k_sil_pad'%locutor
48
- target_path = wavs_path + target_path
49
- source_filename = 'upc_ca_%s_wav_22k_sil/'%locutor+audio_filename
50
- source_filename = wavs_path + source_filename
51
- total_duration += durations[audio_filename]
52
-
53
- if os.path.isfile(source_filename):
54
- if durations[audio_filename] < 10.0:
55
- aggregate_duration += durations[audio_filename]
56
- files.append((os.path.join(target_path,audio_filename), sentence))
57
- #subprocess.call(['cp',source_filename, target_filename])
58
- else:
59
- long_files.append((audio_filename, sentence))
60
- large_duration += durations[audio_filename]
61
- else:
62
- print(audio_filename)
63
- else:
64
- rejected_duration += durations[audio_filename]
65
- out(args, locutor, files)
66
- out_long(args, locutor, long_files)
67
- out_long_json(args, locutor, long_files)
68
- print(locutor, aggregate_duration/3600, 'hours')
69
- print(locutor, 'rejected due to duration', large_duration/3600, 'hours')
70
- print(locutor, 'rejected', rejected_duration/60, 'minutes')
71
- print(locutor, total_duration, aggregate_duration+rejected_duration+large_duration)
72
-
73
- def get_durations_dict(filename):
74
- durations = {}
75
-
76
- for line in open(filename).readlines():
77
- d = line.split(',')
78
- durations[d[0].split('/')[-1]] = float(d[1])
79
- return durations
80
-
81
- def get_sentence(filename):
82
- utt_all = open(filename, encoding = "ISO-8859-1").read()
83
- m = re.search('(\"\\\\\")(.+)(\\\\\"\")', utt_all)
84
- sentence = m.groups()[1]
85
- # delete interword dashes
86
- sentence = re.sub('-(?=([A-Z]))', ' ', sentence)
87
- if not re.search('\d', sentence):
88
- return sentence
89
- else:
90
- #print(filename, sentence)
91
- return None
92
-
93
- def out(args, locutor, files):
94
-
95
- outname_length = [('upc_%s_test.txt'%locutor,0),
96
- ('upc_%s_val.txt'%locutor,0),
97
- ('upc_%s_train.txt'%locutor,len(files))]
98
- l_sum = sum([el[1] for el in outname_length])
99
- if len(files) != l_sum:
100
- msg = 'train vs test val distribution wrong: %i'%l_sum
101
- raise ValueError('msg')
102
-
103
- for fout, l in outname_length:
104
- open((args.wavs_path + fout), mode= 'a').close()
105
- logger.warning(f"fout: {fout}")
106
- logger.warning(f"l: {l}")
107
- logger.warning(f"Enable l: {len(files)-100}")
108
- logger.warning(f"Files: {files}")
109
- with open((args.wavs_path + fout), 'w') as out:
110
- for i in range(l):
111
- f, sentence = files.pop()
112
- out.write('%s|%s\n'%(f.split("/")[-1].split(".")[-2],sentence))
113
-
114
- def out_long(args, locutor, files):
115
- outname = '%s_longsentences.csv'%locutor
116
- outname_path = args.wavs_path + outname
117
- open(outname_path, mode= 'a').close()
118
- with open(outname_path, 'w') as out:
119
- for audio, text in files:
120
- out.write('%s,"%s"\n'%(audio, text))
121
-
122
- def out_long_json(args, locutor, files):
123
- outname = '%s_longsentences.json'%locutor
124
- source = args.wavs_path +'upc_ca_%s_wav_22k_sil/'%locutor
125
- outname_path = args.wavs_path + outname
126
- open(outname_path, mode= 'a').close()
127
- interventions = []
128
- for audio, text in files:
129
- intervention = {}
130
- intervention['text'] = [(locutor, text)]
131
- intervention['urls'] = [(locutor, os.path.join(source,audio))]
132
- interventions.append(intervention)
133
-
134
- with open(outname_path, 'w') as out:
135
- json.dump({'session': interventions}, out, indent=2)
136
-
137
- if __name__ == "__main__":
138
- main()
139
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
data_processing/extract_google_tts.py DELETED
@@ -1,168 +0,0 @@
1
- import os
2
- import re
3
- import json
4
- import argparse
5
- import logging
6
- import csv
7
- import numpy as np
8
-
9
- logger = logging.getLogger(__name__)
10
-
11
- def main():
12
- my_parser = argparse.ArgumentParser()
13
- my_parser.add_argument('--tsv-path',
14
- metavar='path',
15
- type=str,
16
- help='the path to tsv file')
17
- my_parser.add_argument('--wavs-path',
18
- metavar='path',
19
- type=str,
20
- help='the path to wavs file')
21
- my_parser.add_argument('--locutors',
22
- metavar='N',
23
- type=str,
24
- help='list of speakers names/id separated with commas')
25
- args = my_parser.parse_args()
26
- locutors = args.locutors
27
- locutors = locutors.replace(" ", "");
28
- locutors = locutors.split(",")
29
- tsv_path = args.tsv_path
30
- wavs_path = args.wavs_path
31
-
32
- for locutor in locutors:
33
- # get durations
34
- durations = get_durations_dict(wavs_path + '%s_sil_stats.csv'%locutor)
35
- aggregate_duration = 0
36
- rejected_duration = 0
37
- large_duration = 0
38
- total_duration = 0
39
- tsv_name = "line_index_%s.tsv"%locutor
40
- tsv_path = tsv_path + tsv_name
41
-
42
- tsv_file = open(tsv_path)
43
- read_tsv = csv.reader(tsv_file, delimiter="\t")
44
- files = []
45
- long_files = []
46
- for row in read_tsv:
47
- audio_filename = row[0] + ".wav"
48
- #logger.warning(f"Audio_filename {audio_filename}")
49
- sentence = row[-1]
50
- if sentence:
51
- target_path = 'ca_es_%s_22k_sil_pad'%locutor
52
- target_path = wavs_path + target_path
53
- source_filename = 'ca_es_%s_22k_sil/'%locutor+audio_filename ###
54
- source_filename = wavs_path + source_filename
55
- #logger.warning(f"source_filename {source_filename}")
56
- total_duration += durations[audio_filename]
57
- if os.path.isfile(source_filename):
58
- if durations[audio_filename] < 10.0:
59
- aggregate_duration += durations[audio_filename]
60
- files.append((os.path.join(target_path,audio_filename), sentence))
61
- #subprocess.call(['cp',source_filename, target_filename])
62
- else:
63
- long_files.append((audio_filename, sentence))
64
- large_duration += durations[audio_filename]
65
- else:
66
- print(audio_filename)
67
- else:
68
- rejected_duration += durations[audio_filename]
69
-
70
- speakers_id = find_speakers_id(wavs_path + '%s_sil_stats.csv'%locutor)
71
- for id in speakers_id:
72
- speaker_file = files_spliter(files = files, speaker_id = id)
73
- if len(speaker_file) == 0:
74
- continue
75
- else:
76
- out(args, speaker_id = id, files = speaker_file)
77
- #print(f"mv {wavs_path}ca_{id}_test.txt {wavs_path}{locutor}")
78
- #os.system(f"mv {wavs_path}ca_{id}_test.txt {wavs_path}{locutor}")
79
- #os.system(f"mv {wavs_path}ca_{id}_val.txt {wavs_path}{locutor}")
80
- #os.system(f"mv {wavs_path}ca_{id}_train.txt {wavs_path}{locutor}")
81
- #out(args, locutor, files)
82
- out_long(args, locutor, long_files)
83
- out_long_json(args, locutor, long_files)
84
- print(locutor, aggregate_duration/3600, 'hours')
85
- print(locutor, 'rejected due to duration', large_duration/3600, 'hours')
86
- print(locutor, 'rejected', rejected_duration/60, 'minutes')
87
- print(locutor, total_duration, aggregate_duration+rejected_duration+large_duration)
88
-
89
- def get_durations_dict(filename):
90
- durations = {}
91
- for line in open(filename).readlines():
92
- d = line.split(',')
93
- durations[d[0].split('/')[-1]] = float(d[1])
94
- return durations
95
-
96
- def get_sentence(filename):
97
- utt_all = open(filename, encoding = "ISO-8859-1").read()
98
- m = re.search('(\"\\\\\")(.+)(\\\\\"\")', utt_all)
99
- sentence = m.groups()[1]
100
- # delete interword dashes
101
- sentence = re.sub('-(?=([A-Z]))', ' ', sentence)
102
- if not re.search('\d', sentence):
103
- return sentence
104
- else:
105
- print(filename, sentence)
106
- return None
107
-
108
- def out(args, speaker_id, files):
109
- outname_length = [('ca_%s_test.txt'%speaker_id,0),
110
- ('ca_%s_val.txt'%speaker_id,0),
111
- ('ca_%s_train.txt'%speaker_id,len(files))]
112
- l_sum = sum([el[1] for el in outname_length])
113
- if len(files) != l_sum:
114
- msg = 'train vs test val distribution wrong: %i'%l_sum
115
- raise ValueError('msg')
116
-
117
- for fout, l in outname_length:
118
- open((args.wavs_path + fout), mode= 'a').close()
119
- with open((args.wavs_path + fout), 'w') as out:
120
- for i in range(l):
121
- f, sentence = files.pop()
122
- out.write('%s|%s\n'%(f.split("/")[-1].split(".")[-2],sentence))
123
- print(len(files))
124
-
125
- def out_long(args, locutor, files):
126
- outname = '%s_longsentences.csv'%locutor
127
- outname_path = args.wavs_path + outname
128
- open(outname_path, mode= 'a').close()
129
- with open(outname_path, 'w') as out:
130
- for audio, text in files:
131
- out.write('%s,"%s"\n'%(audio, text))
132
-
133
- def out_long_json(args, locutor, files):
134
- outname = '%s_longsentences.json'%locutor
135
- source = args.wavs_path +'ca_es_%s_22k_sil/'%locutor
136
- outname_path = args.wavs_path + outname
137
- open(outname_path, mode= 'a').close()
138
- interventions = []
139
- for audio, text in files:
140
- intervention = {}
141
- intervention['text'] = [(locutor, text)]
142
- intervention['urls'] = [(locutor, os.path.join(source,audio))]
143
- interventions.append(intervention)
144
-
145
- with open(outname_path, 'w') as out:
146
- json.dump({'session': interventions}, out, indent=2)
147
-
148
- def find_speakers_id(path_tsv):
149
- durations = {}
150
- for line in open(path_tsv).readlines():
151
- d = line.split(',')
152
- durations[d[0].split('/')[-1]] = float(d[1])
153
- keysList = list(durations.keys())
154
- for index in range(len(keysList)):
155
- keysList[index] = keysList[index].split("_")[1]
156
- keysList = np.ndarray.tolist(np.unique(np.array(keysList)))
157
- return keysList
158
-
159
- def files_spliter(files, speaker_id):
160
- out_file = []
161
- for element in files:
162
- if element[0].split("/")[-1].split("_")[1] == speaker_id:
163
- out_file.append(element)
164
- return out_file
165
-
166
- if __name__ == "__main__":
167
- main()
168
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
data_processing/festcat_processing_test.sh DELETED
@@ -1,152 +0,0 @@
1
- #!/bin/sh
2
-
3
-
4
- export FINAL_PATH=$1
5
- export SOURCE_PATH=$2
6
- export EXTRACT_PATH=$3
7
-
8
-
9
- module load gcc/8.3.0 cuda/10.2 cudnn/7.6.4 nccl/2.4.8 tensorrt/6.0.1 openmpi/4.0.1 atlas scalapack/2.0.2 fftw/3.3.8 szip/2.1.1 ffmpeg/4.2.1 opencv/4.1.1 szip/2.1.1 ffmpeg/4.2.1 opencv/4.1.1 python/3.7.4_ML torch/1.9.0a0 fairseq/2021-10-04 llvm/10.0.1 mecab/0.996
10
-
11
- for name in bet eli eva jan mar ona pau pep pol teo uri
12
- do
13
- echo "Processing $name data"
14
- export SPEAKER_NAME=$name
15
- export OUTPUT_CSV="${FINAL_PATH}/${SPEAKER_NAME}/${SPEAKER_NAME}_sil_stats.csv"
16
- export UTTERANCE_PATH="${SOURCE_PATH}/${SPEAKER_NAME}/"
17
-
18
- if [ -d "${FINAL_PATH}" ]; then
19
- ### Take action if $DIR exists ###
20
- echo "Path ${FINAL_PATH} already created"
21
- else
22
- ### Control will jump here if $DIR does NOT exists ###
23
- mkdir ${FINAL_PATH}
24
- echo "Crating: ${FINAL_PATH} "
25
- fi
26
-
27
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}" ]; then
28
- ### Take action if $DIR exists ###
29
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME} already created"
30
- else
31
- ### Control will jump here if $DIR does NOT exists ###
32
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}
33
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME} "
34
- fi
35
-
36
-
37
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav" ]; then
38
- ### Take action if $DIR exists ###
39
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav already created"
40
- else
41
- ### Control will jump here if $DIR does NOT exists ###
42
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav
43
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav "
44
- fi
45
-
46
-
47
- if [ -z "$(ls -A ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav/)" ]; then
48
- i=1
49
- sp="/-\|"
50
- for f in ${SOURCE_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_raw/recordings/*.raw; do
51
- t=${f%.raw}.wav; g=${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav/${t##*/}; sox -t raw -r 48k -e signed -b 16 -c 1 $f $g;
52
- printf "\r Converiting .raw audios to .wav ${sp:i++%${#sp}:1}"
53
- sleep 0.05
54
- done
55
- else
56
- echo "Already converted to .wav"
57
- fi
58
-
59
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k" ]; then
60
- ### Take action if $DIR exists ###
61
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k already created"
62
- else
63
- ### Control will jump here if $DIR does NOT exists ###
64
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k
65
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k "
66
- fi
67
-
68
- if [ -z "$(ls -A ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k/)" ]; then
69
- i=1
70
- sp="/-\|"
71
- for f in ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav/*.wav; do
72
- t=${f##*/}; ffmpeg -i $f -ar 22050 ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k/$t -v error < /dev/null;
73
- printf "\r Converiting audios of 48kHz to 22kHz ${sp:i++%${#sp}:1}"
74
- sleep 0.05
75
- done;
76
- else
77
- echo "Already converted to 22kHz file"
78
- fi
79
-
80
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil" ]; then
81
- ### Take action if $DIR exists ###
82
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil already created"
83
- else
84
- ### Control will jump here if $DIR does NOT exists ###
85
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil
86
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil "
87
- fi
88
-
89
- if [ -z "$(ls -A ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil/)" ]; then
90
- i=1
91
- sp="/-\|"
92
- for f in ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k/*.wav; do
93
- t=${f##*/}; sox $f ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil/$t silence 1 0.02 0.5% reverse silence 1 0.02 0.5% reverse;
94
- printf "\r Filtering silence ${sp:i++%${#sp}:1}"
95
- sleep 0.05
96
- done
97
- else
98
- echo "Silence already eliminated"
99
- fi
100
-
101
- if [ -f "${OUTPUT_CSV}" ]; then
102
- ### Take action if $DIR exists ###
103
- echo "${OUTPUT_CSV} already exists!"
104
- else
105
- ### Control will jump here if $DIR does NOT exists ###
106
- echo "Crating ${OUTPUT_CSV}"
107
- for f in ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil/*.wav; do
108
- d=`ffprobe -i $f -show_entries format=duration -v quiet -of csv="p=0"`;
109
- echo $f,$d;
110
- done >> ${OUTPUT_CSV}
111
- fi
112
-
113
- if [ -f "${FINAL_PATH}/${SPEAKER_NAME}/upc_${SPEAKER_NAME}_train.txt" ]; then
114
- ### Take action if $DIR exists ###
115
- echo "Splits already created!"
116
- else
117
- ### Control will jump here if $DIR does NOT exists ###
118
- echo "Crating splits..."
119
- python ${EXTRACT_PATH} --wavs-path ${FINAL_PATH}/${SPEAKER_NAME}/ --utterance-path ${UTTERANCE_PATH} --locutors ${SPEAKER_NAME}
120
- fi
121
-
122
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil_pad" ]; then
123
- ### Take action if $DIR exists ###
124
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil_pad already created"
125
- else
126
- ### Control will jump here if $DIR does NOT exists ###
127
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}/wavs
128
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME}/wavs"
129
- fi
130
-
131
- if [ -z "$(ls -A ${FINAL_PATH}/${SPEAKER_NAME}/wavs/)" ]; then
132
- i=1
133
- sp="/-\|"
134
- for f in ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil/*.wav; do
135
- t=${f##*/}; sox $f ${FINAL_PATH}/${SPEAKER_NAME}/wavs/$t pad 0 0.058;
136
- printf "\r Adding pad ${sp:i++%${#sp}:1}"
137
- sleep 0.05
138
- done
139
- else
140
- echo "Pad already added!"
141
- fi
142
-
143
- rm -r ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k_sil
144
- rm -r ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav_22k
145
- rm -r ${FINAL_PATH}/${SPEAKER_NAME}/upc_ca_${SPEAKER_NAME}_wav
146
-
147
- done
148
- echo "Done!"
149
-
150
-
151
-
152
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
data_processing/google_tts_processing_test.sh DELETED
@@ -1,124 +0,0 @@
1
- #!/bin/sh
2
-
3
-
4
- export FINAL_PATH=$1
5
- export SOURCE_PATH=$2
6
- export EXTRACT_PATH=$3
7
-
8
-
9
-
10
- module load gcc/8.3.0 cuda/10.2 cudnn/7.6.4 nccl/2.4.8 tensorrt/6.0.1 openmpi/4.0.1 atlas scalapack/2.0.2 fftw/3.3.8 szip/2.1.1 ffmpeg/4.2.1 opencv/4.1.1 szip/2.1.1 ffmpeg/4.2.1 opencv/4.1.1 python/3.7.4_ML torch/1.9.0a0 fairseq/2021-10-04 llvm/10.0.1 mecab/0.996
11
-
12
- for name in male female
13
- do
14
- export SPEAKER_NAME=$name
15
- export OUTPUT_CSV="${FINAL_PATH}/${SPEAKER_NAME}/${SPEAKER_NAME}_sil_stats.csv"
16
- export UTTERANCE_PATH="${SOURCE_PATH}/${SPEAKER_NAME}/"
17
-
18
- if [ -d "${FINAL_PATH}" ]; then
19
- ### Take action if $DIR exists ###
20
- echo "Path ${FINAL_PATH} already created"
21
- else
22
- ### Control will jump here if $DIR does NOT exists ###
23
- mkdir ${FINAL_PATH}
24
- echo "Crating: ${FINAL_PATH} "
25
- fi
26
-
27
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}" ]; then
28
- ### Take action if $DIR exists ###
29
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME} already created"
30
- else
31
- ### Control will jump here if $DIR does NOT exists ###
32
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}
33
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME} "
34
- fi
35
-
36
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k" ]; then
37
- ### Take action if $DIR exists ###
38
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k already created"
39
- else
40
- ### Control will jump here if $DIR does NOT exists ###
41
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k
42
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k "
43
- fi
44
-
45
- if [ -z "$(ls -A ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k/)" ]; then
46
- i=1
47
- sp="/-\|"
48
- for f in ${SOURCE_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}/*.wav; do
49
- t=${f##*/}; ffmpeg -i $f -ar 22050 ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k/$t -v error < /dev/null;
50
- printf "\r Converiting audios of 48kHz to 22kHz ${sp:i++%${#sp}:1}"
51
- sleep 0.05
52
- done;
53
- else
54
- echo "Already converted to 22kHz file"
55
- fi
56
-
57
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil" ]; then
58
- ### Take action if $DIR exists ###
59
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil already created"
60
- else
61
- ### Control will jump here if $DIR does NOT exists ###
62
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil
63
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil "
64
- fi
65
-
66
- if [ -z "$(ls -A ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil/)" ]; then
67
- i=1
68
- sp="/-\|"
69
- for f in ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k/*.wav; do
70
- t=${f##*/}; sox $f ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil/$t silence 1 0.02 0.5% reverse silence 1 0.02 0.5% reverse;
71
- printf "\r Filtering silence ${sp:i++%${#sp}:1}"
72
- sleep 0.05
73
- done
74
- else
75
- echo "Silence has already been filtered!"
76
- fi
77
-
78
- if [ -f "${OUTPUT_CSV}" ]; then
79
- ### Take action if $DIR exists ###
80
- echo "${OUTPUT_CSV} already exists!"
81
- else
82
- ### Control will jump here if $DIR does NOT exists ###
83
- echo "Crating ${OUTPUT_CSV}"
84
- for f in ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil/*.wav; do
85
- d=`ffprobe -i $f -show_entries format=duration -v quiet -of csv="p=0"`;
86
- echo $f,$d;
87
- done >> ${OUTPUT_CSV}
88
- fi
89
-
90
- if [ -f "${FINAL_PATH}/${SPEAKER_NAME}/ca_01591_train.txt" ]; then
91
- ### Take action if $DIR exists ###
92
- echo "Splits already created!"
93
- else
94
- ### Control will jump here if $DIR does NOT exists ###
95
- echo "Crating splits..."
96
- python ${EXTRACT_PATH} --wavs-path ${FINAL_PATH}/${SPEAKER_NAME}/ --tsv-path ${SOURCE_PATH}/${SPEAKER_NAME}/ --locutors ${SPEAKER_NAME}
97
- fi
98
-
99
- if [ -d "${FINAL_PATH}/${SPEAKER_NAME}/wavs" ]; then
100
- ### Take action if $DIR exists ###
101
- echo "Path ${FINAL_PATH}/${SPEAKER_NAME}/wavs"
102
- else
103
- ### Control will jump here if $DIR does NOT exists ###
104
- mkdir ${FINAL_PATH}/${SPEAKER_NAME}/wavs
105
- echo "Crating: ${FINAL_PATH}/${SPEAKER_NAME}/wavs"
106
- fi
107
-
108
- if [ -z "$(ls -A ${FINAL_PATH}/${SPEAKER_NAME}/wavs/)" ]; then
109
- i=1
110
- sp="/-\|"
111
- for f in ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil/*.wav; do
112
- t=${f##*/}; sox $f ${FINAL_PATH}/${SPEAKER_NAME}/wavs/$t pad 0 0.058;
113
- printf "\r Adding pad ${sp:i++%${#sp}:1}"
114
- sleep 0.05
115
- done
116
- else
117
- echo "Pad already added!"
118
- fi
119
-
120
- rm -r ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k_sil
121
- rm -r ${FINAL_PATH}/${SPEAKER_NAME}/ca_es_${SPEAKER_NAME}_22k
122
-
123
- done
124
- echo "Done!"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
data_processing/process_data.sh DELETED
@@ -1,56 +0,0 @@
1
- #!/bin/bash
2
-
3
- ### Festcat variables ###
4
- export PATH_TO_FESTCAT_SHELL='/gpfs/scratch/bsc88/bsc88858/data_processing/festcat_processing_test.sh'
5
- export PATH_TO_FESTCAT_PY='/gpfs/scratch/bsc88/bsc88858/data_processing/extract_festcat.py'
6
- export PATH_TO_FESTCAT_DATA='/gpfs/scratch/bsc88/bsc88858/festcat/'
7
- export FESTCAT_FINAL_PATH='/gpfs/scratch/bsc88/bsc88858/festcat_processed'
8
-
9
- ### Google_tts variables ###
10
- export PATH_TO_GOOGLE_TTS_SHELL='/gpfs/scratch/bsc88/bsc88858/data_processing/google_tts_processing_test.sh'
11
- export PATH_TO_GOOGLE_TTS_PY='/gpfs/scratch/bsc88/bsc88858/data_processing/extract_google_tts.py'
12
- export PATH_TO_GOOGLE_TTS_DATA='/gpfs/scratch/bsc88/bsc88858/google_tts'
13
- export GOOGLE_TTS_FINAL_PATH='/gpfs/scratch/bsc88/bsc88858/google_tts_processed'
14
-
15
- ### General variables ###
16
- export VCTK_FORMATER_PATH='/gpfs/scratch/bsc88/bsc88858/data_processing/ca_multi2vckt.py'
17
- export FINAL_PATH='/gpfs/scratch/bsc88/bsc88858/multispeaker_ca_test/'
18
-
19
-
20
- if [ -d "${FESTCAT_FINAL_PATH}" ]; then
21
- ### Take action if $DIR exists ###
22
- echo "Path ${FESTCAT_FINAL_PATH} already exists"
23
- else
24
- ### Control will jump here if $DIR does NOT exists ###
25
- if [ -d "${PATH_TO_FESTCAT_DATA}" ]; then
26
- source ${PATH_TO_FESTCAT_SHELL} ${FESTCAT_FINAL_PATH} ${PATH_TO_FESTCAT_DATA} ${PATH_TO_FESTCAT_PY}
27
- else
28
- echo "Fescat data not found!"
29
- fi
30
- fi
31
-
32
- if [ -d "${GOOGLE_TTS_FINAL_PATH}" ]; then
33
- ### Take action if $DIR exists ###
34
- echo "Path ${GOOGLE_TTS_FINAL_PATH} already exists"
35
- else
36
- ### Control will jump here if $DIR does NOT exists ###
37
- if [ -d "${PATH_TO_GOOGLE_TTS_DATA}" ]; then
38
- source ${PATH_TO_GOOGLE_TTS_SHELL} ${GOOGLE_TTS_FINAL_PATH} ${PATH_TO_GOOGLE_TTS_DATA} ${PATH_TO_GOOGLE_TTS_PY}
39
- else
40
- echo "Google TTS data not found!"
41
- fi
42
- fi
43
-
44
- if [ -d "${FINAL_PATH}" ]; then
45
- ### Take action if $DIR exists ###
46
- echo "Path ${FINAL_PATH} already created"
47
- else
48
- ### Control will jump here if $DIR does NOT exists ###
49
- mkdir ${FINAL_PATH}
50
- mkdir ${FINAL_PATH}/txt/
51
- mkdir ${FINAL_PATH}/wav/
52
- echo "Crating: ${FINAL_PATH}"
53
- python ${VCTK_FORMATER_PATH} --google-path ${GOOGLE_TTS_FINAL_PATH} --festcat-path ${FESTCAT_FINAL_PATH} --final-path ${FINAL_PATH}
54
- fi
55
-
56
- echo "Done!"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
model/best_model.pth CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b15fa7d2052bada1cf421e49d2d03b00e95b49fcd0e42b7af1d92da2880cdecc
3
- size 1038659133
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7281afd683f92a46feb9068f5dcd96038b0b64b453deee25d147064b34e2dbcf
3
+ size 1040801013
model/config.json CHANGED
Binary files a/model/config.json and b/model/config.json differ