Spaces:
Running
Running
A newer version of the Gradio SDK is available:
5.6.0
data_prep
This directory contains the following data preparation scripts:
- MFA data preparation: Code for extracting phone alignments by MontrΓ©al Forced Aligner (MFA)
- Style prompt data preparation: Code for preparing synthetic annotations of style prompts.
0. Download LibriTTS_R
Before running any scripts, be sure to put the LibriTTS-R dataset to ./LibriTTS_R
. You must have the following directory structure:
LibriTTS_R/
βββ BOOKS.txt
βββ CHAPTERS.txt
βββ LICENSE.txt
βββ NOTE.txt
βββ README_librispeech.txt
βββ README_libritts.txt
βββ README_libritts_r.txt
βββ SPEAKERS.txt
βββ dev-clean
βββ dev-other
βββ reader_book.tsv
βββ speakers.tsv
βββ test-clean
βββ test-other
βββ train-clean-100
βββ train-clean-360
βββ train-other-500
1. MFA data preparation
Setup for MFA
conda install -c conda-forge montreal-forced-aligner
mfa model download dictionary english_us_arpa
mfa model download acoustic english_us_arpa
Usage
Please check runall_mfa.sh
for the usage.
Note that running MFA for all the utterances in LibriTTS-R takes a long time (likely a few days).
Directory structure
After all the data preparation steps, the following directories will be created:
libritts_r_per_spk_cleaned
${spk}
textgrid
: text grid fileswav24k
: 24kHz wav files
βββ 100
β βββ textgrid
β βββ wav24k
βββ 1001
β βββ textgrid
β βββ wav24k
βββ 1006
β βββ textgrid
β βββ wav24k
...
2. Style prompt data preparation
Code for estimating per-utterance style tags (e.g., low pitch, normal pitch and high pitch) from the data statistics.
Usage
Please check runall_style_prompt_tags.sh
for the usage.