--- datasets: - simon3000/genshin-voice - CSTR-Edinburgh/vctk language: - en --- # So-Vits-Svc Base Model V1 The base model to generate new voices with so-vits-svc voice lab. The dataset was comprised of 278 english speaking people. 4 datasets where used: - Genshin Voice: Only speakers with more than 30min of audio - VCTK - Vocalset - Private scraped dataset The model was trained for around 4 days and 16 hours on a single rtx 3090 (61 epochs / 430k steps)