File size: 468 Bytes
711459d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
---
datasets:
- simon3000/genshin-voice
- CSTR-Edinburgh/vctk
language:
- en
---
# So-Vits-Svc Base Model V1
The base model to generate new voices with so-vits-svc voice lab.

The dataset was comprised of 278 english speaking people. 
4 datasets where used:
 - Genshin Voice: Only speakers with more than 30min of audio
 - VCTK
 - Vocalset
 - Private scraped dataset

The model was trained for around 4 days and 16 hours on a single rtx 3090 (61 epochs / 430k steps)