GTFO_Character_Voice_models
Table of Contents
About The Project
This project provides trained models for audio to audio
simulation using so-vits-4.1, as training the model is resource intensive, but not so much for infering an audio. Every model included has been trained for at least 20K steps
This repo includes:
- All data sets used to train the models
- Default model
- Diffusion model
- Fusion model
- Sample of model
Dataset Source:
Trained with so-vits-svc
Getting Started
To use the models, you need to follow the instructions on so-vits-svc or so-vits-svc-fork for a better GUI and easier inference as no training is required.
Dragging the folder into the so-vits-svc folder should work right away, otherwise, move models to designated folder based on description.
so-vits-svc-4.1
โ
โโโโconfigs
โ โโโโconfig.json - config file for default training
โ โโโโdiffusion.yaml - config file for diffusion training
โ
โโโโlogs
โโโโ44k
โโโโG_(name of character).pth - Default model
โโโโ(name of character)Kmeans.pt - fusion model
โโโโdiffusion
โโโโ(name of character).pt - difussion model for character
data_set - dataset used for training, audio cut to slices.
Usage
Select Default model, diffusion model, fusion model and respective config for training. Note: Update the speaker in the config file to avoid key errors. "Hacket_data_set,Dauda_data_set,Bishop_data_set,Woods_data_set" If that does not work, try using pre process a folder with such names, and preconfig to set all configs with the same voice name.
- Fusion model = cluster model
- You might not see the option for diffusion as it is a new feature, it is only provided in some versions of so-vits-forks
License
Distributed under the MIT License. See LICENSE.txt
for more information.
If used, please attatch link to the repo.
Contact
NAinfini - na.infini@gmail.com
NA infini#6457 -Discord
Project Link: GTFO_Character_Voice_models