rvc-genshin-impact / README.md
ArkanDash's picture
feat: added yanfei-jp
7f32cd1
|
raw
history blame
2.57 kB
metadata
license: mit
language:
  - ja
metrics:
  - accuracy
pipeline_tag: audio-to-audio
tags:
  - rvc

RVC Genshin Impact Japanese Voice Model

model-cover.png

About Retrieval based Voice Conversion (RVC)

Learn more about Retrieval based Voice Conversion in this link below: RVC WebUI

How to use?

Download the prezipped model and put to your RVC Project.

Model test: Google Colab / RVC Models New (Which is basically the same but hosted on spaces)

INFO

Model Created by ArkanDash
The voice that was used in this model belongs to Hoyoverse.
The voice I make to make this model was ripped from the game (3.6 - 4.0).

Total Models: 46 Models

V1 Models: 19
V2 Models: 27

Duplicate:

  • Zhongli (v1 & v2)
  • Nahida (v1 & v2)

Plans:

  • Character from fontaine.
  • v2 model recreation from v1 model.

Note:

  • For faruzan, somehow the index file is smaller, Might retrain faruzan. Error message: Converged (lack of improvement in inertia) at step 1152/48215
  • Furina has only 20 minutes of dataset. (Will update the model in the future when its 1 hour long)
  • New model will be created using v2 training, I'm no longer making v1 model.

Have a request?
I accept genshin character request if you want it. Other request outside playable character:

  • Greater Lord Rukkhadevata: 750 Epochs, 16 Batch size, 48k Sample rate. (10 minutes dataset)
  • Charlotte: 400 Epochs, 16 Batch size, 48k Sample rate. (18 minutes dataset)
  • La Signora 1k Epochs, 16 Batch size, 48k Sample rate. (8 minutes dataset)

Model Training Information

V1 Model Training

This was trained on Original RVC.

Pitch Extract using Harvest.
This model was trained with 100 epochs, 10 batch sizes, and a 40K sample rate (some models had a 48k sample rate).

Every V1 model was trained more or less around 30 minutes of character voice.

V2 Model Training

This was trained on Mangio-Fork RVC.

Pitch Extract using Crepe.
This model was trained with 100 epochs, 8 batch sizes, and a 48K sample rate. (some models had a 40k sample rate).

Every V2 model was trained more or less around 60 minutes of character voice.

Warning

I'm not responsible for the output of this model. Use wisely.