--- license: mit language: - ja metrics: - accuracy pipeline_tag: text-to-speech tags: - rvc library_name: fairseq --- # Its not my model! #
RVC Genshin Impact Japanese Voice Model ![model-cover.png](https://huggingface.co/ArkanDash/rvc-genshin-impact/resolve/main/model-cover.png) ## About Retrieval based Voice Conversion (RVC) Learn more about Retrieval based Voice Conversion in this link below: [RVC WebUI](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI) ## How to use? Download the prezipped model and put to your RVC Project. Model test: [Google Colab](https://colab.research.google.com/drive/110kiMZTdP6Ri1lY9-NbQf17GVPPhHyeT?usp=sharing) / [RVC Models New](https://huggingface.co/spaces/ArkanDash/rvc-models-new) (Which is basically the same but hosted on spaces) ##
INFO Model Created by ArkanDash
The voice that was used in this model belongs to Hoyoverse.
The voice I make to make this model was ripped from the game (3.6 - 4.0). #### Total Models: 46 Models V1 Models: 19
V2 Models: 27 Duplicate: - Zhongli (v1 & v2) - Nahida (v1 & v2) Plans: - Character from fontaine. - v2 model recreation from v1 model. Note: - For faruzan, somehow the index file is smaller, Might retrain faruzan. Error message: `Converged (lack of improvement in inertia) at step 1152/48215`
- Furina has only 20 minutes of dataset. (Will update the model in the future when its 1 hour long) - New model will be created using v2 training, I'm no longer making v1 model. Have a request?
I accept genshin character request if you want it. Other request outside playable character:
- Greater Lord Rukkhadevata: 750 Epochs, 16 Batch size, 48k Sample rate. (10 minutes dataset) - Charlotte: 400 Epochs, 16 Batch size, 48k Sample rate. (18 minutes dataset) - La Signora 1k Epochs, 16 Batch size, 48k Sample rate. (8 minutes dataset) ##
Model Training Information ### V1 Model Training
##### This was trained on Original RVC. Pitch Extract using Harvest.
This model was trained with 100 epochs, 10 batch sizes, and a 40K sample rate (some models had a 48k sample rate). Every V1 model was trained more or less around 30 minutes of character voice. ### V2 Model Training
##### This was trained on Mangio-Fork RVC. Pitch Extract using Crepe.
This model was trained with 100 epochs, 8 batch sizes, and a 48K sample rate. (some models had a 40k sample rate). Every V2 model was trained more or less around 60 minutes of character voice. ## Warning I'm not responsible for the output of this model. Use wisely.