freyza commited on
Commit
1cfc047
1 Parent(s): 88509ba

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -204
README.md DELETED
@@ -1,204 +0,0 @@
1
- # AICoverGen
2
- An autonomous pipeline to create covers with any RVC v2 trained AI voice from YouTube videos or a local audio file. For developers who may want to add a singing functionality into their AI assistant/chatbot/vtuber, or for people who want to hear their favourite characters sing their favourite song.
3
-
4
- Showcase: https://www.youtube.com/watch?v=2qZuE4WM7CM
5
-
6
- Setup Guide: https://www.youtube.com/watch?v=pdlhk4vVHQk
7
-
8
- ![](images/webui_generate.png?raw=true)
9
-
10
- WebUI is under constant development and testing, but you can try it out right now on both local and colab!
11
-
12
- ## Changelog
13
-
14
- - WebUI for easier conversions and downloading of voice models
15
- - Support for cover generations from a local audio file
16
- - Option to keep intermediate files generated. e.g. Isolated vocals/instrumentals
17
- - Download suggested public voice models from table with search/tag filters
18
- - Support for Pixeldrain download links for voice models
19
- - Implement new rmvpe pitch extraction technique for faster and higher quality vocal conversions
20
- - Volume control for AI main vocals, backup vocals and instrumentals
21
- - Index Rate for Voice conversion
22
- - Reverb Control for AI main vocals
23
- - Local network sharing option for webui
24
- - Extra RVC options - filter_radius, rms_mix_rate, protect
25
- - Local file upload via file browser option
26
- - Upload of locally trained RVC v2 models via WebUI
27
- - Pitch detection method control, e.g. rmvpe/mangio-crepe
28
- - Pitch change for vocals and instrumentals together. Same effect as changing key of song in Karaoke.
29
- - Audio output format option: wav or mp3.
30
-
31
- ## Update AICoverGen to latest version
32
-
33
- Install and pull any new requirements and changes by opening a command line window in the `AICoverGen` directory and running the following commands.
34
-
35
- ```
36
- pip install -r requirements.txt
37
- git pull
38
- ```
39
-
40
- For colab users, simply click `Runtime` in the top navigation bar of the colab notebook and `Disconnect and delete runtime` in the dropdown menu.
41
- Then follow the instructions in the notebook to run the webui.
42
-
43
- ## Colab notebook
44
-
45
- For those without a powerful enough NVIDIA GPU, you may try AICoverGen out using Google Colab.
46
-
47
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/SociallyIneptWeeb/AICoverGen/blob/main/AICoverGen_colab.ipynb)
48
-
49
- For those who want to run this locally, follow the setup guide below.
50
-
51
- ## Setup
52
-
53
- ### Install Git and Python
54
-
55
- Follow the instructions [here](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) to install Git on your computer. Also follow this [guide](https://realpython.com/installing-python/) to install Python **VERSION 3.9** if you haven't already. Using other versions of Python may result in dependency conflicts.
56
-
57
- ### Install ffmpeg
58
-
59
- Follow the instructions [here](https://www.hostinger.com/tutorials/how-to-install-ffmpeg) to install ffmpeg on your computer.
60
-
61
- ### Install sox
62
-
63
- Follow the instructions [here](https://www.tutorialexample.com/a-step-guide-to-install-sox-sound-exchange-on-windows-10-python-tutorial/) to install sox and add it to your Windows path environment.
64
-
65
- ### Clone AICoverGen repository
66
-
67
- Open a command line window and run these commands to clone this entire repository and install the additional dependencies required.
68
-
69
- ```
70
- git clone https://github.com/SociallyIneptWeeb/AICoverGen
71
- cd AICoverGen
72
- pip install -r requirements.txt
73
- ```
74
-
75
- ### Download required models
76
-
77
- Run the following command to download the required MDXNET vocal separation models and hubert base model.
78
-
79
- ```
80
- python src/download_models.py
81
- ```
82
-
83
-
84
- ## Usage with WebUI
85
-
86
- To run the AICoverGen WebUI, run the following command.
87
-
88
- ```
89
- python src/webui.py
90
- ```
91
-
92
- | Flag | Description |
93
- |--------------------------------------------|-------------|
94
- | `-h`, `--help` | Show this help message and exit. |
95
- | `--share` | Create a public URL. This is useful for running the web UI on Google Colab. |
96
- | `--listen` | Make the web UI reachable from your local network. |
97
- | `--listen-host LISTEN_HOST` | The hostname that the server will use. |
98
- | `--listen-port LISTEN_PORT` | The listening port that the server will use. |
99
-
100
- Once the following output message `Running on local URL: http://127.0.0.1:7860` appears, you can click on the link to open a tab with the WebUI.
101
-
102
- ### Download RVC models via WebUI
103
-
104
- ![](images/webui_dl_model.png?raw=true)
105
-
106
- Navigate to the `Download model` tab, and paste the download link to the RVC model and give it a unique name.
107
- You may search the [AI Hub Discord](https://discord.gg/aihub) where already trained voice models are available for download. You may refer to the examples for how the download link should look like.
108
- The downloaded zip file should contain the .pth model file and an optional .index file.
109
-
110
- Once the 2 input fields are filled in, simply click `Download`! Once the output message says `[NAME] Model successfully downloaded!`, you should be able to use it in the `Generate` tab after clicking the refresh models button!
111
-
112
- ### Upload RVC models via WebUI
113
-
114
- ![](images/webui_upload_model.png?raw=true)
115
-
116
- For people who have trained RVC v2 models locally and would like to use them for AI Cover generations.
117
- Navigate to the `Upload model` tab, and follow the instructions.
118
- Once the output message says `[NAME] Model successfully uploaded!`, you should be able to use it in the `Generate` tab after clicking the refresh models button!
119
-
120
-
121
- ### Running the pipeline via WebUI
122
-
123
- ![](images/webui_generate.png?raw=true)
124
-
125
- - From the Voice Models dropdown menu, select the voice model to use. Click `Update` if you added the files manually to the [rvc_models](rvc_models) directory to refresh the list.
126
- - In the song input field, copy and paste the link to any song on YouTube or the full path to a local audio file.
127
- - Pitch should be set to either -12, 0, or 12 depending on the original vocals and the RVC AI modal. This ensures the voice is not *out of tune*.
128
- - Other advanced options for Voice conversion and audio mixing can be viewed by clicking the accordion arrow to expand.
129
-
130
- Once all Main Options are filled in, click `Generate` and the AI generated cover should appear in a less than a few minutes depending on your GPU.
131
-
132
- ## Usage with CLI
133
-
134
- ### Manual Download of RVC models
135
-
136
- Unzip (if needed) and transfer the `.pth` and `.index` files to a new folder in the [rvc_models](rvc_models) directory. Each folder should only contain one `.pth` and one `.index` file.
137
-
138
- The directory structure should look something like this:
139
- ```
140
- ├── rvc_models
141
- │ ├── John
142
- │ │ ├── JohnV2.pth
143
- │ │ └── added_IVF2237_Flat_nprobe_1_v2.index
144
- │ ├── May
145
- │ │ ├── May.pth
146
- │ │ └── added_IVF2237_Flat_nprobe_1_v2.index
147
- │ ├── MODELS.txt
148
- │ └── hubert_base.pt
149
- ├── mdxnet_models
150
- ├── song_output
151
- └── src
152
- ```
153
-
154
- ### Running the pipeline
155
-
156
- To run the AI cover generation pipeline using the command line, run the following command.
157
-
158
- ```
159
- python src/main.py [-h] -i SONG_INPUT -dir RVC_DIRNAME -p PITCH_CHANGE [-k | --keep-files | --no-keep-files] [-ir INDEX_RATE] [-fr FILTER_RADIUS] [-rms RMS_MIX_RATE] [-palgo PITCH_DETECTION_ALGO] [-hop CREPE_HOP_LENGTH] [-pro PROTECT] [-mv MAIN_VOL] [-bv BACKUP_VOL] [-iv INST_VOL] [-pall PITCH_CHANGE_ALL] [-rsize REVERB_SIZE] [-rwet REVERB_WETNESS] [-rdry REVERB_DRYNESS] [-rdamp REVERB_DAMPING] [-oformat OUTPUT_FORMAT]
160
- ```
161
-
162
- | Flag | Description |
163
- |--------------------------------------------|-------------|
164
- | `-h`, `--help` | Show this help message and exit. |
165
- | `-i SONG_INPUT` | Link to a song on YouTube or path to a local audio file. Should be enclosed in double quotes for Windows and single quotes for Unix-like systems. |
166
- | `-dir MODEL_DIR_NAME` | Name of folder in [rvc_models](rvc_models) directory containing your `.pth` and `.index` files for a specific voice. |
167
- | `-p PITCH_CHANGE` | Change pitch of AI vocals in octaves. Set to 0 for no change. Generally, use 1 for male to female conversions and -1 for vice-versa. |
168
- | `-k` | Optional. Can be added to keep all intermediate audio files generated. e.g. Isolated AI vocals/instrumentals. Leave out to save space. |
169
- | `-ir INDEX_RATE` | Optional. Default 0.5. Control how much of the AI's accent to leave in the vocals. 0 <= INDEX_RATE <= 1. |
170
- | `-fr FILTER_RADIUS` | Optional. Default 3. If >=3: apply median filtering median filtering to the harvested pitch results. 0 <= FILTER_RADIUS <= 7. |
171
- | `-rms RMS_MIX_RATE` | Optional. Default 0.25. Control how much to use the original vocal's loudness (0) or a fixed loudness (1). 0 <= RMS_MIX_RATE <= 1. |
172
- | `-palgo PITCH_DETECTION_ALGO` | Optional. Default rmvpe. Best option is rmvpe (clarity in vocals), then mangio-crepe (smoother vocals). |
173
- | `-hop CREPE_HOP_LENGTH` | Optional. Default 128. Controls how often it checks for pitch changes in milliseconds when using mangio-crepe algo specifically. Lower values leads to longer conversions and higher risk of voice cracks, but better pitch accuracy. |
174
- | `-pro PROTECT` | Optional. Default 0.33. Control how much of the original vocals' breath and voiceless consonants to leave in the AI vocals. Set 0.5 to disable. 0 <= PROTECT <= 0.5. |
175
- | `-mv MAIN_VOCALS_VOLUME_CHANGE` | Optional. Default 0. Control volume of main AI vocals. Use -3 to decrease the volume by 3 decibels, or 3 to increase the volume by 3 decibels. |
176
- | `-bv BACKUP_VOCALS_VOLUME_CHANGE` | Optional. Default 0. Control volume of backup AI vocals. |
177
- | `-iv INSTRUMENTAL_VOLUME_CHANGE` | Optional. Default 0. Control volume of the background music/instrumentals. |
178
- | `-pall PITCH_CHANGE_ALL` | Optional. Default 0. Change pitch/key of background music, backup vocals and AI vocals in semitones. Reduces sound quality slightly. |
179
- | `-rsize REVERB_SIZE` | Optional. Default 0.15. The larger the room, the longer the reverb time. 0 <= REVERB_SIZE <= 1. |
180
- | `-rwet REVERB_WETNESS` | Optional. Default 0.2. Level of AI vocals with reverb. 0 <= REVERB_WETNESS <= 1. |
181
- | `-rdry REVERB_DRYNESS` | Optional. Default 0.8. Level of AI vocals without reverb. 0 <= REVERB_DRYNESS <= 1. |
182
- | `-rdamp REVERB_DAMPING` | Optional. Default 0.7. Absorption of high frequencies in the reverb. 0 <= REVERB_DAMPING <= 1. |
183
- | `-oformat OUTPUT_FORMAT` | Optional. Default mp3. wav for best quality and large file size, mp3 for decent quality and small file size. |
184
-
185
-
186
- ## Terms of Use
187
-
188
- The use of the converted voice for the following purposes is prohibited.
189
-
190
- * Criticizing or attacking individuals.
191
-
192
- * Advocating for or opposing specific political positions, religions, or ideologies.
193
-
194
- * Publicly displaying strongly stimulating expressions without proper zoning.
195
-
196
- * Selling of voice models and generated voice clips.
197
-
198
- * Impersonation of the original owner of the voice with malicious intentions to harm/hurt others.
199
-
200
- * Fraudulent purposes that lead to identity theft or fraudulent phone calls.
201
-
202
- ## Disclaimer
203
-
204
- I am not liable for any direct, indirect, consequential, incidental, or special damages arising out of or in any way connected with the use/misuse or inability to use this software.