Upload 2 files
Browse files- README.md +7 -4
- requirements.txt +3 -4
README.md
CHANGED
|
@@ -4,22 +4,22 @@ license: apache-2.0
|
|
| 4 |
|
| 5 |
# LocalSong
|
| 6 |
|
| 7 |
-
LocalSong is
|
| 8 |
|
| 9 |
## Installation
|
| 10 |
|
| 11 |
### Prerequisites
|
| 12 |
|
| 13 |
- Python 3.10 or higher
|
| 14 |
-
- CUDA-capable GPU recommended
|
| 15 |
|
| 16 |
### Setup
|
| 17 |
|
| 18 |
git clone https://huggingface.co/Localsong/LocalSong
|
| 19 |
cd localsong
|
| 20 |
-
python3 -m venv venv
|
| 21 |
source venv/bin/activate
|
| 22 |
-
pip install -
|
| 23 |
|
| 24 |
### Run
|
| 25 |
|
|
@@ -30,7 +30,10 @@ The interface will be available at `http://localhost:7860`
|
|
| 30 |
### Generation Advice
|
| 31 |
|
| 32 |
Generations should use one of the soundtrack, soundtrack1 or soundtrack2 tags, as well as at least one other tag. They can use up to 8 tags; try combining genres and instruments.
|
|
|
|
| 33 |
The default settings (CFG 3.5, steps 200) have been tested as optimal.
|
|
|
|
|
|
|
| 34 |
The first generation will be slower due to torch.compile, then speed will increase.
|
| 35 |
The model was trained on vocals but not lyrics. Vocals will not have recognizable words.
|
| 36 |
|
|
|
|
| 4 |
|
| 5 |
# LocalSong
|
| 6 |
|
| 7 |
+
LocalSong is a 700M parameter audio generation model focused on melodic instrumental music that uses tag-based conditioning.
|
| 8 |
|
| 9 |
## Installation
|
| 10 |
|
| 11 |
### Prerequisites
|
| 12 |
|
| 13 |
- Python 3.10 or higher
|
| 14 |
+
- CUDA-capable GPU recommended with 8GB of VRAM
|
| 15 |
|
| 16 |
### Setup
|
| 17 |
|
| 18 |
git clone https://huggingface.co/Localsong/LocalSong
|
| 19 |
cd localsong
|
| 20 |
+
python3.10 -m venv venv
|
| 21 |
source venv/bin/activate
|
| 22 |
+
pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --extra-index-url https://download.pytorch.org/whl/cu128
|
| 23 |
|
| 24 |
### Run
|
| 25 |
|
|
|
|
| 30 |
### Generation Advice
|
| 31 |
|
| 32 |
Generations should use one of the soundtrack, soundtrack1 or soundtrack2 tags, as well as at least one other tag. They can use up to 8 tags; try combining genres and instruments.
|
| 33 |
+
|
| 34 |
The default settings (CFG 3.5, steps 200) have been tested as optimal.
|
| 35 |
+
If generation is too slow on your system, try lowering steps to 100.
|
| 36 |
+
|
| 37 |
The first generation will be slower due to torch.compile, then speed will increase.
|
| 38 |
The model was trained on vocals but not lyrics. Vocals will not have recognizable words.
|
| 39 |
|
requirements.txt
CHANGED
|
@@ -1,7 +1,6 @@
|
|
| 1 |
-
torch
|
| 2 |
-
torchaudio
|
| 3 |
-
torchvision
|
| 4 |
-
torchcodec>=0.8.0
|
| 5 |
accelerate>=1.9.0
|
| 6 |
diffusers>=0.34.0
|
| 7 |
einops>=0.8.1
|
|
|
|
| 1 |
+
torch==2.7.1
|
| 2 |
+
torchaudio==2.7.1
|
| 3 |
+
torchvision==0.22.1
|
|
|
|
| 4 |
accelerate>=1.9.0
|
| 5 |
diffusers>=0.34.0
|
| 6 |
einops>=0.8.1
|