Localsong commited on
Commit
d7a7e7a
·
verified ·
1 Parent(s): 7b25443

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +7 -4
  2. requirements.txt +3 -4
README.md CHANGED
@@ -4,22 +4,22 @@ license: apache-2.0
4
 
5
  # LocalSong
6
 
7
- LocalSong is an audio generation model focused on melodic instrumental music that uses tag-based conditioning to generate audio.
8
 
9
  ## Installation
10
 
11
  ### Prerequisites
12
 
13
  - Python 3.10 or higher
14
- - CUDA-capable GPU recommended
15
 
16
  ### Setup
17
 
18
  git clone https://huggingface.co/Localsong/LocalSong
19
  cd localsong
20
- python3 -m venv venv
21
  source venv/bin/activate
22
- pip install -r requirements.txt
23
 
24
  ### Run
25
 
@@ -30,7 +30,10 @@ The interface will be available at `http://localhost:7860`
30
  ### Generation Advice
31
 
32
  Generations should use one of the soundtrack, soundtrack1 or soundtrack2 tags, as well as at least one other tag. They can use up to 8 tags; try combining genres and instruments.
 
33
  The default settings (CFG 3.5, steps 200) have been tested as optimal.
 
 
34
  The first generation will be slower due to torch.compile, then speed will increase.
35
  The model was trained on vocals but not lyrics. Vocals will not have recognizable words.
36
 
 
4
 
5
  # LocalSong
6
 
7
+ LocalSong is a 700M parameter audio generation model focused on melodic instrumental music that uses tag-based conditioning.
8
 
9
  ## Installation
10
 
11
  ### Prerequisites
12
 
13
  - Python 3.10 or higher
14
+ - CUDA-capable GPU recommended with 8GB of VRAM
15
 
16
  ### Setup
17
 
18
  git clone https://huggingface.co/Localsong/LocalSong
19
  cd localsong
20
+ python3.10 -m venv venv
21
  source venv/bin/activate
22
+ pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --extra-index-url https://download.pytorch.org/whl/cu128
23
 
24
  ### Run
25
 
 
30
  ### Generation Advice
31
 
32
  Generations should use one of the soundtrack, soundtrack1 or soundtrack2 tags, as well as at least one other tag. They can use up to 8 tags; try combining genres and instruments.
33
+
34
  The default settings (CFG 3.5, steps 200) have been tested as optimal.
35
+ If generation is too slow on your system, try lowering steps to 100.
36
+
37
  The first generation will be slower due to torch.compile, then speed will increase.
38
  The model was trained on vocals but not lyrics. Vocals will not have recognizable words.
39
 
requirements.txt CHANGED
@@ -1,7 +1,6 @@
1
- torch>=2.8.0
2
- torchaudio>=2.8.0
3
- torchvision>=0.23.0
4
- torchcodec>=0.8.0
5
  accelerate>=1.9.0
6
  diffusers>=0.34.0
7
  einops>=0.8.1
 
1
+ torch==2.7.1
2
+ torchaudio==2.7.1
3
+ torchvision==0.22.1
 
4
  accelerate>=1.9.0
5
  diffusers>=0.34.0
6
  einops>=0.8.1