Surn commited on
Commit
59cc4f3
1 Parent(s): 790c9e0

Add changelog

Browse files
Files changed (4) hide show
  1. CHANGELOG.md +33 -0
  2. README.md +15 -1
  3. app.py +50 -2
  4. pre-requirements.txt +1 -2
CHANGELOG.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## [0.0.2a2] - 2023-07-20
2
+
3
+ Music Generation set to a max of 720 seconds (12 minutes) to avoid memory issues.
4
+
5
+ Video editing options (thanks @Surn and @oncorporation).
6
+
7
+ Music Conditioning segment options
8
+
9
+
10
+ ## [0.0.2a] - TBD
11
+
12
+ Improved demo, fixed top p (thanks @jnordberg).
13
+
14
+ Compressor tanh on output to avoid clipping with some style (especially piano).
15
+ Now repeating the conditioning periodically if it is too short.
16
+
17
+ More options when launching Gradio app locally (thanks @ashleykleynhans).
18
+
19
+ Testing out PyTorch 2.0 memory efficient attention.
20
+
21
+ Added extended generation (infinite length) by slowly moving the windows.
22
+ Note that other implementations exist: https://github.com/camenduru/MusicGen-colab.
23
+
24
+ ## [0.0.1] - 2023-06-09
25
+
26
+ Initial release, with model evaluation only.
27
+
28
+
29
+ # Changelog
30
+
31
+ All notable changes to this project will be documented in this file.
32
+
33
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
README.md CHANGED
@@ -4,7 +4,7 @@ emoji: 🎼
4
  colorFrom: white
5
  colorTo: red
6
  sdk: gradio
7
- sdk_version: 3.33.1
8
  app_file: app.py
9
  pinned: false
10
  license: creativeml-openrail-m
@@ -68,6 +68,20 @@ We offer a number of way to interact with MusicGen:
68
  updated with contributions from @camenduru and the community.
69
  6. Finally, MusicGen is available in 🤗 Transformers from v4.31.0 onwards, see section [🤗 Transformers Usage](#-transformers-usage) below.
70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  ## API
72
 
73
  We provide a simple API and 4 pre-trained models. The pre trained models are:
 
4
  colorFrom: white
5
  colorTo: red
6
  sdk: gradio
7
+ sdk_version: 3.38.0
8
  app_file: app.py
9
  pinned: false
10
  license: creativeml-openrail-m
 
68
  updated with contributions from @camenduru and the community.
69
  6. Finally, MusicGen is available in 🤗 Transformers from v4.31.0 onwards, see section [🤗 Transformers Usage](#-transformers-usage) below.
70
 
71
+ ### More info about Top-k, Top-p, Temperature and Classifier Free Guidance from ChatGPT
72
+ 6. Finally, MusicGen is available in 🤗 Transformers from v4.31.0 onwards, see section [🤗 Transformers Usage](#-transformers-usage) below.
73
+
74
+
75
+ Top-k: Top-k is a parameter used in text generation models, including music generation models. It determines the number of most likely next tokens to consider at each step of the generation process. The model ranks all possible tokens based on their predicted probabilities, and then selects the top-k tokens from the ranked list. The model then samples from this reduced set of tokens to determine the next token in the generated sequence. A smaller value of k results in a more focused and deterministic output, while a larger value of k allows for more diversity in the generated music.
76
+
77
+ Top-p (or nucleus sampling): Top-p, also known as nucleus sampling or probabilistic sampling, is another method used for token selection during text generation. Instead of specifying a fixed number like top-k, top-p considers the cumulative probability distribution of the ranked tokens. It selects the smallest possible set of tokens whose cumulative probability exceeds a certain threshold (usually denoted as p). The model then samples from this set to choose the next token. This approach ensures that the generated output maintains a balance between diversity and coherence, as it allows for a varying number of tokens to be considered based on their probabilities.
78
+
79
+ Temperature: Temperature is a parameter that controls the randomness of the generated output. It is applied during the sampling process, where a higher temperature value results in more random and diverse outputs, while a lower temperature value leads to more deterministic and focused outputs. In the context of music generation, a higher temperature can introduce more variability and creativity into the generated music, but it may also lead to less coherent or structured compositions. On the other hand, a lower temperature can produce more repetitive and predictable music.
80
+
81
+ Classifier-Free Guidance: Classifier-Free Guidance refers to a technique used in some music generation models where a separate classifier network is trained to provide guidance or control over the generated music. This classifier is trained on labeled data to recognize specific musical characteristics or styles. During the generation process, the output of the generator model is evaluated by the classifier, and the generator is encouraged to produce music that aligns with the desired characteristics or style. This approach allows for more fine-grained control over the generated music, enabling users to specify certain attributes they want the model to capture.
82
+
83
+ These parameters, such as top-k, top-p, temperature, and classifier-free guidance, provide different ways to influence the output of a music generation model and strike a balance between creativity, diversity, coherence, and control. The specific values for these parameters can be tuned based on the desired outcome and user preferences.
84
+
85
  ## API
86
 
87
  We provide a simple API and 4 pre-trained models. The pre trained models are:
app.py CHANGED
@@ -11,6 +11,8 @@ import argparse
11
  import torch
12
  import gradio as gr
13
  import os
 
 
14
  from pathlib import Path
15
  import time
16
  import typing as tp
@@ -32,6 +34,7 @@ INTERRUPTED = False
32
  UNLOAD_MODEL = False
33
  MOVE_TO_CPU = False
34
  MAX_PROMPT_INDEX = 0
 
35
 
36
  def interrupt_callback():
37
  return INTERRUPTED
@@ -116,6 +119,48 @@ def get_melody(melody_filepath):
116
  melody = tuple(audio_data)
117
  return melody
118
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
  def load_melody_filepath(melody_filepath, title):
120
  # get melody filename
121
  #$Union[str, os.PathLike]
@@ -300,6 +345,8 @@ def ui(**kwargs):
300
  #btn-generate {background-image:linear-gradient(to right bottom, rgb(157, 255, 157), rgb(229, 255, 235));}
301
  #btn-generate:hover {background-image:linear-gradient(to right bottom, rgb(229, 255, 229), rgb(255, 255, 255));}
302
  #btn-generate:active {background-image:linear-gradient(to right bottom, rgb(229, 255, 235), rgb(157, 255, 157));}
 
 
303
  """
304
  with gr.Blocks(title="UnlimitedMusicGen", css=css) as demo:
305
  gr.Markdown(
@@ -360,8 +407,8 @@ def ui(**kwargs):
360
  cfg_coef = gr.Number(label="Classifier Free Guidance", value=8.5, precision=None, interactive=True)
361
  with gr.Row():
362
  seed = gr.Number(label="Seed", value=-1, precision=0, interactive=True)
363
- gr.Button('\U0001f3b2\ufe0f').style(full_width=False).click(fn=lambda: -1, outputs=[seed], queue=False)
364
- reuse_seed = gr.Button('\u267b\ufe0f').style(full_width=False)
365
  with gr.Column() as c:
366
  output = gr.Video(label="Generated Music")
367
  wave_file = gr.File(label=".wav file", elem_id="output_wavefile", interactive=True)
@@ -408,6 +455,7 @@ def ui(**kwargs):
408
  inputs=[text, melody_filepath, model, title],
409
  outputs=[output]
410
  )
 
411
 
412
  # Show the interface
413
  launch_kwargs = {}
 
11
  import torch
12
  import gradio as gr
13
  import os
14
+ import subprocess
15
+ import sys
16
  from pathlib import Path
17
  import time
18
  import typing as tp
 
34
  UNLOAD_MODEL = False
35
  MOVE_TO_CPU = False
36
  MAX_PROMPT_INDEX = 0
37
+ git = os.environ.get('GIT', "git")
38
 
39
  def interrupt_callback():
40
  return INTERRUPTED
 
119
  melody = tuple(audio_data)
120
  return melody
121
 
122
+
123
+ def commit_hash():
124
+ try:
125
+ return subprocess.check_output([git, "rev-parse", "HEAD"], shell=False, encoding='utf8').strip()
126
+ except Exception:
127
+ return "<none>"
128
+
129
+
130
+ def git_tag():
131
+ try:
132
+ return subprocess.check_output([git, "describe", "--tags"], shell=False, encoding='utf8').strip()
133
+ except Exception:
134
+ try:
135
+ from pathlib import Path
136
+ changelog_md = Path(__file__).parent.parent / "CHANGELOG.md"
137
+ with changelog_md.open(encoding="utf-8") as file:
138
+ return next((line.strip() for line in file if line.strip()), "<none>")
139
+ except Exception:
140
+ return "<none>"
141
+
142
+ def versions_html():
143
+ import torch
144
+
145
+ python_version = ".".join([str(x) for x in sys.version_info[0:3]])
146
+ commit = commit_hash()
147
+ #tag = git_tag()
148
+
149
+ import xformers
150
+ xformers_version = xformers.__version__
151
+
152
+ return f"""
153
+ version: <a href="https://github.com/Oncorporation/audiocraft/commit/{"huggingface" if commit == "<none>" else commit}" target="_blank">{"huggingface" if commit == "<none>" else commit}</a>
154
+ &#x2000;•&#x2000;
155
+ python: <span title="{sys.version}">{python_version}</span>
156
+ &#x2000;•&#x2000;
157
+ torch: {getattr(torch, '__long_version__',torch.__version__)}
158
+ &#x2000;•&#x2000;
159
+ xformers: {xformers_version}
160
+ &#x2000;•&#x2000;
161
+ gradio: {gr.__version__}
162
+ """
163
+
164
  def load_melody_filepath(melody_filepath, title):
165
  # get melody filename
166
  #$Union[str, os.PathLike]
 
345
  #btn-generate {background-image:linear-gradient(to right bottom, rgb(157, 255, 157), rgb(229, 255, 235));}
346
  #btn-generate:hover {background-image:linear-gradient(to right bottom, rgb(229, 255, 229), rgb(255, 255, 255));}
347
  #btn-generate:active {background-image:linear-gradient(to right bottom, rgb(229, 255, 235), rgb(157, 255, 157));}
348
+ #versions {margin-top: 1em; width:100%; text-align:center;}
349
+ .small-btn {max-width:75px;}
350
  """
351
  with gr.Blocks(title="UnlimitedMusicGen", css=css) as demo:
352
  gr.Markdown(
 
407
  cfg_coef = gr.Number(label="Classifier Free Guidance", value=8.5, precision=None, interactive=True)
408
  with gr.Row():
409
  seed = gr.Number(label="Seed", value=-1, precision=0, interactive=True)
410
+ gr.Button('\U0001f3b2\ufe0f', elem_classes="small-btn").click(fn=lambda: -1, outputs=[seed], queue=False)
411
+ reuse_seed = gr.Button('\u267b\ufe0f', elem_classes="small-btn")
412
  with gr.Column() as c:
413
  output = gr.Video(label="Generated Music")
414
  wave_file = gr.File(label=".wav file", elem_id="output_wavefile", interactive=True)
 
455
  inputs=[text, melody_filepath, model, title],
456
  outputs=[output]
457
  )
458
+ gr.HTML(value=versions_html(), visible=True, elem_id="versions")
459
 
460
  # Show the interface
461
  launch_kwargs = {}
pre-requirements.txt CHANGED
@@ -1,2 +1 @@
1
- pip>=23.2
2
- gradio_client==0.2.7
 
1
+ pip>=23.2