Jeremy Hummel commited on
Commit
d87960f
·
1 Parent(s): f5a1e6e

Updates markdown

Browse files
Files changed (1) hide show
  1. app.py +10 -7
app.py CHANGED
@@ -35,33 +35,36 @@ network_choices = [
35
  description = \
36
  """
37
  Generate visualizations on an input audio file using [StyleGAN3](https://nvlabs.github.io/stylegan3/) (Karras, Tero, et al. "Alias-free generative adversarial networks." Advances in Neural Information Processing Systems 34 (2021): 852-863.).
 
38
  Inspired by [Deep Music Visualizer](https://github.com/msieg/deep-music-visualizer), which used BigGAN (Brock et al., 2018)
 
39
  Developed by Jeremy Hummel at [Lambda](https://lambdalabs.com/)
40
  """
41
 
42
  article = \
43
  """
44
  ## How does this work?
45
- The audio is transformed to a spectral representation by using Short-time Fourier transform (STFT). [librosa]()
 
46
  Starting with an initial noise vector, we perform a random walk, adjusting the length of each step with the power gradient.
47
  This pushes the noise vector to move around more when the sound changes.
48
 
49
  ## Parameter info:
50
- *Network*: various pre-trained models from NVIDIA, "afhqv2" is animals, "ffhq" is faces, "metfaces" is artwork.
51
 
52
- *Truncation*: controls how far the noise vector can be from the origin. `0.7` will generate more realistic, but less diverse samples,
53
  while `1.2` will can yield more interesting but less realistic images.
54
 
55
- *Tempo Sensitivity*: controls the how the size of each step scales with the audio features
56
 
57
- *Jitter*: prevents the same exact noise vectors from cycling repetitively, if set to `0`, the images will repeat during
58
  repetitive parts of the audio
59
 
60
- *Frame Length*: controls the number of audio frames per video frame in the output.
61
  If you want a higher frame rate for visualizing very rapid music, lower the frame length.
62
  If you want a lower frame rate (which will complete the job faster), raise the frame length
63
 
64
- *Max Duration*: controls the max length of the visualization, in seconds. Use a shorter value here to get output
65
  more quickly, especially for testing different combinations of parameters.
66
  """
67
  # Media sources:
 
35
  description = \
36
  """
37
  Generate visualizations on an input audio file using [StyleGAN3](https://nvlabs.github.io/stylegan3/) (Karras, Tero, et al. "Alias-free generative adversarial networks." Advances in Neural Information Processing Systems 34 (2021): 852-863.).
38
+
39
  Inspired by [Deep Music Visualizer](https://github.com/msieg/deep-music-visualizer), which used BigGAN (Brock et al., 2018)
40
+
41
  Developed by Jeremy Hummel at [Lambda](https://lambdalabs.com/)
42
  """
43
 
44
  article = \
45
  """
46
  ## How does this work?
47
+ The audio is transformed to a spectral representation by using Short-time Fourier transform (STFT) with [librosa](https://librosa.org/doc/latest/index.html).
48
+
49
  Starting with an initial noise vector, we perform a random walk, adjusting the length of each step with the power gradient.
50
  This pushes the noise vector to move around more when the sound changes.
51
 
52
  ## Parameter info:
53
+ **Network**: various pre-trained models from NVIDIA, "afhqv2" is animals, "ffhq" is faces, "metfaces" is artwork.
54
 
55
+ **Truncation**: controls how far the noise vector can be from the origin. `0.7` will generate more realistic, but less diverse samples,
56
  while `1.2` will can yield more interesting but less realistic images.
57
 
58
+ **Tempo Sensitivity**: controls the how the size of each step scales with the audio features
59
 
60
+ **Jitter**: prevents the same exact noise vectors from cycling repetitively, if set to `0`, the images will repeat during
61
  repetitive parts of the audio
62
 
63
+ **Frame Length**: controls the number of audio frames per video frame in the output.
64
  If you want a higher frame rate for visualizing very rapid music, lower the frame length.
65
  If you want a lower frame rate (which will complete the job faster), raise the frame length
66
 
67
+ **Max Duration**: controls the max length of the visualization, in seconds. Use a shorter value here to get output
68
  more quickly, especially for testing different combinations of parameters.
69
  """
70
  # Media sources: