File size: 5,660 Bytes
9d0d223
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MelodyFlow\n",
    "Welcome to MelodyFlow's demo jupyter notebook. \n",
    "Here you will find a self-contained example of how to use MelodyFlow for music generation and editing.\n",
    "\n",
    "First, we start by initializing MelodyFlow for music generation, you can choose a model from the following selection:\n",
    "1. facebook/melodyflow-t24-30secs - 1B parameters, 30 seconds music samples.\n",
    "\n",
    "We will use the `facebook/melodyflow-t24-30secs` variant for the purpose of this demonstration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from audiocraft.models import MelodyFlow\n",
    "\n",
    "model = MelodyFlow.get_pretrained(\"facebook/melodyflow-t24-30secs\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let us configure the generation parameters. Specifically, you can control the following:\n",
    "* `solver` (str, optional): ODE solver, either euler or midpoint. Defaults to midpoint.\n",
    "* `steps` (int, optional): number of solver steps. Defaults to 64.\n",
    "\n",
    "When left unchanged, MelodyFlow will revert to its default parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.set_generation_params(\n",
    "    solver=\"midpoint\",\n",
    "    steps=64,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can go ahead and start generating music given textual prompts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Text-conditional Generation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from audiocraft.utils.notebook import display_audio\n",
    "\n",
    "###### Text-to-music prompts - examples ######\n",
    "text = \"80s electronic track with melodic synthesizers, catchy beat and groovy bass\"\n",
    "# text = \"80s electronic track with melodic synthesizers, catchy beat and groovy bass. 170 bpm\"\n",
    "# text = \"Earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves\"\n",
    "# text = \"Funky groove with electric piano playing blue chords rhythmically\"\n",
    "# text = \"Rock with saturated guitars, a heavy bass line and crazy drum break and fills.\"\n",
    "# text = \"A grand orchestral arrangement with thunderous percussion, epic brass fanfares, and soaring strings, creating a cinematic atmosphere fit for a heroic battle\"\n",
    "                   \n",
    "N_VARIATIONS = 3\n",
    "descriptions = [text for _ in range(N_VARIATIONS)]\n",
    "\n",
    "print(f\"text prompt: {text}\\n\")\n",
    "output = model.generate(descriptions=descriptions, progress=True, return_tokens=True)\n",
    "display_audio(output[0], sample_rate=model.compression_model.sample_rate)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Text-conditional Editing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let us configure the editing parameters. Specifically, you can control the following:\n",
    "* `solver` (str, optional): ODE solver, either euler or midpoint. Defaults to euler.\n",
    "* `steps` (int, optional): number of solver steps. Defaults to 25.\n",
    "* `target_flowstep` (float, optional): Target flow step. Defaults to 0.\n",
    "* `regularize` (int, optional): Regularize each solver step. Defaults to True.\n",
    "* `regularize_iters` (int, optional): Number of regularization iterations. Defaults to 4.\n",
    "* `keep_last_k_iters` (int, optional): Number of meaningful regularization iterations for moving average computation. Defaults to 2.\n",
    "* `lambda_kl` (float, optional): KL regularization loss weight. Defaults to 0.2.\n",
    "\n",
    "When left unchanged, MelodyFlow will revert to its default parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.set_editing_params(\n",
    "    solver = \"euler\",\n",
    "    steps = 25,\n",
    "    target_flowstep = 0.05,\n",
    "    regularize = True,\n",
    "    regularize_iters = 4,\n",
    "    keep_last_k_iters = 2,\n",
    "    lambda_kl = 0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can go ahead and edit the previously generated music given new textual prompts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "edited_output = model.edit(output[1],\n",
    "                        [\"Piano melody.\" for _ in range(N_VARIATIONS)],\n",
    "                        src_descriptions=[\"\" for _ in range(N_VARIATIONS)],\n",
    "                        return_tokens=True,\n",
    "                        progress=True)\n",
    "display_audio(edited_output[0], sample_rate=model.compression_model.sample_rate)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "melodyflow",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.19"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}