Spaces:

facebook
/

MelodyFlow

Running on Zero

File size: 5,660 Bytes

9d0d223

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MelodyFlow\n",
    "Welcome to MelodyFlow's demo jupyter notebook. \n",
    "Here you will find a self-contained example of how to use MelodyFlow for music generation and editing.\n",
    "\n",
    "First, we start by initializing MelodyFlow for music generation, you can choose a model from the following selection:\n",
    "1. facebook/melodyflow-t24-30secs - 1B parameters, 30 seconds music samples.\n",
    "\n",
    "We will use the `facebook/melodyflow-t24-30secs` variant for the purpose of this demonstration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from audiocraft.models import MelodyFlow\n",
    "\n",
    "model = MelodyFlow.get_pretrained(\"facebook/melodyflow-t24-30secs\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let us configure the generation parameters. Specifically, you can control the following:\n",
    "* `solver` (str, optional): ODE solver, either euler or midpoint. Defaults to midpoint.\n",
    "* `steps` (int, optional): number of solver steps. Defaults to 64.\n",
    "\n",
    "When left unchanged, MelodyFlow will revert to its default parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.set_generation_params(\n",
    "    solver=\"midpoint\",\n",
    "    steps=64,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can go ahead and start generating music given textual prompts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Text-conditional Generation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from audiocraft.utils.notebook import display_audio\n",
    "\n",
    "###### Text-to-music prompts - examples ######\n",
    "text = \"80s electronic track with melodic synthesizers, catchy beat and groovy bass\"\n",
    "# text = \"80s electronic track with melodic synthesizers, catchy beat and groovy bass. 170 bpm\"\n",
    "# text = \"Earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves\"\n",
    "# text = \"Funky groove with electric piano playing blue chords rhythmically\"\n",
    "# text = \"Rock with saturated guitars, a heavy bass line and crazy drum break and fills.\"\n",
    "# text = \"A grand orchestral arrangement with thunderous percussion, epic brass fanfares, and soaring strings, creating a cinematic atmosphere fit for a heroic battle\"\n",
    "                   \n",
    "N_VARIATIONS = 3\n",
    "descriptions = [text for _ in range(N_VARIATIONS)]\n",
    "\n",
    "print(f\"text prompt: {text}\\n\")\n",
    "output = model.generate(descriptions=descriptions, progress=True, return_tokens=True)\n",
    "display_audio(output[0], sample_rate=model.compression_model.sample_rate)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Text-conditional Editing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let us configure the editing parameters. Specifically, you can control the following:\n",
    "* `solver` (str, optional): ODE solver, either euler or midpoint. Defaults to euler.\n",
    "* `steps` (int, optional): number of solver steps. Defaults to 25.\n",
    "* `target_flowstep` (float, optional): Target flow step. Defaults to 0.\n",
    "* `regularize` (int, optional): Regularize each solver step. Defaults to True.\n",
    "* `regularize_iters` (int, optional): Number of regularization iterations. Defaults to 4.\n",
    "* `keep_last_k_iters` (int, optional): Number of meaningful regularization iterations for moving average computation. Defaults to 2.\n",
    "* `lambda_kl` (float, optional): KL regularization loss weight. Defaults to 0.2.\n",
    "\n",
    "When left unchanged, MelodyFlow will revert to its default parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.set_editing_params(\n",
    "    solver = \"euler\",\n",
    "    steps = 25,\n",
    "    target_flowstep = 0.05,\n",
    "    regularize = True,\n",
    "    regularize_iters = 4,\n",
    "    keep_last_k_iters = 2,\n",
    "    lambda_kl = 0.2,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can go ahead and edit the previously generated music given new textual prompts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "edited_output = model.edit(output[1],\n",
    "                        [\"Piano melody.\" for _ in range(N_VARIATIONS)],\n",
    "                        src_descriptions=[\"\" for _ in range(N_VARIATIONS)],\n",
    "                        return_tokens=True,\n",
    "                        progress=True)\n",
    "display_audio(edited_output[0], sample_rate=model.compression_model.sample_rate)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "melodyflow",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.19"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}