Al-Nay (الناي) Unconditional Diffusion

Al-Nay is one of the oldest instruments used to this date. With its roots in ancient Egypt nearly 5,000 years ago, it has become a staple in Arabic and Persian music. While the number of Nayzens – the name associated with skilled players of the instrument – has diminished over time, our Unconditional Diffusion model ensures that number is never zero. This project could not have been done without the following audio diffusion tools.

Usage

Usage of this model is no different from any other audio diffusion model from HuggingFace.

import torch
from diffusers import DiffusionPipeline

# Setup device and create generator
device = "cuda" if torch.cuda.is_available() else "cpu"
generator = torch.Generator(device=device)

# Instantiate model
model_id = "mijwiz-laboratories/al_nay_diffusion_unconditional_256"
audio_diffusion = DiffusionPipeline.from_pretrained(model_id).to(device)

# Set seed for generator
seed = generator.seed()
generator.manual_seed(seed)

# Run inference
output = audio_diffusion(generator=generator)
image = output.images[0] # Mel spectrogram generated
audio = output.audios[0, 0] # Playable audio file

Limitations of Model

The dataset used was very small, so the diversity of snippets that can be generated is rather limited. Furthermore, with high intensity segments (think a human playing the instrument with high intensity,) the realism/naturalness of the generated flute degrades.