marcop
/

musika_ae

Model card Files Files and versions Community

musika_ae / README.md

marcop's picture

add model card

4572cc3 about 2 years ago

|

history blame contribute delete

1.04 kB

	---
	license: mit
	library_name: keras
	tags:
	- audio
	- music
	- generation
	- tensorflow
	---

	# Musika Audio Autoencoder

	Pretrained universal autoencoder model for the [Musika system](https://github.com/marcoppasini/musika) for fast infinite waveform music generation.
	Introduced in [this paper](https://arxiv.org/abs/2208.08706).

	## Model description

	The Musika autoencoder consists of two hierarchical stages that are separately trained. This autoencoder is trained to encode and reconstruct general 44.1 kHz waveform music. The final time compression ratio that is achieved is 4096x.
	As an example, 23 seconds of 44.1 kHz audio are encoded into a sequence of 256 vectors with a dimension of 64.

	### How to use

	This autoencoder is automatically downloaded and used at the first execution of the system. Try Musika [here](https://github.com/marcoppasini/musika)!


	## Training data

	The autoencoder was trained on both the SXSW dataset (diverse music dataset) and on the VCTK dataset (speech dataset) to produce general representations.