descript
/

descript-audio-codec

generative-adversarial-network

compression-algorithm

audio-compression

Model card Files Files and versions Community

descript-audio-codec / README.md

eeishaan's picture

Update README.md

87c36ee about 1 year ago

|

raw history blame contribute delete

2.43 kB

	---
	'[object Object]': null
	license: mit
	tags:
	- audio
	- deep-learning
	- pytorch
	- generative-adversarial-network
	- codec
	- gans
	- compression-algorithm
	- audio-compression
	- RVQ
	---


	# Descript Audio Codec

	👉 With Descript Audio Codec, you can compress 44.1 KHz audio into discrete codes at a low 8 kbps bitrate. <br>
	🤌 That's approximately 90x compression while maintaining exceptional fidelity and minimizing artifacts. <br>
	💪 Our universal model works on all domains (speech, environment, music, etc.), making it widely applicable to generative modeling of all audio. <br>
	👌 It can be used as a drop-in replacement for EnCodec for all audio language modeling applications (such as AudioLMs, MusicLMs, MusicGen, etc.) <br>

	## Model Details

	### Model Description

	- License: MIT

	### Model Sources

	- Repository: [Github Repo](https://github.com/descriptinc/descript-audio-codec)
	- Paper: [arXiv Paper: High-Fidelity Audio Compression with Improved RVQGAN
	](http://arxiv.org/abs/2306.06546)
	- Demo: [Demo Site](https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5)

	## Uses

	The model is intended for compressing audio files containing speech, music and environmental sounds.

	### Out-of-Scope Use

	It is not intended to be used for compressing other file formats such as text, images, etc.

	## Bias, Risks, and Limitations
	Our model has difficulty reconstructing some challenging audio. It
	performs best for speech and has more issues with environmental sounds. It
	does not model some musical instruments perfectly, such as glockenspeil, or synthesizer sounds.


	## How to Get Started with the Model
	This model is meant to be used with our official repo linked above. We release the model here for redundancy purposes.
	Our code is able to pull the weights from their
	[original location on Github](https://github.com/descriptinc/descript-audio-codec/releases/download/0.0.1/weights.pth).
	Please refer to the official [README](https://github.com/descriptinc/descript-audio-codec#readme) for usage instructions.

	## Citation

	BibTeX:

	```
	@misc{kumar2023highfidelity,
	title={High-Fidelity Audio Compression with Improved RVQGAN},
	author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},
	year={2023},
	eprint={2306.06546},
	archivePrefix={arXiv},
	primaryClass={cs.SD}
	}
	```