Spaces:
Runtime error
Runtime error
Commit
•
11bfc95
1
Parent(s):
a6d0216
Upload app.py
Browse files
app.py
CHANGED
@@ -1,5 +1,3 @@
|
|
1 |
-
import subprocess
|
2 |
-
import os
|
3 |
import gradio as gr
|
4 |
import json
|
5 |
from utils import *
|
@@ -8,64 +6,27 @@ from transformers import AutoTokenizer
|
|
8 |
|
9 |
description = """
|
10 |
<div>
|
11 |
-
<a style="display:inline-block" href='https://github.com/
|
12 |
-
<a style='display:inline-block' href='https://
|
13 |
-
<a style="display:inline-block
|
|
|
14 |
</div>
|
15 |
-
Bark is a universal text-to-audio model created by [Suno](www.suno.ai), with code publicly available [here](https://github.com/suno-ai/bark). \
|
16 |
-
Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. \
|
17 |
-
This demo should be used for research purposes only. Commercial use is strictly prohibited. \
|
18 |
-
The model output is not censored and the authors do not endorse the opinions in the generated content. \
|
19 |
-
Use at your own risk.
|
20 |
-
"""
|
21 |
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
* [gasps]
|
38 |
-
* [clears throat]
|
39 |
-
* — or ... for hesitations
|
40 |
-
* ♪ for song lyrics
|
41 |
-
* capitalization for emphasis of a word
|
42 |
-
* MAN/WOMAN: for bias towards speaker
|
43 |
-
Try the prompt:
|
44 |
-
```
|
45 |
-
" [clears throat] Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as... ♪ singing ♪."
|
46 |
-
```
|
47 |
-
## 🎶 Music
|
48 |
-
Bark can generate all types of audio, and, in principle, doesn't see a difference between speech and music. \
|
49 |
-
Sometimes Bark chooses to generate text as music, but you can help it out by adding music notes around your lyrics.
|
50 |
-
Try the prompt:
|
51 |
-
```
|
52 |
-
♪ In the jungle, the mighty jungle, the lion barks tonight ♪
|
53 |
-
```
|
54 |
-
## 🧬 Voice Cloning
|
55 |
-
Bark has the capability to fully clone voices - including tone, pitch, emotion and prosody. \
|
56 |
-
The model also attempts to preserve music, ambient noise, etc. from input audio. \
|
57 |
-
However, to mitigate misuse of this technology, we limit the audio history prompts to a limited set of Suno-provided, fully synthetic options to choose from.
|
58 |
-
## 👥 Speaker Prompts
|
59 |
-
You can provide certain speaker prompts such as NARRATOR, MAN, WOMAN, etc. \
|
60 |
-
Please note that these are not always respected, especially if a conflicting audio history prompt is given.
|
61 |
-
Try the prompt:
|
62 |
-
```
|
63 |
-
WOMAN: I would like an oatmilk latte please.
|
64 |
-
MAN: Wow, that's expensive!
|
65 |
-
```
|
66 |
-
## Details
|
67 |
-
Bark model by [Suno](https://suno.ai/), including official [code](https://github.com/suno-ai/bark) and model weights. \
|
68 |
-
Gradio demo supported by 🤗 Hugging Face. Bark is licensed under a non-commercial license: CC-BY 4.0 NC, see details on [GitHub](https://github.com/suno-ai/bark).
|
69 |
"""
|
70 |
examples = [
|
71 |
"Jazz standard in Minor key with a swing feel.",
|
@@ -190,7 +151,7 @@ def semantic_music_search(query):
|
|
190 |
key_cache = torch.load(f)
|
191 |
|
192 |
# encode query
|
193 |
-
query_ids = encoding_data([query], QUERY_MODAL)
|
194 |
query_feature = get_features(query_ids, QUERY_MODAL)
|
195 |
|
196 |
key_filenames = key_cache["filenames"]
|
@@ -225,5 +186,4 @@ gr.Interface(
|
|
225 |
outputs=[output_title, output_artist, output_genre, output_description, output_abc],
|
226 |
title="🗜️ CLaMP: Semantic Music Search",
|
227 |
description=description,
|
228 |
-
article=article,
|
229 |
examples=examples).launch()
|
|
|
|
|
|
|
1 |
import gradio as gr
|
2 |
import json
|
3 |
from utils import *
|
|
|
6 |
|
7 |
description = """
|
8 |
<div>
|
9 |
+
<a style="display:inline-block" href='https://github.com/microsoft/muzic/tree/main/clamp'><img src='https://img.shields.io/github/stars/microsoft/muzic?style=social' /></a>
|
10 |
+
<a style='display:inline-block' href='https://ai-muzic.github.io/clamp/'><img src='https://img.shields.io/badge/website-CLaMP-ff69b4.svg' /></a>
|
11 |
+
<a style="display:inline-block" href="https://huggingface.co/datasets/sander-wood/wikimusictext"><img src="https://img.shields.io/badge/huggingface-dataset-ffcc66.svg"></a>
|
12 |
+
<a style="display:inline-block" href="https://arxiv.org/abs/2106.01955"><img src="https://img.shields.io/badge/arXiv-2106.01955-b31b1b.svg"></a>
|
13 |
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
+
## ℹ️ How to use this demo?
|
16 |
+
1. Enter a query in the text box.
|
17 |
+
2. Click "Submit" and wait for the result.
|
18 |
+
3. It will return the most matching music score from the WikiMusictext dataset (1010 scores in total).
|
19 |
+
|
20 |
+
## ❕Notice
|
21 |
+
- The text box is case-sensitive.
|
22 |
+
- You can enter longer text for the text box, but the demo will only use the first 128 tokens.
|
23 |
+
- The returned results include the title, artist, genre, description, and the score in ABC notation.
|
24 |
+
- The genre and description may not be accurate, as they are collected from the web.
|
25 |
+
- The demo is based on CLaMP-S/512, a CLaMP model with 6-layer Transformer text/music encoders and a sequence length of 512.
|
26 |
+
|
27 |
+
## 🔠👉🎵 Semantic Music Search
|
28 |
+
Semantic search is a technique for retrieving music by open-domain queries, which differs from traditional keyword-based searches that depend on exact matches or meta-information. This involves two steps: 1) extracting music features from all scores in the library, and 2) transforming the query into a text feature. By calculating the similarities between the text feature and the music features, it can efficiently locate the score that best matches the user's query in the library.
|
29 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
"""
|
31 |
examples = [
|
32 |
"Jazz standard in Minor key with a swing feel.",
|
|
|
151 |
key_cache = torch.load(f)
|
152 |
|
153 |
# encode query
|
154 |
+
query_ids = encoding_data([unidecode(query)], QUERY_MODAL)
|
155 |
query_feature = get_features(query_ids, QUERY_MODAL)
|
156 |
|
157 |
key_filenames = key_cache["filenames"]
|
|
|
186 |
outputs=[output_title, output_artist, output_genre, output_description, output_abc],
|
187 |
title="🗜️ CLaMP: Semantic Music Search",
|
188 |
description=description,
|
|
|
189 |
examples=examples).launch()
|