File size: 3,124 Bytes
b854631
43ed3b5
fb03d85
43ed3b5
68e365c
43ed3b5
68e365c
43ed3b5
68e365c
43ed3b5
68e365c
 
 
 
 
43ed3b5
fb03d85
 
52d13fa
43ed3b5
 
 
fb03d85
43ed3b5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

# 🎸 πŸ₯ Rockbot 🎀 🎧 
A [GPT-2](https://openai.com/blog/better-language-models/) based lyrics generator fine-tuned on the writing styles of 16000 songs by 270 artists across MANY genres (not just rock).

**Instructions:** Type in a fake song title, pick an artist, click "Generate".

Most language models are imprecise and Rockbot is no exception. You may see NSFW lyrics unexpectedly. I have made no attempts to censor. Generated lyrics may be repetitive and/or incoherent at times, but hopefully you'll encounter something interesting or memorable.

Oh, and generation is resource intense and can be slow. I set governors on song length to keep generation time somewhat reasonable. You may adjust song length and other parameters on the left or check out [Github](https://github.com/bigjoedata/rockbot) to spin up your own Rockbot.

Just have fun.

[Demo](https://share.streamlit.io/bigjoedata/rockbot/main/src/main.py) Adjust settings to increase speed

[Github](https://github.com/bigjoedata/rockbot)

[GPT-2 124M version Model page on Hugging Face](https://huggingface.co/bigjoedata/rockbot)

[DistilGPT2 version Model page on Hugging Face](https://huggingface.co/bigjoedata/rockbot-distilgpt2/) This is leaner with the tradeoff being that the lyrics are more simplistic.

🎹 πŸͺ˜ 🎷 🎺 πŸͺ—  πŸͺ• 🎻
## Background
With the shutdown of [Google Play Music](https://en.wikipedia.org/wiki/Google_Play_Music) I used Google's takeout function to gather the metadata from artists I've listened to over the past several years. I wanted to take advantage of this bounty to build something fun. I scraped the top 50 lyrics for artists I'd listened to at least once from [Genius](https://genius.com/), then fine tuned [GPT-2's](https://openai.com/blog/better-language-models/) 124M token model using the [AITextGen](https://github.com/minimaxir/aitextgen) framework after considerable post-processing. For more on generation, see [here.](https://huggingface.co/blog/how-to-generate)

### Full Tech Stack
[Google Play Music](https://en.wikipedia.org/wiki/Google_Play_Music)  (R.I.P.). 
[Python](https://www.python.org/). 
[Streamlit](https://www.streamlit.io/). 
[GPT-2](https://openai.com/blog/better-language-models/). 
[AITextGen](https://github.com/minimaxir/aitextgen). 
[Pandas](https://pandas.pydata.org/). 
[LyricsGenius](https://lyricsgenius.readthedocs.io/en/master/). 
[Google Colab](https://colab.research.google.com/) (GPU based Training). 
[Knime](https://www.knime.com/) (data cleaning). 


## How to Use The Model
Please refer to [AITextGen](https://github.com/minimaxir/aitextgen) for much better documentation.

### Training Parameters Used

    ai.train("lyrics.txt",
             line_by_line=False,
             from_cache=False,
             num_steps=10000,
             generate_every=2000,
             save_every=2000,
             save_gdrive=False,
             learning_rate=1e-3,
             batch_size=3,
             eos_token="<|endoftext|>",
             #fp16=True
             )
###  To Use


    Generate With Prompt (Use Title Case):
    Song Name
    BY
    Artist Name