metadata

title: NLP Song Generator Guessing Game
emoji: 🎤🤖
colorFrom: gray
colorTo: yellow
sdk: gradio
sdk_version: 4.26.0
app_file: app.py
pinned: true
license: apache-2.0

Song Generator Guessing Game

User Guide

Introduction

This program generates a song using a language model and then presents a multiple-choice question to the user to guess the artist of the generated song.

Usage

Select a language model from the dropdown.
Click the 'Generate Song' button to generate a song.
Guess the artist of the generated song by selecting an option from the radio buttons.
Click the 'Submit Answer' button to submit your guess.
The correct answer will be displayed below the radio buttons.
Repeat steps 2-5 to generate and guess more songs.

Documentation

Detailed documentation of models, data, and frameworks used. This should explain how the demo works, the core components used, how the data was processed from user input to output, and details about the pretrained models and datasets used.

Models

The core component of this application is a GPT2 traned language model. The model was trained using the datasets listed below.
In addition to this special dataset, the demo also uses untrained versiosons of GPT2-medium, facebook/bart-base, and GPT-NEO for comparison
You can access the model here

Data

The data for this project was generated using the genius API. I created two data sets of the same size, approximately 26,000 tokens, each containing the artist and a portion of one of their top songs. You can find the datasets linked to below. There are approximately 135 different artists used in this data set and they range from a myriad of decades and genres when the program runs it will pick randomly from from the set of classifiers and that will be what is used to generate the song snippet

https://huggingface.co/datasets/SpartanCinder/song-lyrics-artist-classifier https://huggingface.co/datasets/SpartanCinder/artist-lyrics-dataset

Frameworks

This application uses the following frameworks:

Gradio: Gradio is used to create the user interface for the application. It provides various components such as buttons, dropdowns, and radio buttons for user input, and text boxes for displaying output.
PyTorch: PyTorch is used as the backend for the language model. It provides the necessary functions and classes for loading the model, processing the input, and generating the output.
Transformers: The Transformers library, developed by Hugging Face, is used for loading and using the different models. It provides a high-level API for using transformer models.

Contributions

My main contributions for this project were the building of two datasets and the training and fine tuning of the GPT 2 model. As I was looking to start this project I realized that there was no data set that currently existed that allowed me to train any model for the type of lyrics that I wanted to generate, so I knew I would have to create that myself. The overall intention was to build two models the a special text generative model that would be good at generating lyrics and a classifier model that would go along with it and classify the lyrics that it generated so that the application would show the likelihoods of all of the options that are able to be selected. The second model was created, but due to time constraints I was not able to get it to work properly with the specially trained transformer model. I detail why in the limitations section.

Limitations

I did not want to use GPT 2 for this project because of my past experience and my struggles getting it to work slash even generate text after I trained it. However I ended up going with it for the final implementation after much fiddling because I simply wasn't able to get another lightweight language model to output anything reasonable to what I wanted. I think this was in part because of the limitations of my data set. After doing some digging i realized that other GPT 2 models that output more accurate text have significantly larger data sets. The data set I created only had 26,000 tokens and I only ran training on my model over 5 epochs. Well I think a larger data set would aid in the training, I also think repeating the training multiple times with the same data set would also yield different results. What's more, I found some other models that have been trained previously to generate songs like our real life artist and those GPT 2 models were only trained on a single artist and were very accurate to the source material. They didn't say how much they trained, but in the future i think this might be a better Ave. For training for specifically what I want to do.

I trained the model it simply would not output anything making it less useful than a default model, but after some tuning I realized part of this was because of how I had the model generating text. After I tuned the model to use beam search to generate text it started producing lyrics much more frequently. However, it only produces lyrics in the style of an artist about 56% of the time. Part of these limitations are due to the data set, but also i think part of the reason why it struggles to generate relevant lyrics is based on what artist is randomly chosen. Because GPT 2 is trained on Reddit and Twitter comments, it naturally has more information to go on regarding newer pop artists. In my testing I used Taylor Swift as an example and about 70% of the time it could generate relevant lyrics for Taylor Swift. However if it picks an older artist the model might just ramble on about not having heard the song before. There is also a case where the program picks an artist that the model just doesn't know what to do with and it generates either numbers or YouTube links to their non existent channels.

Citations

Gradio Docs
Github CoPilot
Dr. Wilson
W3 Schools
Tranformer docs
Classification Docs