Create Album Art From Lyrics; How do GANs work?
asterbyte
September 8, 2021
‘I am fascinated with all genres of music and whatever the tracks I’ve listened to, the only memory that lives with me is their album art, it makes them unique’- says a melophile. This means that album art is as important as a musician’s identity to keep a record of his musical journey like Andy Warhol‘s banana, Pink Floyd’s Prism, etc. “Memory is the diary that we all carry about with us”- Oscar Wilde
To begin with, GAN here, let me take you to a swift drive on it. When Ian J. Goodfellow coined the term Generative Adversarial Network, this machine learning technique ensures an unsupervised learning approach to data scientists. Call it a boon, 2014 thus marked a notable move in further development of GANs. GANs are there in Music Genre Classification, Detecting Credit Card Frauds, Recognizing Sign Languages, etc.
Let’s have a look at how to generate Album cover art using GANs?
GAN
Obviously, your first step is to design it. Why worry, if there is an automated solution to try on!
In a word, GANs can train an existing image and create a new unique image. Since challenges like stability between generator and discriminator, issues in positioning the objects, etc. affect the working of GANs, our Python engineer comes up with a project that intends to generate cover art from lyrics uniquely. Seems interesting?
Well, time to chalk out how to play with GANs.
Firstly it is all up to fetching data or simply data collection. Effective research is the baseline. What data to be collected, methodologies, and selecting (GAN) models. Here for that, the function began with piling up album names and artist names. But before that, you need to explore reputable sources for the data. Now to get the original cover art and track name, the Spotify API is used along with the album and artist name (SpotiPy is the library to interface with the Spotify API). Then with the help of Genius API (BeautifulSoup Library is used to explicate links and to extract the lyrics from the tag), you can reach to full lyrics webpage. Your step to link the information together is equally important. You must be wondering what is next. Cool, next is to scrape the Genius webpages and the lyrics to the database. To summarize, data acquisition worked quite well.
From the top three methods of GAN,(#1 CGAN-the Conditional GAN, which models the conditional probability, #2 DCGAN Deep CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORK –wherewith the pooling technique applied on the convolutional networks and #3 AttnGANs ATTENTIONAL GENERATIVE ADVERSARIAL NETWORK-uses a combination of Attention & GAN at every stage, to iteratively add details to get the final image)
it is DCGAN preferred here to train on the album dataset after sure-fire testing on MNIST.
DCGAN
Still, there were issues. Therefore, it required more than two techniques to get the model trained.
Mainly two major types of evaluating metrics namely the inception score (IS) and Fréchet inception distance (FID) are there to help in assessing the performance. Comparatively, the latter one is more popular. On the other hand, it would be full-fledged with irrelevant information. So, the task is to cut the unwanted and go formatted. Another matter to think of is choosing the appropriate Word Embedding Models (bag of words (BOW), Word2Vec, and Doc2Vec). As Doc2Vec creates a fixed-length, numeric representation of a document, the stress in training the model gets reduced. You must be wondering what is next. Calm next is to scrape the Genius webpages and the lyrics to the database. Analyzing results from various models in relation to epochs, it is concluded that image generation results over epochs have improvement. Colors became solid enough which reflect the real image.
DCGAN Output
However, to generate a quality result it would be a good idea to work with AttnGAN and also porting our dataset to the AttnGAN library.
AttnGAN
Exploration of the GAN continues. Hopefully works loading for a better future result in generating unique album art with attnGANs or more.
The video is out, here it is!
REFERENCE:
Nick Nieman, “Generating Album Artwork from Lyrics”, https://medium.com/@nickn9715/generating-album-artwork-from-lyrics-699b7d57a92d