This blog summarized work that has been posted here, here, and here.
Iain of degeneratestate.org wrote a three-piece series where he applied text mining to the lyrics of 222,623 songs from 7,364 heavy metal bands spread over 22,314 albums that he scraped from darklyrics.com. He applied a broad range of different analyses in Python, the code of which you can find here on Github.
For example, he starts part 1 by calculated the difficulty/complexity of the lyrics of each band using the Simple Measure of Gobbledygook or SMOG and contrasted this to the number of swearwords used, finding a nice correlation.
Furthermore, he ran some word importance analysis, looking at word frequencies, log-likelihood ratios, and TF-IDF scores. This allowed him to contrast the word usage of the different bands, finding, for instance, one heavy metal band that was characterized by the words “oh yeah baby got love“: fans might recognize either Motorhead, Machinehead, or Diamondhead.
Using cosine distance measures, Iain could compare the word vectors of the different bands, ultimately recognizing band similarity, and song representativeness for a band. This allowed interesting analysis, such as a clustering of the various bands:
However, all his analysis worked out nicely. While he also applied t-SNE to visualize band similarity in a two-dimensional space, the solution was uninformative due to low variance in the data.
He could predict the band behind a song by training a one-vs-rest logistic regression classifier based on the reduced lyric space of 150 dimensions after latent semantic analysis. Despite classifying a song to one of 120 different bands, the classifier had a precision and recall both around 0.3, with negligible hyper parameter tuning. He used the classification errors to examine which bands get confused with each other, and visualized this using two network graphs.
In part 2, Iain tried to create a heavy metal lyric generator (which you can now try out).
His first approach was to use probabilistic distributions known as language models. Basically he develops a Markov Chain, in his opinion more of a “unsmoothed maximum-likelihood language model“, which determines the next most probable word based on the previous word(s). This model is based on observed word chains, for instance, those in the first two lines to Iron Maiden’s Number of the Beast:
Another approach would be to train a neural network. Iain used Keras, which ran on an amazon GPU instance. He recognizes the power of neural nets, but says they also come at a cost:
“The maximum likelihood models we saw before took twenty minutes to code from scratch. Even using powerful libraries, it took me a while to understand NNs well enough to use. On top of this, training the models here took days of computer time, plus more of my human time tweeking hyper parameters to get the models to converge. I lack the temporal, financial and computational resources to fully explore the hyperparameter space of these models, so the results presented here should be considered suboptimal.” – Iain
He started out with feed forward networks on a character level. His best try consisted of two feed forward layers of 512 units, followed by a softmax output, with layer normalisation, dropout and tanh activations, which he trained for 20 epochs to minimise the mean cross-entropy. Although it quickly beat the maximum likelihood Markov model, its longer outputs did not look like genuine heavy metal songs.
So he turned to recurrent neural network (RNN). The RNN Iain used contains two LSTM layers of 512 units each, followed by a fully connected softmax layer. He unrolled the sequence for 32 characters and trained the model by predicting the next 32 characters, given their immediately preceding characters, while minimizing the mean cross-entropy:
“To generate text from the RNN model, we step character-by-character through a sequence. At each step, we feed the current symbol into the model, and the model returns a probability distribution over the next character. We then sample from this distribution to get the next character in the sequence and this character goes on to become the next input to the model. The first character fed into the model at the beginning of generation is always a special start-of-sequence character.” – Iain
This approach worked quite well, and you can compare and contrast it with the earlier models here. If you’d just like to generate some lyrics, the models are hosted online at deepmetal.io.
In part 3, Iain looks into emotional arcs, examining the happiness and metalness of words and lyrics.
When applied to the combined lyrics of albums, you could examine how bands developed their signature sound over time. For example, the lyrics of Metallica’s first few albums seem to be quite heavy metal and unhappy, before moving to a happier place. The Black album is almost sentiment-neutral, but after that they became ever more darker and more metal, moving back to the style to their first few albums. He applied the same analysis on the text of the Harry Potter books, of which especially the first and last appear especially metal.