Tag: textanalytics

Harry Plotter: Network analysis of spell usage

Harry Plotter: Network analysis of spell usage

Apparently, I was not the only geek who decided to celebrate the 20th anniversary of the Harry Potter saga with statistical analysis. Students Moritz Haine and Markus Dienstknecht of the Data Science for Decision Making Master at Maastricht University started their own celebratory project as part of a course Information Retrieval and Text Mining. Students in … Continue reading Harry Plotter: Network analysis of spell usage

Regular Expressions in R – Part 1: Introduction and base R functions

Regular Expressions in R – Part 1: Introduction and base R functions

The following is the first part of my introduction to regular expression (regex), in general, and the use of regex in R, in specific. It is loosely inspired on the swirl() tutorial by Jon Calder. I created it in R Markdown and uploaded it to RPubs, for an easier read. Regular expression A regular expression, regex or regexp … Continue reading Regular Expressions in R – Part 1: Introduction and base R functions

Text Mining: Pythonic Heavy Metal

Text Mining: Pythonic Heavy Metal

This blog summarized work that has been posted here, here, and here. Iain of degeneratestate.org wrote a three-piece series where he applied text mining to the lyrics of 222,623 songs from 7,364 heavy metal bands spread over 22,314 albums that he scraped from darklyrics.com. He applied a broad range of different analyses in Python, the code of which … Continue reading Text Mining: Pythonic Heavy Metal

Analysis of Media Coverage on Refugees

Analysis of Media Coverage on Refugees

Hannah Yan Han is doing #100dayprojects on data science and visual storytelling and I can only recommend that you take a look yourself. Below you find her R text analysis (#41) of UNHCR speeches and TV coverage on refugees. Unsurprisingly, nouns like asylum, repatriation, displacement, persecution, plight, and crisis appear significantly more often in UNHCR speeches on refugees than … Continue reading Analysis of Media Coverage on Refugees

Visualizing #IRMA Tweets

Visualizing #IRMA Tweets

Reddit user LucasCu90 used the R package twitteR to retrieve all tweets that were sent with #Irma and a Geocode of central Miami (25 mile radius) from Saturday September 9, to Sunday September 10, 2017 (the period of Irma’s approach and initial landfall on the Florida Keys and the mainland). From the 29,000 tweets he collected, Lucas then … Continue reading Visualizing #IRMA Tweets