Tag: tidyverse

Generating Book Covers By Their Words — My Dissertation Cover

Generating Book Covers By Their Words — My Dissertation Cover

As some of you might know, I am defending my PhD dissertation later this year. It's titled "Data-Driven Human Resource Management: The rise of people analytics and its application to expatriate management" and, over the past few months, I was tasked with designing its cover. Now, I didn't want to buy some random stock photo … Continue reading Generating Book Covers By Their Words — My Dissertation Cover

Become a data-driven Sommelier by text mining wine reviews

Become a data-driven Sommelier by text mining wine reviews

Aleszu Bajak at Storybench.org published a great demonstration of the power of text mining. He used the R tidytext package to analyse 150,000 wine reviews which Zach Thoutt had scraped from Wine Enthusiast in November of 2017. Aleszu started his analysis on only the French wines, with a simple word count per region: Next, he applied TF-IDF to surface the … Continue reading Become a data-driven Sommelier by text mining wine reviews

Sentiment Analysis of Stranger Things Seasons 1 and 2

Sentiment Analysis of Stranger Things Seasons 1 and 2

Jordan Dworkin, a Biostatistics PhD student at the University of Pennsylvania, is one of the few million fans of Stranger Things, a 80s-themed Netflix series combining drama, fantasy, mystery, and horror. Awaiting the third season, Jordan was curious as to the emotional voyage viewers went through during the series, and he decided to examine this … Continue reading Sentiment Analysis of Stranger Things Seasons 1 and 2

Functional programming and why not to “grow” vectors in R

Functional programming and why not to “grow” vectors in R

For fresh R programmers, vectorization can sound awfully complicated. Consider two math problems, one vectorized, and one not: Why on earth should R spend more time calculating one over the other? In both cases there are the same three addition operations to perform, so why the difference? This is what we will try to illustrate … Continue reading Functional programming and why not to “grow” vectors in R

Harry Plotter: Part 2 – Hogwarts Houses and their Stereotypes

Harry Plotter: Part 2 – Hogwarts Houses and their Stereotypes

Two weeks ago, I started the Harry Plotter project to celebrate the 20th anniversary of the first Harry Potter book. I could not have imagined that the first blog would be so well received. It reached over 4000 views in a matter of days thanks to the lovely people in the data science and #rstats community that were kind enough to share it … Continue reading Harry Plotter: Part 2 – Hogwarts Houses and their Stereotypes