Search Results for: tidytext

Harry Plotter: Celebrating the 20 year anniversary with tidytext and the tidyverse in R

Harry Plotter: Celebrating the 20 year anniversary with tidytext and the tidyverse in R

It has been twenty years since the first Harry Potter novel, the sorcerer’s/philosopher’s stone, was published. To honour the series, I started a text analysis and visualization project, which my other-half wittily dubbed Harry Plotter. In several blogs, I intend to demonstrate how Hadley Wickham’s tidyverse and packages that build on its principles, such as tidytext (free book), have taken programming in R to an … Continue reading Harry Plotter: Celebrating the 20 year anniversary with tidytext and the tidyverse in R

Two years of paulvanderlaken.com

Yesterday was the second anniversary of my website. I also reflected on this moment last year, and I thought to continue the tradition in 2019. Let me start with a great, big THANK YOUto all my readers for continuing to visit my website! You are the reason I continue to write down what I read. … Continue reading Two years of paulvanderlaken.com

Chatterplots

Chatterplots

I’ve mentioned before that I dislike wordclouds (for instance here, or here) and apparently others share that sentiment. In his recent Medium blog, Daniel McNichol goes as far as to refer to the wordcloud as the pie chart of text data! Among others, Daniel calls wordclouds disorienting, one-dimensional, arbitrary and opaque and he mentions their lack of order, … Continue reading Chatterplots

Become a data-driven Sommelier by text mining wine reviews

Become a data-driven Sommelier by text mining wine reviews

Aleszu Bajak at Storybench.org published a great demonstration of the power of text mining. He used the R tidytext package to analyse 150,000 wine reviews which Zach Thoutt had scraped from Wine Enthusiast in November of 2017. Aleszu started his analysis on only the French wines, with a simple word count per region: Next, he applied TF-IDF to surface the … Continue reading Become a data-driven Sommelier by text mining wine reviews

One year of paulvanderlaken.com

One year ago, I registered the domain paulvanderlaken.com with three reasons in mind: (1) I wanted an online environment to store and showcase my pet projects, (2) to share and promote some of the great blogs and research others had been writing, and (3) to show others what I was doing on my path to … Continue reading One year of paulvanderlaken.com

Sentiment Analysis: Analyzing Lexicon Quality and Estimation Errors

Sentiment Analysis: Analyzing Lexicon Quality and Estimation Errors

Sentiment analysis is a topic I cover regularly, for instance, with regard to Harry Plotter, Stranger Things, or Facebook. Usually I stick to the three sentiment dictionaries (i.e., lexicons) included in the tidytext R package (Bing, NRC, and AFINN) but there are many more one could use. Heck, I’ve even tried building one myself using a synonym/antonym … Continue reading Sentiment Analysis: Analyzing Lexicon Quality and Estimation Errors