Simple Correlation Analysis in R using Tidyverse Principles

R’s standard correlation functionality (base::cor) seems very impractical to the new programmer: it returns a matrix and has some pretty shitty defaults it seems. Simon Jackson thought the same so he wrote a tidyverse-compatible new package: corrr! Simon wrote some practical R code that has helped me out greatly before (e.g., color palette’s), but this new package is…

Harry Plotter: Celebrating the 20 year anniversary with tidytext and the tidyverse in R

It has been twenty years since the first Harry Potter novel, the sorcerer’s/philosopher’s stone, was published. To honour the series, I started a text analysis and visualization project, which my other-half wittily dubbed Harry Plotter. In several blogs, I intend to demonstrate how Hadley Wickham’s tidyverse and packages that build on its principles, such as tidytext (free book), have taken programming in R to an…

tidyverse 101: Simplifying life for useRs

Hadley Wickham‘s tidyverse has improved the workflow of analysts / data scientists, makes coding errors less likely and code more transparent. You’ve got to love the figure below, representing a simplified workflow of the average analysis project. The tidyverse provides assistance in each of the stages. Various packages provide functionality to perform analytical tasks more effectively, in…

tidyverse: Example: Trump Approval Rate

For those of you unfamiliar with the tidyverse, it is a collection of R packages that share common philosophies and are designed to work together. Most if not all, are created by R-god Hadley Wickham, one of the leads at RStudio. I was introduced to the tidyverse-packages such as ggplot2 and dplyr in my second…

Top-19 articles of 2019

With only one day remaining in 2019, let’s review the year. 2019 was my third year of blogging and it went by even quicker than the previous two! Personally, it has been a busy year for me: I started a new job, increased my speaking and teaching activities, bought and moved to my new house,…

Online Workshop Tidy Data Science in R, by Jake Thompson

Here’s a website hosting for a five-day hands-on workshop based on the book “R for Data Science”. The workshop was originally offered as part of the Stats Camp: Summer Statistical Institute in Lawrence, KS and hosted by the Center for Research Methods and Data Analysis and the Achievement and Assessment Instituteat the University of Kansas. It is designed for those who…

Creating plots with custom icons for data points

Data visualizations that make smart use of icons have a way of conveying information that sticks. Dataviz professionals like Moritz Stefaner know this and use the practice in their daily work. A recent #tidytuesday entry by Georgios Karamanis demonstrates how easy it is to integrate visual icons in your data figures when you write code…

Factors in R: forcats to help

Hugo Toscano wrote a great blog providing an overview of all the helpful functionalities of the R forcats package. The package includes functions that help handling categorical data, by setting a random or fixed order, and by recategorizing or anonymizing. These functions are specifically helpful when visualizing data with R’s ggplot2. A comprehensive overview is…

Two years of paulvanderlaken.com

Yesterday was the second anniversary of my website. I also reflected on this moment last year, and I thought to continue the tradition in 2019. Let me start with a great, big THANK YOUto all my readers for continuing to visit my website! You are the reason I continue to write down what I read….