Recommended Books on Data Visualization

Data visualization and the (in)effective communication of information are salient topics on this blog. I just love to read and write about best practices related to data visualization (or bad practices), or to explore novel types of complex graphs. However, I am not always online, and I am equally fond of reading about data visualization…

StatQuest: Statistical concepts, clearly explained

Josh Starmer is assistant professor at the genetics department of the University of North Carolina at Chapel Hill. But more importantly: Josh is the mastermind behind StatQuest! StatQuest is a Youtube channel (and website) dedicated to explaining complex statistical concepts — like data distributions, probability, or novel machine learning algorithms — in simple terms. Once…

Books for the modern, data-driven HR professional (incl. People Analytics)

With great pleasure I’ve studied and worked in the field of people analytics, where we seek to leverage employee, management-, and business information to better organize and manage our personnel. Here, data has proven valuable itself indispensible for the organization of the future. Data and analytics have not traditionally been high on the list of…

GoalKicker: Free Programming Books

This specific link has been on my to-do list for so-long now that I’ve decided to just share it without any further ado. The people behind GoalKicker, for whatever reason, decided to compile nearly 100 books on different programming languages based on among others StackOverflow entries. Their open access library contains books on languages from…

#100DaysOfCode: Machine Learning & Data Visualization

2018 seemed to be the year of challenges going viral on the web. Most of them were plain stupid and/or dangerous. However, one viral challenge I did like: #100DaysOfCode 1. Code minimum an hour every day for the next 100 days. 2. Tweet your progress every day with the #100DaysOfCode hashtag. 3. Each day, reach…

Avoid bar plots for continuous data! Do this instead:

Tracey Weissgerber, Natasa Milic, Stacey Winham, and Vesna Garovic wrote this interesting 2015 paper on bar graphs. By a systematic review of physiology research, they demonstrate we need to reconsider how we present continuous data in small samples. Bar and line plots are commonly used to display continuous data. This is problematic, as many different data…

Visualizing the inner workings of the k-means clustering algorithm

Originally, I wrote this blog to share this interactive visualization of the k-means algorithm (wiki) which I was all enthusiastic about. However, then I imagined that not everybody may be familiar with k-means, hence, I wrote the whole blog below.  Next thing I know, u/dashee87 on r/datascience points me to these two other blogs that had already…

Your own personalized motivational GIF

One of the R OpenSci ozunconference 2018 projects was all about gganimate. The guys and gals dubbed this project learngganimate and dedicated an expansive GitHub repository to it. As part of this project, they created some — let’s call it — very creative GIFS. One of their GIFs I particularly liked, copied below. Using the OpenSci syn…

Learning Functional Programming & purrr

The R for Data Science (R4DS) book by Hadley Wickham is a definite must-read for every R programmer. Amongst others, the power of functional programming is explained in it very well in the chapter on Iteration. I wrote about functional programming before, but I recently re-read the R4DS book section after coming across some new valuable…