Visualizing the inner workings of the k-means clustering algorithm

Originally, I wrote this blog to share this interactive visualization of the k-means algorithm (wiki) which I was all enthusiastic about. However, then I imagined that not everybody may be familiar with k-means, hence, I wrote the whole blog below.  Next thing I know, u/dashee87 on r/datascience points me to these two other blogs that had already…

Animated Citation Gates turned into Selection Gates

Bret Beheim — senior researcher at the Max Planck Institute for Evolutionary Anthropology — posted a great GIF animation of the response to his research survey. He calls the figure citation gates, relating the year of scientific publication to the likelihood that the research materials are published open-source or accessible. To generate the visualization, Bret used…

Chatterplots

I’ve mentioned before that I dislike wordclouds (for instance here, or here) and apparently others share that sentiment. In his recent Medium blog, Daniel McNichol goes as far as to refer to the wordcloud as the pie chart of text data! Among others, Daniel calls wordclouds disorienting, one-dimensional, arbitrary and opaque and he mentions their lack of order,…

100 amazing color palettes including their Hex codes

TJ Mahr hinted to this Canva webpage on Twitter. It contains 100 beautiful color palettes including their hexadecimal color codes. For instance, these three below. The great thing is that these color palettes are include in the ggthemes package in R. Hence, the following code uses this Nightlife palette directly in an R script, resulting in…

Add a self-explantory legend to your ggplot2 boxplots

Laura DeCicco found that non-R users keep asking her what her box plots exactly mean or demonstrate. In a recent blog post, she therefore breaks down the calculations into easy-to-follow chunks of code. Even better, she included the source code to make boxplots that come with a very elaborate default legend: As you can see,…

ggstatsplot: Creating graphics including statistical details

This pearl had been resting in my inbox for quite a while before I was able to add it to my R resources list. Citing its GitHub page, ggstatsplot is an extension of ggplot2 package for creating graphics with details from statistical tests included in the plots themselves and targeted primarily at behavioral sciences community to provide a one-line code…

(Time Series) Forecasting: Principles & Practice (in R)

I stumbled across this open access book by Rob Hyndman, the god of time series, and George Athanasopoulos, a colleague statistician / econometrician at Monash University in Melbourne Australia. Hyndman and Athanasopoulos provide a comprehensive introduction to forecasting methods, accessible and relevant among others for business professionals without any formal training in the area. All R examples…

A Categorical Spatial Interpolation Tutorial in R

Timo Grossenbacher works as reporter/coder for SRF Data, the data journalism unit of Swiss Radio and TV. He analyzes and visualizes data and investigates data-driven stories. On his website, he hosts a growing list of cool projects. One of his recent blogs covers categorical spatial interpolation in R. The end result of that blog looks amazing: This map…