Getting started with Python in Visual Studio Code

After several years of proscrastinating, the inevitable finally happened: Three months ago, I committed to learning Python! I must say that getting started was not easy. One afternoon three months ago, I sat down, motivated to get started. Obviously, the first step was to download and install Python as well as something to write actual…

Overview of built-in colors in R

Most of my data visualizations I create using R programming — as you might have noticed from the content of my website. Though I am colorblind myself, I love to work with colors and color palettes in my visualizations. And I’ve come across quite some neat tricks in my time. For instance, did you it’s…

Learn Programming Project-Based: Build-Your-Own-X

Last week, this interesting reddit thread was filled with overviews for cool projects that may help you learn a programming language. The top entries are: Build Your Own X, by Dani Stefanovic Project-based Learning, by Tu Tran Projects from Scratch, by Algory L. Project-based Tutorials in C, by Robby Awesome DIY Software, by Cameron Eagans…

Tidy Machine Learning with R’s purrr and tidyr

Jared Wilber posted this great walkthrough where he codes a simple R data pipeline using purrr and tidyr to train a large variety of models and methods on the same base data, all in a non-repetitive, reproducible, clean, and thus tidy fashion. Really impressive workflow!

Comparison between R dplyr and data.table code

Atrebas created this extremely helpful overview page showing how to program standard data manipulation and data transformation routines in R’s famous packages dplyr and data.table. The document has been been inspired by this stackoverflow question and by the data.table cheat sheet published by Karlijn Willems. Resources for data.table can be found on the data.table wiki, in the data.table vignettes,…

ROC, AUC, precision, and recall visually explained

A receiver operating characteristic (ROC) curve displays how well a model can classify binary outcomes. An ROC curve is generated by plotting the false positive rate of a model against its true positive rate, for each possible cutoff value. Often, the area under the curve (AUC) is calculated and used as a metric showing how well…

Python for R users

Wanting to broaden your scope and learn a new programming language? This great workshop was given at EARL 2018 by Mango Solutions and helps R programmers transition into Python building on their existing R knowledge. The workshop includes exercises that introduce you to the key concepts of Python and some of its most powerful packages…

Putting R in Production, by Heather Nolis & Mark Sellors

It is often said that R is hard to put into production. Fortunately, there are numerous talks demonstrating the contrary. Here’s one by Heather Nolis, who productionizes R models at T-Mobile. Her teams even shares open-source version of some of their productionized Tensorflow models on github. Read more about that model here. There’s another great…

Recreating graphics from the Fundamentals of Data Visualization

Claus Wilke wrote the Fundamentals of Data Visualization – a great resource that’s definitely high on my list of recommended data visualization books. In a recent post, Claus shared the link to a GitHub repository where he hosts some of the R programming code with which Claus made the graphics for his dataviz book. The…