Overview of built-in colors in R

Most of my data visualizations I create using R programming — as you might have noticed from the content of my website. Though I am colorblind myself, I love to work with colors and color palettes in my visualizations. And I’ve come across quite some neat tricks in my time. For instance, did you it’s…

Simulate Datasets with DrawData.xyz

Vincent Warmerdam shared his new tool to quickly simulate artificial datasets: http://www.drawdata.xyz. The drawdata.xyz tool allows you to easily create your own line- and scatter-plot with different groups of datapoints following specific x-y patterns. After drawing your data, you can just click to export your new dataset to csv or json format. x y 106.04…

17 Principles of (Unix) Software Design

I came across this 1999-2003 e-book by Eric Raymond, on the Art of Unix Programming. It contains several relevant overviews of the basic principles behind the Unix philosophy, which are probably useful for anybody working in hardware, software, or other algoritmic design. First up, is a great list of 17 design rules, explained in more…

Dynamic Programming MIT Course

Cover image by xkcd Over the last months I’ve been working my way through Project Euler in my spare time. I wanted to learn Python programming, and what better way than solving mini-problems and -projects?! Well, Project Euler got a ton of these, listed in increasing order of difficulty. It starts out simple: to solve…

Learn Programming Project-Based: Build-Your-Own-X

Last week, this interesting reddit thread was filled with overviews for cool projects that may help you learn a programming language. The top entries are: Build Your Own X, by Dani Stefanovic Project-based Learning, by Tu Tran Projects from Scratch, by Algory L. Project-based Tutorials in C, by Robby Awesome DIY Software, by Cameron Eagans…

Data Visualization Style Guide Repositories

Amy Cesal put together (1) this great overview of style guides for data visualization practice. Moreover, in the original tweet, Amy refers to other great repositories such as (2) this PolicyViz one and (3) this humongous one by Adele. Amy’s list includes many references to the best practices used by some of the leading data…

Causal Random Forests, by Mark White

I stumbled accros this incredibly interesting read by Mark White, who discusses the (academic) theory behind, inner workings, and example (R) applications of causal random forests: EXPLICITLY OPTIMIZING ON CAUSAL EFFECTS VIA THE CAUSAL RANDOM FOREST: A PRACTICAL INTRODUCTION AND TUTORIAL (By Mark White) These so-called “honest” forests seem a great technique to identify opportunities…

2019 Shortlist for the Royal Society Prize for Science Books

Since 1988, the Royal Society has celebrated outstanding popular science writing and authors. Each year, a panel of expert judges choose the book that they believe makes popular science writing compelling and accessible to the public. Over the decades, the Prize has celebrated some notable winners including Bill Bryson and Stephen Hawking. The author of the winning…

Data Engineering Reading List, by Mapflat

Lars Albertsson, former software engineer at Spotify and Google and currently freelance data engineer via mapflat, maintains this list of data engineering resources. It includes many links to videos and courses about data pipelines, batch processing, Kafka, NoSQL, Clojure, Scala, Parquet, Luigi, Storm, Spark, Hadoop, Cassandra, and other tools I am not too familiar with….