Need to save R's lm() or glm() models? Trim the fat!

I was training a predictive model for work for use in a Shiny App. However, as the training set was quite large (700k+ obs.), the model object to save was also quite large in size (500mb). This slows down your operation significantly! Basically, all you really need are the coefficients (and a link function, in…

A Visual Introduction to Hierarchical Models, by Michael Freeman

Hierarchical models I have covered before on this blog. These models are super relevant in practice. For instance, in HR, employee data is always nested within teams which are in turn nested within organizational units. Also in my current field of insurances, claims are always nested within policies, which can in turn be nested within…

Online Workshop Tidy Data Science in R, by Jake Thompson

Here’s a website hosting for a five-day hands-on workshop based on the book “R for Data Science”. The workshop was originally offered as part of the Stats Camp: Summer Statistical Institute in Lawrence, KS and hosted by the Center for Research Methods and Data Analysis and the Achievement and Assessment Instituteat the University of Kansas. It is designed for those who…

Comprehensive Introduction to Command Line for R Users

Too little time, too many things of interest. Here’s a resource that’s still on my to-do list: A Comprehensive Introduction to Command Line for R Users by rsquaredacademy.com In this tutorial, you will be introduced to the command line. We have selected a set of commands we think will be useful in general to a…

How Do I…? R Code Snippets by Sharon Machlis

Sharon Machlis is the author of Practical R for Mass Communication and Journalism. In writing this book, she obviously wrote a lot of R code. Now, Sharon has been nice enough to share all 195 tricks and tips she came across during her writing with us, via this handy table. Sharon’s list contains many neat…

The Causal Inference Book: DAGS and more

Harvard (bio)statisticians Miguel Hernan and Jamie Robins just released their new book, online and accessible for free! The Causal Inference book provides a cohesive presentation of causal inference, its concepts and its methods. The book is divided in 3 parts of increasing difficulty: causal inference without models, causal inference with models, and causal inference from…

An Introduction to Docker for R Users, by Colin Fay

In this awesome 8-minute read, R-progidy Colin Fay explains in laymen’s terms what Docker images, Docker containers, and Volumes are; what Rocker is; and how to set up a Docker container with an R image and run code on it: On your machine, you’re going to need two things: images, and containers. Images are the definition…

Overview of built-in colors in R

Most of my data visualizations I create using R programming — as you might have noticed from the content of my website. Though I am colorblind myself, I love to work with colors and color palettes in my visualizations. And I’ve come across quite some neat tricks in my time. For instance, did you it’s…

Causal Random Forests, by Mark White

I stumbled accros this incredibly interesting read by Mark White, who discusses the (academic) theory behind, inner workings, and example (R) applications of causal random forests: EXPLICITLY OPTIMIZING ON CAUSAL EFFECTS VIA THE CAUSAL RANDOM FOREST: A PRACTICAL INTRODUCTION AND TUTORIAL (By Mark White) These so-called “honest” forests seem a great technique to identify opportunities…