## CodeWars: Learn programming through test-driven development

As I wrote about Project Euler and CodingGame before, someone recommended me CodeWars. CodeWars offers free online learning exercises to develop your programming skills through fun daily challenges. In line with Project Euler, you are tasked with solving increasingly complex programming challenges. At CodeWars, these little problems you need to solve with code are called…

## Why Gordon Shotwell uses R

This blog by Gordon Shotwell has passed my Twitter feed a couple of times now and I thought I’d share it here: blog.shotwell.ca/posts/why_i_use_r It in, Gordon present his reasons for using R, describing R’s four unique selling point, and outlining a discussion full of perfectly quotable thoughts and opinions. Do have a look at the…

## Visualizing Sampling Distributions in ggplot2: Adding area under the curve

Thank you ggplot2tutor for solving one of my struggles. Apparently this is all it takes: I can’t begin to count how often I have wanted to visualize a (normal) distribution in a plot. For instance to show how my sample differs from expectations, or to highlight the skewness of the scores on a particular variable….

## Anomaly Detection Resources

Carnegie Mellon PhD student Yue Zhao collects this great Github repository of anomaly detection resources: https://github.com/yzhao062/anomaly-detection-resources The repository consists of tools for multiple languages (R, Python, Matlab, Java) and resources in the form of: Books & Academic Papers Online Courses and Videos Outlier Datasets Algorithms and Applications Open-source and Commercial Libraries/Toolkits Key Conferences & Journals…

## Need to save R's lm() or glm() models? Trim the fat!

I was training a predictive model for work for use in a Shiny App. However, as the training set was quite large (700k+ obs.), the model object to save was also quite large in size (500mb). This slows down your operation significantly! Basically, all you really need are the coefficients (and a link function, in…

## A Visual Introduction to Hierarchical Models, by Michael Freeman

Hierarchical models I have covered before on this blog. These models are super relevant in practice. For instance, in HR, employee data is always nested within teams which are in turn nested within organizational units. Also in my current field of insurances, claims are always nested within policies, which can in turn be nested within…

## Online Workshop Tidy Data Science in R, by Jake Thompson

Here’s a website hosting for a five-day hands-on workshop based on the book “R for Data Science”. The workshop was originally offered as part of the Stats Camp: Summer Statistical Institute in Lawrence, KS and hosted by the Center for Research Methods and Data Analysis and the Achievement and Assessment Instituteat the University of Kansas. It is designed for those who…

## Comprehensive Introduction to Command Line for R Users

Too little time, too many things of interest. Here’s a resource that’s still on my to-do list: A Comprehensive Introduction to Command Line for R Users by rsquaredacademy.com In this tutorial, you will be introduced to the command line. We have selected a set of commands we think will be useful in general to a…

## How Do I…? R Code Snippets by Sharon Machlis

Sharon Machlis is the author of Practical R for Mass Communication and Journalism. In writing this book, she obviously wrote a lot of R code. Now, Sharon has been nice enough to share all 195 tricks and tips she came across during her writing with us, via this handy table. Sharon’s list contains many neat…