Simulating data with Bayesian networks, by Daniel Oehm

Daniel Oehm wrote this interesting blog about how to simulate realistic data using a Bayesian network. Bayesian networks are a type of probabilistic graphical model that uses Bayesian inference for probability computations. Bayesian networks aim to model conditional dependence, and therefore causation, by representing conditional dependence by edges in a directed graph. Through these relationships, one…

Bayes theorem, and making probability intuitive – by 3Blue1Brown

This video I’ve been meaning to watch for a while now. It another great visual explanation of a statistics topic by the 3Blue1Brown Youtube channel (which I’ve covered before, multiple times). This time, it’s all about Bayes theorem, and I just love how Grant Sanderson explains the concept so visually. He argues that rather then…

E-Book: Probabilistic Programming & Bayesian Methods for Hackers

The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chapters of slow, mathematical analysis. Nevertheless, mathematical analysis is only one way to “think Bayes”. With cheap computing power, we can now afford to take an alternate route via probabilistic programming. Cam Davidson-Pilon wrote the book Bayesian Methods for…

Helpful resources for A/B testing

Brandon Rohrer — (former) data scientist at Microsoft, iRobot, and Facebook — asked his network on Twitter and LinkedIn to share their favorite resources on A/B testing. It produced a nice list, which I summarized below. The order is somewhat arbitrary, and somewhat based on my personal appreciation of the resources. Course: A/B-testing by Google…

PyData, London 2018

PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The communities approach data science using many languages, including (but not limited to) Python, Julia, and R. April 2018, a PyData conference was held in London, with three days of super…

Bayesian data analysis for newcomers

Professor John Kruschke and Torrin Liddell – one of his Ph.D. students at Indiana University – wrote a fantastically useful scientific paper introducing Bayesian data analysis to the masses. Kruschke and Liddell explain the main ideas behind Bayesian statistics, how Bayesians deal with continuous and binary variables, how to use and set meaningful priors, the differences between…

Advanced GIFs in R

Rafa Irizarry is a biostatistics professor and one of the three people behind SimplyStatistics.org (the others are Jeff Leek, Roger Peng). They post ideas that they find interesting and their blog contributes greatly to discussion of science/popular writing. Rafa is the creator of many data visualization GIFs that have recently trended on the web, and in a recent post…

Data Science, Machine Learning, & Statistics resources (free courses, books, tutorials, & cheat sheets)

Welcome to my repository of data science, machine learning, and statistics resources. Software-specific material has to a large extent been listed under their respective overviews: R Resources & Python Resources. I also host a list of SQL Resources and datasets to practice programming. If you have any additions, please comment or contact me! LAST UPDATED: 21-05-2018 Courses: Udacity: Introduction to Descriptive Statistics…

Must read: Computer Age Statistical Inference (Efron & Hastie, 2016)

Statistics, and statistical inference in specific, are becoming an ever greater part of our daily lives. Models are trying to estimate anything from (future) consumer behaviour to optimal steering behaviours and we need these models to be as accurate as possible. Trevor Hastie is a great contributor to the development of the field, and I…