t-SNE, the Ultimate Drum Machine and more

This blog explains t-SNE (t-Distributed Stochastic Neighbor Embedding) by a story of programmers joining forces with musicians to create the ultimate drum machine (if you are here just for the fun, you may start playing right away). Kyle McDonald, Manny Tan, and Yotam Mann experienced difficulties in pinpointing to what extent sounds are similar (ding, dong) … Continue reading t-SNE, the Ultimate Drum Machine and more

Google Facets: Interactive Visualization for Everybody

Last week, Google released Facets, their new, open source visualization tool. Facets consists of two interfaces that allow users to investigate their data at different levels. Facets Overview provides users with a quick understanding of the distribution of values across the variables in their dataset. Overview is especially helpful in detecting unexpected values, missing values, unbalanced … Continue reading Google Facets: Interactive Visualization for Everybody

Computing and visualizing PCA in R

Thiago G. Martins

Following my introduction to PCA, I will demonstrate how to apply and visualize PCA in R. There are many packages and functions that can apply PCA in R. In this post I will use the function prcomp from the stats package. I will also show how to visualize PCA in R using Base R graphics. However, my favorite visualization function for PCA is ggbiplot, which is implemented by Vince Q. Vu and available on github. Please, let me know if you have better ways to visualize PCA in R.

Computing the Principal Components (PC)

I will use the classical iris dataset for the demonstration. The data contain four continuous variables which corresponds to physical measures of flowers and a categorical variable describing the flowers’ species.

We will apply PCA to the four continuous variables and use the categorical variable to visualize the PCs later. Notice that in…

View original post 612 more words

Statistics Visually Explained

Statistical literacy is essential to our data-driven society. Analytics has been and continues to be a game changer in many business fields, among other Human Resources. Yet, for all the increased importance and demand for statistical competence, the pedagogical approaches in statistics have barely changed. Seeing Theory is a project designed and created by Daniel … Continue reading Statistics Visually Explained

Veritasium: Bayes’ Theorem explained

Veritasium makes educational video's, mostly about science, and recently they recorded one offering an intuitive explanation of Bayes' Theorem. They guide the viewer through Bayes' thought process coming up with the theory, explain its workings, but also acknowledge some of the issues when applying Bayesian statistics in society. "The thing we forget in Bayes' Theorem is … Continue reading Veritasium: Bayes’ Theorem explained

Multi-Armed Bandits: The Smart Alternative for A/B Testing

Just as humans, computers learn by experience.The purpose of A/B testing is often to collect data to decide whether intervention A or B is better. As such, we provide one group with intervention A whereas another group receives intervention B. With the data of these two groups coming in, the computer can statistically estimate which … Continue reading Multi-Armed Bandits: The Smart Alternative for A/B Testing