Survival of the Best Fit: A webgame on AI in recruitment

Survival of the Best Fit is a webgame that simulates what happens when companies automate their recruitment and selection processes. You – playing as the CEO of a starting tech company – are asked to select your favorite candidates from a line-up, based on their resumés. As your simulated company grows, the time pressure increases,…

Logistic regression is not fucked, by Jake Westfall

Recently, I came across a social science paper that had used linear probability regression. I had never heard of linear probability models (LPM), but it seems just an application of ordinary least squares regression but to a binomial dependent variable. According to some, LPM is a commonly used alternative for logistic regression, which is what…

Propensity Score Matching Explained Visually

Propensity score matching (wiki) is a statistical matching technique that attempts to estimate the effect of a treatment (e.g., intervention) by accounting for the factors that predict whether an individual would be eligble for receiving the treatment. The wikipedia page provides a good example setting: Say we are interested in the effects of smoking on…

Recommended Books on Data Visualization

Data visualization and the (in)effective communication of information are salient topics on this blog. I just love to read and write about best practices related to data visualization (or bad practices), or to explore novel types of complex graphs. However, I am not always online, and I am equally fond of reading about data visualization…

Learn from the Pros: How media companies visualize data

Past months, multiple companies shared their approaches to data visualization and their lessons learned. Click the companies in the list below to jump to their respective section The Financial Times The Britisch Broadcast Corporation The Economist FiveThirtyEight Financial Times The Financial Times (FT) released a searchable database of the many data visualizations they produced over…

18 Pitfalls of Data Visualization

Maarten Lambrechts is a data journalist I closely follow online, with great delight. Recently, he shared on Twitter his slidedeck on the 18 most common data visualization pitfalls. You will probably already be familiar with most, but some (like #14) were new to me: Save pies for dessert Don’t cut bars Don’t cut time axes…

Avoid bar plots for continuous data! Do this instead:

Tracey Weissgerber, Natasa Milic, Stacey Winham, and Vesna Garovic wrote this interesting 2015 paper on bar graphs. By a systematic review of physiology research, they demonstrate we need to reconsider how we present continuous data in small samples. Bar and line plots are commonly used to display continuous data. This is problematic, as many different data…

Papers with Code: State-of-the-Art

OK, this is a really great find! The website PapersWithCode.com lists all scientific publications of which the codes are open-sourced on GitHub. Moreover, you can sort these papers by the stars they accumulated on Github over the past days. The authors, @rbstojnic and @rosstaylor90, just made this in their spare time. Thank you, sirs! Papers with Code allows you to quickly…