Overviews of Graph Classification and Network Clustering methods

Thanks to Sebastian Raschka I am able to share this great GitHub overview page of relevant graph classification techniques, and the scientific papers behind them. The overview divides the algorithms into four groups: Factorization Spectral and Statistical Fingerprints Deep Learning Graph Kernels Moreover, the overview contains links to similar collections on community detection, classification/regression trees and gradient boosting papers…

Glossary of Statistical Terminology

Frank Harrel shared this 16-page glossary of statistical terminology created by the Department of Biostatistics of Vanderbilt University School of Medicine. The overview touches on everything from Bayes’ Theorem to p-values, explaining matters in just the right detail. Various study designs and model types are also discussed so it might just come in handy for…

Avoid bar plots for continuous data! Do this instead:

Tracey Weissgerber, Natasa Milic, Stacey Winham, and Vesna Garovic wrote this interesting 2015 paper on bar graphs. By a systematic review of physiology research, they demonstrate we need to reconsider how we present continuous data in small samples. Bar and line plots are commonly used to display continuous data. This is problematic, as many different data…

Papers with Code: State-of-the-Art

OK, this is a really great find! The website PapersWithCode.com lists all scientific publications of which the codes are open-sourced on GitHub. Moreover, you can sort these papers by the stars they accumulated on Github over the past days. The authors, @rbstojnic and @rosstaylor90, just made this in their spare time. Thank you, sirs! Papers with Code allows you to quickly…

A/B Testing a New Look

This WordPress blogger I came across — let’s call him “John” for now — has a very peculiar way of testing out his looks. Using dating-apps like Tinder, John conducted A/B-tests to find out whether people would prefer him romantically with or without a beard.  Via a proper experimental setup, John found out that bearded John receives much…

Checklist to Optimize Training Transfer in Organizations

Ashley Hughes, Stephanie Zajac, Jacqueline Spencer, and Eduardo Salas wrote a recent research note for the International Journal of Training and Development. The research note is build around an evidence-based checklist of actionable insights for practitioners that will help to enhance the effectiveness of training interventions. These actionable insights would help to prevent ‘transfer problem’, meaning that…

Privacy, Compliance, and Ethical Issues with Predictive People Analytics

November 9th 2018, I defended my dissertation on data-driven human resource management, which you can read and download via this link. On page 149, I discuss several of the issues we face when implementing machine learning and analytics within an HRM context. For the references and more detailed background information, please consult the full dissertation. More interesting reads on ethics in machine learning can be found here….

Tidy Missing Data Handling

A recent open access paper by Nicholas Tierney and Dianne Cook — professors at Monash University — deals with simpler handling, exploring, and imputation of missing values in data.They present new methodology building upon tidy data principles, with a goal to integrating missing value handling as an integral part of data analysis workflows. New data structures are…

12 Guidelines for Effective A/B Testing

I wrote about Emily Robinson and her A/B testing activities at Etsy before, but now she’s back with a great new blog full of practical advice: Emily provides 12 guidelines for A/B testing that help to setup effective experiments and mitigate data-driven but erroneous conclusions: Have one key metric for your experiment. Use that key…