Animated Machine Learning Classifiers

Ryan Holbrook made awesome animated GIFs in R of several classifiers learning a decision rule boundary between two classes. Basically, what you see is a machine learning model in action, learning how to distinguish data of two classes, say cats and dogs, using some X and Y variables. These visuals can be great to understand…

AI Book Review: You look like a thing and I love you

The following are my summary and take-aways from Janelle Shane’s 2019 book named You look like a thing and I love you. Most of the below are excerpts from Janelle’s book, combined, or rewritten by me. For the sake of copyright, just consider everything Janelle’s : ) AI weirdness You look like a thing and…

How Booking.com deals with Selection Bias

I came across this PyData 2018 talk by Lucas Bernadi of Booking.com where he talks about the importance of selection bias for practical applications of machine learning. We can’t just throw data into machines and expect to see any meaning […], we need to think [about this]. I see a strong trend in the practitioners…

Finland's free online AI crash course

Finland developed a crash course on AI to educate its citizens. The course was arguably a great local success, with over 50 thousand Fins taking the course (1% of the population). Now, as a gift to the European Union, Finland has opened up the course for the rest of Europe and the world to enjoy….

Anomaly Detection Resources

Carnegie Mellon PhD student Yue Zhao collects this great Github repository of anomaly detection resources: https://github.com/yzhao062/anomaly-detection-resources The repository consists of tools for multiple languages (R, Python, Matlab, Java) and resources in the form of: Books & Academic Papers Online Courses and Videos Outlier Datasets Algorithms and Applications Open-source and Commercial Libraries/Toolkits Key Conferences & Journals…

Calibrating algorithmic predictions with logistic regression

I found this interesting blog by Guilherme Duarte Marmerola where he shows how the predictions of algorithmic models (such as gradient boosted machines, or random forests) can be calibrated by stacking a logistic regression model on top of it: by using the predicted leaves of the algorithmic model as features / inputs in a subsequent…

Neural Synesthesia: GAN AI dreaming of music

Xander Steenbrugge shared his latest work on LinkedIn yesterday, and I was completely stunned! Xander had been working on, what he called, a “fun side-project”, but which was in my eyes, absolutely awesome. He had used two generative adversarial networks (GANs) to teach one another how to respond visually to changing audio cues. This resulted…

Podcasts for Data Science Start-Ups

Christopher of Neurotroph.de compiled this short list of data science podcasts worth listening to. See Chris’ original article for more details on the podcasts, but the links below take you to them directly: Data Skeptic DataFramed Not So Standard Deviations Linear Digressions  Rework

Overviews of Graph Classification and Network Clustering methods

Thanks to Sebastian Raschka I am able to share this great GitHub overview page of relevant graph classification techniques, and the scientific papers behind them. The overview divides the algorithms into four groups: Factorization Spectral and Statistical Fingerprints Deep Learning Graph Kernels Moreover, the overview contains links to similar collections on community detection, classification/regression trees and gradient boosting papers…