Tag: production

Repository of Production Machine Learning

Repository of Production Machine Learning

The Institute for Ethical Machine Learning compiled this amazing curated list of open source libraries that will help you deploy, monitor, version, scale, and secure your production machine learning.

🔍 Explaining predictions & models🔏 Privacy preserving ML📜 Model & data versioning
🏁 Model Training Orchestration💪 Model Serving and Monitoring🤖 Neural Architecture Search
📓 Reproducible Notebooks📊 Visualisation frameworks🔠 Industry-strength NLP
🧵 Data pipelines & ETL🏷️ Data Labelling🗞️ Data storage
📡 Functions as a service🗺️ Computation distribution📥 Model serialisation
🧮 Optimized calculation frameworks💸 Data Stream Processing🔴 Outlier and Anomaly Detection
🌀 Feature engineering🎁 Feature Stores⚔ Adversarial Robustness
💰 Commercial Platforms
Direct links to the sections of the Github repo

The Institute for Ethical Machine Learning is a think-tank that brings together with technology leaders, policymakers & academics to develop standards for ML.

ML Model Degradation, and why work only just starts when you reach production

ML Model Degradation, and why work only just starts when you reach production

The assumption that a Machine Learning (ML) project is done when a trained model is put into production is quite faulty. Neverthless, according to Alexandre Gonfalonieri — artificial intelligence (AI) strategist at Philips — this assumption is among the most common mistakes of companies taking their AI products to market.

Actually, in the real world, we see pretty much the opposite of this assumption. People like Alexandre therefore strongly recommend companies keep their best data scientists and engineers on a ML project, especially after it reaches production!


If you’ve ever productionized a model and really started using it, you know that, over time, your model will start performing worse.

In order to maintain the original accuracy of a ML model which is interacting with real world customers or processes, you will need to continuously monitor and/or tweak it!

In the best case, algorithms are retrained with each new data delivery. This offers a maintenance burden that is not fully automatable. According to Alexandre, tending to machine learning models demands the close scrutiny, critical thinking, and manual effort that only highly trained data scientists can provide.

This means that there’s a higher marginal cost to operating ML products compared to traditional software. Whereas the whole reason we are implementing these products is often to decrease (the) costs (of human labor)!

What causes this?

Your models’ accuracy will often be at its best when it just leaves the training grounds.

Building a model on relevant and available data and coming up with accurate predictions is a great start. However, for how long do you expect those data — that age by the day — continue to provide accurate predictions?

Chances are that each day, the model’s latent performance will go down.

This phenomenon is called concept drift, and is heavily studied in academia but less often considered in business settings. Concept drift means that the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways.

In simpler terms, your model is no longer modelling the outcome that it used to model. This causes problems because the predictions become less accurate as time passes.

Particularly, models of human behavior seem to suffer from this pitfall.

The key is that, unlike a simple calculator, your ML model interacts with the real world. And the data it generates and that reaches it is going to change over time. A key part of any ML project should be predicting how your data is going to change over time.

Read more about concept drift here.


How do we know when our models fail?

You need to create a monitoring strategy before reaching production!

According to Alexandre, as soon as you feel confident with your project after the proof-of-concept stage, you should start planning a strategy for keeping your models up to date.

How often will you check in?

On the whole model, or just some features?

What features?

In general, sensible model surveillance combined with a well thought out schedule of model checks is crucial to keeping a production model accurate. Prioritizing checks on the key variables and setting up warnings for when a change has taken place will ensure that you are never caught by a surprise by a change to the environment that robs your model of its efficacy.

Alexandre via

Your strategy will strongly differ based on your model and your business context.

Moreover, there are many different types of concept drift that can affect your models, so it should be a key element to think of the right strategy for you specific case!

Image result for concept drift
Different types of model drift (via)

Let’s solve it!

Once you observe degraded model performance, you will need to redesign your model (pipeline).

One solution is referred to as manual learning. Here, we provide the newly gathered data to our model and re-train and re-deploy it just like the first time we build the model. If you think this sounds time-consuming, you are right. Moreover, the tricky part is not refreshing and retraining a model, but rather thinking of new features that might deal with the concept drift.

A second solution could be to weight your data. Some algorithms allow for this very easily. For others you will need to custom build it in yourself. One recommended weighting schema is to use the inversely proportional age of the data. This way, more attention will be paid to the most recent data (higher weight) and less attention to the oldest of data (smaller weight) in your training set. In this sense, if there is drift, your model will pick it up and correct accordingly.

According to Alexandre and many others, the third and best solution is to build your productionized system in such a way that you continuously evaluate and retrain your models. The benefit of such a continuous learning system is that it can be automated to a large extent, thus reducing (the human labor) maintance costs.

Although Alexandre doesn’t expand on how to do these, he does formulate the three steps below:

Via the original blog

In my personal experience, if you have your model retrained (automatically) every now and then, using a smart weighting schema, and keep monitoring the changes in the parameters and for several “unit-test” cases, you will come a long way.

If you’re feeling more adventureous, you could improve on matters by having your model perform some exploration (at random or rule-wise) of potential new relationships in your data (see for instance multi-armed bandits). This will definitely take you a long way!

Solving concept drift (via)
An Introduction to Docker for R Users, by Colin Fay

An Introduction to Docker for R Users, by Colin Fay

In this awesome 8-minute read, R-progidy Colin Fay explains in laymen’s terms what Docker images, Docker containers, and Volumes are; what Rocker is; and how to set up a Docker container with an R image and run code on it:

On your machine, you’re going to need two things: images, and containers. Images are the definition of the OS, while the containers are the actual running instances of the images. […] To compare with R, this is the same principle as installing vs loading a package: a package is to be downloaded once, while it has to be launched every time you need it. And a package can be launched in several R sessions at the same time easily.

Colin Fay, via https://colinfay.me/docker-r-reproducibility

In his blog, Colin also refers to some great additional resources on Rocker/Docker…

… as well as reading list for those interested in learning more about Docker:

Putting R in Production, by Heather Nolis & Mark Sellors

Putting R in Production, by Heather Nolis & Mark Sellors

It is often said that R is hard to put into production. Fortunately, there are numerous talks demonstrating the contrary.

Here’s one by Heather Nolis, who productionizes R models at T-Mobile. Her teams even shares open-source version of some of their productionized Tensorflow models on github. Read more about that model here.

There’s another great talk on the RStudio website. In this talk, Mark Sellors discusses some of the misinformation around the idea of what “putting something into production” actually means, and provides some tips on overcoming obstacles.

Cover image via Fotolia.

PyData, London 2018

PyData, London 2018

PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

April 2018, a PyData conference was held in London, with three days of super interesting sessions and hackathons. While I couldn’t attend in person, I very much enjoy reviewing the sessions at home as all are shared open access on YouTube channel PyDataTV!

In the following section, I will outline some of my favorites as I progress through the channel:

Winning with simple, even linear, models:

One talk that really resonated with me is Vincent Warmerdam‘s talk on “Winning with Simple, even Linear, Models“. Working at GoDataDriven, a data science consultancy firm in the Netherlands, Vincent is quite familiar with deploying deep learning models, but is also midly annoyed by all the hype surrounding deep learning and neural networks. Particularly when less complex models perform equally well or only slightly less. One of his quote’s nicely sums it up:

“Tensorflow is a cool tool, but it’s even cooler when you don’t need it!”

— Vincent Warmerdam, PyData 2018

In only 40 minutes, Vincent goes to show the finesse of much simpler (linear) models in all different kinds of production settings. Among others, Vincent shows:

  • how to solve the XOR problem with linear models
  • how to win at timeseries with radial basis features
  • how to use weighted regression to deal with historical overfitting
  • how deep learning models introduce a new theme of horror in production
  • how to create streaming models using passive aggressive updating
  • how to build a real-time video game ranking system using mere histograms
  • how to create a well performing recommender with two SQL tables
  • how to rock at data science and machine learning using Python, R, and even Stan