Category: machine learning

Neural Networks 101

Last month, a video by 3Blue1Brown has been trending on YouTube, accumulating already over a quarter of a million views. It only lasts 10 minutes but provides a very good and intuitive explanation of the inner workings of Neural Networks (NN):

The Machine Learning & Deep Learning book I wrote about recently provides a more substantial explanation of the different NNs and their inner workings. Neural nets come in various different flavors and my list of Data Science, Machine Learning, & Statistics Resources includes useful cheatsheets and other information, such as the architecture map below.

If you still haven’t had enough, Daniel Shiffman demonstrates how to code Neural Networks in Processing (Java), and the video displays precisely what happens behind the scenes. Finally, MIT has made their AI course material open-source, and it includes two 45 minute lectures on NNs. The lecturing professor – Patrick Winston – isn’t much of a fan of these “bulldozer” algorithms. He has a stronger preference for “more sophisticated” mathematical learning through, for instance, Support Vector Machines.

Video: Human-Computer Interactions in Reinforcement Learning

Reinforcement learning is an area of machine learning inspired by behavioral psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward (Wikipedia, 2017). Normally, reinforcement learning occurs autonomously. Here, algorithms will seek to minimize/maximize a score that is estimated via predefined constraints. As such, algorithms can thus learn to perform the most effective actions (those that minimize/maximize the score) by repeatedly experimenting and assessing strategies.

The approach in the video below is radically different. Instead of a pre-defined scoring, human-computer interaction is used to assign each action sequence (each iteration/experiment) a score. This approach is particularly useful for complex behaviors, such as a back-flip, for which it is hard to pre-define the constraints and actions that lead to the “most effective” back-flip. However, for us humans, it is relatively easy to recognize a good back-flip when we see one. The video below shows how the researchers therefore integrated a human-computer interaction in their reinforcement learning algorithm. After observing the algorithm perform a sequence of actions, a human actor indicates to what extent the goal (i.e., a backflip) is achieved or not. This human assessment thus functions as the score which the algorithm will try to minimize/maximize.

This approach can be really valuable for organizations seeking to improve their machine learning application. The paper on the principle (Deep Reinforcement Learning from Human Preferences) can be found here. The scholars conclude that this supervised approach based on human preferences has very good training results whereas the cost are similar the simple bulldozer approach of training a neural net from scratch using GPU servers.

Machine Learning & Deep Learning book

The Deep Learning textbook helps students and practitioners enter the field of machine learning in general and deep learning in particular. Its online version is available online for free whereas a hardcover copy can be ordered here on Amazon. You can click on the topics below to be redirected to the book chapter:

Part I: Applied Math and Machine Learning Basics

Part II: Modern Practical Deep Networks

Part III: Deep Learning Research

Generating 3D Faces from 2D Photographs

Aaron Jackson, Adrian Bulat, Vasileios Argyriou and Georgios Tzimiropoulos
of the Computer Vision Laboratory of the University of Nottingham built a neural network that generates a full 3D image of a single portrait photograph. They turn a photograph like this…

… into an accurately creepy 3D image like this.

You can try it with your own or other photographs here. I found that images with white background get the best results. On their project website you can read more about the underlying convolutional neural network.

Update 21-10-2017: One of my favorite YouTube channels explains how the models were trained and the data used:

IBM’s Watson for Oncology: A Biased and Unproven Recommendation System in Cancer Treatment?

The below reiterates and summarizes this Stat article.

Recently, I addressed how bias may slip into Machine Learning applications and this weekend I came across another real-life example: IBM’s Watson, specifically Watson for Oncology. With a single machine, IBM intended to tackle humanity’s most vexing diseases and revolutionize medicine and they quickly zeroed in on a high-profile target: cancer.

However, three years later now, a STAT investigation has found that the supercomputer isn’t living up to the lofty expectations IBM created for it. IBM claims that, through Artificial Intelligence, Watson for Oncology can generate new insights and identify “new approaches” to cancer care. However, the STAT investigation (video below) concludes that the system doesn’t create new knowledge and is artificially intelligent only in the most rudimentary sense of the term. Similarly, cancer specialists using the product argue Watson is still in its “toddler stage” when it comes to oncology.

Let’s start with the positive side. For specific treatments, Watson can scan academic literature, immediately providing the “best data” about a treatment — survival rates, for example — thereby relieving doctors of tedious literature searches. Due to this transparency, Watson may level the hierarchy commonly found in hospital settings, by holding (senior) doctors accountable to the data and empowering junior physicians to back up their arguments. Furthermore, Watson’s information may empower patients as they can be offered a comprehensive packet of treatment options, including potential treatment plans along with relevant scientific articles. Patients can do their own research about these treatments, and maybe even disagree with the doctor about the right course of action.

Although study results demonstrate that Watson saves doctors time and can have a high concordance rate with their treatment recommendations, much more research is needed. The studies were all conference abstracts, which haven’t been published in peer-reviewed journals — and all but one was either conducted by a paying customer or included IBM staff on the author list, or both. More importantly, IBM has failed to exposed Watson for Oncology to critical review by outside scientists nor have they conducted clinical trials to assess its effectiveness. It would be very interesting to examine whether Watson’s implementation is actually saving lives or making healthcare more efficient/effective.

imaging-video[1].jpg — IBM Watson Health

Such validation is especially necessary because several issues are identified. First, the actual capabilities of Watson for Oncology are not well-understood by the public, and even by some of the hospitals that use it. It’s taken nearly six years of painstaking work by data engineers and doctors to train Watson in just seven types of cancer, and keep the system updated with the latest knowledge. Moreover, because of the complexity of the underlying machine learning algorithms, the recommendations Watson puts out are a black box, and Watson can not provide the specific reasons for picking treatment A over treatment B.

Second, the system is essentially Memorial Sloan Kettering in a portable box. IBM celebrates Memorial Sloan Kettering’s role as the only trainer of Watson. After all, who better to educate the system than doctors at one of the world’s most renowned cancer hospitals? However, doctors claim that Memorial Sloan Kettering’s training has caused bias in the system, because the treatment recommendations it puts into Watson don’t always comport with the practices of doctors elsewhere in the world. When users ask Watson for advice, the system also searches published literature — some of which is curated by Memorial Sloan Kettering — to provide relevant studies and background information to support its recommendation. But the recommendation itself is derived from the training provided by the hospital’s doctors, not the outside literature.

Doctors at Memorial Sloan Kettering acknowledged their influence on Watson. “We are not at all hesitant about inserting our bias, because I think our bias is based on the next best thing to prospective randomized trials, which is having a vast amount of experience,” said Dr. Andrew Seidman, one of the hospital’s lead trainers of Watson. “So it’s a very unapologetic bias.”

However, this bias causes serious problems when Watson for Oncology is implemented in other countries/hospitals. The generally affluent population treated at Memorial Sloan Kettering doesn’t reflect the diversity of people around the world. According to Martijn van Oijen, an epidemiologist and associate professor at Academic Medical Center in the Netherlands, Watson has not been implemented in because of country level differences in treatment approaches. Similarly, oncologists at one hospital in Denmark said they have dropped implementation altogether after finding that local doctors agreed with Watson in only about 33 percent of cases. Different problems occurred in South Korea, where researchers reported that the treatment Watson most often recommended for breast cancer patients simply wasn’t covered by their national insurance system.

Kris, the lead trainer at Memorial Sloan Kettering, says nobody wants to hear the problems. “All they want to hear is that Watson is the answer. And it always has the right answer, and you get it right away, and it will be cheaper. But like anything else, it’s kind of human.”

AI at MIT (2010/2015): Part 1 – Introduction

Massachusetts Institute of Technology (MIT) hosts their entire 2010 course on artificial intelligence / machine learning by Professor Patrick Winston on YouTube. Although some parts seem already kind of dated seven years later, the videos on several evolving topics (e.g., Neural Networks) have been updated in the fall of 2015. The tutorial assignments you can find at the course website. Requirements for the course include experience with Python programming and an understanding of search algorithms (depth-first, breadth-first, uniform-cost, A*), basic probability, state estimation, the chain rule, partial derivatives, and dot products.

Below is the first, introductory lecture, which provides a short introduction to the history and concept of artificial intelligence:
AI is about algorithms enabled by constraints exposed by representations that support models targeted at loops that tie together thinking, perception and action.