This video I’ve been meaning to watch for a while now. It another great visual explanation of a statistics topic by the 3Blue1Brown Youtube channel (which I’ve covered before, multiple times).
This time, it’s all about Bayes theorem, and I just love how Grant Sanderson explains the concept so visually. He argues that rather then memorizing the theorem, we’d rather learn how to draw out the context. Have a look at the video, or read my summary below:
Grant Sanderson explains the concept very visually following an example outlined in Daniel Kahneman’s and Amos Tversky’s book Thinking Fast, Thinking Slow:
Steve is very shy and withdrawn, invariably helpful but with very little interest in people or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail.”
Is Steve more likely to be a librarian or a farmer?
Kahneman and Tversky argue that people take into account Steve’s disposition and therefore lean towards librarians.
However, few people take into account that librarians are quite scarce in our society, which is rich with farmers. For every librarian, there are 20+ farmers. Hence, despite the disposition, Steve is probably more like to be a farmer.
Rather than remembering the upper theorem, Grant argues that it’s often easier to just draw out the rectangle of probabilities below.
Try it out for yourself using another example by Kahneman and Tversky:
First, Raymond discusses chunking and aliasing. He brings up the theory that the human mind can only handle/remember 7 pieces of information at a time, give or take 2. Anything above proves to much cognitive load, and causes discomfort as well as errors. Hence, in a programming context, we need to make sure programmers can use all 7 to improve the code, rather than having to decypher what’s in front of them. In a programming context, we do so by modularizing and standardizing through functions, modules, and packages. Raymond uses the Python random module to hightlight the importance of chunking and modular code. This part was quite long, but still interesting.
For the next two strategies, Raymond quotes the Feinmann method of solving problems: “(1) write down a clear problem specification; (2) think very, very hard; (3) write down a solution”. Using the example of a tree walker, Raymond shows how the strategies of incremental development and solving simpler programs can help you build programs that solve complex problems. This part only lasts a couple of minutes but really underlines the immense value of these strategies.
Next, Raymond touches on the DRY principle: Don’t Repeat Yourself. But in a context I haven’t seen it in yet, object oriented programming [OOP], classes, and inherintance.
Raymond continues to build his arsenal of programming strategies in the next 10 minutes, where he argues that programmers should repeat tasks manually until patterns emerge, before they starting moving code into functions. Even though I might not fully agree with him here, he does have some fun examples of file conversion that speak in his case.
Lastly, Raymond uses the graph below to make the case that OOP is a graph traversal problem. According to Raymond, the Python ecosystem is so rich that there’s often no need to make new classes. You can simply look at the graph below. Look for the island you are currently on, check which island you need to get to, and just use the methods that are available, or write some new ones.
While there were several more strategies that Raymond wanted to discuss, he doesn’t make it to the end of his list of strategies as he spend to much time on the first, chunking bit. Super curious as to the rest? Contact Raymond on Twitter.
Xander Steenbrugge shared his latest work on LinkedIn yesterday, and I was completely stunned!
Xander had been working on, what he called, a “fun side-project”, but which was in my eyes, absolutely awesome. He had used two generative adversarial networks (GANs) to teach one another how to respond visually to changing audio cues.
This resulted in the generation of stunning audio-visual fanatasy worlds that are complete brain porn. You just can’t stop staring. So much is happening in these video’s; everything looks familiar, whereas nothing really represent anything realistic. There’s always a sliver of reality before the visual shapes morph to their next form.
This is my favorite video, but there are more below.
Amazing how the image responds to changes in the music, right? I suspect Xander let’s the algorithm traverse some latent space with spaces that are determined by the bass, trebble, and other audio-cues.
Here’s another one of Xander’s videos, with the same audio track as background:
But Xander didn’t limit his GANs to generating landscapes and still paintings, but he also dared to do some human faces. These also turned out amazing.
Both the left and right face seem to start out in about the same position/seed in the latent space, but traverse in different, though still similar directions, morphing into all kinds of reaslistic and more alien forms. The result is simply out of this world!
Brandon Rohrer — (former) data scientist at Microsoft, iRobot, and Facebook — asked his network on Twitter and LinkedIn to share their favorite resources on A/B testing. It produced a nice list, which I summarized below.
The order is somewhat arbitrary, and somewhat based on my personal appreciation of the resources.
Josh Starmer is assistant professor at the genetics department of the University of North Carolina at Chapel Hill.
But more importantly: Josh is the mastermind behind StatQuest!
StatQuest is a Youtube channel (and website) dedicated to explaining complex statistical concepts — like data distributions, probability, or novel machine learning algorithms — in simple terms.
Once you watch one of Josh’s “Stat-Quests”, you immediately recognize the effort he put into this project. Using great visuals, a just-about-right pace, and relateable examples, Josh makes statistics accessible to everyone. For instance, take this series on logistic regression:
And do you really know what happens under the hood when you run a principal component analysis? After this video you will:
Or are you more interested in learning the fundamental concepts behind machine learning, then Josh has some videos for you, for instance on bias and variance or gradient descent:
With nearly 200 videos and counting, StatQuest is truly an amazing resource for students ‘and teachers on topics related to statistics and data analytics. For some of the concepts, Josh even posted videos running you through the analysis steps and results interpretation in the R language.
StatQuest started out as an attempt to explain statistics to my co-workers – who are all genetics researchers at UNC-Chapel Hill. They did these amazing experiments, but they didn’t always know what to do with the data they generated. That was my job. But I wanted them to understand that what I do isn’t magic – it’s actually quite simple. It only seems hard because it’s all wrapped up in confusing terminology and typically communicated using equations. I found that if I stripped away the terminology and communicated the concepts using pictures, it became easy to understand.
Over time I made more and more StatQuests and now it’s my passion on YouTube.
In the video below, one of my favorite YouTube channels (Two Minute Papers) discusses a new super resolution project where academic scholars taught a neural network to improve low quality photo’s. The researchers took the same picture with multiple camera’s of varying quality and allowed a neural network to learn how the lowest quality pictures can be adjusted to more closely resemble their high quality counterparts. A very interesting approach and the results are just mind-boggling:
Super-resolution imaging is a class of techniques that enhance the resolution of an imaging system (Wikipedia). The entertainment series CSI has been ridiculed for relying on exaggerated and unrealistic applications of it:
As a result, there are now several applications where machines have learned to literallyfill in the blanks in imagery. Most notable seems the method developed by Google: Rapid and Accurate Image Super Resolution, or RAISR is short. In contrast to other approaches, RAISR does not rely on (adversarial) neural network(s) and is thus not as resource-demanding to train. Moreover, it’s performance is quite remarkable:
I guess you’re eager to test this super resolution out yourself?! letsenhance.io let’s you enhance the resolution of five images for free, after which it charges you $5 per twenty pictures processed. The website feeds the input image to a neural net and puts out an image of which the resolution has been increased four fold! I tested it with this random blurry picture I retrieved from Google/Pinterest.
Do you see how much more detailed (though still blurry) the second image is? Nevertheless, upscaling four times seems about the limit as that is the default factor for both RAISR and Let’s Enhance. I am very curious to see how this super resolution is going to develop in the future, how it will be used to decrease memory or network demands, whether it will be integrated with video platforms like YouTube or Netflix, and which algorithm will ultimately take the crown!