Tag: computerscience

A free, self-taught education in Computer Science!

The Open Source Society University offers a complete education in computer science using online materials.

They offer a proper introduction to the fundamental concepts for all computing disciplines. Evyerthing form algorithms, logic, and machine learning, up to databases, full stack web development, and graphics is covered. Moreover, you will acquire skills in a variety of languages, including Python, Java, C, C++, Scala, JavaScript, and many more.

According to their GitHub page, the curriculum is suited for people with the discipline, will, and good habits to obtain this education largely on their own, but who’d still like support from a worldwide community of fellow learners.

Curriculum

Intro CS: for students to try out CS and see if it’s right for them
Core CS: corresponds roughly to the first three years of a computer science curriculum, taking classes that all majors would be required to take
Advanced CS: corresponds roughly to the final year of a computer science curriculum, taking electives according to the student’s interests
Final Project: a project for students to validate, consolidate, and display their knowledge, to be evaluated by their peers worldwide
Pro CS: graduate-level specializations students can elect to take after completing the above curriculum if they want to maximize their chances of getting a good job

It is possible to finish Core CS within about 2 years if you plan carefully and devote roughly 18-22 hours/week to your studies. Courses in Core CS should be taken linearly if possible, but since a perfectly linear progression is rarely possible, each class’s prerequisites are specified so that you can design a logical but non-linear progression based on the class schedules and your own life plans.

Links to the contents

Links to the curriculum (v8.0.0)

Turning the Traveling Salesman problem into Art

Robert Bosch is a professor of Natural Science at the department of Mathematics of Oberlin College and has found a creative way to elevate the travelling salesman problem to an art form.

For those who aren’t familiar with the travelling salesman problem (wiki), it is a classic algorithmic problem in the field of computer science and operations research. Basically, we want are looking for a mathematical solution that is cheapest, shortest, or fastest for a given problem. Most commonly, it is seen as a graph (network) describing the locations of a set of nodes (elements in that network). Wikipedia has a description I can’t improve on:

The Travelling Salesman Problem describes a salesman who must travel between N cities. The order in which he does so is something he does not care about, as long as he visits each once during his trip, and finishes where he was at first. Each city is connected to other close by cities, or nodes, by airplanes, or by road or railway. Each of those links between the cities has one or more weights (or the cost) attached. The cost describes how “difficult” it is to traverse this edge on the graph, and may be given, for example, by the cost of an airplane ticket or train ticket, or perhaps by the length of the edge, or time required to complete the traversal. The salesman wants to keep both the travel costs, as well as the distance he travels as low as possible.
Wikipedia

Here’s a visual representation of the problem and some algorithmic approaches to solving it:

Now, Robert Bosch has applied the traveling salesman problem to well-know art pieces, trying to redraw them by connecting a series of points with one continuous line. Robert even turned it into a challenge so people can test out how well their travelling salesman algorithms perform on, for instance, the Mona Lisa, or Vincent van Gogh.

Just look at the detail on these awesome Dutch classics:

Read more about this awesome project here: http://www.math.uwaterloo.ca/tsp/data/art/

P.S. Why do Brits and Americans have this spelling feud?! As a non-native, I never know what to pick. Should I write modelling or modeling, travelling or traveling, tomato or tomato? I got taught the U.K. style, but the U.S. style pops up whenever I google stuff, so I am constantly confused! Now I subconciously intertwine both styles in a single text…

The wondrous state of Computer Vision, and what the algorithms actually “see”

The field of computer vision tries to replicate our human visual capabilities, allowing computers to perceive their environment in a same way as you and I do. The recent breakthroughs in this field are super exciting and I couldn’t but share them with you.

In the TED talk below by Joseph Redmon (PhD at the University of Washington) showcases the latest progressions in computer vision resulting, among others, from his open-source research on Darknet – neural network applications in C. Most impressive is the insane speed with which contemporary algorithms are able to classify objects. Joseph demonstrates this by detecting all kinds of random stuff practically in real-time on his phone! Moreover, you’ve got to love how well the system works: even the ties worn in the audience are classified correctly!

PS. please have a look at Joseph’s amazing My Little Pony-themed resumé.

The second talk, below, is more scientific and maybe even a bit dry at the start. Blaise Aguera y Arcas (engineer at Google) starts with a historic overview brain research but, fortunately, this serves a cause, as ~6 minutes in Blaise provides one of the best explanations I have yet heard of how a neural network processes images and learns to perceive and classify the underlying patterns. Blaise continues with a similarly great explanation of how this process can be reversed to generate weird, Asher-like images, one could consider creative art:

An example of a reversed neural network thus “*estimating*” an image of a bird [via Youtube]

Blaise’s colleagues at Google took this a step further and used t-SNE to visualize the continuous space of animal concepts as perceived by their neural network, here a zoomed in part on the Armadillo part of the map, apparently closely located to fish, salamanders, and monkeys?

A zoomed view of part of a t-SNE map of latent animal concepts generated by reversing a neural network [via Youtube]

We’ve seen these latent spaces/continua before. This example Andrej Karpathy shared immediately comes to mind:

0_o pic.twitter.com/M3yB6rzTiO

— Andrej Karpathy (@karpathy) 28 oktober 2017

Blaise’s presentaton you can find here:

If you want to learn more about this process of image synthesis through deep learning, I can recommend the scientific papers discussed by one of my favorite Youtube-channels, Two-Minute Papers. Karoly’s videos, such as the ones below, discuss many of the latest developments:

Let me know if you have any other video’s, papers, or materials you think are worthwhile!

Sorting Algorithms 101: Visualized

Sorting is one of the central topic in most Computer Science degrees. In general, sorting refers to the process of rearranging data according to a defined pattern with the end goal of transforming the original unsorted sequence into a sorted sequence. It lies at the heart of successful businesses ventures — such as Google and Amazon — but is also present in many applications we use daily — such as Excel or Facebook.

Many different algorithms have been developed to sort data. Wikipedia lists as many as 45 and there are probably many more. Some work by exchanging data points in a sequence, others insert and/or merge parts of the sequence. More importantly, some algorithms are quite effective in terms of the time they take to sort data — taking only $n$ time to sort $n$ datapoints — whereas others are very slow — taking as much as $n^2$ . Moreover, some algorithms are stable — in the sense that they always take the same amount of time to process $n$ datapoints — whereas others may fluctuate in terms of processing time based on the original order of the data.

I really enjoyed this video by TED-Ed on how to best sort your book collection. It provides a very intuitive introduction into sorting strategies (i.e., algorithms). Moreover, Algorithms to Live By (Christian & Griffiths, 2016) provided the amazing suggestion to get friends and pizza in whenever you need to sort something, next to the great explanation of various algorithms and their computational demand.

The main reason for this blog is that I stumbled across some nice video’s and GIFs of sorting algorithms in action. These visualizations are not only wonderfully intriguing to look at, but also help so much in understanding how the sorting algorithms process the data under the hood. You might want to start with the 4-minute YouTube video below, demonstrating how nine different sorting algorithms (Selection Sort, Shell Sort, Insertion Sort, Merge Sort, Quick Sort, Heap Sort, Bubble Sort, Comb Sort, & Cocktail Sort) process a variety of datasets.

This interactive website toptal.com allows you to play around with the most well-known sorting algorithms, putting them to work on different datasets. For the grande finale, I found these GIFs and short video’s of several sorting algorithms on imgur. In the visualizations below, each row of the image represents an independent list being sorted. You can see that Bubble Sort is quite slow:

Cocktail Shaker Sort already seems somewhat faster, but still takes quite a while.

For some algorithms, the visualization clearly shows that the settings you pick matter. For instance, Heap Sort is much quicker if you choose to shift down instead of up.

In contrast, for Merge Sort it doesn’t matter whether you sort by breadth first or depth first.

The imgur overview includes many more visualized sorting algorithms but I don’t want to overload WordPress or your computer, so I’ll leave you with two types of Radix Sort, the rest you can look up yourself!

Multi-Armed Bandits: The Smart Alternative for A/B Testing

Just as humans, computers learn by experience.The purpose of A/B testing is often to collect data to decide whether intervention A or B is better. As such, we provide one group with intervention A whereas another group receives intervention B. With the data of these two groups coming in, the computer can statistically estimate which intervention (A or B) is more effective. The more data the computer has, the more certain the estimate is. Here, a trade-off exists: we need to collect data on both interventions to be certain which is best. But we don’t want to conduct an inefficient intervention, say B, if we are quite sure already that intervention A is better.

In his post, Corné de Ruijt of Endouble writes about multi-armed bandit algorithms, which try to optimize this trade-off: “Multi-armed bandit algorithms try to overcome the high missed opportunity cost involved in learning, by exploiting and exploring at the same time. Therefore, these methods are in particular interesting when there is a high lost opportunity cost involved in the experiment, and when exploring and exploiting must be performed during a limited time interval.“

In the full article, you can read Corné’s comparison of this multi-armed bandit approach to the traditional A/B testing approach using a recruitment and selection example. For those of you who are interested in reading how anyone can apply this algorithm and others to optimize our own daily decisions, I highly recommend the book Algorithms to Live By: The Computer Science of Human Decisions available on Amazon or the Dutch bol.com.