Last May, Tony Beltramelli of Ulzard Technologies presented his latest algorithm pix2code at the NIPS conference. Put simply, the algorithm looks at a picture of a graphical user interface (i.e., the layout of an app), and determines via an iterative process what the underlying code likely looks like.

Afbeeldingsresultaat voor user interface
Graphical user interface examples (Google Images)

Please watchUlzard’s pix2code demo video or the third-party summary at the bottom of this blog. My undertanding is that pix2code is based on convolutional and recurrent neural networks (long explanation video) in combination with long short-term memory (short explanation video). Based on a single input image, pix2code can generate code that is 77% accurate and it works for three of the larger platforms (i.e. iOS, Android and web-based technologies).

The input and output of pix2code

Obviously, this is groundbreaking technology. When further developed, pix2code not only increases the speed with which society is automated/robotized but it also further expands the automation to more complex and highly needed tasks, such as programming and web/app development.

Here you can read the full academic paper on pix2code.

Below is the official demo reviewed by another data enthusiast with commentary and some additional food for thought.

Read here some of my other blogs on neural networks and robotization: