Tag: synthesis

Making Pictures 3D using Context-aware Layered Depth Inpainting

Making Pictures 3D using Context-aware Layered Depth Inpainting

Several Chinese Ph.D. students wrote a PyTorch program that can turn your holiday pictures into 3D sceneries. They call it 3D photo inpainting. Here are some examples

And here’s the new method compares to previous techniques:

Here are several links to more detailed resources: [Paper] [Project Website] [Google Colab] [GitHub]

We propose a method for converting a single RGB-D input image into a 3D photo, i.e., a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view. We use a Layered Depth Image with explicit pixel connectivity as underlying representation, and present a learning-based inpainting model that iteratively synthesizes new local color-and-depth content into the occluded region in a spatial context-aware manner. The resulting 3D photos can be efficiently rendered with motion parallax using standard graphics engines. We validate the effectiveness of our method on a wide range of challenging everyday scenes and show fewer artifacts when compared with the state-of-the-arts.

Via github.com/vt-vl-lab/3d-photo-inpainting
3D Photography Inpainting: Exploring Art with AI. - Towards Data ...
I loved this one as well, but it could be a different technique used, via Medium
Neural Synesthesia: GAN AI dreaming of music

Neural Synesthesia: GAN AI dreaming of music

Xander Steenbrugge shared his latest work on LinkedIn yesterday, and I was completely stunned!

Xander had been working on, what he called, a “fun side-project”, but which was in my eyes, absolutely awesome. He had used two generative adversarial networks (GANs) to teach one another how to respond visually to changing audio cues.

This resulted in the generation of stunning audio-visual fanatasy worlds that are complete brain porn. You just can’t stop staring. So much is happening in these video’s; everything looks familiar, whereas nothing really represent anything realistic. There’s always a sliver of reality before the visual shapes morph to their next form.

Have a look yourself at the video’s on Xander’s new Youtube channel “Neural Synesthesia dedicated to this project. The videos are also hosted here on Vimeo, where they are rendered in higher resolution even.

This is my favorite video, but there are more below.

Amazing how the image responds to changes in the music, right? I suspect Xander let’s the algorithm traverse some latent space with spaces that are determined by the bass, trebble, and other audio-cues.

The audio behind the above video is also just enticing. The track is called Raindrops, by Kupla X j’san.

Here’s another one of Xander’s videos, with the same audio track as background:

But Xander didn’t limit his GANs to generating landscapes and still paintings, but he also dared to do some human faces. These also turned out amazing.

Both the left and right face seem to start out in about the same position/seed in the latent space, but traverse in different, though still similar directions, morphing into all kinds of reaslistic and more alien forms. The result is simply out of this world!

The music behind this video is by Phantom Studies, by Dettmann | Klock.

Curious to see where this project and others head as we continue to see development in this GAN field. This must turn the world of design and art up side down in the coming decade…

A beautiful machine-generated still from the Neural Synthesia videos (link)
The wondrous state of Computer Vision, and what the algorithms actually “see”

The wondrous state of Computer Vision, and what the algorithms actually “see”

The field of computer vision tries to replicate our human visual capabilities, allowing computers to perceive their environment in a same way as you and I do. The recent breakthroughs in this field are super exciting and I couldn’t but share them with you.

In the TED talk below by Joseph Redmon (PhD at the University of Washington) showcases the latest progressions in computer vision resulting, among others, from his open-source research on Darknet – neural network applications in C. Most impressive is the insane speed with which contemporary algorithms are able to classify objects. Joseph demonstrates this by detecting all kinds of random stuff practically in real-time on his phone! Moreover, you’ve got to love how well the system works: even the ties worn in the audience are classified correctly!

PS. please have a look at Joseph’s amazing My Little Pony-themed resumé.

The second talk, below, is more scientific and maybe even a bit dry at the start. Blaise Aguera y Arcas (engineer at Google) starts with a historic overview brain research but, fortunately, this serves a cause, as ~6 minutes in Blaise provides one of the best explanations I have yet heard of how a neural network processes images and learns to perceive and classify the underlying patterns. Blaise continues with a similarly great explanation of how this process can be reversed to generate weird, Asher-like images, one could consider creative art:

neuralnetart1.png
An example of a reversed neural network thus “estimating” an image of a bird [via Youtube]
Blaise’s colleagues at Google took this a step further and used t-SNE to visualize the continuous space of animal concepts as perceived by their neural network, here a zoomed in part on the Armadillo part of the map, apparently closely located to fish, salamanders, and monkeys?

neuralnetart2.png
A zoomed view of part of a t-SNE map of latent animal concepts generated by reversing a neural network [via Youtube]
We’ve seen these latent spaces/continua before. This example Andrej Karpathy shared immediately comes to mind:

Blaise’s presentaton you can find here:

If you want to learn more about this process of image synthesis through deep learning, I can recommend the scientific papers discussed by one of my favorite Youtube-channels, Two-Minute Papers. Karoly’s videos, such as the ones below, discuss many of the latest developments:

Let me know if you have any other video’s, papers, or materials you think are worthwhile!