Overview

This is one of several projects I worked on while taking and TA’ing for Smart Machines, an Interactive Arts class about artificial intelligence and machine learning.

Works Cited

I want to preface this project breakdown by saying that the vast majority of the machine learning code in this project is courtesy of TensorFlow. I worked off code from their examples repository, and referred to this article by TensorFlow’s Raymond Yuan. My primary labor on this project was learning to work with AI systems, understand them, and being able to successfully modify and deploy them without breaking anything too badly :)

Concept

I wanted to make a bot that could create bizarre hybrid artwork for the user, on command, given two images of the user’s choosing. So I started looking into neural style transfer.

The idea of style transfer is not a new one, but an algorithm for creating an artificial intelligence with the ability to perform it was not introduced until relatively recently. As you can read in Leon A. Gatys’ paper, A Neural Algorithm of Artistic Style, their algorithm makes use of a Deep Neural Network to essentially separate and parse out the content and style of any provided images, and then recombines them to create a hybrid of the two.

Huh?

Yeah, I know, I had a similar reaction at first. Here’s some examples of what my iteration of this system can do.

Example 1: Otto Dix and Cosima Niehaus

In this example, we have the drypoint print and political cartoon by Otto Dix, War Cripples, and the design with the DNA strands is the laptop decal of Cosima Niehaus, one of my favorite characters from the BBC America show Orphan Black.

War Cripples by Otto Dix Cosima's laptop decal
War Cripples by Otto Dix on the left, our content source for this example; and Cosima’s laptop decal on the right, as our style source.

The system takes our provided content and style images, and performs a series of operations on them to find and identify what parts of each image it thinks are important in terms of content and style. For more detail on this, see Yuan’s article. Once done, it generates a hybrid of our two source images over and over and over, self-grading the results and attempting to improve with each iteration. After 1000 iterations, we receive our final product:

The final composite image

While a little muddled, you can still pretty clearly see the mutilated forms of Dix’s soldiers, but the style in which they’ve been depicted has become reminiscent of an acid trip, complete with Day-Glo neon colors. If you look at Cosima’s laptop decal, you can see where it derived the colors and the spiraling and curved forms from, and where it chose to merge them with the content of War Cripples. Several key elements of the print have been preserved, most notably the boot in the store display (near the middle of the print), and the window frame in the upper right.

Example 2: Sekiro and Edvard Munch

For this piece, I used a gameplay image from the undeniably stylish Sekiro: Shadows Die Twice, the latest game from From Software, as the content image, and Edvard Munch’s The Scream as the style image.

Screenshot from Sekiro: Shadows Die Twice The Scream by Edvard Munch
A screenshot from Sekiro: Shadows Die Twice on the left, our content source for this example; and Edvard Munch’s The Scream on the right, our style source.

I was curious to see how well Munch’s distinct brushstrokes and colors transferred over to the final product. I was very happy when I showed the final piece to others without explicitly mentioning The Scream as the source, and most people were able to recognize Munch’s style anyway!

Sekiro Scream composite

Reflections

Challenges

Funnily enough, while the code was complex, I think the most difficult part by far was setting up an environment with TensorFlow, GPU support (which requires a bunch of NVIDIA tools), and all the bells and whistles required to make this project run. Even on a Linux box it took a lot of tweaking. Extremely frustrating to say the least.

Next Steps

This project still needs to have a public presence, where people can input their own images and get a mashup on demand. Currently it only lives on my Linux box, and that needs to change. I’m just not sure about how/where I can host a thing like this, given the processing power necessary…something to think about.