Generating Art with Creative Adversarial Networks (CAN)
Generating Art with Creative Adversarial Networks (CAN)
Computationally recreating creativity is a significant step towards Artificial General Intelligence (AGI). Generative Adversarial Networks (GAN) are AI algorithms that can create novel art simulating a given distribution. The system learns about the style of the images it’s exposed to and generates art by deviating from the learned styles and increasing the arousal potential.
The most significant arousing properties for aesthetics are novelty, surprisingness, complexity, ambiguity, and puzzlingness. Creative Adversarial Networks (CAN) modify GAN’s objective and create art by maximizing deviation from established styles while minimizing deviation from art distribution. The model tries to generate art that is novel, but not too novel.
People prefer stimulus with moderate arousal potential — too little is boring and too much may trigger an aversive response. Creative pieces counter well habituation, which is the decreased arousal in response to repetitions of a stimulus. CANs aim to generate art with stylistic ambiguity and high arousal potential without landing it into a negative hedonic range.
A Generative Adversarial Network (GAN) has two sub-networks, a generator, and a discriminator. The discriminator has access to images the model is trained on. The discriminator tries to discriminate between “real” images (from the training set) and “fake” images generated by the generator.
The generator tries to generate images similar to the training set without seeing these images. Its purpose is to create fake images that the discriminator would deem to be from the training set. The generator starts by generating random images and receives a signal from the discriminator telling whether the discriminator finds them real or fake.
At equilibrium, the discriminator can’t tell the difference between the images generated by the generator and the actual images in the training set, hence the generator succeeds in generating images that come from the same distribution as the training set. The generator isn’t creative as it generates images that look like already existing art.
However, in a Creative Adversarial Network (CAN), the generator is designed to receive two signals from the discriminator that act as two contradictory forces to achieve three points: 1) generate novel works, 2) the novel work should not too novel, or far from the distribution, and 3) the generated work should increase the stylistic ambiguity.
The first signal is the discriminator’s classification of “art or not art”. The second signal the generator receives is a signal about how well the discriminator can classify the generated art into established styles. The creative generator will purposely try to generate ambiguous art that confuses the discriminator.
These models, trained on 81,449 paintings from 1,119 artists from the Fifteenth century to Twentieth century, were put to test. The above generated, novel images were created by GANs with increasingly complex network architectures. The top row consists of the most liked paintings and the bottom row consists of the least liked paintings as judged by humans.
The above CAN implementation aims to calssify a generated image into a style and so takes into account a style classification loss leading to more style-conforming, less ambiguous works. To induce ambiguity, a style ambiguity loss is implemented in the proposed CAN model which generates aesthetically appealing images, as seen below, that can be characterized as novel and not emulating the art distribution.
Two sets of real paintings — Abstract Expressionist Set and Art Basel 2016 Set — are compared to the images generated by the AI networks. The former set is a collection of 25 paintings by Abstract Expressionist masters made between 1945–2007. The latter set consists of 25 paintings from the renowned art fair, Art Basel, in 2016.
Not knowing the creator of the images, human subjects were asked if presented images were created by a human or a computer. 85% of Abstract Expressionist images, 53% of CAN images, 41% of Basel images, and 35% of GAN images were deemed human. In another experiment, humans were asked to rate the image sets on various parameters. The results are captured below.
There is a weak correlation between the likeness rating and whether subjects think it is by an artist or a computer. The parameters above are rated sequentially from left to right leading to different results for the human/computer classification.
Then, the humans were asked to rate the pantings on more intangible yet intrinsic artistic parameters. The results below indicate that the CAN generated images were rated higher than human paintings in all aspects. So, the humans see these images as art.
When inquired about the novelty, 59.47% of the subjects selected CAN images as more novel and 60% of them found CAN images more aesthetically appealing. The highest ranked paintings created by CAN are displayed below with the ratings decreasing from top to bottom.
A main characteristic of the proposed system is that it learns about the history of art in its process to create art. However, it does not have any semantic understanding of subject matters, explicit models of elements, or principles of art. The learning here is based only on exposure to art and concepts of styles, which the system learns implicitly.