Written by Nicholas Wolf
What does it mean for a computer to learn, read or write? Can we compute creativity into a formulaic system? This blog discusses these questions by analyzing the neural networks behind Google's Project Magenta, a computer software that mimics creativity and generates music.
Google released its first musical single last week --- a 90-second long piano melody composed entirely by a computer. Give it a listen and the first impression you'll (probably) have is that it’s pretty bad. It reminds me of the ring-tone options I had when my cellphone still had a 12-button slide out keyboard.
The software behind this song is a recently open-sourced music and art generation toolkit developed by the Google Brain team called Project Magenta --- "a research project to advance the state of the art in machine intelligence for music and art generation... [and] an attempt to build a community of artists, coders and machine learning researchers". Anyone can download, use, and contribute to the development of Project Magenta. You can pass the software a database of music and it will listen to it and learn how to write music from it.
When I say that this computer program is 'listening', 'learning' and 'writing', I don't mean to imply that it is conscious or self-aware. Instead, I mean to use some familiar actions as analogues for understanding the complex things it is doing. 'Listening' really means processing some kind of input, 'learning' really means recognizing and remembering patterns and 'writing' really means creating some kind of output. Recent research in artificial intelligence has pushed the boundaries of what we can do with computers and this practice of understanding by analogy is useful in a world where they are increasingly capable of doing 'human' things.
The field of artificial intelligence is a body of work that spans a broad range of subjects --- mathematics and statistics, cognitive science and psychology, and engineering to name a few. It has been developed and refined by thousands of people over hundreds of years and a comprehensive history of artificial intelligence would fill volumes. Instead of a broad overview, I'll give an introduction to one small subset of the technology that has recently become popular and is behind Google's recent musical efforts --- something called neural networks.
In terms of elegance, modern computing technology still pales in comparison to the human brain --- nature has a way of creating extremely well designed structures. The desire to replicate the brain's structures and functions has been a driving force in the development of computers since they were first conceived. 'Neural networks' are digital networks of interconnected 'neurons' inspired by the structure of the human brain. Each 'neuron' represents a mathematical function that takes some input, alters it, and then passes it along to other 'neurons'. This simple approach is wildly powerful and has applications in everything from linguistics to medical diagnosis to image recognition.
An intuitive and entertaining way of getting a grasp on how neural networks work is by watching one learn. Let's look at an example from Andrej Karpathy , who built a neural network that teaches itself how to write English. When it was first run, the computer program had no prior knowledge of grammar. It had no concept of nouns or verbs. It didn't even know that a group of letters is a word. Its only resource for learning English was (a translation of) Leo Tolstoy's War and Peace. It scanned the book letter-by-letter looking for patterns. At first it found simple patterns, i.e. the comma is usually followed by a space. Eventually it found more complex patterns, like a quotation mark is followed eventually, but not immediately, by another quotation mark; the letter 't' is often followed by the letter 'h' is often followed by the letter 'e'. In this way the computer program gradually built its knowledge about the English language. The program needed to scan the book many times before it has a good grasp of English grammar but every time it finished, it was asked to write a sentence based on what it had learned so far. Here are the sentences it wrote after reading the book 100, 300, 500, 700, 1200 and 2000 times:
So it's still a bit of gibberish. But impressive when you consider how much you could learn about a foreign language in the same 30 seconds.
The neural network behind Google's piano melody learned about music in a similar way. From listening to music, it began to recognize patterns like repetition between phrases and regular spacing of notes (i.e. tempo). It also learned about higher-level structures like verse-bridge-conclusion and dissonance resolution. Now, give it another listen with the aforementioned information in mind. It's still not very good music. But when you think about how 1) this computer program taught itself, with no human intervention, about basic musical patterns and how 2) this technology is still in its infancy and there are many talented people --- artists in their own right --- working on developing it, I think it's promising music.