Skip to content

The Infinite Monkey Theorem

2013 January 13
Monkey at a typewriter

No one doubts that Leo Tolstoy was a singular genius. His novel, War and Peace , is considered one of history’s great works and another, Anna Karenina, was recently adapted for a major motion picture, nearly 150 years after it was written.

Famous lines such as “Happy families are all alike; each unhappy family is unhappy in its own way,” the opening of Anna Karenina, show not only uncommon insight into the human condition, but a superlative talent for communication that spans barriers of time and language.

Yet if you would put an infinite amount of monkeys sitting at an infinite amount of keyboards, the problem of producing masterworks like Tolstoy’s would be one of curation, not creation.  Up till now, the concept of infinite monkeys has been an interesting notion and nothing more, but amazing advances in artificial intelligence are making it a reality.

Our Own Molecular Monkeys

The infinite monkey theorem has been a prominent one in literary circles, most famously in the Borges story, the The Library of Babel.  It’s compelling because it’s so disturbing. Creativity, like spirituality, is something we’ve come to consider uniquely human and therefore completely separate from the rote, automated processes of machines.

Our brains work relatively slowly, at about 200 MPH, which is no match for the speed-of-light calculations of computer chips.  However, we are able to do many things they cannot. Even something as simple as the way a child catches a ball is beyond the reach of the most advanced robots.

The reason is that our brains are, in technological jargon, massively parallel.  Each one of our billions of neurons can, potentially at least, communicate directly with every other one. So while the processing is done very slowly, it can work on millions of problems at a time, coordinating our senses and movements as well as autonomic processes such as breathing.

The connections between our neurons are far from random, but evolve through two primary processes.  The first, called Hebbian plasticity, is based on the principle that “neurons that fire together, wire together,” the more we use a pathway the stronger it gets. The second, is feedback.  We tend to repeat pathways that result in successful outcomes.

Notably, both of these processes can be replicated algorithmically.

Recognizing Patterns

Computers work in a fundamentally different way.  They tend to have very poor internal connectivity.  One circuit works largely in isolation of others and each step in a process is coordinated through the use of an internal clock.  They are, however, amazingly fast, so even though much of their processing is wasteful, they can still get a whole lot done.

As our technology is able to work at exponentially increasing speeds, it is progressively able to approximate human behavior through simple brute force.  They can simulate literally millions of instances and, through human provided feedback, are able to recognize patterns.  In a sense, they learn like a child would, eliminating mistakes over time.

This has been going on for decades and we are beginning to see tangible results.  Our phones now do a passable job of recognizing what our thumbs want to text and can correct for their clumsiness.  Natural language processing applications like Siri are able to understand the patterns of our speech to an almost uncanny degree.

The process continues countless times every day, mostly without us realizing that it’s even going on.  As technology historian George Dyson puts it, “Are we searching the search engines or are the search engines searching us?”

Creating Patterns

Our machines are starting to evolve beyond simple computational aids in that they are beginning to be able to emulate our creative intelligence as well.  While they can’t yet match our massively parallel architecture, they are becoming able to match our ability to turn basic data into understandable content.

To see what I mean, take a look at this recent earnings preview posted on Forbes:

Wall Street is high on Costco Wholesale (COST), expecting it to report earnings that are up 16.3% from a year ago when it reports its first quarter earnings on Wednesday, December 12, 2012. The consensus estimate is 93 cents per share, up from earnings of 80 cents per share a year ago.

For the fiscal year, analysts are expecting earnings of $4.49 per share. Revenue is expected to be $23.48 billion for the quarter, 8.6% higher than the year-earlier total of $21.63 billion. For the year, revenue is projected to come in at $106.04 billion.

The company has reported increasing profit for three straight quarters. The 27.4% year-over-year growth in net income in the most recent quarter came after the 19.1% profit growth in the third quarter of the last fiscal year and the 13.2% rise in the second quarter of the last fiscal year…

Okay, it’s not exactly Tolstoy, but it is competently done, by a computer, with no human involvement.  The company that produced it, Narrative Science, uses powerful algorithms to turn data into stories.  So next time you read a box score or an earnings report, don’t be so sure a human wrote it.

And that’s not all.  Philip M Parker, a Professor at INSEAD, uses a similar algorithm that has published over 100,000 books, you can buy them on Amazon.  Music scholar and composer David Cope has been able to to create algorithms that produce original works of music which are so good that even experts can’t tell the difference.

A Short Guide to How It All Works

The types of creative output that computers are now capable of generating is eerily similar close to what humans produce.  However, the methods they use to do it are, at least im principle, surprisingly simple:

Nearest Neighbor: One key aspect of patterns is that similar entities tend to be grouped together and that’s the core concept of the nearest neighbor algorithm.  Apply the same technique to reference data at a very granular level and you have a great way of matching similar objects, like handwritten letters or pictures of faces.

Markov Chains:  Imagine someone who does three things:  eats, works and sleeps.  We can infer his life is a chain of these events with varying probabilities that depend on what came before.  If he just slept, he might decide to sleep some more, but is more likely to eat or work.  Additional factors, such as time of day, can improve accuracy further.

These sequences of probabilities are called Markov chains and, with enough data (remember that computers have an almost unlimited capacity to run simulations), we can infer probabilities based on thousands of variables.

Bayesian Nets:  These are somewhat similar to Markov chains in that they can handle thousands of variables, but include chains of causality based on probability distributions. The really cool thing about Bayesian nets, however, is that they can learn from new data.

For instance, a medical algorithm can suggest a diagnosis for a new patient based on data from the general population, but is also capable of adjusting as data specific to that patient becomes available.  It will also use that individual’s data to slightly adjust the model for the population as a whole.

In a similar vein, as you continue to speak into Siri, it is not only learning your specific speech patterns, but also how others might want to string words and phrases together.

The Ultimate Question

In the Borges story The Circular Ruins, a man creates a clone of himself through the power of his own mind only to realize, in the end, that he too was originally dreamt by another.

As we continue to advance both our artificial intelligence capabilities and our understanding of our own neurology, we are in a very real sense creating copies of ourselves, albeit ones that are millions of times more efficient than we are.

The time is drawing near when we will have to ask ourselves what it means to be human, not in the philosophical sense, but as a practical matter.  Now that our machines have started showing common sense, can understand art and even operate deadly military equipment , where will we draw the line?

How much control should we hand over to machines?  What kind of cybernetic implants should and shouldn’t we enhance ourselves with?  What will we do in 20 years, when our technology is thousands of times more powerful?  It is a future that is at once both inspiring and terrifying.

The inescapable conclusion is this:  The infinite monkey theorem is no longer theoretical. Our machines have begun to surpass us and we don’t have the first idea about what to do about it.

– Greg

4 Responses leave one →
  1. January 18, 2013

    Greg I have to tell you. Sometimes I can’t get here to read and learn. Sometimes it is for a few weeks and I start to feel guilty and anxious that I will miss something new and/or useful. And then I get here and am gobsmacked by what you are teaching me and the clarity of that basic skill you have…writing. Sorry this sounds like I am sucking up to you. It is more that when I grow-up I want to write like you. Don’t worry I am old and broken so it will never happen but it something to strive for.

    [Reply]

    Greg Reply:

    Thank you Mark. That’s incredibly kind of you to say. I’ll try to live up to it.

    Have a great weekend!

    – Greg

    [Reply]

  2. gregorylent permalink
    January 29, 2013

    reading, words, don’t have enough density of meaning per unit of time to meet what consciousness is capable of absorbing .. because of that, words are losing importance, newspapers, long form writing, it’s heading towards short, concise (twitter), and multi media

    rule by algorithm? a day late, and a dollar short, as grandpa used to say. not to mention the institutionalization of lowest common denominator thinking and confirmation bias. :-)

    [Reply]

    Greg Reply:

    Thanks for your input Gregory.

    – Greg

    [Reply]

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS

CommentLuv badge