50,000 years from now Mankind will spread throughout the galaxy, united in a great Galactic Empire. That is the premise behind the Foundation series of books by science-fiction author Isaac Asimov. Asimov introduces a character named Hari Seldon into this speculative future version of the Roman Empire in space. Seldon creates the science of Psychohistory. Psychohistory depends on the idea that, while one cannot foresee the actions of a particular individual, the laws of statistics as applied to large groups of people could predict the general flow of future events.
Using the science of psychohistory, Seldon foresees the collapse of the Empire. Using this foreknowledge, he assembles an ark of humanity called The Foundation, the future equivalent of the monastic settlements off the coast of Ireland and Scotland where our knowledge of the science of the ancient world was preserved throughout the Dark Ages.
In the 19th Century of our timeline, Pierre-Simon Laplace envisioned a hypothetical demon. If it knew the precise location and momentum of every atom in the universe then it could use Newton’s laws to reveal the entire course of cosmic events, past and future.
We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.
Singularities are a thorn in the side of Laplace’s demon. If it is possible for information to be destroyed upon venturing beyond the event horizon of a black hole, as the cosmic censorship hypothesis formulated by Roger Penrose seems to indicate, then the universe is no longer a closed system. If a black hole can act as a sort of Maxwell’s Demon, screwing up the conservation of information, then the clockwork universe required by Laplace is no longer viable. Whether black holes can and do destroy information is the basis of a long-running bet. John Preskill bet Stephen Hawking and Kip Thorne that information was not lost in black holes. Hawking conceded the bet in 2005 giving Preskill an encyclopaedia of baseball
from which information can be retrieved at will. But Thorne is still holding out.
There’s a more fundamental problem with the idea of a clockwork universe based upon Newton’s laws. Newton’s predictive theory, and indeed Einstein’s theory of relativity which superseded it, simply doesn’t work at the quantum level. The Copenhagen interpretation of quantum mechanics doesn’t deal with particles at all. Instead, everything in the universe exists only as a waveform of probabilities only collapsing into a particle state when a value is measured. But if you measure one value, you simply cannot know another value of that same particle. The more accurately you measure a particle’s speed, the less you know about its location and vice-versa.
You can ask Werner Heisenberg the time or you can ask him for directions but you can’t ask him for both at once.
Psychohistory isn’t quite as ambitious as Laplace’s demon. Hari Seldon doesn’t need to know the precise location and momentum of every atom in the universe. Instead, he focuses on the tiny subset of atoms that are clumped together to form the Ugly Bags Of Mostly Water we called human beings. If we know the chain of past and present human events, then can we determine future behaviour?
Our natural reaction is to rebel against this deterministic approach to human affairs that apparently leaves little room for free will or morality. (Oddly, the very religions that place such weight on individual moral behaviour are the same ones to posit deterministic first causes like Original Sin).
In An Inquiry into the Nature and Causes of the Wealth of Nations, Adam Smith replaces the religious construct of a predeterministic or interventionist diety with the equally intangible Invisible Hand. The market will regulate itself, he argues, because people act in their own best interest.
“It is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest. We address ourselves, not to their humanity but to their self-love, and never talk to them of our own necessities but of their advantages.”
Smith’s Invisible Hand reminds me of the Leviathan of Thomas Hobbes. But whereas Hobbes’s Leviathan (which is essentially government itself) is a deliberately-created construct of society to avoid the natural state of man (
nasty, brutish and short), Smith proposes a policy of non-interference in human affairs.
Smith’s fellow Scotsman Charles Mackay would disagree. In Extraordinary Popular Delusions and the Madness of Crowds, he chronicled the history of popular folly such as witch-hunts, crusades and economic bubbles like the infamous Tulip 2.0 mania of the 17th Century.
“Men, it has been well said, think in herds; it will be seen that they go mad in herds, while they only recover their senses slowly, and one by one!”
Francis Galton, the father of eugenics, would certainly have agreed with Mackay’s evidence that the mob can’t be trusted to make good decisions. Leave the decision making to your genetically superior betters, he argued. But then experience and evidence proved him wrong.
Galton attended a country fair that featured a betting competition: guess the weight of the ox. Our modern equivalent would be guess the number of jelly beans in the jar. 800 people from various walks of life attempted to guess the weight of the ox. The person with the closest estimate won a prize. Just for shits and giggles, Galton averaged out all 800 estimates. Galton expected the average result would be way off the mark because so few people in the crowd were experts in the meat industry. In fact, he discovered that the average of all the guesses was much more accurate than any single guess. (The average estimate was 1,197 pounds — the ox weighed 1,198 pounds).
Galton had stumbled upon the phenomenon of The Wisdom of Crowds, so named by James Surowiecki in his 2004 book.
Surowieki proposes a number of factors required to make a crowd “wise” (rather than “mad”):
A large enough crowd.
Hari Seldon would have an excellent sampling with his galaxy-spanning pool of actors.
Diversity of opinion.
A crowd made up entirely of experts is as useless as a crowd made up entirely of amateurs. Perhaps even more useless. Experts are very bad at estimating their own fallibility.
Independence of opinion.
If everyone’s guesses are visible to everyone else in the crowd, the crowd will soon descend into a cascading feedback loop of imitation.
To see the wisdom of crowds in action, just watch Who Wants To Be A Millionaire? and wait for a contestant to use the “ask the audience” card.
Smith’s Invisible Hand and Surowiecki’s Wisdom of Crowds only work in the aggregate: they attempt to predict the behaviour of groups of people but they make no claims about predicting the actions of any one person. This way of studying something by observing only its aggregate effects runs counter to the usual scientific method of reductionism: breaking something down into its smallest constituent parts and studying those parts in order to better understand the larger system. Here we have the opposite. But there is a scientific precedent.
If you want to study the diffusion of gas, it does you no good to study individual gas particles. At the atomic level, particles display the random movements of Brownian motion. There’s no way of predicting the future behaviour of any single particle. But it is entirely possible to predict how a gas will diffuse to fill a container.
There is no intelligence, no invisible hand, no wisdom of particles behind the diffusion process. Instead it is raw probability that makes possible the aggregate prediction of randomly moving particles.
Emergent behaviour can be found throughout the natural world at multiple scales.
Consider the slime mold. Slime mold cells are ridiculously simple life forms with no brains. Yet, when times are hard, they gather together to form a single entity: a slug-like mold that can move, forage and even traverse a maze. It may be that this is how complex life emerged on our planet: not through the appearance of a smarter protein one day, but by lots of dumb proteins acting together.
The neurons in our brains are relatively simple clumps of cells. But when enough them are put together in a lump of grey matter, consciousness emerges. There is no single neuron in charge of consciousness. It is the collective action of all my neurons working in concert that forms the invisible hand we call intelligence.
Bees and ants display eerily intelligent group behaviour, discovering the most efficient routes to food sources and building complex structures. Don’t be fooled by the terminology when we refer to a queen in this context. The queen is little more than a breeding chamber for more ants and bees. Beehives and ant colonies are Leviathans without leaders.
Shoals of fish move as one.
The flocking motions of birds appear so co-ordinated that it’s hard for us to accept that there is no single entity leading the movement.
Naturally occurring ecosystems like rain forests and coral reefs display the same emergent efficiency as the unplanned systems of Man: the growth of our cities, the flow of traffic on the roads and the expansion of the internet.
The spirit of the beehive and the ghost in the machine are one and the same.
All of these different systems have one thing in common. The individual components of the system (the slime mold cells, the nuerons, the bees, the ants) can communicate with one another within the system. They are connected. They are nodes in a network.
Until quite recently, networks were thought to fall into one of two categories. Either the nodes of the network were connected in a very structured, lattice-like way with every node having the same number of connections. Or else the nodes were connected at random. This model of a random graph was proposed by Paul Erdős in 1959 (remember that name). At the time, it was thought that this pattern of random connections would occur in the natural networks of bees, ants, birds and fish. But the observed emergent behaviour just didn’t fit the random graph model.
If we take a sampling of data from the world and plot it on a graph, we might expect to see a bell curve indicating normal distribution. Take, for example, measurements of height or weight. The measurements will vary but never by a huge degree. Even if the tallest or the heaviest person in the world is included in our sampling, the bell curve distribution can accommodate that variance.
But now let’s take a sampling of wealth. Let’s say we gather our data from a thousand different people. We might expect a bell curve to describe the distribution of wealth amongst those thousand people. But if one of those people is Bill Gates then our graph will look very, very different. Bill Gates is the statistical equivalent of finding someone who is 10 miles tall.
Unlike height or weight, wealth is not distributed evenly. The Italian economist Vilfredo Pareto measured the distribution of wealth and found that 80% of the wealth was distributed amongst 20% of the population. If you plot this distribution on a graph, you don’t get a bell curve.
A power law distribution is characterised by a fat head (that’s Bill Gates) and a long tail (that’s you and me).
When we plot samples of data from nature, we don’t get a bell curve of normal distribution and we don’t get a random graph. We get a power law distribution.
When I say that power law distributions occur naturally, I don’t just mean bees, ants, fish and birds. Let’s say we took a sampling of 100 measurements: distance, size, whatever, it really doesn’t matter. We would intuitively expect a fairly equal distribution of numbers: there should be about the same amount of numbers between 10 and 20 as there are between 20 and 30 or 80 and 90. We expect these equal ranges to be distributed equally. In fact what we find is that the first digit is 1 almost one third of the time. Numbers beginning with 9 account for barely more than four percent of the total.
This is Benford’s Law. It runs totally counter to our intuition about the nature of the world around us but this distribution shows up again and again: the sizes of mountains and lakes, the lengths of rivers, populations in cities, births and deaths. Wherever we might intuitively expect to see a random graph, we instead find a power law distribution. Numbers with a leading digit of 1 are the fat head. The long tail is made up of higher numbers.
Benford’s Law shows us that not all numbers are created equal. When we examine emergent networks, we find that not all nodes are created equal. Some nodes are more connected than others. If you plot the connections per node on a graph, you get… yes, you guessed it: a power law distribution.
Networks that exhibit this power law distribution are called scale-free networks.
The well-connected nodes that make up the fat head of a scale-free network are the hubs. Most nodes in a scale-free network will have very few connections but an elite monarchy of hubs will have many connections. The internet is a scale-free network. So is the network of airports that enables the flow of air traffic around the world. Think of an airport as being defined by the number of routes it offers. These routes are the connections. Most airports are small, offering a limited number of routes. But just occasionally you get a monster like Heathrow, responsible for a disproportionately large number of routes.
It appears that hubs form because new nodes in a network exhibit a behaviour known as preferential attachment. Basically, a new node in a network is 50% more likely to join a well-connected node than a less connected node. The rich get richer.
Hubs are both a strength and a weakness of scale-free networks. Because most nodes in a scale-free network are relatively unimportant, removing a node won’t affect the overall efficiency of a network. Shutting down Shoreham airport isn’t going to upset the flow of air traffic around the world.
But if a hub is removed from a network, the network can be crippled. If Heathrow airport were to shut down, there would be a chaos of cancellations and delays at airports all around the globe.
You will have often heard it said that the internet was designed to withstand nuclear attack. It’s true that the underlying architecture of the net is remarkably resilient. That’s because of its scale-free nature. Most attacks on a network (or failures within a network) are random in nature. Because most nodes in a scale-free network are relatively unimportant, chances are that an attack or a failure will occur on a fairly insignificant. But if someone were to make a deliberately-targeted attack on the small number of hubs on the internet, the network would quickly collapse.
A lattice-like structure would be more stable, with every node having the same number of connections. But it wouldn’t be nearly as efficient.
We often think of natural networks like ecosystems as being stable. But in fact they exist in a state of self-organised criticality. Removing just one species from a food web could cause the collapse of the entire system, if that species is a hub.
The scale-free networks that are of most interest to those of us designing the social web are social networks.
The idea that social relationships follow the Pareto principle can strike us as unfair. Entire nations and philosophies have been founded on the principle that we are all equal. But when it comes to our social connections, some are far more equal than others.
We all know someone who seems to know everyone, right?
Remember Paul Erdős, the originator of the random graph? He was quite an eccentric character. He was a brilliant mathematician of no fixed abode. He spent most of his life living out of a suitcase, showing up on some other mathematician’s doorstop, crashing on their sofa and collaborating on writing papers with them. Because he travelled so widely and collaborated so much, the idea of the Erdős number was born.
If you were lucky enough to have co-authored a paper with Paul Erdős, you would have an Erdős number of one. Just over 500 people have an Erdős number of one. If you co-authored a paper with someone who co-authored a paper with Erdős, then your Erdős number would be two. And so on. For mathematicians, the average Erdős number is around four or five. But the Erdős number has spread beyond the field of mathematics. Because Noam Chomsky has an Erdős number of four and Chomksy is so well-connected in the field of linguistics, many linguists can boast an Erdős number.
It’s bacon Friday and I’d like to point out that in the world of entertainment, an actor’s equivalent to the Erdős number is the Bacon number. You’ve probably all played this game. Given a random actor’s name, you have to link them to Kevin Bacon. If someone appeared in a film with Kevin Bacon, they have a Bacon number of one. If they appeared in a film with someone who appeared in a film with Kevin Bacon, they have a Bacon number of two …and so on. As with the Erdős number, the average number of connections is surprisingly small.
Kevin Bacon isn’t the most prolific actor. What’s important isn’t the number of films he has appeared in but the diversity. If sheer number of appearances were all that mattered, then Peter North and Ron Jeremy would be the most significant actors of our time. But because the films they appeared in are somewhat same-y, a game of Six Degrees of Ron Jeremy is going to be quite limited.
Kevin Bacon isn’t the centre of the acting world. He isn’t even the most well connected actor. A game of Six Degrees of Rod Steiger or Six Degrees of Donald Sutherland would result in even shorter connections.
Some people have an Erdős number and a Bacon number. The sum of these two numbers together gives you their Erdős-Bacon number.
Remember Winnie Cooper on The Wonder Years? She was played by the actress Dannica McKellar. She has a Bacon number of two. She appeared in the film The Year That Trembled with Marin Hinkle who appeared with Kevin Bacon in Rails and Ties.
As well as being an actress, Danica McKellar is an author. She wrote the bestsellers Math Doesn’t Suck and Kiss My Math, currently on the New York Times bestseller list. While she was still in college, she co-authored a scientific paper with her professor who has an Erdős number of three. That gives McKellar an Erdős number of four resulting in a combined Erdős-Bacon number of six.
The Erdős number is limited to people who have published academic papers. In the grand scheme of things, that’s a fairly restricted group of people. But we see the same level of interconnectedness when we look at society in general.
Six Degrees of Kevin Bacon is an offshoot from a series of experiments originally conducted by Stanley Milgram in the late 60s. He conducted quite a few whacky experiments in his time.
Milgram wanted to find out how many degrees of separation there were between any two randomly selected people. He created the small world experiment which he conducted several times with several variations.
Typically, Milgram would choose people from Omaha, Nebraska or Wichita, Kansas as the starting points and someone in Boston as the end point (these points are socially, as well as geographically, very far apart). The people in Omaha or Wichita received a packet telling them about the experiment and asking them if they knew this person in Boston. If they did, they forwarded the package directly to that person and the experiment was over. That was pretty unlikely, so the other option was to ask the participant to think of someone they knew who they thought might have a better chance of knowing this person in Boston and forward to the package to them instead. That person receives the package and goes through the same process. Lather, rinse, repeat. When the package reaches the intended recipient in Boston, the experiment is over.
At each stage of this game of pass the parcel, each participant added their name to a roster so that the number of “hops” could be counted at the end of the experiment. Sometimes the connections were as short as two or three hops. Often the letters never reached their target because one of the participants in the chain simply refused to play ball.
But of the 64 letters that did reach their target, the average number of connections was around six.
In 2001, Duncan Watts repeated Milgram’s experiment but he was able to do it on a much larger scale by using email as the medium. This experiment spanned 157 countries. The average number of connections was around six.
A Facebook application that calculates the degree of connectivity in its install base of 4.5 million users has found an average of around six degrees of separation.
Statistical analysis on the 240 million users of MSN Messenger found an average degree of connectivity between users of around six.
Our minds rebel against the idea that such vast numbers of people are typically connected by such a small number. It is as unintuitive as Benford’s Law. That’s because we intuitively expect connections between people to be roughly equal. Our minds yearn for a bell curve or a random graph even when the world keeps showing us power laws. Our moral centre tells us that everyone has value. But experiment shows that, when it comes to social networks, some people are much more valuable than others.
Just as some nodes in a network are an order of magnitude more important than others, the importance of a connection can also vary.
We all have social ties that we would consider to be very important: our connections to our immediate family members; perhaps a handful of people that we would consider Best Friends Forever. But although we might consider these strong ties to be the most important connections in our social networks, it turns out that the weak ties are far more useful for communication in a social network.
If you’re looking for a new job, it probably won’t do you much good to tell your nearest and dearest friends. Chances are they mostly know the same people as you. But if you reach out to your acquaintances and contacts, there’s a much greater chance that one of them will know somebody who can help you out. That’s the strength of weak ties, a term coined by Mark Granovetter.
It is the strength of weak ties that enables rapid communication through a scale-free network and creates such low degrees of separation between nodes. This is how rumours can spread so quickly. The spread of sexually transmitted disease is enabled through the weak ties of casual sex and the existence of a few very sexually active hubs.
In his book The Tipping Point, Malcolm Gladwell categorises these hubs of the social networks, giving them names like connectors, mavens and salesmen. According to Gladwell, these well-connected people are not just responsible for spreading sexually transmitted diseases. They are also key factors in any rapid expansion of popularity: fads, memes, bestselling books by first-time authors.
Duncan Watts, one of the formulators of small world network theory, takes issue with the importance that Gladwell places on the connectors in a social network. According to his research, the phase transitions we call tipping points will be reached anyway. But connectors can hasten the process.
There’s something else missing from this discussion of fads and bestsellers and that’s the fad or the bestseller itself. These are what Jyri Engeström calls Social Objects.
Still, even though Gladwell doesn’t put much weight in Social Objects, concentrating instead on the power of hubs and the strength of weak ties, he makes a compelling case by recounting many stories of products and services that exploded in popularity.
You can usually find The Tipping Point in the Business section of your local bookshop. The implication is clear: buy this book, absorb these stories of success and, if we can understand the secret to their success, you too can engineer the Next Big Thing.
Gladwell is offering a kind of psychohistory-lite; Laplace’s Demon writ small.
But all of the success stories in The Tipping Point have one thing in common. They all occurred in the past. There is no prediction of future trends.
A Black Swan is an unpredictable event with large consequences …like, say, a fad, a bestselling book from a first-time author or an unexpectedly successful service or product. It is quite likely that every single success story recounted in The Tipping Point is actually an example of a Black Swan.
A Black Swan is characterised by three qualities:
- It cannot be predicted in advance.
- Though the causes of a Black Swan might be small, the effects are very large.
- Once a Black Swan has occurred, we retroactively explain its appearance and treat the event as though it were predictable.
The attacks on the World Trade Centre on September 11th, 2001 are a classic example of a Black Swan. But Harry Potter and The Da Vinci Code are equally good examples. They came out of nowhere, had a huge impact and now we like to fool ourselves into thinking they were inevitable.
Tipping Points and Black Swans are basically the same thing: they are the fat head of power law distributions. Bill Gates is a Black Swan of wealth. While the rest of us are down in the long tail, he’s the fat head. Most authors in the world form the long tail of publishing, with just a few Black Swans like JK Rowling and Dan Brown experiencing the tipping points that put them in the fat head.
The existence of Black Swans sounds the death knell for psychohistory. In fact, in the Foundation series of books, Hari Seldon’s plan unravels when a Black Swan known as The Mule is introduced into his universe.
The idea of The Black Swan was put forward by Nassim Nicholas Teleb in his book of the same name. It’s a terrible book, very badly and smugly written. But it does contain a napkin’s worth of useful advice. While we cannot predict Black Swans themselves, it is a certainty that Black Swans will occur. Once we know that events are distributed according to a power law, not a bell curve, we can stop fooling ourselves into thinking that we can predict the unpredictable.
Where does that leave us designers of the Social Web? Our task is to construct scale-free small world networks. We know that when these networks grow, they will exhibit emergent behaviour but we also know that because the growth of these networks depends on hubs formed by tipping points—Black Swans, in other words—they are unpredictable.
We can learn from the wise words of Donald Rumsfeld:
There are known knowns. There are things we know that we know. There are known unknowns. That is to say, there are things that we now know we don’t know. But there are also unknown unknowns. There are things we do not know we don’t know.
What a Zen Master! He’s absolutely right. We are very good at fooling ourselves into thinking we understand how events unfold. We see patterns in the past and we think that these patterns can be used to predict the future. We all have a little bit of Hari Seldon in us.
Here’s a simple example of retroactive pattern matching, one of that I’ve heard far too often:
- MySpace is a successful social networking site.
- MySpace is fugly.
- If I make a fugly social networking site, it will be successful.
This is cargo cult thinking. Even if we knew what made one particular social networking site successful, that knowledge wouldn’t necessarily help us repeat the success with another social networking site.
Some ridiculously large percentage of the population in the United Arab Emirates has a Flickr account. Orkut is disproportionately popular in Brazil. For years now, almost every single person in Ireland has had a Bebo account. These countries and these social networking sites have no obvious connection. We could pore over the log files and discover the cascade of signals transmitted through weak ties that led to these geo-specific tipping points but why bother? It wouldn’t help us predict the success or failure of a new social networking site in any particular country.
I don’t want to sound like a naysayer. I do think that there are things we can do to increase the chances of growing a successful social network. Remember that small causes can trigger large events. Small improvements to the design of an interface or the flow of the user experience could reap huge benefits.
I think it’s important to remember that, while it just isn’t possible to design a complex system like a social network, it is perfectly reasonable to design good initial conditions to nurture the growth of a scale-free small world network. We are no longer creating static artefacts like pages, we are creating frameworks within which people can interact with one another and create connections.
We cannot predict the topography of the resulting small world network but there are some tell-tale signs to be aware of. When people start using your system for purposes other than those you expected, that’s a good thing. When people start playing games within your system, that’s a good thing. It’s a sign of emergent behaviour that should be fostered rather than repressed.
The concept of @ replies on Twitter was a ground-up emergent phenomenon. People started to use this syntax to indicate that a message was targeted at a specific user. Once this behaviour hit a critical mass, the engineers behind Twitter did the right thing by encoding this behaviour into the system, make the username behind the @ symbol clickable.
The web is full of systems that encourage emergent behaviour: Amazon, eBay, Delicious, Flickr …these are all frameworks within which small world networks can bloom. Yet all of these systems are themselves just nodes in one of the most beautiful scale-free networks in existence: the World Wide Web.
Dial F For Frankenstein
I began today by talking about a science fiction scenario that dealt with predicting future events. A far more common science fiction scenario is the unpredictable emergence of intelligent behaviour from a network. The Terminator films, inspired by the work of Harlan Ellison, warn us of the spontaneous emergence of self-awareness of the SkyNet system.
I want to finish today by telling you about another of these shaggy dog stories of emergent intelligence. It was written by that other giant of the golden of age of science fiction, Arthur C. Clarke who died earlier this year. He wrote a story called Dial F For Frankenstein (a tip to science fiction authors hoping to write timeless stories; don’t include device-specific interface details like “dial” in your titles).
In this story, the world’s separate computer networks are linked together by satellite. Suddenly, every phone in the world starts to ring. It is the birth cry of a new intelligence.
This short story would be a fairly unremarkable addition to the science fiction canon except for one unexpected consequence.
A teenager who read the story was inspired to pursue the idea in later life. His name was Tim Berners-Lee.
When he created the World Wide Web on top of the gloriously dumb network of the internet, he gave us all a tool of unimaginable power. We haven’t even begun to scratch the surface of the web’s potential. I can’t predict what incredible stories will emerge from the systems that you are building on the web today. But I know that they will be wondrous.
This presentation is licenced under a Creative Commons attribution licence. You are free to:
- Copy, distribute and transmit this presentation.
- Adapt the presentation.
Under the following conditions:
- You must attribute the presentation to Jeremy Keith.