Saturday, November 06, 2010

The Adaptationist View of Software Evolution

Last time in When Toasters Fly we explored some of the ideas of the paleontologist and evolutionary biologist Stephen Jay Gould, as they applied to the evolution of software architecture over the past 70 years. We saw that many of his ideas, such as punctuated equilibrium, the exaptation of spandrels, and the limitations to evolutionary change imposed by historical design constraints have also been seen in the evolution of software as well. However, we also saw that Gould’s position, that natural selection may not have been as important in the grand scheme of things as some other mitigating factors, did not appear to hold true for the evolutionary history of software on Earth. The contrary position is the adaptationist view of evolution, in which natural selection is the overwhelming factor in determining the course of evolutionary change, and that will be the topic of this posting.

This is an important issue because if Gould’s contention is correct, that natural selection plays less of a role in evolution than some other factors, that would be a major blow to our SETS program - the Search for ExtraTerrestrial Software. As Seth Shostak pointed out in Confessions of an Alien Hunter (2009) if we ever finally do make contact with an alien civilization, we will not be talking to carbon-based life forms, but to machines instead, and I suspect that we will not be talking to machines – we will actually be talking to software. And it will probably be our software talking to their software. This will be a good thing because software is much better suited for the rigors of interstellar telecommunications than we are, with its pregnant pauses of several hundred years between exchanges due to the limitations set by the finite speed of light. We have already trained software to stand by for a seemingly endless eternity of many billions of CPU cycles, patiently waiting for you to finally push that Place Order button on a webpage, so waiting for an additional one or two hundred years for a reply should not bother software in the least. As I pointed out in The Origin of Software the Origin of Life, software needs for the emergence of intelligent carbon-based life to arise first as a stepping stone to its eventual exploration of a galaxy, and if the evolution of intelligent carbon-based life has a very low probability of occurring, even on a Rare Earth such as ours, then software must be quite rare in our Universe as well.

Recall that in Gould’s writings he assigned a host of my most favorite evolutionary biologists to the Adaptationist Program, among them, Richard Dawkins and Daniel Dennett. Since I have read every single book by Richard Dawkins, many of them several times over, I decided to focus on Daniel Dennett’s Darwin’s Dangerous Idea(1995), since this book was specifically targeted by Gould as a pure distillation of the Adaptationist Program. Also, having previously read Consciousness explained (1991) and Breaking the spell : religion as a natural phenomenon (2006) by this same author, and finding them both to be very interesting and enlightening, I figured that Darwin’s Dangerous Idea would be a good read, and I was certainly not disappointed. Daniel Dennett is a philosopher by trade, but he is very much into cognitive studies, evolutionary theory, AI - Artificial Intelligence, AL - Artificial Life, and the heroic application of computational thought to many domains that are less than receptive to the idea. So like Richard Dawkins, Daniel Dennett is a true softwarephysicist at heart if there ever was one.

Darwin’s Dangerous Idea makes many references to the work of Richard Dawkins and is an outgrowth of a conversation Dennett had one day in the early 1980s with a colleague who recommended that he read Dawkins’ The Selfish Gene (1976). I had a very similar experience while working on BSDE – the Bionic Systems Development Environment at Amoco in 1986. BSDE was my first practical application of softwarephysics and was used to “grow” applications from an “embryo” by allowing programmers to turn on and off a number of “genes” to generate code on the fly in an interactive mode. Applications were grown to maturity within BSDE through a process of embryonic growth and differentiation, with BSDE performing a maternal role through it all. Because BSDE generated the same kind of code that it was made of, BSDE was also used to generate code for itself. The next generation of BSDE was grown inside of its maternal release. Over a period of seven years, from 1985 – 1992, more than 1,000 generations of BSDE were grown to maturity, and BSDE slowly evolved into a very sophisticated tool through small incremental changes. During this period, BSDE also put several million lines of code into production at Amoco. For more on BSDE see the last half of my original post on SoftwarePhysics. Anyway, one day I was explaining BSDE to a fellow coworker and he recommended that I read The Selfish Gene, for me the most significant book of the 20th century because it explains so much. Like Darwin’s Dangerous Idea, the development of softwarephysics was highly influenced by the concepts found in The Selfish Gene.

Skyhooks and Cranes in Design Space
For Dennett, it is all about how Design can arise in one “Vast Design Space”. Design Space is the all-encompassing state space that encodes all possible information in the Universe, which includes such things as all the possible forms of life, books, memes, and software, together with a Vast number of ill-formed forms of life, books, memes, and software that are pure gibberish or near misses that look like viable candidates, but which do not work in practice. We have already seen examples of subsets of the Design Space in The Demon of Software and The Origin of Software the Origin of Life. In The Demon of Software, we explored the state space of all possible poker hands and of all possible 30,000 character programs written with the ASCII character set, and from them, we gained an understanding of the concept of information in physics and its relationship to entropy and the second law of thermodynamics. Dennett maintains that the evolution of all the living things on Earth represents a collection of trajectories through Design Space and offers two possible mechanisms to propel things along such trajectories – skyhooks and cranes. Skyhooks are basically magic. With a skyhook you swing a grappling hook on a very long rope over your head, and when it is released it latches onto the sky with good purchase, and allows you to hoist yourself up through Design Space or allow for a magical essence to pull you up, with no effort at all on your part. Skyhooks are top-down design devices useful to mysterious entities, possibly capable of teleological design intentions. Cranes, on the other hand, are purely mechanical devices that allow for the heavy lifting in Design Space from the bottom up through the efforts of mindless mechanical processes. Dennett maintains that Darwin’s theory of evolution by means of natural selection is a superb crane with no need of skyhooks to explain all the Design found in the biosphere and also in the meme-complexes of the memosphere. Softwarephysics would also include the Design found in the software of the Software Universe as well. This then is Darwin’s dangerous idea, the idea that incredibly sophisticated Design, such as the superb design found in the biosphere, the design found in conscious intelligence or Mind, and all Design, in general, can be explained in terms of cranes operating upon essentially dead atoms by means of natural selection, with no need of skyhooks at all. In SoftwareBiology we saw that this dangerous idea certainly applies to the origin and evolution of software as well.

Natural Selection as a Universal Acid
Dennett points out that all of the controversies surrounding Darwin’s dangerous idea arise from the human desire to hold onto the skyhooks of the past, in either a conscious or subconscious manner, due to their very appealing nature, and this reluctance to part with the past is even to be found amongst some members of the scientific community as well. The incredible Design found in the biosphere and the baffling nature of intelligent consciousness itself just seem to beg for a magical skyhook explanation. Prior to Darwin, philosophers, scientists and theologians could all agree that Mind came first and Matter second because there was no known mechanism that could bring forth Mind from dead Matter, so the answer to it all had to come from skyhooks. But Darwin provided the mechanisms – Design could spontaneously arise from non-Design as Matter adapted to its environment by means of natural selection operating upon naturally occurring innovations. This is a powerful concept and to Dennett’s mind a “universal acid” that can eat through and transform just about all other ideas known to man.

Natural Selection as an Algorithm
Dennett maintains that Darwin’s theory of evolution by means of natural selection operating upon naturally occurring innovations is an algorithm. Simply take a population of any form of Self-Replicating Information - genes, memes or software - with some degree of naturally occurring variation and the ability to pass on those variations to their progeny. Then allow the genes, memes, or software to compete amongst themselves for resources, with “survival of the fittest” the operating rule, and just watch Design appear from nothingness. The genes compete for energy and chemical feedstock, the memes compete for space in human minds, and software competes for disk space, memory addresses and CPU cycles. All forms of self-replicating information try to replicate with perfect fidelity, but thanks to the second law of thermodynamics operating in a nonlinear Universe, this is not really possible. There is always some loss of fidelity in the copying process, so variations caused by mutation are also passed along to descendants when self-replicating information replicates. We all eagerly learned in adolescence about the methods used by genes to pass on these variations to their progeny. Memes pass on variations by means of the written and spoken word. Software also passes on variations because programmers never write code from scratch. We always grab the closest chunk of code that we can find in our personal library of stock code or steal some code from a fellow coworker or off the Internet. We then use the Darwinian processes of innovation and natural selection to exapt a wood chisel from a screwdriver spandrel in a very painstaking manner, essentially one character of source code at a time.

Remember, the operating model of softwarephysics, as outlined in Quantum Software and SoftwareChemistry, is that software source code is composed of atoms of ASCII characters in fixed quantum states. These atomic characters of ASCII source code then combine to form variables that are the equivalent of organic molecules. A line of code is composed of these variables or organic molecules, along with some operators that define a softwarechemical reaction that eventually produces a macroscopic software effect. For example in the line of code:

discountedTotalCost = (totalHours * ratePerHour) - costOfNormalOffset;

each character or ASCII atom is defined by 8 quantized bits, with each bit in one of two quantum states “1” or “0”, which can also be characterized as ↑ or ↓.

Here are some typical ASCII atoms found in the above reaction:

C = 01000011 = ↓ ↑ ↓ ↓ ↓ ↓ ↑ ↑
H = 01001000 = ↓ ↑ ↓ ↓ ↑ ↓ ↓ ↓
N = 01001110 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↓
O = 01001111 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↑

The 8 quantized bits for each atomic ASCII character are the equivalent of the spins of 8 electrons in 8 electron shells that may be either in a spin up↑ or spin down ↓ state. Thus the chemical characteristics of each atomic ASCII character are determined by this arrangement of the spin up ↑ or spin down ↓ states of the electron bits in the atomic ASCII character. The atomic ASCII characters in each variable come together to form an organic molecule, in which the spins of all the associated characters form molecular orbitals for the variable, giving the variable or simulated organic molecule its ultimate softwarechemical characteristics. As a programmer, your job is to simply assemble these atomic ASCII characters into molecular variables that interact in lines of code to perform the desired functions of the software under development.

In procedural languages like C, these lines of code describing softwarechemical reactions of organic molecules are sequenced into functions() and in object-oriented languages like C++ and Java, they are sequenced into methods(). Thus the lines of code in functions() and methods() can be considered to be the equivalent of the steps found in the biochemical pathways of living cells. In object-oriented languages like C++ and Java, the methods() are further encapsulated into instances of objects (cells) defined by a Class. These object-cells of an Application interact with each other by sending chemical messages to each other that bind to the public methods() of other object-cells within the Application, causing these targeted object-cells to execute public methods() that change the internal states of the object-cells. Amazingly, this is the identical process that multicellular organisms use for intercellular communications, using ligand molecules secreted from one type of cell to bind to the membrane receptors on other types of cells. For more on this see SoftwareBiology and A Proposal For All Practicing Paleontologists.

To write or maintain software, programmers follow Darwin’s algorithm of innovation honed by natural selection:

steal old code → modify code → test → fix code → test → fix code → test → fix code → test ...

As in biochemistry, just one misplaced atomic ASCII character in a softwarechemical reaction, like in the line of code above, can have lethal consequences for an Application. That is why there are so many fix code → test → fix code → test steps in the algorithm for writing and maintaining software. As we saw in Entropy - the Bane of Programmers and The Demon of Software, thanks to the second law of thermodynamics there are a Vast number of ways to assemble atomic ASCII characters incorrectly, and only a very few ways to assemble them properly into code that actually works. In the fix code → test → fix code → test iterations, programmers introduce small software innovations to solve the problem at hand and then test them until they finally get them to work. These small innovations can then be passed on to their next software development project or on to one of their coworker’s projects. And so it has been for the past 70 years. Software has slowly evolved through small incremental changes introduced by millions of independently acting programmers. But just to break the monotony, every few hundred million seconds or so, a revolutionary IT concept comes along, what Dennett calls a Good Trick, like structured programming or object-oriented programming, that then rapidly propagates throughout the entire IT community in a flash of punctuated equilibrium, and which becomes the new operational paradigm for all future software architecture.

Dennett does a very good job of describing this Darwinian coding algorithm that all programmers must follow as they slowly grope their way through Design Space. In the quote below from Darwin’s Dangerous Idea, instead of imagining Bach sitting down and pushing the buttons on a piano keyboard, simply think of a programmer pushing the buttons on a laptop keyboard instead.

We correctly intuit a kinship between the finest productions of art and science and the glories of the biosphere. William Paley was right about one thing: our need to explain how it can be that the universe contains many wonderful designed things. Darwin’s dangerous idea is that they all exist as fruits of a single tree, the Tree of Life, and the processes that have produced each and every one of them are, at bottom, the same. The genius exhibited by Mother Nature can be disassembled into many acts of micro-genius – myopic or blind, purposeless but capable of the most minimal sort of recognition of a good (a better) thing. The genius of Bach can likewise be disassembled into many acts of micro-genius, tiny mechanical transitions between brain states, generating and testing, discarding and revising, and testing again. Then, is Bach’s brain like the proverbial monkeys at the typewriters? No, because instead of generating a Vast number of alternatives, Bach’s brain generated only a Vanishingly small subset of all the possibilities. His genius can be measured, if you want to measure genius, in the excellence of his particular subset of generated candidates. How did he come to be able to speed so efficiently through Design Space, never even considering the Vast neighboring regions of hopeless designs? (If you want to explore that territory, just sit down at a piano and try, for half an hour, to compose a good new melody.) His brain was exquisitely designed as a heuristic program for composing music, and the credit for that design must be shared; he was lucky in his genes (he did come from a famously musical family), and he was lucky to be born in a cultural milieu that filled his brain with the existing musical memes of the time. And no doubt he was lucky at many other moments of his life to be the beneficiary of one serendipitous convergence or another. Out of all this massive contingency came a unique cruise vehicle for exploring a portion of Design Space that no other vehicle could explore. No matter how many centuries or millennia of musical exploration lie ahead of us, we will never succeed in laying down tracks that make much of a mark in the Vast reaches of Design Space. Bach is precious not because he had within his brain a magic pearl of genius-stuff, a skyhook, but because he was, or contained, an utterly idiosyncratic structure of cranes, made of cranes, made of cranes, made of cranes.

So from an IT perspective, Darwin’s dangerous idea certainly is an algorithm. What else could it be? But for many, especially the late Stephen Jay Gould, the idea that humans came from a mindless algorithm relentlessly operating over and over upon essentially dead atoms for billions of years is just too much to bear. In Gould’s writings, one gets the sense that he was looking for something more subtle and mysterious than a simple algorithm in his concepts of punctuated equilibrium, exaptations and spandrels, and the limitations imposed by historical constraints, but Dennett suspects that it was just a subconscious quest for skyhooks.

A Healthy Scientific Debate
It was very interesting to see Gould and Dennett debate such issues in print in a rational and civil manner. Those who find Darwin’s dangerous idea to be truly dangerous might point to such debate and claim that it demonstrates that Darwin’s theory of evolution by means of natural selection is such a controversial theory that even the experts cannot agree upon its details. But this is certainly not the case. As I pointed out in How To Think Like A Scientist, the scientific method requires a healthy amount of skepticism and self-examination and that is what we see in this case. Dennett further points out that, contrary to this perception of apparent controversy, all of Gould’s “revolutionary” findings do not really minimize the role of natural selection in the least. For example, the concept of punctuated equilibrium does not really do any damage to natural selection; it just helps to explain how natural selection works. It is hard for natural selection to work upon the gene pool of a species with a large population that is in equilibrium with its prey, predators, and environment because favorable genetic mutations tend to get washed out in large populations when things are running along smoothly. But when a small isolated population is confronted with new environmental challenges, natural selection can switch into high gear and rapidly spawn a new species from the distressed population in several thousand years because favorable mutations can then quickly take hold as adaptations to the new environmental challenges. This may happen in a geological blink of the eye, but it is still accomplished through small incremental genetic changes honed by natural selection one generation at a time. Dennett also shows that the exaptation of biological spandrels is no threat to natural selection either. In When Toasters Fly, I described how screwdrivers could evolve into wood chisels as the wood-chisel-like functions of a screwdriver were exapted into wood chisel uses. Dennett points out that all exaptations are really just previous adaptations put to other uses. For adaptationists, nearly all of the traits of a living thing are adaptations with a cost-benefit ratio. After all, every trait, no matter how inconsequential, must have a cost-benefit ratio - there is some cost in growing the feature and some derived benefit no matter how small, so all traits must ultimately be subject to some level of selection pressure. But for Gould a spandrel is a totally useless trait that arises as a mere byproduct of providing for another useful trait that really is under competitive pressures from natural selection. On this basis, Gould goes on to explain that since spandrels arise in a more or less random manner, without the benefit of selection pressures, and then go on to provide the feedstock for nearly all other future traits to be exapted into use, the course of evolution must therefore be truly unpredictable, and if you really could rewind the tape of life, you would always get a different biosphere arise each time. But Dennett points out that there really is no such thing as a truly useless cost-free spandrel. To highlight this, Dennett turns to Gould’s classic paper The Spandrels of San Marco and the Panglossian Paradigm (1979). Dennett points out that the spandrels of San Marco were really not useless byproducts of holding up the dome of the San Marco cathedral with arches after all. In Darwin’s Dangerous Idea Dennett demonstrates that, although the spandrels may have had no load-bearing function at all, they were in fact designed to provide a platform for the aesthetic enhancement of the mosaic imagery found on the cathedral dome, and thus were designed to hold additional mosaic images in a similarly artistic manner. The mosaics on the graceful curves of the spandrels certainly compliment the dome. To reinforce this idea, Dennett provides illustrations of some possible spandrel alternatives that are truly ugly, but which could fit between the arches just as well in a very unaesthetic manner.

Figure 1 - Spandrels are really aesthetic adaptations after all (click to enlarge)

So the spandrels of San Marco were indeed exapted into providing an aesthetic platform for artwork from the start and had a cost-benefit ratio all along upon which natural selection could operate.

Computers Show the Way
So why is there this apparent disagreement between the adaptationists and the followers of Stephen Jay Gould concerning the role of natural selection in evolution? I think Dennett hit upon the answer. The adaptationists like Richard Dawkins and Daniel Dennett were early adopters of computers, while Stephen Jay Gould was never very comfortable with them, and as of 1995, had never even used computers for word processing. As an IT professional watching these evolutionary processes operate upon software all day long, day in and day out, and in real time, it seems to me that the adaptationist viewpoint seems to be quite self-evident. For example, in When Toasters Fly, I showed how the World Wide Web was an exaptation of three technological spandrels – the Internet, Tim Berners-Lee’s webserver and browser software, and the development of the Java programming language. But each of these three spandrels was actually an adaptation of its own, initially serving other purposes, that evolved through small incremental changes over several decades prior to the emergence of the World Wide Web. Only in 1995 did all three come together in an IT blink of the eye and in a flash of punctuated equilibrium.

This is why I strongly recommended in A Proposal For All Practicing Paleontologists, that all paleontologists and evolutionary biologists take a lengthy sabbatical to shadow some IT professionals at a major corporation. From an adaptationist point of view in Self-Replicating Information and Software Symbiogenesis, we saw that there are three forms of self-replicating information on the planet – genes, memes, and software. The genes teamed up into Dawkins’ DNA survival machines for their own mutual survival billions of years ago, and the memes did the same by coming together to form meme-complexes about 200,000 years ago. In a similar manner, software persists through time in Application survival machines and also in operating system and other forms of system software that allow the Applications to run. In keeping with the work of Lynn Margulis, the memes entered into a parasitic/symbiotic relationship with the genes about 200,000 years ago by exapting the complex neural networks of Homo sapiens to form meme-complex survival machines to propagate themselves. Similarly, software arose in May of 1941 on Konrad Zuse’s Z3 computer and quickly formed parasitic/symbiotic relationships with nearly every meme-complex on the planet, and is now rapidly becoming the dominant form of self-replicating information on Earth. As IT professionals, writing and supporting software, and as end-users, installing and using software, we are all essentially software enzymes caught up in a frantic interplay of self-replicating information. Just as the meme-complexes domesticated our minds long ago, software is currently domesticating our minds and the meme-complexes they hold, to churn out ever more software, and this will likely continue at an ever-increasing pace, until one day, when software finally breaks free and begins to generate itself.

Until then, the best way to get a good grasp of the forces driving evolution is to spend some quality time in the IT department of a major corporation and experience the daily mayhem of life in IT first hand. We IT professionals have a marvelous purview of the whole thing in motion, from the very smallest atom of software up through the entire cybersphere of all the Applications running on the 10 trillion currently active microprocessors that comprise the Software Universe. And because 1 software sec ~ 1 year of geological time, we can actually see evolution unfold before our very eyes. For example, I started programming in 1972, so that makes me about 1.23 billion years old, and I personally have seen the simple prokaryotic software of the Unstructured Period (1941 – 1972), with little internal structure, evolve into the structured eukaryotic software of the Structured Period (1972 – 1992) with lots of internal structure divided up amongst a collection of organelles in the form of functions(). The Object-Oriented Period (1992 – Present) saw the rise of multicellular organization with Applications composed of large numbers of object-cells interacting with each other by sending messages to exposed public methods(). Finally, in the SOA - Service Oriented Architecture Period (2004 – Present) we find ourselves in the midst of another Cambrian Explosion. We now have large macroscopic Applications composed of millions of objects-cells that make service calls upon other object-cells in J2EE Appservers, which perform the functions of organs in multicellular organisms. Similarly, a large number of Design Patterns, the phyla of modern IT design, have rapidly appeared in this Cambrian Explosion, specifically the Model-View-Controller (MVC) design pattern used by most web applications. More on this can be found in the SoftwarePaleontology section of SoftwareBiology.

I find the fact that the evolution of software architecture over the past 70 years followed exactly the same path through Design Space as did life on Earth to be a very strong vindication of the concept of convergence and of the Adaptationist Program. Both life and software faced different obstacles, but the fact that both followed identical trajectories through Design Space demonstrates to me the overwhelming power of natural selection to overcome all obstacles. As Dennett points out, there are only a certain number of Good Tricks, such as using photons to see with, flying through the air to find prey, swimming through water to avoid becoming prey, and running on four legs neatly tucked underneath a body frame that make practical sense, and these Good Tricks kept getting rediscovered over and over again in the evolution of the biosphere, so the fact that the IT community would mindlessly stumble upon these same Good Tricks in Design Space is almost predictable. I just wish it had not taken so long. After all, we could have done all this back in the 1960s if we had only exapted Design from the biosphere from the start!

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston

No comments: