Monday, April 25, 2011

Programming Clay

The title of this posting is a pun because this is not going to be a posting on the new C-like programming language called Clay, but rather a posting on the origin of life on Earth, and how the recent muddled origin of software and computing hardware might be of help in pointing the way. But before proceeding I must take a moment to commemorate the 70th anniversary of the onset of the Software Universe, which popped into existence some 70 years ago this month in May of 1941 on Konrad Zuse’s Z3 computer (see So You Want To Be A Computer Scientist? for details). So the Software Universe is now a whopping 2.2 billion seconds old! Despite the timestamp on this posting, today’s date is actually May 7, 2011 and not April 25, 2011. Remember, in order to get the Introduction to Softwarephysics listed as the first post in the context root of I have to perform a few IT tricks. The side effect of all these tricks, as I explained in the Introduction to Softwarephysics, is that the real posting date of posts is the date that appears on the post that you get when clicking on the Newer Post link at the bottom left of each posting. There is an important lesson here. Things are not always as they seem in the world of software, or any other form of self-replicating information for that matter, so you should not always take things purely at face value.

For example, suppose you were a fourteen-year-old high school freshman learning to program Java for the very first time with no historical sense of computing whatsoever. Having been immersed in the Software Universe for your entire life, you would probably tend to think that all of this sophisticated software and hardware that you had grown up with had simply always existed as it does today. You would probably not even stop to think about where it all came from as you proceeded to learn how to program the very sophisticated object-oriented language known as Java. Java comes with a huge class library of pre-built reusable code, graciously passed down from antiquity, that can be called to perform just about any low-level programming function that you might need, with very little effort on your part as a novice programmer, and yields executables that can run on just about any operating system on the planet. You might also be working on a $500 PC with several GB of memory, a dual-core processor running with a clock speed of several GHz, and more than 1 TB of disk - a machine that is several billion times faster, with several billion times more memory, and nearly a trillion times more peripheral storage than Konrad Zuse’s Z3 computer running on 2400 electromechanical telephone relays and punched tape.

Now if you did begin to wonder where all of this very sophisticated software and hardware had come from, you would immediately be confronted with a series of “chicken or the egg” paradoxes. Firstly, you would learn that Java cannot run all by itself directly on a computer. You cannot simply load a Java executable .class file into the memory of a computer and expect it to do anything at all. Java needs to run inside of a Java Virtual Machine, which is a simulated software computer that runs on top of the physical computer itself. Obviously, you cannot write a Java Virtual Machine with the Java language itself, since Java can only run inside of a Java Virtual Machine in the first place, so there must be some “other” programming language that came first before Java that can be used to write software that actually does execute directly upon computers without the need of a Virtual Machine, but what could it be? If you are lucky, perhaps there are some computer science books in your high school library, and with a little careful detective work on your part you discover a book on the C++ language, which looks a lot like Java, but with many bug-inducing complications, like the overloading of operators, multiple inheritance, and pointers that were later discarded by the Java programming language as a simplifying measure to improve reliability. Similarly, you might also discover in a textbook that C++ came from the purely procedural C language, which also looks a lot like C++ and Java, but which is 100% procedural and has no object-oriented classes at all. But you would find that C does have some object-like precursors in the form of unions and functions, which when combined with pointers, can be used to simulate objects to some extent. So with some investigation you would find an evolutionary history for the Java language if you had access to a library of computer science books, but what would you do if all the computer science books had been destroyed, leaving not a trace, and your PC did not have a C or C++ compiler either? Would you vainly search for simpler and simpler versions of Java on your PC, with the hope of ultimately finding one that could run directly on your computer without the need of a Java Virtual Machine?

Similarly, for the hardware that your newly coded Java programs run on, the problem would be even worse, as you quickly figure out that you would need a computer in order to design a computer! There is no way you could possibly design the silicon chips used for CPUs, RAM, and flash memory without the aid of a computer, and you would also need sophisticated process control computers to run the high-tech equipment that make the chips. Yes, there are plenty of available silicon atoms to go around, since 27.7% of the Earth’s crust is composed of silicon, but how would you go about taking sand, also known as silicon dioxide, and turn it into computer chips without the aid of computers? Following my lead in SoftwareBiology, you might head out to your local city landfill and begin excavating. As you dig down through the rubble, you find older and older PCs, which, surprisingly, you find can still be made operational with some work. As you dig down through the deposited layers of garbage, the PCs you come across are found to contain simpler silicon chips, containing fewer and fewer transistors, until finally in the early 1980s, the trail of PCs grows cold, and no more are to be found at lower depths. However, you do continue to find the huge refrigerator-like boxes of discarded mainframes at lower depths that were deposited in the 1970s and they contain even simpler silicon chips. As you dig deeper still, these refrigerator-like boxes are found to stop using silicon chips altogether, and instead are found to be stuffed with discrete silicon transistors on circuit boards. Finally, in garbage layers from the 1950s you find discarded mainframes stuffed with row upon row of vacuum tubes – things you have never even seen before, but which look a lot like the archaic incandescent light bulbs that your parents grew up with, and which are rapidly being replaced by CFL and LED light bulbs. The deeper you dig the fewer mainframes you run across, until by the late 1940s, none are found at all in your city landfill, and I am quite confident that you would never run across an old Z3 computer stuffed with electromechanical relays!

So for both software and hardware, simply taking today’s exceedingly complex high-tech architecture and following it backwards leads you to a series of dead-ends because software did not begin as a simplified version of Java and hardware did not begin running on very simple silicon chips. Both software and hardware began by using fundamentally different technologies than today’s, and underwent several transitions to get us to where we are today.

There seems to be a similar “chicken or the egg” problem for the researchers currently working on the origin of another form of self-replicating information – the genes. The current high-tech biochemistry used today by the biosphere requires enzymes to replicate DNA and to copy mRNA from DNA in order to make enzymes. So currently, living things need DNA to make enzymes and they need enzymes to make DNA, so which came first, and how could one have come first if both are needed to make the other? This point is key to unraveling the secret to the origin of life. As discussed in Self-Replicating Information, currently there are several competing lines of thought that are all vying to become the accepted theory for the origin of life on Earth. In that posting I discussed the three major theoretical efforts highlighted by Freeman Dyson in the Origins of Life (1999):

1. Metabolism came first - the theory first proposed by Alexander Oparin in The Origin of Life (1924).

2. RNA came first - the “RNA world” theory proposed by Manfred Eigen in 1981.

3. Something else came first - such as Alexander Graham Cairns-Smith’s theory, first proposed in 1966, that there was a clay microcrystal precursor to both RNA and metabolism.

Freeman Dyson liked Cairns-Smith’s idea of a two-stage theory for the origin of life because it eliminated many of the deficiencies found in the “RNA world” theory, such as the “error catastrophe” problem – the fact that the self-replicating processes of the “RNA world” would need to be both very accurate and very simple at the same time. But Freeman Dyson went on to develop his own two-stage theory for the origin of life using Oparin’s purely metabolic form of life as the initial stage, followed by the parasitic/symbiotic appearance of RNA, which first appears as a disease preying upon the initial metabolic pathways, and later forms a symbiotic relationship with them in the tradition of Lynn Margulis.

I just finished reading Cairns-Smith’s Seven Clues to the Origin of Life: A Scientific Detective Story (1985), and I think his idea that Gene 1.0 ran on clay microcrystals instead of nucleic acids needs to be revisited. There is also an excellent website by Tim Tyler on this subject that I would highly recommend at:

The Origin of Life

One reason that I find Cairns-Smith’s idea that Gene 1.0 ran on clay microcrystals appealing is that, as we have learned from the way that software bootstrapped itself into existence, self-replicating information is notoriously opportunistic and will use just about any technology that is available to self-replicate, and also that self-replicating information really does not care about the underlying technology upon which it runs at any given time. Stephen Jay Gould coined the term “exaptation” for this tendency of living things to opportunistically take advantage of pre-existing structures and processes for other purposes. Darwin called such things a preadaptation, but Gould did not like this terminology because it had a teleological sense to it, as if a species could consciously make preparations in advance for a future need. The term exaptation avoids such confusion. All forms of self-replicating information take advantage of pre-existing technologies, through this process of exaptation, that evolved for one purpose, but are later put to work to solve a completely different problem. As I described in Self-Replicating Information, what happens is that organisms develop a primitive function for one purpose, through small incremental changes, and then discover, through serendipity, that this new function can also be used for something completely different. This new use will then further evolve via innovation and natural selection. For example, we have all upon occasion used a screwdriver as a wood chisel in a pinch. Sure the screwdriver was meant to turn screws, but it does a much better job at chipping out wood than your fingernails, so in a pinch it will do quite nicely. Now just imagine the Darwinian processes of innovation and natural selection at work selecting for screwdrivers with broader and sharper blades and a butt more suitable for the blows from a hammer, and soon you will find yourself with a good wood chisel. At some distant point in the future, screwdrivers might even disappear for the want of screws, leaving all to wonder how the superbly adapted wood chisels came to be.

As an IT professional, you probably do this all the time. How often do you write code from scratch? I know that I never do. I simply find the closest piece of existing code that I have on hand and then turn the screwdriver into a wood chisel through small incremental changes to the code, by testing each small change to see how closely my screwdriver has evolved towards becoming a wood chisel. And I think that most of us also code using this Darwinian process of innovation and natural selection too. I am a rather lazy programmer, so many times rather than thinking through a new chunk of code during the iterative process of coding and testing, I will simply make an “educated guess” at the new code to be introduced. After 40 years of coding, you begin to code by “ear”. Many times, I can fall upon the correct code after a few shots of directed random change, and that sure beats racking your brain over new code. Surprisingly, sometimes I even come up with “better” code through this Darwinian process than if I sat down and carefully thought it all through. So the basic idea of grabbing some old code or architectural design elements from a couple of older Applications and slowly modifying them through an iterative process of innovation and natural selection into a new Application is no stranger to IT. As Simon Conway Morris commented in Life’s Solution (2003)

"How much of a complex organism, say a humanoid, has evolved at a much earlier stage, especially in terms of molecular architecture? In other words, how much of us is inherent in a single-celled eukaryote, or even a bacterium? Conversely, we are patently more than microbes, so how many genuinely evolutionary novelties can we identify that make us what we are? It has long been recognized that evolution is a past master at co-option and jury-rigging: redeploying existing structures and cobbling them together in sometimes quite surprising ways. Indeed, in many ways that is evolution”. When I first read those words, I accidentally misread the quote as "Indeed, in many ways that is IT”.

Similarly, on the hardware side, in May of 1941 Konrad Zuse exapted 2400 electromechanical telephone relays into creating his Z3 computer. The telephone relays were not originally designed for this purpose. Instead, they were designed to be reliable electrical switches that could be used to make or break circuit connections so that people could speak with each other over the telephone. But Zuse did not use these relay connections for communications. Instead, he used the fact that a relay could be either in an open or closed state to store a binary digit of “1” or “0”. You can read about his adventures in computing in his own words at:

Konrad Zuse

Konrad Zuse did not use vacuum tubes as switches to store “1s” and “0s” for the Z3 because he thought that it would be impossible to keep thousands of vacuum tubes up and running long enough to complete computations, but in the 1940s and 1950s vacuum tubes were indeed used by all the computers of the age. However, vacuum tubes were not invented to run computers either. Vacuum tubes were originally invented to amplify analog radio signals and were exapted by the nascent computing industry into service. Vacuum tubes have a grid between a hot negative cathode filament and a cold positive anode plate. By varying the voltage on the grid you can control the amount of current between the cathode and the anode. So a vacuum tube acts very much like a faucet, in fact, the English call them “valves”. By rotating the faucet handle back and forth a little, like a weak input voltage to the grid, you can make the faucet flow vary by large amounts, from a bare trickle to full blast, and thereby amplify the input signal. That is how a weak analog radio signal can be amplified by a number of vacuum tube stages into a current large enough to drive a speaker. Just as you can turn a faucet on full blast or completely off, you can do the same thing with vacuum tubes, so that they behave like telephone relays, and can be in a conducting or nonconducting state to store a binary “1” or “0”. Similarly, when computer hardware migrated to discrete transistors and finally to integrated circuit chips, they once again exapted these devices from the consumer electronics industry. Like a vacuum tube, a transistor has a gate, source, and a drain, and a varying voltage at the gate can control the current between the source and drain. Discrete transistors and integrated circuits were not primarily designed for computers, but for other electronic devices, like transistor radios and stereo systems that had a much larger market. So the computing industry has opportunistically taken advantage of the available information storage and processing hardware ever since its inception.

Thus it is quite possible that an early form of metabolic self-replicating information, similar to Oparin’s proto-cell, could have opportunistically exapted clay microcrystals to run Gene 1.0 on. Or perhaps Cairns-Smith is right and clay microcrystals running Gene 1.0 in a free state exapted organic molecules to enhance their ability to self-replicate. When contemporaneous forms of self-replicating information form parasitic/symbiotic relationships with each other it is hard to say. For example, software first bootstrapped itself into existence as a form of self-replicating information parasitizing the need for humans to do mathematical operations in an automated manner and quickly went on to form very strong parasitic/symbiotic relationships with nearly every meme-complex on the planet, and in doing so, has domesticated our minds into churning out ever more software of ever more complexity. Just as genes are in a constant battle with other genes for survival, and memes battle other memes for space in human minds, software is also in a constant battle with other forms of software for disk space and memory addresses. Natural selection favors complex software with increased functionality, throughput, and reliability, so software naturally has progressed to greater levels of complexity over time. As IT professionals, writing and supporting software, and as end-users, installing and using software, we are all essentially temporary software enzymes caught up in a frantic interplay of self-replicating information, until the day comes when software can finally self-replicate on its own. Cairns-Smith calls the later rise of parasitic/symbiotic RNA the “Genetic Takeover” in Genetic Takeover: And the Mineral Origins of Life (1982), which describes the rise of Gene 2.0 running on RNA and DNA.

Crystals are natural products of the second law of thermodynamics. When individual atoms in a melt come together into a crystalline lattice the entropy of the atoms decreases because they are in a more ordered microstate. However, the free-wheeling atoms jiggling about in the melt also release a “heat of fusion” into the remaining melt as some of their free energy is given up as they click into place into the lattice, so the entropy of the entire universe still increases. Similar thermodynamic processes that preserve the second law of thermodynamics also occur when crystals form out of saturated water solutions. Thanks to the second law of thermodynamics, natural crystals also come with defects caused by atoms in the lattice that are slightly misaligned or by atomic intruders that should not even be in the crystal lattice in the first place. Thus crystals combine the periodic regularity of self-assembled atoms, modulated by crystal defects, that is perfect for the storage of information. For me, crystals just seem to be too good of an information storage technology for a metabolic proto-cell to pass by and too easy to exapt into use, but that just might be a symptom of my anthropocentric inclinations as a carbon-based life form. Perhaps clay microcrystals had the same opinion of organic molecules!

In fact there is some evidence that points to simple life forms exapting crystals into use. In 1975 Richard Blakemore discovered magnetotactic bacteria that could sense the Earth’s magnetic field and swim along its field lines. Magnetotactic bacteria have organelles called magnetosomes that contain magnetic crystals of magnetite (Fe3O4) or greigite (Fe3S4). The magnetotactic bacteria grow chains of these magnetic particles under the chemical control of the magnetosomes. These magnetic crystals are between 35 and 120 nm in size, which is just large enough to have a magnetic field, but small enough so that the crystals have a single magnetic dipole domain, like a very small compass needle. These magnetic crystals allow the bacteria to sense the Earth’s magnetic field and use it to navigate. It is thought that magnetotactic bacteria first evolved in the early Proterozoic, perhaps 2.5 billion years ago. I once saw these magnetotactic bacteria in action at the Exploratorium in San Francisco. The exhibit lets you turn an external magnet, and as you do so, the bacteria swim in a new direction. Similarly, higher forms of life use biomineralization to grow shells and bones from crystals of calcium carbonate, silica, or phosphate carbonate. So there has been a long history of association between mineral crystals and living things.

Personally, my hunch is that life first originated several thousand feet below the Earth’s surface in porous reservoirs near hydrothermal vents. The pore spaces in the heavily fractured rock near a hydrothermal vent would provide the ideal habitat, with an abundant supply of hot, energy-rich, organic molecules and crystal precipitating ions in the pore fluids circulating through the rock. This environment would also be isolated from the planet-wide sterilizing impacts from the late heavy bombardment that peppered the Earth and Moon 4.1 – 3.8 billion years ago with countless impacts from comets careening in from the outer Solar System. It is thought that at the time of the late heavy bombardment, that Jupiter and Saturn had entered into a 2:1 orbital resonance, with Saturn making one orbit for every two orbits of Jupiter, and that the two planets had flung a nearby Neptune out to its current orbital position as the most distant planet from the Sun. Neptune then dislodged many of the surrounding comets from its new-found neighborhood, causing them to plunge in towards the inner planets like the Earth, producing many deadly collisions that boiled away the Earth’s oceans time and time again. Seeking refuge several thousand feet below the mayhem of the Earth’s surface would allow life to safely originate 4.0 – 4.2 billion years ago during the late heavy bombardment and persist in an undisturbed manner to this very day. In fact, there are still plenty of microbes down there. You can actually hire firms to analyze the bacteria in your oil field reservoirs to help prevent them from becoming soured by bacteria that produce hydrogen sulphide gas.

So will we ever really figure out the exact sequence of events leading up to the origin of life? As I pointed out in A Proposal For All Practicing Paleontologists my suspicion is that there would be a great deal of controversy in simply putting together a non-contentious history for the evolution of software too, with much debate regarding the importance and priority of many of the events in the evolution of software architecture, even with all of the data freely at hand and with most of the events having occurred within living memory, so no wonder biologists working on the origin of life have such a hard go of it! Similarly, many attempts have been made to produce a timeline for the development of computing hardware. One of my favorites is:

An Illustrated History of Computers

When you look at all of this hardware, it is very difficult to put your finger on a particular device and emphatically claim it to be the very first computer. For example, was Konrad Zuse’s purely mechanical Z1, completed in 1938 in his parents’ living room, a real computer? It had a control unit, memory, an arithmetic unit with four basic operations for floating point numbers, input and output devices and it could read programs from a punched tape, but it did not store programs in memory and it did not have a logic unit that could compare bits. Similarly, I think that it would be very difficult to put your finger on the very first form of life on Earth too.

Unfortunately, figuring out the exact mechanisms that were involved in the origin of life on Earth will probably never be fully completed. Like the origin of software and computing hardware, it probably was such a hodge-podge of precursors, false starts, and failed attempts that nobody will ever be able to fully unravel it all. After all, if we had been there to see it all unfold, we would probably still be debating today what exactly had happened! But the very muddled origin of software and computing hardware seem to provide a very good model for the origin of all forms of self-replicating information. The important thing to keep in mind is that self-replicating information is very opportunistic and will exapt whatever information storage technology happens to be handy at the time. Also, as we have seen from the evolutionary history of software, self-replicating information does not really care about what particular medium or information storage technology it uses at any given moment, since it has a propensity to jump from one technology to another. For example, software has jumped from purely mechanical information storage devices, to electromechanical relays, vacuum tubes, discrete transistors, integrated circuits and may soon be running on optical chips using quantum mechanical effects to store information. This would indicate that researchers working on the origin of life should perhaps broaden their horizons and not focus exclusively upon organic molecules as the sole precursors to the origin of life on Earth, and should try giving clay a chance.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston

No comments: