Saturday, July 05, 2014

Introduction to Softwarephysics

Softwarephysics is a simulated science for the simulated Software Universe that we are all immersed in. It is an approach to software development, maintenance, and support based upon concepts from physics, chemistry, biology, and geology that I have been using on a daily basis for over 35 years as an IT professional. For those of you not in the business, IT is short for Information Technology, commercial computer science. The purpose of softwarephysics is to explain why IT is so difficult, to suggest possible remedies, and to provide a direction for thought. If you are an IT professional, general computer user, or simply an individual interested in computer science, physics, chemistry, biology, or geology then softwarephysics might be of interest to you, if not in an entirely serious manner, perhaps at least in an entertaining one.

From 1975 – 1979, I was an exploration geophysicist exploring for oil, first with Shell, and then with Amoco. In 1979, I made a career change into IT, and spent about 20 years in development. For the past 14 years, I have been in IT operations, supporting middleware on WebSphere, JBoss, Tomcat, and ColdFusion. When I transitioned into IT from geophysics, I figured if you could apply physics to geology; why not apply physics to software? So like the exploration team at Amoco that I had just left, consisting of geologists, geophysicists, paleontologists, geochemists, and petrophysicists, I decided to take all the physics, chemistry, biology, and geology that I could muster and throw it at the problem of software. The basic idea was that many concepts in physics, chemistry, biology, and geology suggested to me that the IT community had accidentally created a pretty decent computer simulation of the physical Universe on a grand scale, a Software Universe so to speak, and that I could use this fantastic simulation in reverse, to better understand the behavior of commercial software, by comparing software to how things behaved in the physical Universe. Softwarephysics depicts software as a virtual substance, and relies upon our understanding of the current theories in physics, chemistry, biology, and geology to help us model the nature of software behavior. So in physics we use software to simulate the behavior of the Universe, while in softwarephysics we use the Universe to simulate the behavior of software. Along these lines, we use the Equivalence Conjecture of Softwarephysics as an aid; it allows us to shift back and forth between the Software Universe and the physical Universe, and hopefully to learn something about one by examining the other:

The Equivalence Conjecture of Softwarephysics
Over the past 70 years, through the uncoordinated efforts of over 50 million independently acting programmers to provide the world with a global supply of software, the IT community has accidentally spent more than $10 trillion creating a computer simulation of the physical Universe on a grand scale – the Software Universe.

Logical Positivism and Effective Theories
Many IT professionals have a difficult time with softwarephysics because they think of physics as being limited to the study of real things like electrons and photons, and since software is not “real”, how can you possibly apply concepts from physics and the other sciences to software? To address this issue, softwarephysics draws heavily upon two concepts from physics that have served physics quite well over the past century – the concept of logical positivism and the concept of effective theories. This was not always the case. In the 17th, 18th, and 19th centuries, physicists mistakenly thought that they were actually discovering the fundamental laws of the Universe, which they thought were based upon real tangible things like particles, waves, and fields. Classical Newtonian mechanics (1687), thermodynamics (1850), and classical electrodynamics (1864) did a wonderful job of describing the everyday world at the close of the 19th century, but early in the 20th century it became apparent that the models upon which these very successful theories were based did not work very well for small things like atoms or for objects moving at high velocities or in strong gravitational fields. This provoked a rather profound philosophical crisis within physics at the turn of the century, as physicists worried that perhaps 300 years of work was about to go down the drain. The problem was that classical physicists confused their models of reality with reality itself, and when their classical models began to falter, their confidence in physics began to falter as well. This philosophical crisis was resolved with the adoption of the concepts of logical positivism and some new effective theories in physics. Quantum mechanics (1926) was developed for small things like atoms, the special theory of relativity (1905) was developed for objects moving at high velocities and the general theory of relativity (1915) was developed for objects moving in strong gravitational fields.

Logical positivism, usually abbreviated simply to positivism, is an enhanced form of empiricism, in which we do not care about how things “really” are; we are only interested with how things are observed to behave. With positivism, physicists only seek out models of reality - not reality itself. When we study quantum mechanics, we will find that the concept of reality gets rather murky in physics anyway, so this is not as great a loss as it might at first seem. By concentrating on how things are observed to behave, rather than on what things “really” are, we avoid the conundrum faced by the classical physicists. In retrospect, this idea really goes all the way back to the very foundations of physics. In Newton’s Principia (1687) he outlined Newtonian mechanics and his theory of gravitation, which held that the gravitational force between two objects was proportional to the product of their masses divided by the square of the distance between them. Newton knew that he was going to take some philosophical flack for proposing a mysterious force between objects that could reach out across the vast depths of space with no apparent mechanism, so he took a very positivistic position on the matter with the famous words:

I have not as yet been able to discover the reason for these properties of gravity from phenomena, and I do not feign hypotheses. For whatever is not deduced from the phenomena must be called a hypothesis; and hypotheses, whether metaphysical or physical, or based on occult qualities, or mechanical, have no place in experimental philosophy. In this philosophy particular propositions are inferred from the phenomena, and afterwards rendered general by induction.

Instead, Newton focused on how things were observed to move under the influence of his law of gravitational attraction, without worrying about what gravity “really” was.

The second concept, that of effective theories, is an extension of positivism. An effective theory is an approximation of reality that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand. For example, Newtonian mechanics is an effective theory that makes very good predictions for the behavior of objects moving less than 10% of the speed of light and which are bigger than a very small grain of dust. These limits define the effective range over which Newtonian mechanics can be applied to solve problems. For very small things we must use quantum mechanics and for very fast things moving in strong gravitational fields, we must use relativity theory. So all of the current theories of physics, such as Newtonian mechanics, classical electrodynamics, thermodynamics, statistical mechanics, the special and general theories of relativity, quantum mechanics, and the quantum field theories of QED and QCD are effective theories that are based upon models of reality, and all these models are approximations - all these models are fundamentally "wrong", but at the same time, these effective theories make exceedingly good predictions of the behavior of physical systems over the limited ranges in which they apply. That is the goal of softwarephysics – to provide for an effective theory of software behavior that makes useful predictions of software behavior that are applicable to the day-to-day activities of IT professionals. So in softwarephysics, we adopt a very positivistic viewpoint of software; we do not care what software “really is”, we only care about how software is observed to behave and try to model those behaviors with an effective theory of software behavior that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand.

GPS satellites provide a very good example of positivism and effective theories at work. There are currently 31 GPS satellites orbiting at an altitude of 12,600 miles above the Earth, and each contains a very accurate atomic clock. The signals from the GPS satellites travel to your GPS unit at the speed of light, so by knowing the travel time of the signals from at least 4 of the GPS satellites, it is possible to determine your position on Earth very accurately. In order to do that, it is very important to have very accurate timing measurements. Newtonian mechanics is used to launch the GPS satellites to an altitude of 12,600 miles and to keep them properly positioned in orbit. Classical electrodynamics is then used to beam the GPS signals back down to Earth to the GPS unit in your car. Quantum mechanics is used to build the transistors on the chips onboard the GPS satellites and to understand the quantum tunneling of electrons in the flash memory chips used to store GPS data on the satellites. The special theory of relativity predicts that the onboard atomic clocks on the GPS satellites will run slower and lose about 7.2 microseconds per day due to their high velocities relative to an observer on the Earth. But at the same time, the general theory of relativity also predicts that because the GPS satellites are further from the center of the Earth and in a weaker gravitational field, where spacetime is less deformed than on the surface of the Earth, their atomic clocks also run faster and gain 45.9 microseconds per day due to the weaker gravitational field out there. The net effect is a gain of 38.7 microseconds per day, so the GPS satellite atomic clocks have to be purposefully built to run slow by 38.7 microseconds per day before they are launched, so that they will keep in sync with clocks on the surface of the Earth. If this correction were not made, an error in your position of 100 yards/day would accrue. The end result of the combination of all these fundamentally flawed effective theories is that it is possible to pinpoint your location on Earth to an accuracy of 16 feet or better for as little as $100. But physics has done even better than that with its fundamentally flawed effective theories. By combining the effective theories of special relativity (1905) with quantum mechanics (1926), physicists were able to produce a new effective theory for the behavior of electrons and photons called quantum electrodynamics QED (1948) which was able to predict the gyromagnetic ratio of the electron, a measure of its intrinsic magnetic field, to an accuracy of 11 decimal places. As Richard Feynman has pointed out, this was like predicting the exact distance between New York and Los Angeles accurate to the width of a human hair!

So Newtonian mechanics makes great predictions for the macroscopic behavior of GPS satellites, but it does not work very well for small things like the behavior of individual electrons within transistors, where quantum mechanics is required, or for things moving at high speeds or in strong gravitational fields where relativity theory must be applied. And all three of these effective theories are based upon completely contradictory models. General relativity maintains that spacetime is curved by matter and energy, but that matter and energy are continuous, while quantum mechanics maintains that spacetime is flat, but that matter and energy are quantized into chunks. Newtonian mechanics simply states that space and time are mutually independent dimensions and universal for all, with matter and energy being continuous. The important point is that all effective theories and scientific models are approximations – they are all fundamentally "wrong". But knowing that you are "wrong" gives you a great advantage over people who know that they are "right", because knowing that you are "wrong" allows you to seek improved models of reality. So please consider softwarephysics to simply be an effective theory of software behavior that is based upon models that are fundamentally “wrong”, but at the same time, fundamentally useful for IT professionals. So as you embark upon your study of softwarephysics, please always keep in mind that the models of softwarephysics are just approximations of software behavior, they are not what software “really is”. It is very important not to confuse models of software behavior with software itself, if one wishes to avoid the plight of the 19th century classical physicists.

If you are an IT professional and many of the above concepts are new to you, do not be concerned. This blog on softwarephysics is aimed at a diverse audience, but with IT professionals in mind. All of the above ideas will be covered at great length in the postings in this blog on softwarephysics and in a manner accessible to all IT professionals. Now it turns out that most IT professionals have had some introduction to physics in high school or in introductory college courses, but that presents an additional problem. The problem is that such courses generally only cover classical physics, and leave the student with a very good picture of physics as it stood in 1864! It turns out that the classical physics of Newtonian mechanics, thermodynamics, and classical electromagnetic theory were simply too good to discard and are still quite useful, so they are taught first to beginners and then we run out of time to cover the really interesting physics of the 20th century. Now imagine the problems that the modern world would face if we only taught similarly antiquated courses in astronomy, metallurgy, electrical and mechanical engineering, medicine, economics, biology, or geology that happily left students back in 1864! Since many of the best models for software behavior stem from 20th century physics, we will be covering a great deal of 20th century material in these postings – the special and general theories of relativity, quantum mechanics, quantum field theories, and chaos theory, but I hope that you will find that these additional effective theories are quite interesting on their own, and might even change your worldview of the physical Universe at the same time.

Unintended Consequences for the Scientific Community
As I mentioned at the close of my original posting on SoftwarePhysics, my initial intention for this blog on softwarephysics was to fulfill a promise I made to myself about 20 years ago to approach the IT community with the concept of softwarephysics a second time, following my less than successful attempt to do so in the 1980s, with the hope of helping the IT community to better cope with the daily mayhem of life in IT. However, in laying down the postings for this blog an unintended consequence arose in my mind as I became profoundly aware of the enormity of this vast computer simulation of the physical Universe that the IT community has so graciously provided to the scientific community free of charge and also of the very significant potential scientific value that it provides. One of the nagging problems for many of the observational and experimental sciences is that many times there is only one example readily at hand to study or experiment with, and it is very difficult to do meaningful statistics with a population of N=1.

But the computer simulation of the physical Universe that the Software Universe presents provides another realm for comparison. For example, both biology and astrobiology only have one biosphere on Earth to study and even physics itself has only one Universe with which to engage. Imagine the possibilities if scientists had another Universe readily at hand in which to work! This is exactly what the Software Universe provides. For example, in SoftwareBiology and A Proposal For All Practicing Paleontologists we see that the evolution of software over the past 70 years has closely followed the same path as life on Earth over the past 4.0 billion years, in keeping with Simon Conway Morris’s contention that convergence has played the dominant role in the evolution of life on Earth. In When Toasters Fly, we also see that software has evolved in fits and starts as portrayed by the punctuated equilibrium of Stephen Jay Gould and Niles Eldredge, and in The Adaptationist View of Software Evolution we explore the overwhelming power of natural selection in the evolution of software. In keeping with Peter Ward’s emphasis on mass extinctions dominating the course of evolution throughout geological time, we also see in SoftwareBiology that there have been several dramatic mass extinctions of various forms of software over the past 70 years as well, that have greatly affected the evolutionary history of software, and that between these mass extinctions, software has also tended to evolve through the gradual changes of Hutton’s and Lyell’s uniformitarianism. In Software Symbiogenesis and Self-Replicating Information, we also see the very significant role that parasitic/symbiotic relationships have played in the evolution of software, in keeping with the work of Lynn Margulis and also of Freeman Dyson’s two-stage theory of the origin of life on Earth. In The Origin of Software the Origin of Life, we explore Stuart Kauffman’s ideas on how Boolean nets of autocatalytic chemical reactions might have kick-started the whole thing as an emergent behavior of an early chaotic pre-biotic environment on Earth, and that if Seth Shostak is right, we will never end up talking to carbon-based extraterrestrial aliens, but to alien software instead. In Is the Universe Fine-Tuned for Self-Replicating Information? we explore the thermodynamics of Brandon Carter’s Weak Anthropic Principle (1973), as it relates to the generation of universes in the multiverse that are capable of sustaining intelligent life. Finally, in Programming Clay we revisit Alexander Graham Cairns-Smith’s theory (1966) that Gene 1.0 did not run on nucleic acids, but on clay microcrystal precursors instead.

Similarly for the physical sciences, in Is the Universe a Quantum Computer? we find a correspondence between TCP/IP and John Cramer’s Transactional Interpretation of quantum mechanics. In SoftwarePhysics and Cyberspacetime, we also see that the froth of CPU processes running with a clock speed of 109 Hz on the 10 trillion currently active microprocessors that comprise the Software Universe can be viewed as a slowed down simulation of the spin-foam froth of interacting processes of loop quantum gravity running with a clock speed of 1043 Hz that may comprise the physical Universe. And in Software Chaos, we examine the nonlinear behavior of software and some of its emergent behaviors and follow up in CyberCosmology with the possibility that vast quantities of software running on large nonlinear networks might eventually break out into consciousness in accordance with the work of George Dyson and Daniel Dennett. Finally, in Model-Dependent Realism - A Positivistic Approach to Realism we compare Steven Weinberg’s realism with the model-dependent realism of Stephen Hawking and Leonard Mlodinow and how the two worldviews affect the search for a Final Theory.

The chief advantage of doing fieldwork in the Software Universe is that, unlike most computer simulations of the physical Universe, it is an unintended and accidental simulation, without any of the built-in biases that most computer simulations of the physical Universe suffer. So you will truly be able to do fieldwork in a pristine and naturally occuring simulation, just as IT professionals can do fieldwork in the wild and naturally occuring simulation of software that the living things of the biosphere provide. Secondly, the Software Universe is a huge simulation that is far beyond the budgetary means of any institution or consortium by many orders of magnitude. So if you are an evolutionary biologist, astrobiologist, or paleontologist working on the origin and evolution of life in the Universe, or a physicist or economist working on the emergent behaviors of nonlinear systems and complexity theory, or a neurobiologist working on the emergence of consciousness in neural networks, or even a frustrated string theorist struggling with quantum gravity, it would be well worth your while to pay a friendly call upon the local IT department of a major corporation in your area. Start with a visit to the Command Center for their IT Operations department to get a global view of their IT infrastructure and to see how it might be of assistance to the work in your area of interest. From there you can branch out to the applicable area of IT that will provide the most benefit.

The Impact of Self-Replicating Information Upon the Planet
One of the key findings of softwarephysics is concerned with the magnitude of the impact upon the planet of self-replicating information.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

Basically, we have seen several waves of self-replicating information dominate the Earth:
1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Note that because the self-replicating autocatalytic metabolic pathways of organic molecules, RNA and DNA have become so heavily intertwined over time that I now simply refer to them as the “genes”. Over the past 4.0 billion years, the surface of the Earth has been totally reworked by three forms of self-replicating information – the genes, memes and software, with software rapidly becoming the dominant form of self-replicating information on the planet. For more on this see:

A Brief History of Self-Replicating Information
How to Use an Understanding of Self-Replicating Information to Avoid War
How to Use Softwarephysics to Revive Memetics in Academia
Is Self-Replicating Information Inherently Self-Destructive?
Is the Universe Fine-Tuned for Self-Replicating Information?
Self-Replicating Information

Softwarephysics and the Real World of Human Affairs
Having another universe readily at hand to explore, even a simulated universe like the Software Universe, necessarily has an impact upon one's personal philosophy of life, and allows one to draw certain conclusions about the human condition and what’s it all about, so as you read through the postings in this blog you will stumble across a bit of my own personal philosophy - definitely a working hypothesis still in the works. Along these lines you might be interested in a few postings where I try to apply softwarephysics to the real world of human affairs:

MoneyPhysics – my impression of the 2008 world financial meltdown.

The Fundamental Problem of Everything – If you Google "the fundamental problem of everything", this will be the only hit you get on the entire Internet, which is indicative of the fundamental problem of everything!

What’s It All About? and Genes, Memes and Software – my current working hypothesis on what’s it all about.

How to Use an Understanding of Self-Replicating Information to Avoid War – my current working hypothesis for how the United States can avoid getting bogged down again in continued war in the Middle East.

Some Specifics About These Postings
The postings in this blog are a supplemental reading for my course on softwarephysics for IT professionals entitled SoftwarePhysics 101 – The Physics of Cyberspacetime, which was originally designed to be taught as a series of seminars at companies where I was employed. Since softwarephysics essentially covers the simulated physics, chemistry, biology, and geology of an entire simulated universe, the slides necessarily just provide a cursory skeleton upon which to expound. The postings in this blog go into much greater depth. Because each posting builds upon its predecessors, the postings in this blog should be read in reverse order from the oldest to the most recent, beginning with my original posting on SoftwarePhysics. In addition, several universities also now offer courses on Biologically Inspired Computing which cover some of the biological aspects of softwarephysics, and the online content for some of these courses can be found by Googling for "Biologically Inspired Computing" or "Natural Computing". At this point we will finish up with my original plan for this blog on softwarephysics with a purely speculative posting on CyberCosmology that describes the origins of the Software Universe, cyberspacetime, software and where they all may be heading. Since CyberCosmology will be purely speculative in nature, it will not be of much help to you in your IT professional capacities, but I hope that it might be a bit entertaining. Again, if you are new to softwarephysics, you really need to read the previous posts before taking on CyberCosmology. I will probably continue on with some additional brief observations about softwarephysics in the future, but once you have completed CyberCosmology, you can truly consider yourself to be a bona fide softwarephysicist.

For those of you following this blog, the posting dates on the posts may seem to behave in a rather bizarre manner. That is because in order to get the Introduction to Softwarephysics listed as the first post in the context root of I have to perform a few IT tricks. When publishing a new posting, I simply copy the contents of the Introduction to Softwarephysics to a new posting called the New Introduction to Softwarephysics. Then I update the original Introduction to Softwarephysics entry with the title and content of the new posting to be published. I then go back and take “New” out of the title of the New Introduction to Softwarephysics. This way the Introduction to Softwarephysics always appears as the first posting in the context root of The side effect of all this is that the real posting date of posts is the date that appears on the post that you get when clicking on the Newer Post link at the bottom left of the posting webpage.

For those of you who would rather listen to the postings in this blog as a podcast, you may listen to the most recent postings at:

thanks to the efforts of Rudolf Penninkhof who registered this blog with Odiogo.

For audio versions of older postings, please contact me at the email address below, and I will send you the MP3 files. The postings in this blog were converted to streaming audio by a company called Odiogo, which has some really great software for automatically “reading” text from the Internet and converting it to dynamic audio that sounds very much like a human voice, complete with very accurate pronunciations and tonal inflections. In fact, I have recommended my current employer look into using Odiogo text-to-audio services for our external websites. I have a DSL broadband connection with a 3 MB download. Some of the audio postings on the Odiogo website above come in at over 70 MB and run for several hours. My DSL seems to have trouble with buffering these very large postings because they are probably much larger than Odiogo anticipated for dynamic news feeds. If you have the same difficulty, just download the corresponding MP3 files, and you will be able to listen to my very long-winded postings that run for several hours. You will probably need to take a break anyway. I know that even I cannot listen to them all in one sitting! Once again, I would like to thank Google for hosting this blog and providing some really great software for creating and maintaining the postings, and also to Rudolf Penninkhof and Odiogo for converting the postings to a podcast as well.

SoftwarePhysics 101 – The Physics of Cyberspacetime is now available on Google Drive. Please note that some of the formulas do not render properly, especially exponents which do not display as superscripts, so please use your imagination.

Part1 - Part 1 of the original PowerPoint document.
Part 2
- Part 2 of the original PowerPoint document.
Entropy – A spreadsheet referenced in Part 1
– A 1989 document describing how to use BSDE - the Bionic Systems Development Environment - to grow applications from genes and embryos within the maternal BSDE software.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston

Saturday, June 14, 2014

How to Use an Understanding of Self-Replicating Information to Avoid War

Periodically events in the “real world” of human affairs seem to intervene in our lives, and so once again, we must take a slight detour along our path to IT enlightenment, as we did with MoneyPhysics in the fall of 2008 with the global financial meltdown, and with The Fundamental Problem of Everything as it relates to the origins of war. With the 100 year anniversary of the onset of World War I in August of 1914 close at hand, which led to the deaths of 20 million people and 40 million casualties for apparently no particular reason at all, once again we see growing turmoil in the world, specifically in the Middle East and a multitude of conflicts converging. World War I basically shattered the entire 20th century because it led to the Bolshevik Revolution in Russia in 1917 and to the rise of fascism in Europe in the 1930s that led to World War II, and the ensuing Cold War of the latter half of the 20th century. This ongoing turmoil has continued on well into the 21st century in the Middle East because the end of World War I brought with it a number of manufactured countries in the Middle East that were arbitrarily carved up out of the remains of the Ottoman Empire that, unfortunately, aligned itself with the Central Powers, and thus chose to be on the losing side of World War I. With such rampant mass insanity once again afoot in the Middle East, one must naturally ask why is the real world of human affairs so absurd, and why has it always been so? I think I know why.

In the analysis that follows there will be no need to mention any current names in the news because, as in The Fundamental Problem of Everything, this is a human problem that is not restricted to any particular group or subgroup of people. It is a problem that stems from the human condition and applies to all sides of all conflicts for all times.

It’s The Fundamental Problem of Everything Again
In The Fundamental Problem of Everything, I left it to the readers to make the final determination for themselves, but for me, the fundamental problem of everything is ignorance. Let me explain.

About 15 years ago it dawned upon me that I only had a finite amount of time left and that it sure would be a shame to have lived my whole life without ever having figured out what’s it all about or where I had been, so I started reading a popular book on science each week or a scientific college textbook over a span of several months in an attempt to figure it all out as best I could. The conclusion I came to was that it is all about self-replicating information, and that there are currently three forms of self-replicating information on the Earth – the genes, memes and software, with software rapidly becoming the dominant form of self-replicating information on the planet. As human beings, it seems that our entire life, from the moment of conception, to that last gasp, is completely shaped by the competitive actions of these three forms of self-replicating information. So as a sentient being, in a Universe that has become self-aware, if you want to take back control of your life, it is important to confront them now and know them well. Before proceeding, let us review what self-replicating information is and how it behaves.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

The Characteristics of Self-Replicating Information
All forms of self-replicating information have some common characteristics.

1. All self-replicating information evolves over time through the Darwinian processes of innovation and natural selection, which endows self-replicating information with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity.

2. All self-replicating information begins spontaneously as a parasitic mutation that obtains energy, information and sometimes matter from a host.

3. With time, the parasitic self-replicating information takes on a symbiotic relationship with its host.

4. Eventually, the self-replicating information becomes one with its host through the symbiotic integration of the host and the self-replicating information.

5. Ultimately, the self-replicating information replaces its host as the dominant form of self-replicating information.

6. Most hosts are also forms of self-replicating information.

7. All self-replicating information has to be a little bit nasty in order to survive.

8. The defining characteristic of self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space. See Enablement - the Definitive Characteristic of Living Things for more on this last characteristic.

For a good synopsis of how self-replicating information has dominated the Earth for the past 4 billion years, and also your life, take a quick look at A Brief History of Self-Replicating Information. Basically, we have seen several waves of self-replicating information dominate the Earth:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Note that because the self-replicating autocatalytic metabolic pathways of organic molecules, RNA and DNA have become so heavily intertwined over time that I now simply call them the “genes”.

The Ongoing Battle Between the Genes, Memes and Software For World Domination
In school you were taught that your body consists of about 100 trillion cells, and that these cells use DNA to create proteins that you need to replicate and operate your cells. The problem, as always, is that this is an entirely anthropocentric point of view. As Richard Dawkins explains in the The Selfish Gene (1976), this is totally backwards. We do not use genes to protect and replicate our bodies; genes use our bodies to protect and replicate genes, so in Dawkins’ view we are DNA survival machines, and so are all other living things. Darwin taught us that natural selection was driven by survival of the fittest. But survival of the fittest what? Is it survival of the fittest species, species variety, or possibly the fittest individuals within a species? Dawkins notes that none of these things actually replicate, not even individuals. All individuals are genetically unique, so it is impossible for individuals to truly replicate. What does replicate are genes, so for Dawkins, natural selection operates at the level of the gene. These genes have evolved over time to team up with other genes to form bodies or DNA survival machines that protect and replicate DNA, and that is why the higher forms of life are so “inefficient” when it comes to how genetic information is stored in DNA. For example, the human genome consists of about 23,000 genes stored on a few percent of the 6 feet of DNA found within each human cell, which is a rather inefficient way to store genetic information because it takes a lot of time and resources to replicate all that DNA when human cells divide. But that is the whole point, the DNA in higher forms of life is not trying to be an “efficient” genetic information storage system, rather it is trying to protect and replicate as much DNA as possible, and then build a DNA survival machine to house it by allocating a small percentage of the DNA to encode for the genes that produce the proteins needed to build the DNA survival machine. From the perspective of the DNA, these genes are just a necessary evil, like the taxes that must be paid to build roads and bridges.

Prokaryotic bacteria are small DNA survival machines that cannot afford the luxury of taking on any “passenger” junk DNA. Only large multicellular cruise ships like ourselves can afford that extravagance. If you have ever been a “guest” on a small sailing boat, you know exactly what I mean. There are no “guest passengers” on a small sailboat; it's always "all hands on deck" - and that includes the "guests"! Individual genes have been selected for one overriding trait, the ability to replicate, and they will do just about anything required to do so, like seeking out other DNA survival machines to mate with and rear new DNA survival machines. In Blowin’ in the Wind Bob Dylan asked the profound question,”How many years can a mountain exist; Before it's washed to the sea?”. Well, the answer is a few hundred million years. But some of the genes in your body are billions of years old, and as they skip down through the generations largely unscathed by time, they spend about half their time in female bodies and the other half in male bodies. If you think about it, all of your physical needs and desires are geared to ensuring that your DNA survives and gets passed on, with little regard for you as a disposable DNA survival machine. I strongly recommend that all IT professionals read the The Selfish Gene, for me the most significant book of the 20th century because it explains so much. For a book written in 1976, it makes many references to computers and data processing that you will find extremely interesting.

As DNA survival machines, our genes create our basic desires to survive and to replicate our genes through sexual activity in a Dawkinsian manner. When you factor in the ensuing human desires for food and comfort, and for the wealth that provides for them, together with the sexual tensions that arise in the high school social structures that seem to go on to form the basis for all human social structures, the genes alone probably account for at least 50% of the absurdity of the real world of human affairs because life just becomes a never ending continuation of high school. This is all part of my general theory that nobody ever really graduates from their culturally equivalent form of high school. We all just go on to grander things in our own minds. Certainly the success of Facebook and Twitter are testament to this observation.

Our Minds were formed next by the rise of the memes over the past 2.5 million years, again this was first proposed by Richard Dawkins in The Selfish Gene. The concept of memes was later advanced by Daniel Dennett in Consciousness Explained (1991) and Richard Brodie in Virus of the Mind: The New Science of the Meme (1996), and was finally formalized by Susan Blackmore in The Meme Machine (1999). For those of you not familiar with the term meme, it rhymes with the word “cream”. Memes are cultural artifacts that persist through time by making copies of themselves in the minds of human beings and were first recognized by Richard Dawkins in The Selfish Gene. Dawkins described memes as “Examples of memes are tunes, ideas, catch-phrases, clothes fashions, ways of making pots or of building arches. Just as genes propagate themselves in the gene pool by leaping from body to body via sperms or eggs, so memes propagate themselves in the meme pool by leaping from brain to brain via a process which, in the broad sense, can be called imitation.”. Just as genes come together to build bodies, or DNA survival machines, for their own mutual advantage, memes also come together from the meme pool to form meme-complexes for their own joint survival. DNA survives down through the ages by inducing disposable DNA survival machines, in the form of bodies, to produce new disposable DNA survival machines. Similarly, memes survive in meme-complexes by inducing the minds of human beings to reproduce memes in the minds of others. Meme-complexes come in a variety of sizes and can become quite large and complicated with a diverse spectrum of member memes. Examples of meme-complexes of increasing complexity and size would be Little League baseball teams, clubs and lodges, corporations, political and religious movements, tribal subcultures, branches of the military, governments and cultures at the national level, and finally the sum total of all human knowledge in the form of all the world cultures, art, music, religion, and science put together.

To the genes and memes, human bodies are simply disposable DNA survival machines housing disposable minds that come and go with a lifespan of less than 100 years. The genes and memes, on the other hand, continue on largely unscathed by time as they skip down through the generations. However, both genes and memes do evolve over time through the Darwinian mechanisms of innovation and natural selection. You see, the genes and memes that do not come together to build successful DNA survival machines, or meme-complexes, are soon eliminated from the gene and meme pools. So both genes and memes are selected for one overriding characteristic – the ability to survive. Once again, the “survival of the fittest” rules the day. Now it makes no sense to think of genes or memes as being either “good” or “bad”; they are just mindless forms of self-replicating information bent upon surviving with little interest in you as a disposable survival machine. So in general, these genes and memes are not necessarily working in your best interest, beyond keeping you alive long enough so that you can pass them on to somebody else.

According to Susan Blackmore, we are not so much thinking machines, as we are copying machines. For example, Blackmore maintains that memetic-drive was responsible for creating our extremely large brains and also our languages and cultures as well, in order to store and spread memes more effectively. Many researchers have noted that the human brain is way over engineered for the needs of a simple hunter-gatherer. After all, even a hundred years ago, people did not require the brain-power to do IT work, yet today we find many millions of people earning their living doing IT work, or at least trying to. Blackmore then points out that the human brain is a very expensive and dangerous organ. The brain is only 2% of your body mass, but burns about 20% of your calories each day. The extremely large brain of humans also kills many mothers and babies at childbirth, and also produces babies that are totally dependent upon their mothers for survival and that are totally helpless and defenseless on their own. Blackmore asks the obvious question of why the genes would build such an extremely expensive and dangerous organ that was definitely not in their own self-interest. Blackmore has a very simple explanation – the genes did not build our exceedingly huge brains, the memes did. Her reasoning goes like this. About 2.5 million years ago, the predecessors of humans slowly began to pick up the skill of imitation. This might not sound like much, but it is key to her whole theory of memetics. You see, hardly any other species learns by imitating other members of their own species. Yes, there are many species that can learn by conditioning, like Pavlov’s dogs, or that can learn through personal experience, like mice repeatedly running through a maze for a piece of cheese, but a mouse never really learns anything from another mouse by imitating its actions. Essentially, only humans do that. If you think about it for a second, nearly everything you do know, you learned from somebody else by imitating or copying their actions or ideas. Blackmore maintains that the ability to learn by imitation required a bit of processing power by our distant ancestors because one needs to begin to think in an abstract manner by abstracting the actions and thoughts of others into the actions and thoughts of their own. The skill of imitation provided a great survival advantage to those individuals who possessed it, and gave the genes that built such brains a great survival advantage as well. This caused a selection pressure to arise for genes that could produce brains with ever increasing capabilities of imitation and abstract thought. As this processing capability increased there finally came a point when the memes, like all of the other forms of self-replicating information that we have seen arise, first appeared in a parasitic manner. Along with very useful memes, like the meme for making good baskets, other less useful memes, like putting feathers in your hair or painting your face, also began to run upon the same hardware in a manner similar to computer viruses. The genes and memes then entered into a period of coevolution, where the addition of more and more brain hardware advanced the survival of both the genes and memes. But it was really the memetic-drive of the memes that drove the exponential increase in processing power of the human brain way beyond the needs of the genes.

A very similar thing happened with software over the past 70 years. When I first started programming in 1972, million dollar mainframe computers typically had about 1 MB (about 1,000,000 bytes) of memory with a 750 KHz system clock (750,000 ticks per second). Remember, one byte of memory can store something like the letter “A”. But in those days, we were only allowed 128 K (about 128,000 bytes) of memory for our programs because the expensive mainframes were also running several other programs at the same time. It was the relentless demands of software for memory and CPU-cycles over the years that drove the exponential explosion of hardware capability. For example, today the typical $600 PC comes with 8 GB (about 8,000,000,000 bytes) of memory and has several CPUs running with a clock speed of about 3 GHz (3,000,000,000 ticks per second). Last year, I purchased Redshift 7 for my personal computer, a $60 astronomical simulation application, and it alone uses 382 MB of memory when running and reads 5.1 GB of data files, a far cry from my puny 128K programs from 1972. So the hardware has improved by a factor of about 10 million since I started programming in 1972, driven by the ever increasing demands of software for more powerful hardware. For example, in my current position in Middleware Operations for a major corporation we are constantly adding more application software each week, so every few years we must upgrade all of our servers to handle the increased load.

The memes then went on to develop languages and cultures to make it easier to store and pass on memes. Yes, languages and cultures also provided many benefits to the genes as well, but with languages and cultures, the memes were able to begin to evolve millions of times faster than the genes, and the poor genes were left straggling far behind. Given the growing hardware platform of an ever increasing number of Homo sapiens on the planet, the memes then began to cut free of the genes and evolve capabilities on their own that only aided the survival of memes, with little regard for the genes, to the point of even acting in a very detrimental manner to the survival of the genes, like developing the capability for global thermonuclear war and global climate change. The memes have since modified the entire planet. They have cut down the forests for agriculture, mined minerals from the ground for metals, burned coal, oil, and natural gas for energy, releasing the huge quantities of carbon dioxide that its genetic predecessors had sequestered within the Earth, and have even modified the very DNA, RNA, and metabolic pathways of its predecessors.

We can now see these very same processes at work today with the evolution of software. Software is currently being written by memes within the minds of programmers. Nobody ever learned how to write software all on their own. Just as with learning to speak or to read and write, everybody learned to write software by imitating teachers, other programmers, imitating the code written by others, or by working through books written by others. Even after people do learn how to program in a particular language, they never write code from scratch; they always start with some similar code that they have previously written, or others have written, in the past as a starting point, and then evolve the code to perform the desired functions in a Darwinian manner (see How Software Evolves). This crutch will likely continue for another 20 – 50 years, until the day finally comes when software can write itself, but even so, “we” do not currently write the software that powers the modern world; the memes write the software that does that. This is just a reflection of the fact that “we” do not really run the modern world either; the memes in meme-complexes really run the modern world because the memes are currently the dominant form of self-replicating information on the planet. In The Meme Machine, Susan Blackmore goes on to point out that the memes at first coevolved with the genes during their early days, but have since outrun the genes because the genes could simply not keep pace when the memes began to evolve millions of times faster than the genes. The same thing is happening before our very eyes to the memes, with software now rapidly outpacing the memes. Software is now evolving thousands of times faster than the memes, and the memes can simply no longer keep up.

As with all forms of self-replicating information, software began as a purely parasitic mutation within the scientific and technological meme-complexes, initially running onboard Konrad Zuse’s Z3 computer in May of 1941 (see So You Want To Be A Computer Scientist? for more details). It was spawned out of Zuse’s desire to electronically perform calculations for aircraft designs that were previously done manually in a very tedious manner. So initially software could not transmit memes, it could only perform calculations, like a very fast adding machine, and so it was a pure parasite. But then the business and military meme-complexes discovered that software could also be used to transmit memes, and software then entered into a parasitic/symbiotic relationship with the memes. Software allowed these meme-complexes to thrive, and in return, these meme-complexes heavily funded the development of software of ever increasing complexity, until software became ubiquitous, forming strong parasitic/symbiotic relationships with nearly every meme-complex on the planet. In the modern day, the only way memes can now spread from mind to mind without the aid of software is when you directly speak to another person next to you. Even if you attempt to write a letter by hand, the moment you drop it into a mailbox, it will immediately fall under the control of software. The poor memes in our heads have become Facebook and Twitter addicts.

So in the grand scheme of things, the memes have replaced their DNA predecessor, which replaced RNA, which replaced the original self-replicating autocatalytic metabolic pathways of organic molecules as the dominant form of self-replicating information on the Earth. Software is the next replicator in line, and is currently feasting upon just about every meme-complex on the planet, and has formed very strong parasitic/symbiotic relationships with all of them. How software will merge with the memes is really unknown, as Susan Blackmore pointed out in her TED presentation which can be viewed at:

Once established, software then began to evolve based upon the Darwinian concepts of innovation and natural selection, which endowed software with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity. Successful software, like MS Word and Excel competed for disk and memory address space with WordPerfect and VisiCalc and out-competed these once dominant forms of software to the point of extinction. In less than 70 years, software has rapidly spread across the face of the Earth and outward to every planet of the Solar System and many of its moons, with a few stops along the way at some comets and asteroids. And unlike us, software is now leaving the Solar System for interstellar space onboard the Pioneer 1 & 2 and Voyager 1 & 2 probes.

Currently, software manages to replicate itself with the support of you. If you are an IT professional, then you are directly involved in some, or all of the stages in this replication process, and act sort of like a software enzyme. No matter what business you support as an IT professional, the business has entered into a parasitic/symbiotic relationship with software. The business provides the budget and energy required to produce and maintain the software, and the software enables the business to run its processes efficiently. The ultimate irony in all this is the symbiotic relationship between computer viruses and the malevolent programmers who produce them. Rather than being the clever, self-important, techno-nerds that they picture themselves to be, these programmers are merely the unwitting dupes of computer viruses that trick these unsuspecting programmers into producing and disseminating computer viruses! And if you are not an IT professional, you are still involved with spreading software around because you buy gadgets that are loaded down with software, like smartphones, notepads, laptops, PCs, TVs, DVRs, cars, refrigerators, coffeemakers, blenders, can openers and just about anything else that uses electricity.

The Genes, Memes and Software of War
In times of war, successful meme-complexes appeal primarily to two gene-induced emotions – the desire for social status and the fear of a perceived enemy. Social status in a group of similar DNA survival machines is always a good thing for the replication of genes because it brings with it the necessities of life that are required to maintain a healthy DNA survival machine and also provides for more opportunities for a DNA survival machine to couple with other DNA survival machines and to replicate its genes. Fear of a perceived enemy is another gene-induced emotion because it is a known fact that an enemy can destroy the DNA survival machines that are used to house genes as they move about from place to place.

Meme-complexes can do wonderful things, as is evidenced by the incredible standard of living enjoyed by the modern world, thanks to the efforts of the scientific meme-complex, or the great works of art, music, and literature handed down to us from the Baroque, Classical, and Romantic periods, not to mention the joys of jazz, rock and roll, and the blues. However, other meme-complexes, like the memes of war, can also turn incredibly nasty. Just since the Scientific Revolution of the 17th century we have seen the Thirty Years War (1618 -1648), the Salem witch hunts (1692), the French Reign of Terror (1793 – 1794), American slavery (1654 – 1865), World War I (all sides) (1914 – 1918), the Stalinist Soviet Union (1929 – 1953), National Socialism (1933 – 1945), McCarthyism (1949 – 1958), Mao’s Cultural Revolution (1969 – 1976), and Pol Pot’s reign of terror (1976 – 1979).

The problem is that when human beings get wrapped up into a horrible meme-complex, they can do horrendous things without even being aware of the fact. This is because in order to survive, the first thing that most meme-complexes do is to use a meme that turns off human thought and reflection. To paraphrase Descartes, ”I think, therefore I am" a heretic. So if you ever questioned any of the participants caught up in any of the above atrocious events, you would find that the vast majority would not have any qualms about their deadly activities whatsoever. In fact, they would question your loyalty and patriotism for even bringing up the subject. For example, during World War I there were few dissenters beyond Albert Einstein in Germany and Bertrand Russell in Great Britain, and both suffered the consequences of not being onboard with the World War I meme-complex. Unquestioning blind obedience to a meme-complex through unconditional group-think is definitely a good survival strategy for any meme-complex.

In the modern world, during times of distress, we now see a very interesting interplay between the genes, memes and software of war. This certainly was true during the Arab Spring which began on December 18, 2010 and was made possible by the spreading of the memes of revolution via social media software. The trouble with the memes of war is that, like all meme-complexes, once they are established they are very conservative and not very open to new memes that might jeopardize the ongoing survival of the meme-complex, and consequently, they are very hard to change or eliminate. Remember, every meme-complex is less than one generation away from oblivion. So normally, meme-complexes are very resistant to the Darwinian processes of innovation and natural selection, and just settle down into a state of coexistence with the other meme-complexes that they interact with. But during periods of stress, very violent and dangerous war-like meme-complexes can break out of this equilibrium, rapidly forming a new war-like meme-complex in a manner similar to the Punctuated Equilibrium model of Stephen Jay Gould and Niles Eldridge (1972), which holds that species are usually very stable and in equilibrium with their environment and only rarely change when required.

In times of peace, the genes, memes and software enter into an uneasy alliance of parasitic/symbiotic relationships, but in times of war, this uneasy truce breaks down, as we have again seen in the Middle East. The Middle East is currently plagued by a number of warring religious meme-complexes that are in the process of destroying the Middle East, as did the warring Catholic and Protestant religious meme-complexes of the Thirty Years War (1618 – 1648), which nearly destroyed Europe. But at the same time that the Thirty Years War raged in Europe, people like Kepler, Galileo and Descartes were laying the foundations of the 17th century Scientific Revolution which led to the 18th century European Enlightenment. So perhaps the warring meme-complexes of a region have to eliminate the belligerent genes of the region before rational thought can once again prevail.

Application to the Foreign Policy of the United States
The foreign policy of the United States keeps getting into trouble because Americans do not understand the enduring nature of meme-complexes. Because all successful meme-complexes have survived the rigors of Darwinian natural selection, they are very hardy forms of self-replicating information and not easily dislodged or eliminated once they have become endemic in a region. Yes, by occupying a region it is possible to temporarily suppress what the local meme-complexes can do, but it is very difficult to totally eliminate them from the scene because successful meme-complexes have learned to simply hide when confronted by a hostile intruding meme-complex, only later to reemerge when the hostile meme-complex has gone. The dramatic collapse of South Vietnam in less than two months (March 10 – April 30 1975) after spending more than a decade trying to alter the meme-complexes of the region is evidence of that fact. Similarly, the dramatic collapse of Iraq and Afghanistan after another decade of futile attempts to subdue the local meme-complexes of the region that are thousands of years old is another example of a failed foreign policy stemming from a naïve understanding of the hardiness of meme-complexes. History has taught us that the only way to permanently suppress the local meme-complexes of a region is to establish a permanent empire to rule the region with a heavy hand, and this is something Americans are loath to do, having once freed ourselves from such an empire.

Currently, in the United States the polls are showing that Americans, on one hand, do not want to get involved in the Middle East again, but on the other hand, perceive that the foreign policy of the United States is weak and that we are not showing leadership. Apparently, Americans are now so confused by the varying warring factions in the Middle East that they can no longer even tell who the potential enemy is. This confusion also stems from an old 20th century meme that world leadership equates to military action, which is probably no longer true in the 21st century because the 21st century will be marked by the rise of software to supremacy as the dominant form of self-replicating information on the planet. This self-contradictory assessment troubling the minds of Americans is further exasperated by an old 20th century meme currently floating about that, if the Middle East should further spin out of control, that governmental safe havens will be established for the training of combatants that might again strike the United States as they did with the September 11, 2001 attacks, specifically those on the World Trade Center and the Pentagon. But in the modern world, with the dramatic rise of software, there is no longer a need for physical safe havens in the Middle East to train and equip combatants. Indeed, training combatants to effectively attack modern 21st century countries, and the technology that they rely upon, is best done in locations with modern 21st century technology and good Internet connectivity close at hand. For example, Timothy McVeigh, Terry Nichols and Michael Fortier conspired to conduct the Oklahoma City bombing attack that killed 168 people and injured over 600 on April 19, 1995 by training within the United States itself. Similarly, the September 11, 2001 combatants also trained within the United States prior to the attack. After all, it’s hard to learn how to fly a modern jetliner in a cave. Ironically, in the 21st century it would actually be a good defensive strategy to try to isolate your enemies to the deserts and caves of the Middle East because deserts and caves have such poor Internet connectivity and access to modern technology.

For example, currently I am employed in the Middleware Operations group of a major U.S. corporation, and I work out of my home office in Roselle, IL, a northwest suburb of Chicago. The rest of our onshore Middleware Operations group is also scattered throughout the suburbs of Chicago and hardly ever goes into our central office for work. And about 2/3 of Middleware Operations works out of an office in Bangalore India. But the whole team can collaborate very effectively in a remote manner using CISCO software. We use CISCO IP Communicator for voice over IP phone conversations and CISCO WebEx for online web-meetings. We use CISCO WebEx Connect for instant messaging and the sharing of desktops to view the laptops of others for training purposes. Combined with standard corporate email, these technologies allow a large group of Middleware Operations staff to work together from locations scattered all over the world, without ever actually being physically located in the same place. In fact, when the members of Middleware Operations do come into the office for the occasional group meeting, we usually just use the same CISCO software products to communicate while sitting in our cubicles, even when we are sitting in adjacent cubicles! After all, the CISCO collaborative software works better than leaning over somebody else’s laptop and trying to see what is going on. I believe that many enemies of the United States now also work together in a very similar distributed manner as a network of agents scattered all over the world. Now that memes can move so easily over the Internet and are no longer confined to particular regions, even the establishment of regional empires will no longer be able to suppress them.

So in the 21st century dominated by software, the only thing that the enemies of the United States really need is money. From a 21st century military perspective, control of territory is now an obsolete 20th century meme because all an enemy really needs is money and the complicit cooperation of world-wide financial institutions to do things like launch cyber-attacks, create and deliver dirty bombs, purchase surface to air missiles to down commercial aircraft or purchase nuclear weapons to FedEx to targets. For the modern 21st century economies, it really makes more sense to beef up your Cyber Defense capabilities rather than trying to control territories populated by DNA survival machines infected with very self-destructive war-like meme-complexes that tend to splinter and collapse on their own. So for the present situation, the most effective military action that the United States could take would be to help the world to cut off the money supply to the Middle East by ending the demand for oil and natural gas by converting to renewable sources of energy. This military action would also have the added benefit of preventing many additional future wars fought over the control of Middle Eastern oil and wars that would be induced by global climate change as it severely disrupts the economies of the world (see How to Use Your IT Skills to Save the World and 400 PPM - The Dawn of the SophomorEocene for more details).

Since the “real world” of human affairs only exists in our minds, we can change it by simply changing the way we think by realizing that we are indeed DNA survival machines with minds infected with memes and software that are not necessarily acting in our own best interests. We are sentient beings in a Universe that has become self-aware and perhaps the only form of intelligence in our galaxy. What a privilege! The good news is that conscious intelligence is something new. It is not a mindless form of self-replicating information, bent on replicating at all costs with all the associated downsides of a ruthless nature. We can do much better with this marvelous opportunity once we realize what is really going on. It is up to all of us to make something of this unique opportunity that we can all be proud of – that’s our responsibility as sentient beings.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston

Monday, May 19, 2014

Software Embryogenesis

In my last posting An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer I proposed that the chromatin and chromosomes of multicellular eukaryotic organisms, like ourselves, might have arisen in nature as a common solution to the same problem that IT faced when trying to process 14 miles of magnetic computer tape just to process the rudimentary account data for 50 million customers. The problem was how do you quickly find the account information for a single customer on 14 miles of tape? Multicellular organisms faced this very same challenge when large-scale multicellular organism first appeared during the Cambrian explosion 530 million years ago. How does each cell in a multicellular organism, consisting of billions or trillions of differentiated cells, find the genes that it needs in order to differentiate into the correct cell type for the tissue that it is forming? For example, Humans are made up of about 100 trillion eukaryotic cells, and each cell of the 100 trillion contains about 23,000 genes for coding proteins, stored on a small percentage of the 6 feet of DNA that is carefully wound up around a large number of histone proteins, and is packaged into 23 pairs of chromosomes within each cell, like the magnetic computer tape of yore that was wound up around 2400 foot reels, and was carefully stored in the tape racks of times gone by. The 100 trillion eukaryotic cells of a human are composed of several hundred different cell types, and each cell at each location within a human body must somehow figure out what kind of cell to become. So how does each differentiated cell within a human being find the proper genes, and at the proper time, to develop into a human baby from a single fertilized egg cell? This is a very perplexing problem because each human being begins as a spherically symmetric fertilized egg cell. How can it possibly grow and differentiate into 100 trillion cells, composed of several hundred different cell types, and ultimately forming the myriad varied tissues within a body that perform the functions of life? In biology the study of this incredible feat is called embryogenesis or developmental biology, and this truly amazing process from a data processing perspective is certainly worthy of investigation from an IT perspective.

Human Embryogenesis
Most multicellular organisms follow a surprisingly similar sequence of steps to form a complex body, composed of billions or trillions of eukaryotic cells, from a single fertilized egg. This is a sure sign of some inherited code at work that has been tweaked many times to produce a multitude of complex body plans or phyla by similar developmental processes. Since many multicellular life forms follow a similar developmental theme let us focus, as always, upon ourselves and use the development of human beings as our prime example of how developmental biology works. For IT professionals and other readers not familiar with embryogenesis, it would be best now to view this short video before proceeding:

Medical Embryology - Difficult Concepts of Early Development Explained Simply

Basically, a fertilized egg, or zygote, begins to divide many times over, without the zygote really increasing in size at all. After a number of divisions, the zygote becomes a ball of undifferentiated cells that are all the same and is known as a morula. The morula then develops an interior hollow center called a blastocoel. The hollow ball of cells is known as a blastula and all the cells in the blastula are undifferentiated, meaning that they are all still identical in nature.

Figure 1 – A fertilized egg, or zygote, divides many times over to form a solid sphere of cells called a morula. The morula then develops a central hole to become a hollow ball of cells known as a blastula. The blastula consists of identical cells. When gastrulation begins some cells within the blastula begin to form three layers of differentiated cells – the ectoderm, mesoderm, and endoderm. The above figure does not show the amnion which forms just outside of the infolded cells that create the gastrula. See Figure 2 for the location of the amnion.

The next step is called gastrulation. In gastrulation one side of the blastula breaks symmetry and folds into itself and eventually forms three differentiated layers – the ectoderm, mesoderm and endoderm. The amnion forms just outside of the gastrulation infold.

Figure 2 – In gastrulation three layers of differentiated cells form - the ectoderm, mesoderm and endoderm by cells infolding and differentiating.

Figure 3 – Above is a close up view showing the ectoderm, mesoderm and endoderm forming from the primitive streak.

The cells of the endoderm go on to differentiate into the internal organs or guts of a human being. The cells of the mesoderm, or the middle layer, go on to form the muscles and connective tissues that do most of the heavy lifting. Finally, the cells of the ectoderm go on to differentiate into the external portions of the human body, like the skin and nerves.

Figure 4 – Some examples of the cell types that develop from the endoderm, mesoderm and ectoderm.

Figure 5 – A human being develops from the cells in the ectoderm, mesoderm and endoderm as they differentiate into several hundred different cell types.

This is all incredibly impressive from a data processing perspective. The nagging question in biology has always been if each of the 100 trillion cells in the human body all have the very same 23,000 genes strung out along some small percentage of the 6 feet of DNA found in the 23 pairs of chromosomes of each cell, how do the 100 trillion cells figure out what to do?

Alan Turing’s Morphogenesis Model
In biology the currently accepted paradigm for how the spherically symmetric cells of a blastula differentiate into the 100 trillion cells of the human body, forming very complex tissues and organs, stems from a paper that Alan Turing published in 1952 entitled The Chemical Basis of Morphogenesis. Yes, the very same Alan Turing of early computer science fame. In 1936 Alan Turing developed the mathematical concept of the Turing Machine in On Computable Numbers, with an Application to the Entscheidungsproblem that today underlies the architecture for all modern computers. Alan Turing’s work was completely conceptual in nature, and in the paper he proposed the theoretical concept of a Turing Machine. A Turing Machine was composed of a read/write head and an infinitely long paper tape. On the paper tape was stored a sequential series of 1s and 0s, and the read/write head could move back and forth along the paper tape in a motion based upon the 1s and 0s that it read. The read/write head could also write 1s and 0s to the paper tape as well. In Turing’s paper he mathematically proved that such an arrangement could be used to encode any mathematical algorithm, like multiplying two very large numbers together and storing the result on the paper tape. In many ways a Turing Machine is much like a ribosome reading mRNA and writing out the amino acids of a polypeptide chain that eventually fold up into an operational protein.

Figure 6 - A Turing Machine had a read/write head and an infinitely long paper tape. The read/write head could read instructions on the tape that were encoded as a sequence of 1s and 0s, and could write out the results of following the instructions on the paper tape back to the tape as a sequence of 1s and 0s.

Figure 7 – A ribosome read/write head behaves much like the read/write head of a Turing Machine. The ribosome reads an mRNA tape that was transcribed earlier from a section of DNA tape that encodes the information in a gene. The ribosome read/write head then reads the A, C, G, and U nucleobases that code for amino acids three at a time. As each 3 bit byte is read on the mRNA tape, the ribosome writes out an amino acid to a growing polypeptide chain, as tRNA units brings in one amino acid at a time. The polypeptide chain then goes on to fold up into a 3-D protein molecule.

In a sense, all modern computers are loosely based upon the concept of a Turing Machine. Turing did not realize it, but at the same time he was formulating the concept of a Turing Machine back in 1936, Konrad Zuse was constructing his totally mechanical Z1 computer in the bathroom of his parent’s apartment in Germany, and the Z1 really did use a paper tape to store the program and data that it processed, much like a Turing Machine. Neither one of these early computer pioneers had any knowledge of the other at the time. For more about how Konrad Zuse independently developed a physical implementation of many of Alan Turing’s mathematical concepts, but also implemented them in practical terms in the form of the world’s very first real computers, see the following article that was written in his own words:

Figure 8 - A reconstructed mechanical Z1 computer completed by Konrad Zuse in 1989. The Z1 was not a full-fledged modern computer, like Zuse’s Z3 computer that became operational in May of 1941 because it read programs from a punched tape that were not stored in the mechanical memory of the Z1. In that regard, the Z1 was more like a Turing Machine than are modern computers.

Now back to Turing and morphogenesis. The essential element of Turing’s model for the development of embryos was the concept of a morphogen. A morphogen is an organic molecule that is found within the cells of an embryo or diffuses between the cells of an embryo and that can affect embryonic development. In Turing’s model when a morphogen reached a critical concentration it could activate or inhibit some of the genes in the growing embryo that controlled the differentiation and migration of the embryonic cells. Today we call morphogens that are secreted by one cell and diffuse to other nearby cells a paracrine factor and they are primarily protein molecules that are generated by the cells of a developing embryo. The problem with morphogenesis was that if all the cells in the hollow ball of cells that formed a blastula were all identical, how could embryonic development get initiated? Turing proposed that there would naturally be some slight variations in the concentrations of the morphogens from place to place along the surface of the blastula, and eventually, these variations, or instabilities, in the concentrations of the morphogen molecules would naturally cause the blastula to break spherical symmetry. It’s like trying to balance a pencil on its point. Initially the pencil stands straight up, and is totally symmetric with respect to all directions. But eventually the pencil will fall down, due to a slight instability, and then it will point in some specific direction, like due north.

Turing proposed that the spherical symmetry of the blastula could be broken in a similar manner, by means of varying diffusion rates of the morphogen molecules. For example, suppose the genes within a human being can generate two proteins A and B. Protein A can enhance the generation of protein A itself, and can also enhance the generation of another protein B by epigenetic means, like binding to the promoter sections of the DNA for the genes that make proteins A and B. Now suppose that protein B can also inhibit the production of protein A by similar means and that protein B is a smaller molecule that diffuses faster than protein A. A negative feedback loop will then develop between proteins A and B. If protein A increases, it will enhance the production of protein B in the nearby cells of the blastula, which will then inhibit the production of protein A in the local vicinity, and consequently, will help to keep the local production of protein A in check. Proteins A and B will then arrive at some equilibrium level that never changes due to the controlling negative feedback loops operating in the vicinity of the cells. But what if in one isolated section of the blastula an instability should develop, and the concentration of protein A spontaneously peaks far above normal? This will produce more of protein A in the neighboring cells, and also more of protein B too because protein A enhances the production of both proteins A and B. But because protein B can diffuse faster than protein A, the concentration level of protein B at some distance from the protein A peak will be higher than normal and will suppress the production of protein A in a surrounding ring centered upon the protein A peak. The end result will be a protein A peak surrounded by a ring of protein B, like the foothills around a mountain peak. This will break the spherical symmetry of the blastula because now we no longer have constant concentrations of protein A and B throughout the blastula. Once the spherical symmetry of the blastula has been broken an explosive cascade of logical operations are unleashed as a torrent of morphogens, or paracrine factors, are released in a large number of parallel chain reactions that transform the spherically symmetric blastula into a highly nonsymmetrical human being with great rapidity.

Figure 9 – A spontaneous spike in the concentration of protein A can create a permanent peak of protein A surrounded by a foothill ring of protein B and break the spherical symmetry of the hollow ball of cells that form a blastula.

The huge expanse of logical operations that are encoded in the network of genes, combined with epigenetic information that is encoded within the structures of the chromosomes themselves, is quite remarkable because not only do they have to rapidly develop the embryo into a viable organism that can live on its own, but they also have to keep the growing embryo alive at all stages of development, as it greatly changes in size, shape and function.

Figure 10 – The cascades of morphogens, or paracrine factors, rapidly change the spherical blastula into a highly nonsymmetrical human being.

Many of the morphogen, or paracrine factor, cascades are very similar for all multicellular organisms, leading to very similar patterns of development, a sure sign that inherited reusable code is in action.

Figure 11 – The development of embryos across species is remarkably similar because of the reuse of the code found within the cascades of morphogen, or paracrine factors.

I recently finished the ninth edition of Developmental Biology 2010 by Scott F. Gilbert, a 711 page college textbook. Now that I am 62 years old, I frequently like to read current college textbooks from cover to cover, without the least worry about problem sets or exams. The striking realization that I came to from reading this textbook was that for IT professionals struggling with the new SOA architecture, it is critical to focus upon the network of logical operations that the billions of Objects that the SOA architecture generates, and to focus less upon the individual methods within any given Object. There will be more on that in the next section.

The Development of Embryos in Commercial Software
With the advent of SOA (Service Oriented Architecture) about 10 years ago we have seen the evolution of a very similar pattern of embryogenesis in commercial software. For more about SOA please see:

Service-oriented architecture

As I have mentioned in many previous softwarephysics postings, commercial software has been evolving about 100 million times faster than biological software over the last 70 years, or 2.2 billion seconds, ever since Konrad Zuse cranked up his Z3 computer in May of 1941, and that the architectural history of commercial software has essentially recapitulated the evolutionary history of life on Earth over this same period of time through a process of convergence. Over the years, the architecture of commercial software has passed through a very lengthy period of prokaryotic architecture (1941 – 1972), followed by a period of single-celled eukaryotic architecture (1972 – 1992). Multicellular organization took off next with the Object-Oriented revolution of the early 1990s, especially with the arrival of Java in 1995. About 10 years ago, commercial software entered into a Cambrian explosion of its own with the advent of SOA (Service Oriented Architecture) in which large-scale multicellular applications first appeared, chiefly in the form of high-volume corporate websites. For more on this see the SoftwarePaleontology section of:SoftwareBiology.

Object-Oriented Programming Techniques Allow for the Multicellular Organization of Software
Before proceeding with the development of embryos in commercial software, we first need to spend some time exploring how multicellular organization is accomplished in commercial software. Multicellular organization in commercial software is based upon the use of Object-Oriented programming languages. Object-Oriented programming actually began in 1962, but it did not catch on at first. In the late 1980s, the use of the very first significant Object-Oriented programing language, known as C++, started to appear in corporate IT, but Object-Oriented programming really did not become significant in IT until 1995 when both Java and the Internet Revolution arrived at the same time. The key idea in Object-Oriented programming is naturally the concept of an Object. An Object is simply a cell. Object-oriented languages use the concept of a Class, which is a set of instructions for building an Object (cell) of a particular cell type in the memory of a computer. Depending upon whom you cite, there are several hundred different cell types in the human body, but in IT we generally use many thousands of cell types or Classes in commercial software. For a brief overview of these concepts go to the webpage below and follow the links by clicking on them.

Lesson: Object-Oriented Programming Concepts

A Class defines the data that an Object stores in memory and also the methods that operate upon the Object data. Remember, an Object is simply a cell. Methods are like biochemical pathways that consist of many steps or lines of code. A public method is a biochemical pathway that can be invoked by sending a message to a particular Object, like using a ligand molecule secreted from one Object to bind to the membrane receptors on another Object. This binding of a ligand to a public method of an Object can then trigger a cascade of private internal methods within an Object or cell.

Figure 12 – A Class contains the instructions for building an Object in the memory of a computer and basically defines the cell type of an Object. The Class defines the data that an Object stores in memory and also the methods that can operate upon the Object data.

Figure 13 – Above is an example of a Bicycle Object. The Bicycle Object has three data elements - speed in mph, cadence in rpm, and a gear number. These data elements define the state of a Bicycle Object. The Bicycle Object also has three methods – changeGears, applyBrakes, and changeCadence that can be used to change the values of the Bicycle Object’s internal data elements.

Figure 14 – Above is some very simple Java code for a Bicycle Class. Real Class files have many data elements and methods and are usually hundreds of lines of code in length.

Figure 15 – Many different Objects can be created from a single Class just as many cells can be created from a single cell type. The above List Objects are created by instantiating the List Class three times and each List Object contains a unique list of numbers. The individual List Objects have public methods to insert or remove numbers from the Objects and also an internal sort method that could be called whenever the public insert or remove methods are called. The internal sort method automatically sorts the numbers in the List Object whenever a number is added or removed from the Object.

Figure 16 – Objects communicate with each other by sending messages. Really one Object calls the exposed public methods of another Object and passes some data to the Object it calls, like one cell secreting a ligand molecule that then plugs into a membrane receptor on another cell.

Figure 17 – In Turing’s model cells in a growing embryo communicate with each other by sending out ligand molecules called morphogens or paracrine factors that bind to membrane receptors on other cells.

Figure 18 – Calling a public method of an Object can initiate the execution of a cascade of private internal methods within the Object. Similarly, when a paracrine factor molecule plugs into a receptor on the surface of a cell, it can initiate a cascade of internal biochemical pathways. In the above figure an Ag protein plugs into a BCR receptor and initiates a cascade of biochemical pathways or methods within a cell.

Embryonic Growth and Differentiation of a High-Volume Corporate Website
When a high-volume corporate website, consisting of many millions of lines of code running on hundreds of servers, starts up and begins taking traffic, billions of Objects (cells) begin to be instantiated in the memory of the servers in a manner of minutes and then begin to exchange messages with each other in order to perform the functions of the website. Essentially, when the website boots up, it quickly grows to a mature adult through a period of very rapid embryonic growth and differentiation, as billions of Objects are created and differentiated to form the tissues of the website organism. These Objects then begin exchanging messages with each other by calling public methods on other Objects to invoke cascades of private internal methods which are then executed within the called Objects.

For example, today most modern high-volume corporate websites use the MVC pattern – the Model-View-Controller pattern. In the mid-1990s IT came upon the concept of application patterns. An application pattern is basically a phylum, a basic body plan for an application, and the MVC pattern is the most familiar. For example, when you order something from Amazon, you are using an MVC application. The Model is the endoderm or “guts” of the application that stores all of the data on tables in relational databases. A database table is like an Excel spreadsheet, containing many rows of data, and each table consists of a number of columns with differing datatypes and sizes. For example, there may be columns containing strings, numbers and dates of various sizes in bytes. Most tables will have a Primary Index, like a CustomerID, that uniquely identifies each row of data. By joining tables together via their columns it is possible to create composite rows of data. For example, by combining all the rows in the Customers table with the rows in the Orders table via the CustomerID column in each table, it is possible to find information about all of the orders a particular customer has placed. Amazon has a database Model consisting of thousands of tables of data that store all of the information about their products on millions of rows, like descriptions of the products and how many they have in stock, as well as tables on all of their orders and customers. The View in an MVC application comprises the ectoderm tissues of the application and defines how the application looks to the end-user from the outside. The View consists mainly of screens and reports. When you place an order with Amazon, you do so by viewing their products online and then filling out webpage screens with data. When you place an order, the View code takes in the data and validates it for errors. Reports are static output, like the final webpage you see with your order information and the email you receive confirming your order. The Controller code of an MVC application forms the muscular mesoderm connective tissue that connects the View (ectoderm) layer to the Model (endoderm) layer and does most of the heavy lifting. The Controller code has to connect the data from the Model and format it into the proper View that the end-user sees on the surface. The Controller code also has to take the data from the View and create orders from it and send instructions to the warehouse to put the order together. The Controller has to also do things like debit your credit card. So as you can see, Controller code, like mesoderm, is the most muscular code and also the most costly to build.

Figure 19 – The endoderm of an MVC application forms the “guts” of the application and consists of a large number of database Objects that hold the selected data from thousands of relational database tables.

Figure 20 – An online order screen is displayed by Objects in your browser that form the ectoderm layer of an MVC application. The information on the screen comes from HTML that is sent to your browser from the middleware (mesoderm) layer of an MVC application.

The mesoderm layer of website MVC applications runs on middleware that lies between the ectoderm Objects running on an end-user’s browser and the database Objects running on the endoderm layer. Figure 21 shows that the middleware (mesoderm) layer is composed of components that are protected by several layers of firewalls to ward off attacks from the outside. The middleware feeds HTML to the end-user’s browser that kicks off Objects within the end-users browser that display the HTML as webpages.

Figure 21 - The mesoderm layer of website MVC applications runs on middleware that lies between the ectoderm Objects running on an end-user’s browser and the database Objects running on the endoderm layer. The middleware (mesoderm) layer is composed of components that are protected by several layers of firewalls to ward off attacks from the outside. The middleware feeds HTML to the end-user’s browser that kicks off Objects within the end-users browser that display the HTML as webpages.

This is accomplished with mesoderm middleware running on J2EE Application Server software like IBM’s WebSphere or Oracle’s WebLogic. A J2EE Application Server contains a WEB Container that stores pools of Servlet Objects and an EJB Container that stores pools of EJB Objects (see Figure 22). The EJB Objects get data from relational databases (DB) and process the data and then pass the information on to Servlet Objects. The Servlet Objects generate HTML based upon the data processed by the EJB Objects and pass the HTML to HTTP webservers like Apache. The HTTP webservers then send out the HTML to the Objects in your browser to be displayed upon your PC or smartphone. When you fill out an order screen on your PC to purchase an item, the flow of information is reversed and ultimately updates the data in the relational databases (DB). Each J2EE Application Server runs in its own JVM (Java Virtual Machine), and a modern high-volume corporate website might be powered by thousands of J2EE Application Servers in JVMs, running on dozens of physical servers, and each J2EE Application Server might contain millions of Objects.

With SOA (Service Oriented Architecture) some of the J2EE Application Servers run in Service Cells that provide basic services to other J2EE Application Servers running in Consumer Cells. The Objects in Service Cells perform basic functions, like looking up a customer’s credit score or current account information, and provide the information as a service via SOAP or REST calls to Objects in Consumer Cells. Essentially, the Objects in a Service Cell of J2EE Application Servers perform the services that the cells in an organ, like the lungs or kidneys perform, for other somatic cells elsewhere in the body of an organism.

Figure 22 - A J2EE Application Server contains a WEB Container that stores pools of Servlet Objects and an EJB Container that stores pools of EJB Objects. The EJB Objects get data from relational databases (DB) and processes the data and then passes the information to Servlet Objects. The Servlet Objects generate HTML based upon the data processed by the EJB Objects and passes the HTML to HTTP webservers like Apache.

As you can see the middleware mesoderm tissues, composed of billions of Objects running on thousands of J2EE Application Server JVMs, does most of the heavy lifting in the MVC applications running on high-volume corporate websites. This is accomplished by running the middleware software on banks of clustered servers with load balancers between each layer that spray traffic to the next layer. This allows for great flexibility, and allows MVC applications to scale to any load by simply adding more servers to each layer to handle more Objects in the middleware mesoderm tissues.

Figure 23 – As you can see the middleware mesoderm tissues, composed of billions of Objects, does most of the heavy lifting in the MVC applications running on high-volume corporate websites. This is accomplished by running the middleware software on banks of clustered servers with load balancers between each layer that spray traffic to the next layer. This allows for great flexibility, and allows MVC applications to scale to any load by simply adding more servers to each layer to handle more Objects in the middleware mesoderm tissues.

When you login to a high-volume corporate website many thousands of Objects are created for your particular session. These Objects consume a certain amount of memory on the banks of servers in each layer of middleware, and this can lead to problems. For example, one of the most common forms of software disease is called an OOM (Out Of Memory) condition. As I mentioned previously, there are several billion Objects (cells) running at any given time within the middleware mesoderm tissues of a major corporation. These Objects are created and destroyed as users login and then later leave a corporate website. These Objects reside in the JVMs of J2EE Appservers. These JVMs periodically run a “garbage collection” task every few minutes to release the memory used by the “dead” Objects left behind by end-users who have logged off the website. The garbage collection task frees up memory in the JVM so that new “live” Objects can be created for new end-users logging into the website. In biology the programmed death of cells is called apoptosis. For example, between 50 and 70 billion cells die each day due to apoptosis in the average human adult, and when apoptosis fails, it can be a sign of cancer and the uncontrolled growth of tumor cells. Similarly, sometimes, for seemingly unknown reasons, Objects in the JVMs refuse to die, and begin to proliferate in a neoplastic and uncontrolled manner, similar to the cells in a cancerous tumor, until the JVM finally runs out of memory and can no longer create new Objects. The JVM essentially dies at this point and generates a heap dump file. MidOps has a tool that allows us to look at the heap dump of the JVM that died. The tool is much like the microscope that my wife used to look at the frozen and permanent sections of a biopsy sample when she was a practicing pathologist. The heap dump will show us information about the tens of millions of Objects that were in the JVM at the time of its death. Object counts in the heap dump will show us which Objects metastasized, but will not tell us why they did so. So after a lot of analysis by a lot of people, nobody can really figure out why the OOM event happened and that does not make IT management happy. IT management always wants to know what the “root cause” of the problem was so that we can remove it. I keep trying to tell them that it is like trying to find the “root cause” of a thunderstorm! Yes, we can track the motions of large bodies of warm and cold air intermingling over the Midwest, but we cannot find the “root cause” of a particular thunderstorm over a particular suburb of Chicago because the thunderstorm is an emergent behavior of a complex nonlinear network of software Objects. See Software Chaos for more details.

Do Biologists Have It All Wrong?
As we have seen, the evolution of the architecture of commercial software over the past 70 years, or 2.2 billion seconds, has closely followed the same architectural path that life on Earth followed over the past 4 billion years. The reason for this is that both commercial software and living things are forms of self-replicating information. For more on that see A Brief History of Self-Replicating Information. Because both commercial software and living things have converged upon very similar solutions to combat the second law of thermodynamics in a nonlinear Universe, I contend that the study of commercial software by biologists would provide just as much research value as studying any alien life forms that we might eventually find on Mars, or the moons Europa, Enceladus or Titan, if we should ever get there, and why bother, when all you have to do is spend some time in the nearby IT department of a major corporation in the city that your University resides?

Based upon this premise, I would like to highlight some of Professor Dick Gordon’s work on embryogenesis which goes well beyond Alan Turing’s theory that gradients of morphogens are solely responsible for embryonic growth and differentiation. I recently attended the 2014 winter session of Professor Gordon’s Embryo Physics course which met every Wednesday afternoon at 4:00 PM CT in a Second Life virtual world session. I would highly recommend this course to all IT professionals willing to think a bit outside of the box. For more on this very interesting ongoing course please see:

Embryogenesis Explained

Dick Gordon’s theory of embryogenesis is best explained by his The Hierarchical Genome and Differentiation Waves - Novel Unification of Development, Genetics and Evolution (1999). But here is a nice summary of his theory that he presented as a lecture to the students of the Embryo Physics class on March 20, 2012:

Cause and Effect in the Interaction between Embryogenesis and the Genome

Basically, his theory of morphogenesis is that the genes in the genomes of multicellular organisms that control embryogenesis are organized in a hierarchical manner and that as mechanical differentiation waves pass through the cells of a growing embryo, they trigger cascades of epigenetic events within the cells of the embryo that cause them to split along differentiation trees. As more and more differentiation waves impinge upon the differentiating cells of an embryo, the cells continue down continuously bifurcating differentiation trees. This model differs significantly from Alan Turing’s model of morphogenesis that relies upon the varying diffusion rates of morphogens, creating chemical gradients that turn genes on and off. In Dick Gordon’s model it is the arrival of mechanical expansion and contraction waves at each cell that determines how the cell will differentiate by turning specific genes on and off in cascades, and consequently, the differentiation waves determine what proteins each cell ultimately produces and in what quantities. In his theory, each cell has a ring of microtubules that essentially performs the functions of a seismometer that senses the passage of differentiation waves and is called a cell state splitter. When an expansion differentiation wave arrives, it causes the cell state splitter to expand, and when a contraction differential wave arrives, it causes the cell state splitter to contract. The expansion or contraction of the cell state splitter then causes the nucleus of the cell to distort in a similar manner.

Figure 24 – A circular ring of microfilaments performs the functions of a seismometer that senses the passage of expansion and contraction differentiation waves passing by and is called a cell state splitter. The cell state splitter then passes along the signal to the nucleus of the cell (From slide 75 of Dick Gordon’s lecture).

The distortion of the cell’s nucleus then causes one of two possible gene cascades to fire within the cell. Dick Gordon calls this binary logical operation the nuclear state splitter.

Figure 25 – Changes in the cell state splitter seismometer, caused by a passing contraction or expansion differentiation wave triggers the nuclear state splitter to fire in one of two possible gene cascades (From slide 96 of Dick Gordon’s lecture).

Figure 26 – Groups of cells of one cell type bifurcate along differentiation tree paths when a cell state splitter seismometer fires (From slide 51 of Dick Gordon’s lecture).

Figure 27 – As each contraction or expansion wave impinges upon a cell it causes the cell to split down one branch or the other of a differentiation tree by launching a gene cascade within the cell (From slide 52 of Dick Gordon’s lecture).

The distinguishing characteristic of Dick Gordon’s model is that the information needed by a cell to differentiate properly is not biochemically passed from one cell to another. Rather the information is transmitted via differentiation waves. For example, in Alan Turing’s morphogen model, morphogenic proteins are passed from one cell to another as paracrine factors that diffuse from one cell to another along a gradient. Or the morphogenic proteins pass directly from one cell to another across their cell membranes. Or morphogen generating cells are transported to other sites within an embro as the embryo grows and then do both of the above.

Figure 28 – In Alan Turing’s model for morphogenesis the information necessary for a cell to differentiate properly is passed from cell to cell purely in a biochemical manner (From slide 92 of Dick Gordon’s lecture).

In Dick Gordon’s model it is the passage of differentiation waves that transmits the information required for cells to differentiate. In Figure 29 we see that as a differentiation wave passes by and each cell gets squished, the cell state splitter of the cell launches a cascade of genes that generate internal morphogenic proteins within the cell that cause the cell to differentiate. Superficially, this leaves one with the false impression that there is a gradient of such morphogenic proteins in play.

Figure 29 - As a differentiation wave passes by, each cell gets squished, and the cell state splitter of the cell launches a cascade of genes that generate internal morphogenic proteins within the cell. (From slide 93 of Dick Gordon’s lecture).

I am certainly not a biologist, but from the perspective of an IT professional and a former exploration geophysicist, I find that Dick Gordon’s theory merits further investigation for a number of reasons.

1. From an IT Perspective it seems that the genomes of eukaryotic multicellular organisms may have adopted hierarchical indexed access methods to locate groups of genes and to turn them on or off during embryogenesis
As I pointed out in An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer, IT professionals had a very similar problem when trying to find a particular customer record out of 50 million customer records stored on 14 miles of magnetic tape back in the 1970s. To overcome that problem, IT moved the customer data to disk drives and invented hierarchical indexed access methods using hierarchical indexes, like ISAM and VSAM to quickly find the single customer record. Today we store all commercial data on relational databases, but those relational databases still use hierarchical indexes under the hood. For example, suppose you have 200 customers, rather than 50 million, and would like to find the information on customer 190. If the customer data were stored as a sequential file on magnetic tape, you would have to read through the first 189 customer records before you finally got to customer 190. However, if the customer data were stored on a disk drive, using an indexed sequential access method like ISAM or VSAM, you could get to the customer after 3 reads that get you to the leaf page containing records 176 – 200, and you would only have to read 14 records on the leaf page before you got to record 190. Similarly, the differentiating cells within a growing embryo must have a difficult time finding the genes they need among the 23,000 genes of our genome that are stored on some small percentage of the 6 feet of DNA tape in each cell. So it is possible that the chromatin and chromosomes of the eukaryotic cells found within multicellular organisms provide for a hierarchical indexed access method to locate groups of genes and individual genes, and that the passage of Dick Gordon’s differentiation waves provide for the epigenetic means to initiate the hierarchical indexed access methods needed to differentiate the cells of a growing embryo. Comparing Figure 27 and Figure 30 you will find that they both form hierarchical tree structures. In Figure 30 if you think of the intermediate levels as being composed of gene cascades, rather than pointers to customer records, you essentially get an upside down version of Dick Gordon’s Figure 27.

Figure 30 – To find customer 190 out of 200 on a magnetic tape would require sequentially reading 189 customer records. Using the above hierarchical B-Tree index would only require 3 reads to get to the leaf page containing records 176 – 200. Then an additional 14 reads would get you to customer record 190.

2. Where is the system clock that controls embryogenesis?
The current paradigm seems to rely heavily upon Alan Turing's theory of chemical morphogenesis, where cascades of regulatory proteins are generated at different foci within a growing embryo, and the concentration gradients of these regulatory proteins turn genes on and off in neighboring cells by enhancing or suppressing the promoters of genes. Based upon a great deal of experimental work, I think much of this may be true at a superficial level. But based upon my IT experience, I also think something is missing. Alan Turing's theory of chemical morphogenesis is very much like the arrays of logic gates on a CPU chip that come together to form a basic function in the instruction set of a CPU, like adding two binary numbers together in two registers. At the lowest level we have billions of transistor switches turning on and off in a synchronized manner to form AND, OR, and NOT logic gates which are then combined to form the instruction set of a CPU. So we have billions of transistor switches causing billions of other transistor switches to turn on or off in cascades, just as we have billions or trillions of genes turning on and off in cascades of morphogenic proteins. But where is the overall system clock in a growing embryo? We definitely have a system clock in all CPUs that synchronizes all of the switch cascades.

Figure 31 – All CPUs have a system clock that fires many billions of times each second. The system clock sends out an electromagnetic wave as a series of pulses to all of the billions of transistors on a CPU chip. As each pulse arrives at a particular transistor, the transistor must determine if it is to keep its current state of being a 1 or a 0, or to change its state based upon the current state of the logic gate that it finds itself in. Similarly, each cell in a growing embryo must make the same decision via its state splitter when a differentiation wave passes by.

This is why I like Dick Gordon’s theory of differentiation waves that traverse throughout a growing embryo and which perform the same function as a system clock in a CPU by coordinating the turning on and off of genes across all of the cells in an embryo. In a sense, Dick Gordon’s theory can be used to view a growing embryo as a mechanical-chemical computer, using a mechanical system clock driven by mechanical differentiation waves to synchronize events, like Konrad Zuse’s mechanical Z1 computer that was loosely based upon Alan Turing’s conceptual Turing machine. Figure 32 shows a diagram from Konrad Zuse's May 1936 patent for a mechanical binary switching element, using mechanical flat sliding rods that were a fundamental component of the Z1, and which essentially performed the function of Dick Gordon’s cell state splitter. Ironically, Zuse’s patent was granted in the same year that Alan Turing developed the mathematical concept of the Turing Machine in On Computable Numbers, with an Application to the Entscheidungsproblem. Again, Turing and Zuse were not aware of each other’s work at the time, and were soon to find themselves on opposite sides during World War II. For more on the Z1 see:

Figure 32 – A diagram from Zuse's May 1936 patent for a binary switching element, using mechanical flat sliding rods that were a fundamental component of the Z1.

In commercial software, at the level of Objects communicating with other Objects by calling public methods, it superficially appears as though differentiated Objects are brought into existence by paracrine morphogens passing from one Object to another, but don’t forget that this is an abstraction of many abstractions. At the lowest level, it is all really controlled by the system clocks on thousands of CPU chips in hundreds of servers sending out electromagnetic pulses across the surfaces of the chips.

3. Differentiation waves enable the GPS units of growing embryos
As a former geophysicist, what I really like about Dick Gordon’s theory is that the sequence and arrival times of differentiation waves will depend upon a cell’s location in a growing 3-D embryo. In that regard, it is very much like seismology. When an earthquake occurs many different types of mechanical waves are generated and begin to propagate along the Earth’s surface and throughout the Earth’s body. P-waves are compressional longitudinal waves that have the highest velocities, and therefore, arrive first at all recording stations as primary or p-waves. S-waves are transverse waves that have a lower velocity than p-waves, and therefore, arrive later at all recording stations as secondary or s-waves. By measuring the number of seconds between the arrival of the first p-waves and the arrival of the first s-waves, it is possible to tell how far away the epicenter of the earthquake is from a recording station, and by timing the arrival of the first p-waves and the first s-waves from an earthquake at a number of recording stations, it is possible to triangulate the epicenter of the earthquake. For more on that see:

For example, in Figure 35 we see that for the Kobe earthquake of January 17, 1995 a recording station in Manila received p-waves and s-waves from the earthquake first. Receiving stations in Stockholm and Honolulu both received p-waves and s-waves from the earthquake at later times, and the number of seconds between the arrival of the p-waves and s-waves at those distant locations was greater than it was for the Manila seismic station because Stockholm and Honolulu are both much farther away from Kobe than is Manila. By plotting the arrival times for all three recording stations, it was possible for geophysicists to triangulate the location of the epicenter of the earthquake.

GPS units work just the opposite. With a GPS system we have a single recording station and electromagnetic waves coming from multiple “earthquakes” in the sky onboard a number of GPS satellites that orbit the Earth with an orbital radius of 16,500 miles. Again, by comparing the arrival times of the electromagnetic waves from several GPS satellites, it is possible to triangulate the position of a GPS receiver on the Earth.

Similarly, because each seismic recording station on the Earth has a unique position on the Earth’s surface, the first p-waves and s-waves arrive at different times and in different sequences when several earthquakes at different locations on the Earth are all happening around the same time. Since there are always many little earthquakes going on all over the Earth all the time, the Earth’s natural seismicity can be used as a very primitive natural GPS system. Essentially, by comparing p-wave arrival times at one seismic station with the p-wave arrival times of surrounding seismic stations, you can also figure out the location of the seismic station relative to the others.

I imagine this would be a great way to create a closed feedback loop between the genes and a developing embryo. Since each cell in a growing embryo occupies a unique 3-D location, and thus will experience a unique timing and sequencing of differentiation waves, it’s a wonderful way for the genes in an individual cell to obtain its GPS location in an embryo and to differentiate accordingly by switching on certain genes, while switching other genes off. One could still maintain that all of this is still under the control of the genes in a Richard Dawkins’ The Extended Phenotype 1982 manner. But communicating via waves seems like a much more efficient way to coordinate the growth and differentiation of an entire embryo, rather than trying to rely on morphogens diffusing across the bulk mass of an embryo that is essentially many light years in diameter at the molecular level. Indeed most information transmission in the Universe is accomplished via waves. It would only make sense that living things would stumble upon this fact at the cellular level. We certainly see macroscopic organisms using sound and light waves for communications, in addition to the primitive communications that are accomplished by the diffusion of scent molecules.

Figure 33 – When built up tectonic strain is released near the Earth’s surface sudden rock motions release mechanical waves that propagated away from the earthquake focus. When these mechanical waves reach you, they are felt as an earthquake.

Figure 34 – P-waves have the highest velocity and therefore arrive at each recording station first.

Figure 35 – By recording the arrival times of the p-waves and s-waves at a number of recording stations, it is possible to triangulate the epicenter of an earthquake. For example, by comparing the arrival times of the p-waves and s-waves from an earthquake at recording stations in Stockholm, Manila and Honolulu it was possible to pinpoint the epicenter of the January 17, 1995 earthquake in Kobe Japan.

Figure 36 – GPS systems do just the opposite. You can find your location on the Earth by comparing the arrival times of electromagnetic waves from several different satellites at the same time. If you think of each arriving satellite signal as a separate earthquake, it is possible to use them to triangulate your position.

Figure 37 – A conceptual view of the seismic waves departing from an earthquake in Italy.

Figure 38 – An ectoderm contraction wave in amphibian embryos. At hourly intervals, the image was digitally subtracted from the one 5 minutes earlier, showing the moving ectoderm contraction wave. The arc shaped wave moves faster at its ends than in the middle, reforming a circle which then vanishes at what will be the anterior (head) end of the embryo. (These sets of images are from three different embryos.) The Bar in slide 10 = 1 mm. (Reprint of slides 80 and 81 of Dick Gordon’s lecture).

4. Déjà vu all over again
The current model that biologists use for morphogenesis reminds me very much of the state of affairs that classical geology found itself in back in 1960, before the advent of plate tectonics. I graduated from the University of Illinois in 1973 with a B.S. in physics, only to find that the end of the Space Race and a temporary lull in the Cold War had left very few prospects for a budding physicist. So on the advice of my roommate, a geology major, I headed up north to the University of Wisconsin in Madison to obtain an M.S. in geophysics, with the hope of obtaining a job with an oil company exploring for oil. These were heady days for geology because we were at the very tail end of the plate tectonics revolution that totally changed the fundamental models of geology. The plate tectonics revolution peaked during the five year period 1965 – 1970. Having never taken a single course in geology during all of my undergraduate studies, I was accepted into the geophysics program with many deficiencies in geology, so I had to take many undergraduate geology courses to get up to speed in this new science. The funny thing was that the geology textbooks of the time had not yet had time to catch up with the new plate tectonics revolution of the previous decade, so they still embraced the “classical” geological models of the past which now seemed a little bit silly in light of the new plate tectonics model. But this was also very enlightening. It was like looking back at the prevailing thoughts in physics prior to Newton or Einstein. What the classical geological textbooks taught me was that over the course of several hundred years, the geologists had figured out what had happened, but not why it had happened. Up until 1960 geology was mainly an observational science relying upon the human senses of sight and touch, and by observing and mapping many outcrops in detail, the geologists had figured out how mountains had formed, but not why.

In classical geology, most geomorphology was thought to arise from local geological processes. For example, in classical geology, fold mountains formed off the coast of a continent when a geosyncline formed because the continental shelf underwent a dramatic period of subsidence for some unknown reason. Then very thick layers of sedimentary rock were deposited into the subsiding geosyncline, consisting of alternating layers of sand and mud that turned into sandstones and shales, intermingled with limestones that were deposited from the carbonate shells of dead sea life floating down or from coral reefs. Next, for some unknown reason, the sedimentary rocks were laterally compressed into folded structures that slowly rose from the sea. More compression then followed, exceeding the ability of the sedimentary rock to deform plastically, resulting in thrust faults forming that uplifted blocks of sedimentary rock even higher. As compression continued, some of the sedimentary rocks were then forced down into great depths within the Earth and were then placed under great pressures and temperatures. These sedimentary rocks were then far from the thermodynamic equilibrium of the Earth’s surface where they had originally formed, and thus the atoms within recrystallized into new metamorphic minerals. At the same time, for some unknown reason, huge plumes of granitic magma rose from deep within the Earth’s interior as granitic batholiths. Then over several hundred millions of years, the overlying folded sedimentary rocks slowly eroded away, revealing the underlying metamorphic rocks and granitic batholiths, allowing human beings to cut them into slabs and to polish them into pretty rectangular slabs for the purpose of slapping them up onto the exteriors of office buildings and onto kitchen countertops. In 1960, classical geologists had no idea why the above sequence of events, producing very complicated geological structures, seemed to happen over and over again many times over the course of billions of years. But with the advent of plate tectonics (1965 – 1970), all was suddenly revealed. It was the lateral movement of plates on a global scale that made it all happen. With plate tectonics, everything finally made sense. Fold mountains did not form from purely local geological factors in play. There was the overall controlling geological process of global plate tectonics making it happen. For a comparison of the geomorphology of fold mountains with the morphogenesis of an embryo, please take a quick look at the two videos down below:

Fold Mountains

Medical Embryology - Difficult Concepts of Early Development Explained Simply

Figure 39 – Fold mountains occur when two tectonic plates collide. A descending oceanic plate first causes subsidence offshore of a continental plate which forms a geosyncline that accumulates sediments. When all of the oceanic plate between two continents has been consumed, the two continental plates collide and compress the accumulated sediments in the geosyncline into fold mountains. This is how the Himalayas formed when India crashed into Asia.

Now the plate tectonics revolution was really made possible by the availability of geophysical data. It turns out that most of the pertinent action of plate tectonics occurs under the oceans, at the plate spreading centers and subduction zones, far removed from the watchful eyes of geologists in the field with their notebooks and trusty hand lenses. Geophysics really took off after World War II, when universities were finally able to get their hands on cheap war surplus gear. By mapping variations in the Earth’s gravitational and magnetic fields and by conducting deep oceanic seismic surveys, geophysicists were finally able to figure out what was happening at the plate spreading centers and subduction zones. Actually, the geophysicist and meteorologist Alfred Wegner had figured this all out in 1912 with his theory of Continental Drift, but at the time Wegner was ridiculed by the geological establishment. You see, Wegner had been an arctic explorer and had noticed that sometimes sea ice split apart, like South America and Africa, only later to collide again to form mountain-like pressure ridges. Unfortunately, Wegner froze to death in 1930 trying to provision some members of his last exploration party to Greenland, never knowing that one day he would finally be vindicated. In many ways I suspect that Dick Gordon might be another Alfred Wegner and that his embryogenesis model built upon differentiation waves, cell state splitters and differentiation trees might just be the plate tectonics of embryology. Frankly, geophysicists would just love to know that geologists came from seismic waves traveling over the surfaces and through the bodies of growing embryos!

Final Thoughts
Based upon this posting and An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer, together with my experience of watching commercial software evolve over the past 42 years, it may have gone down like this. About 4.0 billion years ago some very ancient prokaryotic archaea began to wrap their DNA around histone proteins to compact and stabilize the DNA under the harsh conditions that the archaea are so fond of. Basically, these ancient archaea accidentally discovered the value of using computer tape reels to store their DNA on tape racks. This innovation was found to be far superior to the simple free-floating DNA loops of normal prokaryotic bacteria that basically stored their DNA tape loosely sprawled all over the computer room floor. During this same interval of time a number of parasitic bacteria took up residence within these archaea, and entered into parasitic/symbiotic relationships with them to form the other organelles of eukaryotic cells in accordance with the Endosymbiosis theory of Lynn Margulis. These early single-celled eukaryotes then serendipitously discovered that storing DNA on tape reels that were systematically positioned on tape racks in identifiable locations in the form of chromosomes, allowed for epigenetic factors to control the kinds and amounts of proteins that were to be generated at any given time. This was a far superior access method for genes compared to the simple sequential access methods used by the prokaryotes. With time, this innovation that originally was meant to stabilize DNA under harsh conditions was further exapted into full-fledged indexed hierarchical B-Tree access methods like ISAM and VSAM. With the Cambrian explosion 530 million years ago these pre-existing indexed hierarchical B-Tree access methods were further exapted into Dick Gordon’s hierarchical differentiation trees by exapting the passage of differentiation waves through the body of developing embryos. I am guessing that originally the differentiation waves in a growing embryo served some other unrelated useful purpose, or perhaps they were just naturally occurring mechanical waves that arose to relieve the strains that accumulated in a growing embryo, like little earthquakes, or perhaps they were just a spandrel for some other totally unrelated function. However, once the cells in a growing embryo discovered the advantages of using wave communications to keep development in sync, there was no turning back.

For both biologists and IT professionals the most ponderous thing about all of this is how can all that information possibly be encoded within a single fertilized egg? Somehow it must be encoded in stretches of DNA that we call genes, stretches of DNA that we do not call genes, in the complex structure of the chromatin and chromosomes of the single cell, and the complex structure of the fertilized egg itself, creating a complex network of interacting logical operations that ultimately produce a mature newborn.

Figure 40 – The Star Child from Stanley Kubrick’s 2001 : A Space Odyssey gazes down upon the planet from which it came and wonders how it all can be.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston