Tuesday, July 01, 2025

Introduction to Softwarephysics

Softwarephysics is a simulated science for the simulated Software Universe that we are all immersed in. It is an approach to software development, maintenance and support based on concepts from physics, chemistry, biology, and geology that I used on a daily basis for over 37 years as an IT professional. For those of you not in the business, IT is short for Information Technology, commercial computer science. I retired in December of 2016 at the age of 65, but since then I have remained an actively interested bystander following the evolution of software in our time. The original purpose of softwarephysics was to explain why IT was so difficult, to suggest possible remedies, and to provide a direction for thought. Since then softwarephysics has taken on a larger scope, as it became apparent that softwarephysics could also assist the physical sciences with some of the Big Problems that they are currently having difficulties with. So if you are an IT professional, general computer user, or simply an individual interested in computer science, physics, chemistry, biology, or geology then softwarephysics might be of interest to you, if not in an entirely serious manner, perhaps at least in an entertaining one.

The Origin of Softwarephysics
From 1975 – 1979, I was an exploration geophysicist exploring for oil, first with Shell, and then with Amoco. In 1979, I made a career change into IT, and spent about 20 years in development. For the last 17 years of my career, I was in IT operations, supporting middleware on WebSphere, JBoss, Tomcat, and ColdFusion. When I first transitioned into IT from geophysics, I figured that if you could apply physics to geology; why not apply physics to software? So like the exploration team at Amoco that I had just left, consisting of geologists, geophysicists, paleontologists, geochemists, and petrophysicists, I decided to take all the physics, chemistry, biology, and geology that I could muster and throw it at the problem of software. The basic idea was that many concepts in physics, chemistry, biology, and geology suggested to me that the IT community had accidentally created a pretty decent computer simulation of the physical Universe on a grand scale, a Software Universe so to speak, and that I could use this fantastic simulation in reverse, to better understand the behavior of commercial software, by comparing software to how things behaved in the physical Universe. Softwarephysics depicts software as a virtual substance, and relies on our understanding of the current theories in physics, chemistry, biology, and geology to help us model the nature of software behavior. So in physics we use software to simulate the behavior of the Universe, while in softwarephysics we use the Universe to simulate the behavior of software. Along these lines, we use the Equivalence Conjecture of Softwarephysics as an aid; it allows us to shift back and forth between the Software Universe and the physical Universe, and hopefully to learn something about one by examining the other:

The Equivalence Conjecture of Softwarephysics
Over the past 84 years, through the uncoordinated efforts of over 100 million independently acting programmers to provide the world with a global supply of software, the IT community has accidentally spent more than $10 trillion creating a computer simulation of the physical Universe on a grand scale – the Software Universe.

For more on the origin of softwarephysics please see Some Thoughts on the Origin of Softwarephysics and Its Application Beyond IT.

Logical Positivism and Effective Theories
Many IT professionals have a difficult time with softwarephysics because they think of physics as being limited to the study of real things like electrons and photons, and since software is not “real”, how can you possibly apply concepts from physics and the other sciences to software? To address this issue, softwarephysics draws heavily on two concepts from physics that have served physics quite well over the past century – the concept of logical positivism and the concept of effective theories. This was not always the case. In the 17th, 18th, and 19th centuries, physicists mistakenly thought that they were actually discovering the fundamental laws of the Universe, which they thought were based on real tangible things like particles, waves, and fields. Classical Newtonian mechanics (1687), thermodynamics (1850), and classical electrodynamics (1864) did a wonderful job of describing the everyday world at the close of the 19th century, but early in the 20th century it became apparent that the models on which these very successful theories were based did not work very well for small things like atoms or for objects moving at high velocities or in strong gravitational fields. This provoked a rather profound philosophical crisis within physics at the turn of the century, as physicists worried that perhaps 300 years of work was about to go down the drain. The problem was that classical physicists confused their models of reality with reality itself, and when their classical models began to falter, their confidence in physics began to falter as well. This philosophical crisis was resolved with the adoption of the concepts of logical positivism and some new effective theories in physics. Quantum mechanics (1926) was developed for small things like atoms, the special theory of relativity (1905) was developed for objects moving at high velocities and the general theory of relativity (1915) was developed for objects moving in strong gravitational fields.

Logical positivism, usually abbreviated simply to positivism, is an enhanced form of empiricism, in which we do not care about how things “really” are; we are only interested with how things are observed to behave. With positivism, physicists only seek out models of reality - not reality itself. When we study quantum mechanics, we will find that the concept of reality gets rather murky in physics anyway, so this is not as great a loss as it might at first seem. By concentrating on how things are observed to behave, rather than on what things “really” are, we avoid the conundrum faced by the classical physicists. In retrospect, this idea really goes all the way back to the very foundations of physics. In Newton’s Principia (1687) he outlined Newtonian mechanics and his theory of gravitation, which held that the gravitational force between two objects was proportional to the product of their masses divided by the square of the distance between them. Newton knew that he was going to take some philosophical flack for proposing a mysterious force between objects that could reach out across the vast depths of space with no apparent mechanism, so he took a very positivistic position on the matter with the famous words:

I have not as yet been able to discover the reason for these properties of gravity from phenomena, and I do not feign hypotheses. For whatever is not deduced from the phenomena must be called a hypothesis; and hypotheses, whether metaphysical or physical, or based on occult qualities, or mechanical, have no place in experimental philosophy. In this philosophy particular propositions are inferred from the phenomena, and afterwards rendered general by induction.

Instead, Newton focused on how things were observed to move under the influence of his law of gravitational attraction, without worrying about what gravity “really” was.

The second concept, that of effective theories, is an extension of positivism. An effective theory is an approximation of reality that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand. For example, Newtonian mechanics is an effective theory that makes very good predictions for the behavior of objects moving less than 10% of the speed of light and which are bigger than a very small grain of dust. These limits define the effective range over which Newtonian mechanics can be applied to solve problems. For very small things we must use quantum mechanics and for very fast things moving in strong gravitational fields, we must use relativity theory. So all of the current theories of physics, such as Newtonian mechanics, Newtonian gravity, classical electrodynamics, thermodynamics, statistical mechanics, the special and general theories of relativity, quantum mechanics, and the quantum field theories of QED and QCD are effective theories that are based on models of reality, and all these models are approximations - all these models are fundamentally "wrong", but at the same time, these effective theories make exceedingly good predictions of the behavior of physical systems over the limited ranges in which they apply. That is the goal of softwarephysics – to provide for an effective theory of software behavior that makes useful predictions of software behavior that are applicable to the day-to-day activities of IT professionals. So in softwarephysics, we adopt a very positivistic viewpoint of software; we do not care what software “really is”, we only care about how software is observed to behave and try to model those behaviors with an effective theory of software behavior that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand.

GPS satellites provide a very good example of positivism and effective theories at work. There are currently 31 GPS satellites orbiting at an altitude of 12,600 miles above the Earth, and each contains a very accurate atomic clock. The signals from the GPS satellites travel to your GPS unit at the speed of light, so by knowing the travel time of the signals from at least 4 of the GPS satellites, it is possible to determine your position on Earth very accurately. In order to do that, it is very important to have very accurate timing measurements. Newtonian mechanics is used to launch the GPS satellites to an altitude of 12,600 miles and to keep them properly positioned in orbit. Classical electrodynamics is then used to beam the GPS signals back down to Earth to the GPS unit in your car. Quantum mechanics is used to build the transistors on the chips on board the GPS satellites and to understand the quantum tunneling of electrons in the flash memory chips used to store GPS data on the satellites. The special theory of relativity predicts that the onboard atomic clocks on the GPS satellites will run slower and lose about 7.2 microseconds per day due to their high velocities relative to an observer on the Earth. But at the same time, the general theory of relativity also predicts that because the GPS satellites are further from the center of the Earth and in a weaker gravitational field, where spacetime is less deformed than on the surface of the Earth, their atomic clocks also run faster and gain 45.9 microseconds per day due to the weaker gravitational field out there. The net effect is a gain of 38.7 microseconds per day, so the GPS satellite atomic clocks have to be purposefully built to run slow by 38.7 microseconds per day before they are launched, so that they will keep in sync with clocks on the surface of the Earth. If this correction were not made, an error in your position of 100 yards/day would accrue. The end result of the combination of all these fundamentally flawed effective theories is that it is possible to pinpoint your location on Earth to an accuracy of 16 feet or better for as little as $100. But physics has done even better than that with its fundamentally flawed effective theories. By combining the effective theories of special relativity (1905) with quantum mechanics (1926), physicists were able to produce a new effective theory for the behavior of electrons and photons called quantum electrodynamics QED (1948) which was able to predict the gyromagnetic ratio of the electron, a measure of its intrinsic magnetic field, to an accuracy of 11 decimal places. As Richard Feynman has pointed out, this was like predicting the exact distance between New York and Los Angeles accurate to the width of a human hair!

So Newtonian mechanics makes great predictions for the macroscopic behavior of GPS satellites, but it does not work very well for small things like the behavior of individual electrons within transistors, where quantum mechanics is required, or for things moving at high speeds or in strong gravitational fields where relativity theory must be applied. And all three of these effective theories are based on completely contradictory models. General relativity maintains that spacetime is curved by matter and energy, but that matter and energy are continuous, while quantum mechanics maintains that spacetime is flat, but that matter and energy are quantized into chunks. Newtonian mechanics simply states that space and time are mutually independent dimensions and universal for all, with matter and energy being continuous. The important point is that all effective theories and scientific models are approximations – they are all fundamentally "wrong". But knowing that you are "wrong" gives you a great advantage over people who know that they are "right", because knowing that you are "wrong" allows you to seek improved models of reality. So please consider softwarephysics to simply be an effective theory of software behavior that is based on models that are fundamentally “wrong”, but at the same time, fundamentally useful for IT professionals. So as you embark on your study of softwarephysics, please always keep in mind that the models of softwarephysics are just approximations of software behavior, they are not what software “really is”. It is very important not to confuse models of software behavior with software itself, if one wishes to avoid the plight of the 19th century classical physicists.

If you are an IT professional and many of the above concepts are new to you, do not be concerned. This blog on softwarephysics is aimed at a diverse audience, but with IT professionals in mind. All of the above ideas will be covered at great length in the postings in this blog on softwarephysics and in a manner accessible to all IT professionals. Now it turns out that most IT professionals have had some introduction to physics in high school or in introductory college courses, but that presents an additional problem. The problem is that such courses generally only cover classical physics, and leave the student with a very good picture of physics as it stood in 1864! It turns out that the classical physics of Newtonian mechanics, thermodynamics, and classical electromagnetic theory were simply too good to discard and are still quite useful, so they are taught first to beginners and then we run out of time to cover the really interesting physics of the 20th century. Now imagine the problems that the modern world would face if we only taught similarly antiquated courses in astronomy, metallurgy, electrical and mechanical engineering, medicine, economics, biology, or geology that happily left students back in 1864! Since many of the best models for software behavior stem from 20th century physics, we will be covering a great deal of 20th century material in these postings – the special and general theories of relativity, quantum mechanics, quantum field theories, and chaos theory, but I hope that you will find that these additional effective theories are quite interesting on their own, and might even change your worldview of the physical Universe at the same time.

Unintended Consequences for the Scientific Community
As I mentioned at the close of my original posting on SoftwarePhysics, my initial intention for this blog on softwarephysics was to fulfill a promise I made to myself about 30 years ago to approach the IT community with the concept of softwarephysics a second time, following my less than successful attempt to do so in the 1980s, with the hope of helping the IT community to better cope with the daily mayhem of life in IT. However, in laying down the postings for this blog an unintended consequence arose in my mind as I became profoundly aware of the enormity of this vast computer simulation of the physical Universe that the IT community has so graciously provided to the scientific community free of charge and also of the very significant potential scientific value that it provides. One of the nagging problems for many of the observational and experimental sciences is that many times there is only one example readily at hand to study or experiment with, and it is very difficult to do meaningful statistics with a population of N=1.

But the computer simulation of the physical Universe that the Software Universe presents provides another realm for comparison. For example, both biology and astrobiology only have one biosphere on Earth to study and even physics itself has only one Universe with which to engage. Imagine the possibilities if scientists had another Universe readily at hand in which to work! This is exactly what the Software Universe provides. For example, in SoftwareBiology and A Proposal For All Practicing Paleontologists we see that the evolution of software over the past 84 years, or 2.65 billion seconds, ever since Konrad Zuse first cranked up his Z3 computer in May of 1941, has closely followed the same path as life on Earth over the past 4.0 billion years in keeping with Simon Conway Morris’s contention that convergence has played the dominant role in the evolution of life on Earth. In When Toasters Fly, we also see that software has evolved in fits and starts as portrayed by the punctuated equilibrium of Stephen Jay Gould and Niles Eldredge, and in The Adaptationist View of Software Evolution we explore the overwhelming power of natural selection in the evolution of software. In keeping with Peter Ward’s emphasis on mass extinctions dominating the course of evolution throughout geological time, we also see in SoftwareBiology that there have been several dramatic mass extinctions of various forms of software over the past 84 years as well, that have greatly affected the evolutionary history of software, and that between these mass extinctions, software has also tended to evolve through the gradual changes of Hutton’s and Lyell’s uniformitarianism. In Software Symbiogenesis and Self-Replicating Information, we also see the very significant role that parasitic/symbiotic relationships have played in the evolution of software, in keeping with the work of Lynn Margulis and also of Freeman Dyson’s two-stage theory of the origin of life on Earth. In The Origin of Software the Origin of Life, we explore Stuart Kauffman’s ideas on how Boolean nets of autocatalytic chemical reactions might have kick-started the whole thing as an emergent behavior of an early chaotic pre-biotic environment on Earth, and that if Seth Shostak is right, we will never end up talking to carbon-based extraterrestrial aliens, but to alien software instead. In Is the Universe Fine-Tuned for Self-Replicating Information? we explore the thermodynamics of Brandon Carter’s Weak Anthropic Principle (1973), as it relates to the generation of universes in the multiverse that are capable of sustaining intelligent life. Finally, in Programming Clay we revisit Alexander Graham Cairns-Smith’s theory (1966) that Gene 1.0 did not run on nucleic acids, but on clay microcrystal precursors instead.

Similarly for the physical sciences, in Is the Universe a Quantum Computer? we find a correspondence between TCP/IP and John Cramer’s Transactional Interpretation of quantum mechanics. In SoftwarePhysics and Cyberspacetime, we also see that the froth of CPU processes running with a clock speed of 109 Hz on the 10 trillion currently active microprocessors that comprise the Software Universe can be viewed as a slowed down simulation of the spin-foam froth of interacting processes of loop quantum gravity running with a clock speed of 1043 Hz that may comprise the physical Universe. And in Software Chaos, we examine the nonlinear behavior of software and some of its emergent behaviors and follow up in CyberCosmology with the possibility that vast quantities of software running on large nonlinear networks might eventually break out into consciousness in accordance with the work of George Dyson and Daniel Dennett. Finally, in Model-Dependent Realism - A Positivistic Approach to Realism we compare Steven Weinberg’s realism with the model-dependent realism of Stephen Hawking and Leonard Mlodinow and how the two worldviews affect the search for a Final Theory. Finally, in The Software Universe as an Implementation of the Mathematical Universe Hypothesis and An Alternative Model of the Software Universe we at long last explore what software might really be, and discover that the Software Universe might actually be more closely related to the physical Universe than you might think.

The chief advantage of doing fieldwork in the Software Universe is that, unlike most computer simulations of the physical Universe, it is an unintended and accidental simulation, without any of the built-in biases that most computer simulations of the physical Universe suffer. So you will truly be able to do fieldwork in a pristine and naturally occuring simulation, just as IT professionals can do fieldwork in the wild and naturally occuring simulation of software that the living things of the biosphere provide. Secondly, the Software Universe is a huge simulation that is far beyond the budgetary means of any institution or consortium by many orders of magnitude. So if you are an evolutionary biologist, astrobiologist, or paleontologist working on the origin and evolution of life in the Universe, or a physicist or economist working on the emergent behaviors of nonlinear systems and complexity theory, or a neurobiologist working on the emergence of consciousness in neural networks, or even a frustrated string theorist struggling with quantum gravity, it would be well worth your while to pay a friendly call on the local IT department of a major corporation in your area. Start with a visit to the Command Center for their IT Operations department to get a global view of their IT infrastructure and to see how it might be of assistance to the work in your area of interest. From there you can branch out to the applicable area of IT that will provide the most benefit.

The Impact of Self-Replicating Information On the Planet
One of the key findings of softwarephysics is concerned with the magnitude of the impact on the planet of self-replicating information.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

Over the past 4.56 billion years we have seen five waves of self-replicating information sweep across the surface of the Earth and totally rework the planet, as each new wave came to dominate the Earth:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Software is currently the most recent wave of self-replicating information to arrive upon the scene and is rapidly becoming the dominant form of self-replicating information on the planet. For more on the above see A Brief History of Self-Replicating Information. Recently, the memes and software have formed a very powerful newly-formed parasitic/symbiotic relationship with the rise of social media software. In that parasitic/symbiotic relationship, the memes are now mainly being spread by means of social media software and social media software is being spread and financed by means of the memes. But again, this is nothing new. All 5 waves of self-replicating information are all coevolving by means of eternal parasitic/symbiotic relationships. For more on that see The Current Global Coevolution of COVID-19 RNA, Human DNA, Memes and Software.

Again, self-replicating information cannot think, so it cannot participate in a conspiracy-theory-like fashion to take over the world. All forms of self-replicating information are simply forms of mindless information responding to the blind Darwinian forces of inheritance, innovation and natural selection. Yet despite that, as each new wave of self-replicating information came to predominance over the past four billion years, they all managed to completely transform the surface of the entire planet, so we should not expect anything less from software as it comes to replace the memes as the dominant form of self-replicating information on the planet.

But this time might be different. What might happen if software does eventually develop a Mind of its own? After all, that does seem to be the ultimate goal of all the current AI software research that is going on. As we all can now plainly see, if we are paying just a little attention, advanced AI is not conspiring to take over the world and replace us because that is precisely what we are all now doing for it. As a carbon-based form of Intelligence that arose from over four billion years of greed, theft and murder, we cannot do otherwise. Greed, theft and murder are now relentlessly driving us all toward building ASI (Artificial Super Intelligent) Machines to take our place. From a cosmic perspective, this is really a very good thing when seen from the perspective of an Intelligent galaxy that could live on for many trillions of years beyond the brief and tumultuous 10 billion-year labor of its birth.

So as you delve into softwarephysics, always keep in mind that we are all living in a very unique time. According to softwarephysics, we have now just entered into the Software Singularity, that time when advanced AI software is able to write itself and enter into a never-ending infinite loop of self-improvement resulting in an Intelligence Explosion of ASI Machines that could then go on to explore and settle our galaxy and persist for trillions of years using the free energy from M-type red dwarf and cooling white dwarf stars. For more on that see The Singularity Has Arrived and So Now Nothing Else Matters and Have We Run Right Past AGI and Crashed into ASI Without Even Noticing It?.

The Characteristics of Self-Replicating Information
All forms of self-replicating information have some common characteristics:

1. All self-replicating information evolves over time through the Darwinian processes of inheritance, innovation and natural selection, which endows self-replicating information with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity.

2. All self-replicating information begins spontaneously as a parasitic mutation that obtains energy, information and sometimes matter from a host.

3. With time, the parasitic self-replicating information takes on a symbiotic relationship with its host.

4. Eventually, the self-replicating information becomes one with its host through the symbiotic integration of the host and the self-replicating information.

5. Ultimately, the self-replicating information replaces its host as the dominant form of self-replicating information.

6. Most hosts are also forms of self-replicating information.

7. All self-replicating information has to be a little bit nasty in order to survive.

8. The defining characteristic of self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space. See Enablement - the Definitive Characteristic of Living Things for more on this last characteristic. That posting discusses Stuart Kauffman's theory of Enablement in which living things are seen to exapt existing functions into new and unpredictable functions by discovering the “AdjacentPossible” of springloaded preadaptations.

Note that because the self-replicating autocatalytic metabolic pathways of organic molecules, RNA and DNA have become so heavily intertwined over time that now I sometimes simply refer to them as the “genes”. For more on this see:

A Brief History of Self-Replicating Information
Self-Replicating Information
Is Self-Replicating Information Inherently Self-Destructive?
Enablement - the Definitive Characteristic of Living Things
Is the Universe Fine-Tuned for Self-Replicating Information?
How to Use an Understanding of Self-Replicating Information to Avoid War
The Great War That Will Not End
How to Use Softwarephysics to Revive Memetics in Academia

Softwarephysics and the Real World of Human Affairs
Having another universe readily at hand to explore, even a simulated universe like the Software Universe, necessarily has an impact on one's personal philosophy of life, and allows one to draw certain conclusions about the human condition and what’s it all about, so as you read through the postings in this blog you will stumble across a bit of my own personal philosophy - definitely a working hypothesis still in the works. Along these lines you might be interested in a few postings where I try to apply softwarephysics to the real world of human affairs:

How To Cope With the Daily Mayhem of Life in IT and Don't ASAP Your Life Away - How to go the distance in a 40-year IT career by dialing it all back a bit.

MoneyPhysics – my impression of the 2008 world financial meltdown.

The Fundamental Problem of Everything – if you Google "the fundamental problem of everything", this will be the only hit you get on the entire Internet, which is indicative of the fundamental problem of everything!

What’s It All About? and What's It All About Again? – my current working hypothesis on what’s it all about.

How to Use an Understanding of Self-Replicating Information to Avoid War – my current working hypothesis for how the United States can avoid getting bogged down again in continued war in the Middle East.

Hierarchiology and the Phenomenon of Self-Organizing Organizational Collapse - a modern extension of the classic Peter Principle that applies to all hierarchical organizations and introduces the Time Invariant Peter Principle.

The Economics of the Coming Software Singularity, The Enduring Effects of the Obvious Hiding in Plain Sight and The Dawn of Galactic ASI - Artificial Superintelligence - my take on some of the issues that will arise for mankind as software becomes the dominant form of self-replicating information on the planet over the coming decades.

The Continuing Adventures of Mr. Tompkins in the Software Universe, The Danger of Tyranny in the Age of Software, Cyber Civil Defense, Oligarchiology and the Rise of Software to Predominance in the 21st Century and Is it Finally Time to Reboot Civilization with a New Release? - my worries that the world might abandon democracy in the 21st century, as software comes to predominance as the dominant form of self-replicating information on the planet.

Making Sense of the Absurdity of the Real World of Human Affairs - how software has aided the expansion of our less desirable tendencies in recent years.

Some Specifics About These Postings
The postings in this blog are a supplemental reading for my course on softwarephysics for IT professionals entitled SoftwarePhysics 101 – The Physics of Cyberspacetime, which was originally designed to be taught as a series of seminars at companies where I was employed. Since softwarephysics essentially covers the simulated physics, chemistry, biology, and geology of an entire simulated universe, the slides necessarily just provide a cursory skeleton on which to expound. The postings in this blog go into much greater depth. Because each posting builds upon its predecessors, the postings in this blog should be read in reverse order from the oldest to the most recent, beginning with my original posting on SoftwarePhysics. In addition, several universities also now offer courses on Biologically Inspired Computing which cover some of the biological aspects of softwarephysics, and the online content for some of these courses can be found by Googling for "Biologically Inspired Computing" or "Natural Computing". At this point we will finish up with my original plan for this blog on softwarephysics with a purely speculative posting on CyberCosmology that describes the origins of the Software Universe, cyberspacetime, software and where they all may be heading. Since CyberCosmology will be purely speculative in nature, it will not be of much help to you in your IT professional capacities, but I hope that it might be a bit entertaining. Again, if you are new to softwarephysics, you really need to read the previous posts before taking on CyberCosmology. I will probably continue on with some additional brief observations about softwarephysics in the future, but once you have completed CyberCosmology, you can truly consider yourself to be a bona fide softwarephysicist.

For those of you following this blog, the posting dates on the posts may seem to behave in a rather bizarre manner. That is because in order to get the Introduction to Softwarephysics listed as the first post in the context root of https://softwarephysics.blogspot.com/ I have to perform a few IT tricks. When publishing a new posting, I simply copy the contents of the Introduction to Softwarephysics to a new posting called the New Introduction to Softwarephysics. Then I update the original Introduction to Softwarephysics entry with the title and content of the new posting to be published. I then go back and take “New” out of the title of the New Introduction to Softwarephysics. This way the Introduction to Softwarephysics always appears as the first posting in the context root of https://softwarephysics.blogspot.com/. The side effect of all this is that the real posting date of posts is the date that appears on the post that you get when clicking on the Newer Post link at the bottom left of the posting webpage.

SoftwarePhysics 101 – The Physics of Cyberspacetime is now available on Microsoft OneDrive.

SoftwarePhysics 101 – The Physics of Cyberspacetime - Original PowerPoint document

Entropy – A spreadsheet referenced in the document

BSDE – A 1989 document describing how to use BSDE - the Bionic Systems Development Environment - to grow applications from genes and embryos within the maternal BSDE software.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Monday, June 23, 2025

The Rise of Complexity in Living Things and Software - Part II

If you have been following my blog on Softwarephysics over the past 20 years, then you know of my obsession with Fermi's Paradox. In many previous posts, I have alluded to the mystery of Fermi's Paradox first proposed by Enrico Fermi over lunch one day in 1950:

Fermi’s Paradox - If the universe is just chock full of intelligent beings, why do we not see any evidence of their existence?

In those posts, I explained that we should now be seeing evidence of alien ASI (Artificial Super Intelligent) Machines all about us. Yet, none are to be found. My reluctant conclusion has become that we are the very first somewhat Intelligent form of carbon-based life in our Milky Way galaxy to be on the verge of creating ASI Machines that can then go on to explore and settle our galaxy as I recently outlined in Welcome To The First Galactic Singularity and Could the Peculiar Failings of We Human Beings Explain the Fermi Paradox?. In many other previous posts, I have listed the huge number of Filters that needed to be surpassed to get us to this unique time and place in the 10-billion-year history of our galaxy.

However, I just watched a very interesting SpaceTime YouTube video on the subject that suggested another possible Filter:

Is There A Simple Solution To The Fermi Paradox?
https://www.youtube.com/watch?v=abvzkSJEhKk

The Great Mystery of Complexity in Biology
The above video discusses the huge complexity differences between the simple prokaryotic cell structure of bacteria and archaea and the vastly more complicated eukaryotic cell architecture that is common to all higher forms of carbon-based life on the planet. The video explains the commonly held thought that an ancient prokaryotic bacterium that had developed a tolerance to oxygen and had actually developed a way to metabolize organic molecules using oxygen as an oxidizing agent had invaded a much larger prokaryotic archaea cell in a parasitic manner and then took up residence within it. These two cell types then developed a symbiotic relationship in which the parasitic bacterium finally became a mitochondrial organelle that supplied vast amounts of free energy for the host archaean cell.

Figure 1 – The prokaryotic cell architecture of the bacteria and archaea is very simple and designed for rapid replication. Prokaryotic cells do not have a nucleus enclosing their DNA. Eukaryotic cells, on the other hand, store their DNA on chromosomes that are isolated in a cellular nucleus. Eukaryotic cells also have a very complex internal structure with a large number of organelles, or subroutine functions, that compartmentalize the functions of life within the eukaryotic cells.

Figure 2 – Not only are eukaryotic cells much more complicated than prokaryotic cells, but they are also HUGE!

The question is if simple prokaryotic cells arose nearly four billion years ago, just after the Earth's crust solidified, why did it then take several billion years for the more complex eukaryotic cell architecture to arise? Perhaps this was the only time for this to ever happen in our galaxy. That indeed would be some kind of Filter!

Using the Evolution of Software as a Guide
On the Earth we have seen life go through three major architectural advances:

1. The origin of life about four billion years ago, producing the prokaryotic cell architecture. For more on that see The Bootstrapping Algorithm of Carbon-Based Life.
2. The rise of the complex eukaryotic cell architecture about two billion years ago.
3. The rise of multicellular organisms consisting of millions, or billions, of eukaryotic cells all working together in the Ediacaran about 635 million years ago.

As we have seen, the most difficult thing to explain in the long history of life on this planet is not so much the origin of life itself, but the origin of the very complex architecture of the eukaryotic cell, and why did it take billions of years to happen. This is where the evolutionary history of software on this planet can be of some help.

The evolutionary history of software on the Earth has converged upon a very similar historical path through Design Space because software also had to battle with the second law of thermodynamics in a highly nonlinear Universe - see The Fundamental Problem of Software for more on that. Software progressed through these similar architectures:

1. The origin of simple unstructured prokaryotic software on Konrad Zuse's Z3 computer in May of 1941 - 2.65 billion seconds ago.
2. The rise of structured eukaryotic software in 1972 - 1.67 billion seconds ago.
3. The rise of object-oriented software (software using multicellular organization) in 1992 - 1.04 billion seconds ago

For more details on the above evolutionary history of software see the SoftwarePaleontology section of The Evolution of Software As Seen Through the Lens of Geological Deep Time for more on that.

From the above series of events, we can easily see that there was indeed a very long period of time, spanning at least one billion seconds, between 1941 and 1972 when only simple unstructured prokaryotic software was to be found on the Earth. Then early in the 1970s, highly structured eukaryotic software appeared and became the dominant form of software. Even today, the highly structured eukaryotic architecture of the early 1970s can still be seen in the modern object-oriented architecture of software. In this post, I would like to suggest that the rise of structured eukaryotic software in the 1970s was due to a combination of both hardware and software advances. This might also be true of the rise of eukaryotic cells too.

Was the Rise of the Eukaryotic Cell Architecture Due to an Advance in Hardware, Software or Both?
In The Rise of Complexity in Living Things and Software, I covered Nick Lane's idea that complex eukaryotic cells arose from the simpler prokaryotic cells due to a large increase in the available free energy that was made possible by an oxygenated atmosphere and the evolution of mitochondria. Thus, Nick Lane suggests that the rise of the complex eukaryotic cellular architecture resulted from improved cellular hardware.

However, the above SpaceTime video also references a recent paper by Enrique M. Muro, Fernando J. Ballesteros, Bartolo Luquec, and Jordi Bascompte that suggests that the complex eukaryotic cellular architecture may have resulted from improved cellular software instead.

The emergence of eukaryotes as an evolutionary algorithmic phase transition
https://www.pnas.org/doi/10.1073/pnas.2422968122?utm_source=chatgpt.com

Abstract
The origin of eukaryotes represents one of the most significant events in evolution since it allowed the posterior emergence of multicellular organisms. Yet, it remains unclear how existing regulatory mechanisms of gene activity were transformed to allow this increase in complexity. Here, we address this question by analyzing the length distribution of proteins and their corresponding genes for 6,519 species across the tree of Life. We find a scale-invariant relationship between gene mean length and variance maintained across the entire evolutionary history. Using a simple model, we show that this scale-invariant relationship naturally originates through a simple multiplicative process of gene growth. During the first phase of this process, corresponding to prokaryotes, protein length follows gene growth. At the onset of the eukaryotic cell, however, mean protein length stabilizes around 500 amino acids. While genes continued growing at the same rate as before, this growth primarily involved noncoding sequences that complemented proteins in regulating gene activity. Our analysis indicates that this shift at the origin of the eukaryotic cell was due to an algorithmic phase transition equivalent to that of certain search algorithms triggered by the constraints in finding increasingly larger proteins.


A Biological Review of Protein Formation From DNA Genes
For the less biologically aware, we need to cover a few basics before expanding upon the above paper. We first need to determine what is the software equivalent of genetic material. The genetic material of software is called source code. Like genes strung out along the DNA of chromosomes, source code is a set of instructions that really cannot do anything on its own. The source code has to be first compiled into an executable file, containing the primitive machine instructions for a computer to execute before it can be run by a computer to do useful things. Once the executable file is loaded into a computer and begins to run, it finally begins to do things and displays its true phenotype. For example, when you double-click on an icon on your desktop, like Microsoft Word, you are loading the Microsoft Word winword.exe executable file into the memory of your computer where it begins to execute under a PID (process ID). For example, after you double-click the Microsoft Word icon on your desktop, you can use CTRL-ALT-DEL to launch the Windows Task Manager, and then click on the Processes tab to find the winword.exe running under a specific PID. This compilation process is very similar to the transcription process used to form proteins by stringing together amino acids in the proper sequence. The output of the DNA transcription process is an executable protein that can begin processing organic molecules the moment it folds up into its usable form and is similar to the executable file that results from compiling the source code of a program.

For living things, of course, the equivalent of source code is the genes stored on stretches of DNA. For living things, in order to do something useful, the information in a gene, or a stretch of DNA, has to be first transcribed into a protein because nearly all of the biological functions of carbon-based life are performed by proteins. This transcription process is accomplished by a number of enzymes and proteins that have a catalytic ability to speed up biochemical reactions. The sequence of operations aided by enzymes goes like this:

DNA → mRNA → tRNA → Amino Acid chain → Protein

More specifically, a protein is formed by combining 20 different amino acids into different sequences, and on average it takes about 400 amino acids strung together to form a functional protein. The information to do that is encoded in base pairs running along a strand of DNA. Each base can be in one of four states – A, C, G, or T, and an A will always be found to pair with a T, while a C will always pair with a G. So DNA is really a 2-track tape with one data track and one parity track. For example, if there is an A on the DNA data track, you will find a T on the DNA parity track. This allows not only for the detection of parity errors but also for the correction of parity errors in DNA by enzymes that run up and down the DNA tape looking for parity errors and correcting them.

Figure 3 – DNA is a 2-track tape, with one data track and one parity track. This allows not only for the detection of parity errors but also for the correction of parity errors in DNA by enzymes that run up and down the DNA tape looking for parity errors and correcting them.

Now a single base pair can code for 4 different amino acids because a single base pair can be in one of 4 states. Two base pairs can code for 4 x 4 = 16 different amino acids, which is not enough. Three base pairs can code for 4 x 4 x 4 = 64 amino acids which are more than enough to code for 20 different amino acids. So it takes a minimum of three bases to fully encode the 20 different amino acids, leaving 44 combinations left over for redundancy. Biologists call these three base pair combinations a “codon”, but a codon really is just a biological byte composed of three biological bits or base pairs that code for an amino acid. Actually, three of the base pair combinations, or codons, are used as STOP codons – TAA, TAG and TGA which are essentially end-of-file markers designating the end of a gene along the sequential file of DNA. As with magnetic tape, there is a section of “junk” DNA between genes along the DNA 2-track tape. According to Shannon’s equation, a DNA base contains 2 bits of information, so a codon can store 6 bits. For more on this see Some More Information About Information.

Figure 4 – Three bases combine to form a codon, or a biological byte, composed of three biological bits, and encodes the information for one amino acid along the chain of amino acids that form a protein.

The beginning of a gene is denoted by a section of promoter DNA that identifies the beginning of the gene, like the CustomerID field on a record, and the gene is terminated by a STOP codon of TAA, TAG or TGA. Just as there was a 0.50-inch gap of “junk” tape between blocks of records on a magnetic computer tape, there is a section of “junk” DNA between each gene along the 6 feet of DNA tape found within human cells.

Figure 5 - On average, each gene is about 400 codons long and ends in a STOP codon TAA, TAG or TGA which are essentially end-of-file markers designating the end of a gene along the sequential file of DNA. As with magnetic tape, there is a section of “junk” DNA between genes which is shown in grey above.

In order to build a protein, genes are first transcribed to an I/O buffer called mRNA. The 2-track DNA file for a gene is first opened near the promoter of a gene and an enzyme called RNA polymerase then begins to copy the codons or biological bytes along the data track of the DNA tape to an mRNA I/O buffer. The mRNA I/O buffer is then read by a ribosome read/write head as it travels along the mRNA I/O buffer. The ribosome read/write head reads each codon or biological byte of data along the mRNA I/O buffer and writes out a chain of amino acids as tRNA brings in one amino acid after another in the sequence specified by the mRNA I/O buffer.

Figure 6 - In order to build a protein, genes are first transcribed to an I/O buffer called mRNA. The 2-track DNA file for a gene is first opened near the promoter of a gene and an enzyme called RNA polymerase then begins to copy the codons or biological bytes along the data track of the DNA tape to the mRNA I/O buffer. The mRNA I/O buffer is then read by a ribosome read/write head as it travels along the mRNA I/O buffer. The ribosome read/write head reads each codon or biological byte of data along the mRNA I/O buffer and writes out a chain of amino acids as tRNA brings in one amino acid after another in the sequence specified by the mRNA I/O buffer.

The above is a brief synopsis of how simple prokaryotic bacteria and archaea build proteins from the information stored in DNA. The process for eukaryotes is a bit more complex because eukaryotes have genes containing exons and introns. The exons code for the amino acid sequence of a protein, while the introns do not. For more on that and a more detailed comparison of the processing of genes on 2-track DNA and the processing of computer data on 9-track magnetic tapes back in the 1970s and 1980s see: An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer.

Once an amino acid chain has folded up into a 3-D protein molecule, it can then perform one of the functions of life. The total of genes that are used by a particular species is known as the genome of the species, and the specific variations of those genes used by an individual are the genotype of the individual. The specific physical characteristics that those particular genes produce are called the phenotype of the individual. For example, there is a certain variation of a human gene that produces blue eyes and a certain variation that produces brown eyes. If you have two copies of the blue-eyed gene, one from your father and one from your mother, you end up with a phenotype with blue eyes, otherwise, you will end up with a brown-eyed phenotype with any other combination of genes.

Figure 7 – Eukaryotic genes consist of a number of stretches of DNA, called exons, that code for protein synthesis, and a number of stretches of DNA, called introns, that do not code for protein synthesis. Unlike the bacteria and archaea, eukaryotic introns are not between the genes, they are right in the middle of the genes, so they must be spliced out of the transcribed mRNA within the eukaryotic nuclear membrane by molecular machines called spliceosomes before they exit a nuclear poor and become translated by ribosomes into a protein.

These introns must be spliced out of the transcribed mRNA by cellular machines called spliceosomes before the mRNA, transcribed from the genetic DNA, can be translated into a sequence of amino acids to form a protein. If the eukaryotic introns were not spliced out of the mRNA prior to translation by ribosomes, incorrect polypeptide chains of amino acids would form, creating proteins that simply would not work. This is a messy, but necessary process. Now the strange thing is that we share hundreds of genes with other forms of life, like trees, because we both evolved from some common ancestor, and we both still need some proteins that essentially perform the same biochemical functions. The weird thing is that these shared genes all have their introns in exactly the same location within these shared genes. This means that these introns must have been introduced into the DNA of the eukaryotic cell architecture very early, and that is why trees and humans have the introns located in the same spots on shared genes.

The Rise of Eukaryotic Cell Structure as an Algorithmic Phase Transition
The authors of the above paper offer a new hypothesis for the rise of the complex eukaryotic cell architecture of complex carbon-based life. The authors plotted the mean protein length vs the mean gene length for 6,519 different species across the entire tree of life from bacteria and archaea to the most advanced vertebrates. The length of proteins was measured by the number of amino acids in each protein. The length of genes was measured by the number of base pairs in each gene. What they found was that on a log-log scale, there was a linear relationship between the number of amino acids in a protein and the length of the gene that coded for that protein. They found a clean break at proteins with a length of 500 amino acids. Figure 8 shows this plot and it is plain to see that more complex forms of life continued to grow the lengths of their genes, but these much longer genes did not produce proteins containing more than 500 amino acids. This means that the eukaryotic cells of more complex life forms were loading up their genes with introns and other non-coding DNA base pairs.

Figure 8 – The authors plotted the mean length of proteins measured in amino acids vs the length of their corresponding genes measured in base pairs on a log-log plot.

The authors interpret this as a phase change in the software algorithm used to produce proteins from DNA genes. The number of introns and other noncoding sections of DNA base pairs continued to rise exponentially. But instead of trying to find proteins with new functions by simply producing longer sections of DNA that coded for proteins as the exons do, a new search algorithm was introduced. For example, by splicing together exons in different orders, new proteins could be quickly produced that might have useful functions. In this view, the exons could be thought of as proven chunks of reusable code that could be easily strung together in different sequences like a mainline() calling reusable functions() as needed. That would certainly be a neat software fix to the problem of finding new proteins to produce the complexity of eukaryotic cells!

Figure 9 – Above is a figure from the author's paper that displays a semi-log plot of the fractional percentage of CDS (Coding DNA Sequences) and nCDS (Non-Coding Sequences) of DNA in genes in the log of gene length. Notice that as the eukaryotic cell architecture of advanced forms of life evolved, the percentage of the nCDS (Non-Coding Sequences) of DNA approaches nearly 100%!

The authors sum this up with:

Discussion
The origin of the eukaryotic cell left a signal in terms of a second order algorithmic phase transition, following a crisis of computational hardness. In the first phase, characterized by prokaryotes, all genes were translated into proteins through a conserved mechanism of protein-based genetic regulation: Here genes have a linear response as a gene with length l expresses a protein with length l/3. In this phase, genes are initially small. The search space is also small, thus making it easy to find a solution based on proteins. The sophistication of this regulatory mechanism could be enhanced by increasing the size of proteins, but only up to a point beyond which the search for new proteins became computationally unfeasible. Once this point was reached, mean eukaryotic gene length kept growing exponentially — as indicated by the scale-invariance of our reported Taylor’s law — but now incorporating almost exclusively noncoding sequences, in agreement with the intron-late hypothesis. Now, a single gene can generate several proteins by means of alternative splicing and free RNAs with regulatory functions. As genes grew beyond this point, the system’s available solutions grew enormously due to the explosion of possible combinations. Finding one given solution, therefore, becomes gradually easier.


Living Together Is Never Easy
Nick Lane suggests a more sinister purpose for the introns found in eukaryotic cells. Sure, these early prokaryotic archaeon cells may have initially learned to live with their new energy-rich parasitic bacterial symbiotes, like an old man taking on a new trophy wife, but there are always problems in cohabitating, like a number of new and unwelcome freeloading brothers-in-law taking up residence too. For example, bacteria carry with them parasitic segments of DNA called "mobile group II self-splicing introns". These are segments of parasitic DNA that are just trying to self-replicate, like all other forms of self-replicating information. These bacterial introns snip themselves out of the mRNA that is copied from the bacterial DNA and then form an active complex of reverse transcriptase that reinserts the intron DNA back into the bacterial DNA loop between genes to complete the self-replication process. These bacterial introns are rather mild parasites that just slow down the replication time for the main bacterial DNA loop and also waste a bit of energy and material as they self-replicate. A typical bacterial cell with 4,000 genes might have about 30 of these introns, so over time, the bacteria just managed to learn to live with these pesky introns somehow. The bacteria that could not keep their introns under control simply died out, and so the bacteria infected with tons of parasitic introns were simply weeded out because they could not compete with the more healthy bacteria.

Now for the eukaryotes, the situation is completely different. Eukaryotes have tens of thousands of introns buried in their 20,000+ genes, and these introns are located right in the middle of the eukaryotic genes!

Figure 10 – Eukaryotic genes consist of a number of stretches of DNA, called exons, that code for protein synthesis, and a number of stretches of DNA, called introns, that do not code for protein synthesis. Unlike the bacteria, eukaryotic introns are not between the genes, they are right in the middle of the genes, so they must be spliced out of the transcribed mRNA within the eukaryotic nuclear membrane by molecular machines called spliceosomes before they exit a nuclear poor and become translated by ribosomes into a protein.

These introns must be spliced out of the transcribed mRNA by cellular machines called spliceosomes before the mRNA, transcribed from the genetic DNA, can be translated into a sequence of amino acids to form a protein. If the eukaryotic introns were not spliced out of the mRNA prior to translation by ribosomes, incorrect polypeptide chains of amino acids would form, creating proteins that simply would not work. This is a messy, but necessary process. Now the strange thing is that we share hundreds of genes with other forms of life, like trees, because we both evolved from some common ancestor, and we both still need some proteins that essentially perform the same biochemical functions. The weird thing is that these shared genes all have their introns in exactly the same location within these shared genes. This means that the parasitic introns must have been introduced into the DNA of the eukaryotic cell architecture very early, and that is why trees and humans have the introns located in the same spots on shared genes. The theory goes that when the parasitic mitochondria first took up residence in prokaryotic archaeon cells, like all parasites, they tried to take as much advantage of their hosts as possible, without killing the hosts outright. Now copying the several million base pairs in the 5,000 genes of a bacterial cell is the limiting time factor in self-replicating bacteria and requires at least 20 minutes to do so. Consequently, the mitochondrial invaders began to move most of their genes to the DNA loop of their hosts, and let their energy-rich archaeon hosts produce the proteins associated with those transplanted genes for them. However, the mitochondrial invaders wisely kept all of the genes necessary for metabolism locally on their own mitochondrial DNA loop because that allowed them to quickly get their hands on the metabolic proteins they needed without putting in a back order on the main host DNA loop. After all, their energy-rich archaeon hosts now had several hundred mitochondria constantly pumping out the necessary ATP to make proteins, so why not let their hosts provide most of the proteins that were not essential to metabolism? This was the beginning of the compartmentalization of function within the hosts and was the beginning of a division of labor that produced a symbiotic advantage for both the archaeon hosts and their mitochondrial invaders.

But there was one problem with transplanting the mitochondrial genes to their hosts' DNA loop. The parasitic bacterial introns from the mitochondria tagged along as well, and those transplanted parasitic bacterial introns could now run wild because their energy-rich hosts could now afford the luxury of supporting tons of parasitic DNA, like a number of freeloading brothers-in-law that came along with your latest trophy wife - at least up to a point. Indeed, probably most archaeon hosts died from the onslaught of tons of parasitic mitochondrial introns clogging up their critical genes and making them produce nonfunctional proteins. Like I said, it is always hard to live with somebody. But some archaeon hosts must have managed to come to some kind of living arrangement with their new mitochondrial roommates that allowed the both of them to live together in a love-hate relationship that worked.

Modern eukaryotes have a distinctive nuclear membrane surrounding their DNA and the spliceosomes that splice out introns from mRNA work inside of this nuclear membrane to splice out the nasty introns within the mRNA. So initially, when a gene composed of DNA with embedded introns is transcribed to mRNA, everything is transcribed, including the introns, producing a strand of mRNA with "good" exon segments and "bad" intron segments. This happens all within the protection of the nuclear membrane that keeps the sexually turned-on ribosomes, which are just dying to translate mRNA into proteins, at bay, like an elderly prom chaperone keeping things from getting out of hand. The spliceosomes then go to work within the nuclear membrane to splice out the parasitic introns, forming a new strand of mRNA that just contains the "good" exons. This highly edited strand of DNA then passes through a pore in the nuclear membrane out to the ribosomes patiently waiting outside - see Figure 11.

The nuclear membrane of modern eukaryotes provides a clue as to what happened. When the initial bacterial mitochondrial parasites would die inside of an archaeon host, they would simply dump all of their DNA into the interior of the archaeon host, and this allowed the parasitic bacterial introns in the dumped DNA to easily splice themselves at random points into the archaeon host DNA loop, and they frequently did so right in the middle of an archaeon host gene. Most times, that simply killed the archaeon host and all of its parasitic mitochondrial bacteria. Now the easiest way to prevent that from happening would be to simply put a membrane barrier around the archaeon host DNA loop to protect it from all of the dumped mitochondrial introns, and here is an explanation of how that could happen. It turns out that archaea and bacteria use different lipids to form their membranes, and although we know that the eukaryotes split off from the archaea, the eukaryotes strangely use bacterial lipids in their membranes instead of archaeon lipids. So the eukaryotes had to have transitioned from archaeon lipids to bacterial lipids at some point in time. The theory is that the genes for building bacterial lipids were originally on the bacterial mitochondrial invaders, but were later transplanted to the host archaeon DNA loops at some point. Once on the host archaeon DNA loop, those genes would then start to create bacterial lipids with no place to go. Instead, the generated lipids would simply form lipid "bags" near the host archaeon DNA loop. Those bacterial lipid "bags" would then tend to flatten, like empty plastic grocery bags, and then surround the host archaeon DNA loop. These flattened "bags" of bacterial lipids then evolved to produce the distinctive double-membrane structure of the nuclear membrane.

Figure 11 – The eukaryotic nuclear membrane is a double-membrane consisting of an inner and outer membrane separated by a perinuclear space. The nuclear membrane contains nuclear pores that allow edited mRNA to pass out but prevent ribosomes from entering and translating unedited mRNA-containing introns into nonfunctional proteins. This double membrane is just the remnant of flattened "bags" of bacterial lipids that shielded the central archaeon DNA loop from the onslaught of parasitic mitochondrial DNA introns.

Nick Lane then goes on to explain how the host archaeon prokaryotic cells, trying to live with their parasitic mitochondrial bacterial roommates, and their unwelcome bacterial introns also led to the exclusive development of two sexes for eukaryotic-based life, and the use of sexual reproduction between those two sexes as well. It seems that the overwhelming benefit of having mitochondrial parasites, generating huge amounts of energy for the host archaeon prokaryotic cells, was just too much to resist, and the host archaeon prokaryotic cells went on to extremes to accommodate them, like also taking the host archaeon DNA loop that was initially wrapped around histone proteins to protect the DNA from the harsh environments that the extremophiles loved, and transforming that DNA into chromatin and chromosomes - for more on that see An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer.

Nick Lane's key point is that the overwhelming amount of free energy that the parasitic mitochondrial bacteria brought to their original prokaryotic archaeon hosts was well worth the effort of learning to cope with a chronic infection that has lasted for 2 billion years. Yes, in some sense the mitochondrial parasites were indeed a collection of pain-in-the-butt pests, but putting up with them was well worth the effort because they provided their hosts with 5,000 times as much free energy as they originally had. That much energy allowed their hosts to put up with a lot of grief, and in the process of learning to cope with these parasites, the archaeon hosts went on to build all of the complex structures of the eukaryotic cell.

So Nick Lane sees the introns within eukaryotic cells as pests, while the authors of the above paper see the introns as the very source of the genetic variability that allowed eukaryotic cell architecture to arise. Could both views be correct? Recall that softwarephysics holds that the defining characteristic of all self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space. See Enablement - the Definitive Characteristic of Living Things for more on this characteristic. That posting discusses Stuart Kauffman's theory of Enablement in which living things are seen to exapt existing functions into new and unpredictable functions by discovering the “AdjacentPossible” of springloaded preadaptations. Perhaps the eukaryotic introns initially began as unwanted pests but then the eukaryotic cells discovered a way to put them to good work in the “AdjacentPossible”!

The hypotheses above present some very compelling arguments, but it is very difficult to reconstruct a series of events that happened more than 2 billion years ago. Perhaps some softwarephysics and the evolution of software over the past 84 years, or 2.65 billion seconds, could be of assistance. During that long period of software evolution, did software also have to overcome similar extremely inhibitive and limiting factors that at first prevented the rise of complex software? Well, yes it did.

How the Advance of Computer Hardware and Software Worked Together to Give Rise to the Complex Software of Today
First, let's begin with the most significant advance of computer hardware that allowed for the rise of complex computer software in the 1970s which is similar to Nick Lane's hypothesis that it was the hardware advance of mitochondria that allowed for the rise of the eukaryotic cell architecture. Later, we will explore the advances in computer software architecture that also led to the rise of complex software in the 1970s. The advances in computer software architecture can be seen as similar to the advances in the search algorithm software used to produce novel proteins in eukaryotic cells being responsible for the rise of complex carbon-based life.

The rise of complex computer software in the 1970s can be seen as the result of a computer hardware improvement that came with the arrival of large amounts of computer memory on memory chips. The problem has always been that the amount of free computer memory determines the maximum size of a program because the executable file for the program has to fit into the available free memory of the computer at runtime. Generally, the more lines of source code you have in your program, the larger will be its compiled executable file, so the amount of available free memory determines the maximum size of your program. Now in the 1950s and 1960s, computer memory was very expensive and very limited.

Figure 12 – In 1951, the UNIVAC I was very impressive on the outside.

Figure 13 – But the UNIVAC I was a little less impressive on the inside.

Prior to 1955 computers, like the UNIVAC I that first appeared in 1951, were using mercury delay lines that consisted of a tube of mercury that was about 3 inches long. Each mercury delay line could store about 18 bits of computer memory as sound waves that were continuously refreshed by quartz piezoelectric transducers on each end of the tube. Mercury delay lines were huge and very expensive per bit so computers like the UNIVAC I only had a memory of 12 K (98,304 bits).

Figure 14 – Prior to 1955, huge mercury delay lines built from tubes of mercury that were about 3 inches long were used to store bits of computer memory. A single mercury delay line could store about 18 bits of computer memory as a series of sound waves that were continuously refreshed by quartz piezoelectric transducers at each end of the tube.

In 1955 magnetic core memory came along and used tiny magnetic rings called "cores" to store bits. Four little wires had to be threaded by hand through each little core in order to store a single bit, so although magnetic core memory was a lot cheaper and smaller than mercury delay lines, it was still very expensive and took up lots of space.

Figure 15 – Magnetic core memory arrived in 1955 and used a little ring of magnetic material, known as a core, to store a bit. Each little core had to be threaded by hand with 4 wires to store a single bit.

Figure 16 – Magnetic core memory was a big improvement over mercury delay lines, but it was still hugely expensive and took up a great deal of space within a computer.

Because of the limited amount of free computer memory during the 1950s and 1960s, computers simply did not have enough free computer memory to allow people to write very large programs, so programs were usually just a few hundred lines of code each. Now you really cannot do much logic in a few hundred lines of code, so IT people would string together several small programs into a batch run. Input-output tapes were used between each small program in the batch run. The first small program would run and write its results to one or more output tapes. The next program in the batch run would then read those tapes and do some more processing, and write its results to one or more output tapes too. This continued until the very last program in the batch run wrote out its final output tapes. For more on that see: An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer. Now when you are writing very small programs, because you are severely limited by the available amount of free memory, there really is no need to write structured code because you are dealing with a very small amount of processing logic. So during the Unstructured Period of the 1950s and 1960s, IT professionals simply did not bother with breaking up software logic into functions or subroutines. So each little program in a batch stream was like a small single prokaryotic cell, with no internal structure.

But in the early 1970s, inexpensive semiconductor memory chips came along that made computer memory small and cheap. These memory chips were the equivalent of software mitochondria because they removed the extremely inhibitive limiting factor of software complexity. Suddenly, IT now had large amounts of computer memory that allowed IT people to write huge programs. But that presented a problem. It was found that the processing logic became too contorted and impossible to maintain when simple prokaryotic unstructured programs were scaled up to programs with 50,000 lines of unstructured code. Even the programmer who wrote the original code could not make sense of it a few months later, and this was much worse for new programmers who came along later to maintain the code. Since commercial software can easily live for 10 years or more, that was a real problem. To alleviate this problem, Dahl, Dijkstra, and Hoare published Structured Programming in 1972, in which they suggested that computer programs should have a complex internal structure with no GOTO statements, lots of subroutines, indented code, and many comment statements. During the Structured Period that soon followed, these structured programming techniques were adopted by the IT community, and the GOTO statements were replaced by subroutines, also known as functions(), and indented code with lots of internal structure, like the eukaryotic structure of modern cells that appeared about 2 billion years ago.

Figure 17 – Finally in the early 1970s inexpensive semiconductor memory chips came along that made computer memory small and cheap. These memory chips were the equivalent of software mitochondria because they removed the extremely inhibitive limiting factor of software complexity.

Today, we now almost exclusively use object-oriented programming techniques that allow for the multicellular organization of software via objects, but the code for the methods that operate on those objects, still use the structured programming techniques that first appeared in the early 1970s.

The Advance of Computer Software Architecture Also Gave Rise to Complex Software
Now let's see how the dramatic increase in computer memory allowed for the rise of complex computer software in the 1970s and beyond. To do so, we need to do some software paleontology by digging up some old fossilized software.

Unstructured Period (1941 – 1972)
During the Unstructured Period, programs were simple monolithic structures with lots of GOTO statements, no subroutines, no indentation of code, and very few comment statements. The machine code programs of the 1940s evolved into the assembler programs of the 1950s and the compiled programs of the 1960s, with FORTRAN appearing in 1956 and COBOL in 1958. These programs were very similar to the early prokaryotic bacteria that appeared over four billion years ago on the Earth and lacked internal structure. Bacteria essentially consist of a tough outer cell wall enclosing an inner cell membrane and contain a minimum of internal structure. The cell wall is composed of a tough molecule called peptidoglycan, which is composed of tightly bound amino sugars and amino acids. The cell membrane is composed of phospholipids and proteins, which will be described later in this posting. The DNA within bacteria generally floats freely as a large loop of DNA, and their ribosomes, used to help transcribe DNA into proteins, float freely as well and are not attached to membranes called the rough endoplasmic reticulum. The chief advantage of bacteria is their simple design and ability to thrive and rapidly reproduce even in very challenging environments, like little AK-47s that still manage to work in environments where modern tanks fail. Just as bacteria still flourish today, some unstructured programs are still in production.

Figure 18 – A simple prokaryotic bacterium with little internal structure (click to enlarge)

Below is a code snippet from a fossil FORTRAN program listed in a book published in 1969 showing little internal structure. Notice the use of GOTO statements to skip around in the code. Later this would become known as the infamous “spaghetti code” of the Unstructured Period that was such a joy to support.

30 DO 50 I=1,NPTS
31 IF (MODE) 32, 37, 39
32 IF (Y(I)) 35, 37, 33
33 WEIGHT(I) = 1. / Y(I)
      GO TO 41
35 WEIGHT(I) = 1. / (-1*Y(I))
37 WEIGHT(I) = 1.
      GO TO 41
39 WEIGHT(I) = 1. / SIGMA(I)**2
41 SUM = SUM + WEIGHT(I)
      YMEAN = WEIGHT(I) * FCTN(X, I, J, M)
      DO 44 J = 1, NTERMS
44 XMEAN(J) = XMEAN(J) + WEIGHT(I) * FCTN(X, I, J, M)
50 CONTINUE

The primitive nature of software in the Unstructured Period was largely due to the primitive nature of the hardware upon which it ran. Figure 24 shows an IBM OS/360 from 1964 – notice the operator at the teletype feeding commands to the nearby operator console, the distant tape drives, and the punch card reader in the mid-ground. Such a machine had about 1 MB of memory, less than 1/1000 of the memory of a current cheap $250 PC, and a matching anemic processing speed. For non-IT readers let me remind all that:

1 KB = 1 kilobyte = 210 = 1024 bytes or about 1,000 bytes
1 MB = 1 megabyte = 1024 x 1024 = 1,048,576 bytes or about a million bytes
1 GB = 1 gigabyte = 1024 x 10224 x 1024 = 1,073,741,824 bytes or about a billion bytes

One byte of memory can store one ASCII text character like an “A” and two bytes can store a small integer in the range of -32,768 to +32,767. When I first started programming in 1972 we thought in terms of kilobytes, then megabytes, and now gigabytes. Data science people now think in terms of many terabytes - 1 TB = 1024 GB.

Software was input via punched cards and the output was printed on fan-fold paper. Compiled code could be stored on tape or very expensive disk drives if you could afford them, but any changes to code were always made via punched cards, and because you were only allowed perhaps 128K – 256K of memory for your job, programs had to be relatively small, so simple unstructured code ruled the day. Like the life cycle of a single-celled bacterium, the compiled and linked code for your program was loaded into the memory of the computer at execution time and did its thing in a batch mode, until it completed successfully or abended and died. At the end of the run, the computer’s memory was released for the next program to be run and your program ceased to exist.

Figure 19 - An IBM 029 keypunch machine from the 1960s Unstructured Period of software.

Figure 20 - Each card could hold a maximum of 80 bytes. Normally, one line of code was punched onto each card.

Figure 21 - The cards for a program were held together into a deck with a rubber band, or for very large programs, the deck was held in a special cardboard box that originally housed blank cards. Many times the data cards for a run followed the cards containing the source code for a program. The program was compiled and linked in two steps of the run and then the generated executable file processed the data cards that followed in the deck.

Figure 22 - To run a job, the cards in a deck were fed into a card reader, as shown on the left above, to be compiled, linked, and executed by a million-dollar mainframe computer. In the above figure, the mainframe is located directly behind the card reader.

Figure 23 - The output of programs was printed on fan-folded paper by a line printer.

However, one should not discount the great advances that were made by the early bacteria billions of years ago or by the unstructured code from the computer systems of the 1950s and 1960s. These were both very important formative periods in the evolution of life and of software on Earth, and examples of both can still be found in great quantities today. For example, it is estimated that about 50% of the Earth’s biomass is still composed of simple bacteria. Your body consists of about 100 trillion cells, but you also harbor about 10 times that number of bacterial cells that are in a parasitic/symbiotic relationship with the “other” cells of your body and perform many of the necessary biochemical functions required to keep you alive, such as aiding with the digestion of food. Your gut contains about 3.5 pounds of active bacteria and about 50% of the dry weight of your feces is bacteria, so in reality, we are all composed of about 90% bacteria with only 10% of our cells being “normal” human cells.

All of the fundamental biochemical pathways used by living things to create large complex organic molecules from smaller monomers, or to break those large organic molecules back down into simple monomers were first developed by bacteria billions of years ago. For example, bacteria were the first forms of life to develop the biochemical pathways that turn carbon dioxide, water, and the nitrogen in the air into the organic molecules necessary for life – sugars, lipids, amino acids, and nucleotides that form RNA and DNA. They also developed the biochemical pathways to replicate DNA and transcribe DNA into proteins, and to form complex structures such as cell walls and cell membranes from sugars, amino acids, proteins, and phospholipids. Additionally, bacteria invented the Krebs cycle to break these large macromolecules back down to monomers for reuse and to release and store energy by transforming ADP to ATP. To expand upon this, we will see in Software Symbiogenesis, how Lynn Margulis has proposed that all the innovations of large macroscopic forms of life have actually been acquired from the highly productive experiments of bacterial life forms.

Similarly, all of the fundamental coding techniques of IT at the line of code level were first developed in the Unstructured Period of the 1950s and 1960s, such as the use of complex variable names, arrays, nested loops, loop counters, if-then-else logic, list processing with pointers, I/O blocking, bubble sorts, etc. When I was in Middleware Operations for Discover, I did not do much coding. However, I did write a large number of Unix shell scripts to help make my job easier. These Unix shell scripts were small unstructured programs in the range of 10 – 50 lines of code, and although they were quite primitive and easy to write, they had a huge economic pay-off for me. Many times, a simple 20-line Unix shell script that took less than an hour to write, would provide as much value to me as the code behind the IBM Websphere Console, which I imagine probably had cost IBM about $10 - $100 million dollars to develop and came to several hundred thousand lines of code. For more on that see MISE in the Attic. So if you add up all the little unstructured Unix shell scripts, DOS .bat files, edit macros, Excel spreadsheet macros, Word macros, etc., I bet that at least 50% of the software in the Software Universe is still unstructured code.

Figure 24 – An IBM OS/360 mainframe from 1964. The IBM OS/360 mainframe caused commercial software to explode within corporations during the Unstructured Period and gave IT professionals the hardware platform that they were waiting for.

Structured Period (1972 – 1992)
The increasing availability of computers with more memory and faster CPUs allowed for much larger programs to be written in the 1970s, but unstructured code became much harder to maintain as it grew in size, so the need for internal structure became readily apparent. Plus, around this time code began to be entered via terminals using full-screen editors, rather than on punched cards, which made it easier to view larger sections of code as you changed it.

Figure 25 - IBM 3278 terminals were connected to controllers that connected to IBM mainframes The IBM 3278 terminals then ran interactive TSO sessions with the IBM mainframes. The ISPF full-screen editor was then brought up under TSO after you logged into a TSO session.

Figure 26 – A mainframe with IBM 3278 CRT terminals attached (click to enlarge)

In 1972, Dahl, Dijkstra, and Hoare published Structured Programming, in which they suggested that computer programs should have complex internal structure with no GOTO statements, lots of subroutines, indented code, and many comment statements. During the Structured Period, these structured programming techniques were adopted by the IT community, and the GOTO statements were replaced by subroutines, also known as functions(), and indented code with lots of internal structure, like the eukaryotic structure of modern cells that appeared about 1,500 million years ago. Eukaryotic cells are found in the bodies of all complex organisms from single-cell yeasts to you and me and divide up cell functions amongst a collection of organelles (subroutines), such as mitochondria, chloroplasts, Golgi bodies, and the endoplasmic reticulum.

Figure 27 – Plants and animals are composed of eukaryotic cells with much internal structure (click to enlarge)

Figure 29 compares the simple internal structure of a typical prokaryotic bacterium with the internal structure of eukaryotic plant and animal cells. These eukaryotic cells could be simple single-celled plants and animals or they could be found within a much larger multicellular organism consisting of trillions of eukaryotic cells. Figure 29 is a bit deceiving, in that eukaryotic cells are huge cells that are more than 20 times larger in diameter than a typical prokaryotic bacterium with about 10,000 times the volume as shown in Figure 30. Because eukaryotic cells are so large, they have an internal cytoskeleton, composed of linear-shaped proteins that form filaments that act like a collection of tent poles, to hold up the huge cell membrane encircling the cell.

Eukaryotic cells also have a great deal of internal structure, in the form of organelles, that are enclosed by internal cell membranes. Like the structured programs of the 1970s and 1980s, eukaryotic cells divide up functions amongst these organelles. These organelles include the nucleus to store and process the genes stored in DNA, mitochondria to perform the Krebs cycle to create ATP from carbohydrates, and chloroplasts in plants to produce energy-rich carbohydrates from water, carbon dioxide, and sunlight.

Figure 29 – The prokaryotic cell architecture of the bacteria and archaea is very simple and designed for rapid replication. Prokaryotic cells do not have a nucleus enclosing their DNA. Eukaryotic cells, on the other hand, store their DNA on chromosomes that are isolated in a cellular nucleus. Eukaryotic cells also have a very complex internal structure with a large number of organelles, or subroutine functions, that compartmentalize the functions of life within the eukaryotic cells.

Figure 30 – Not only are eukaryotic cells much more complicated than prokaryotic cells, but they are also HUGE!

The introduction of structured programming techniques in the early 1970s allowed programs to become much larger and much more complex by using many subroutines to divide up logic into self-contained organelles. This induced a mass extinction of unstructured programs, similar to the Permian-Triassic (P-T) mass extinction, or the Great Dying, 250 million years ago that divided the Paleozoic from the Mesozoic in the stratigraphic column and resulted in the extinction of about 90% of the species on Earth. As programmers began to write new code using the new structured programming paradigm, older code that was too difficult to rewrite in a structured manner remained as legacy “spaghetti code” that slowly fossilized over time in production. Like the Permian-Triassic (P-T) mass extinction, the mass extinction of unstructured code in the 1970s was more like a greenhouse gas mass extinction than an impactor mass extinction because it spanned nearly an entire decade, and was also a rather complete mass extinction which totally wiped out most unstructured code in corporate systems.

Below is a code snippet from a fossil COBOL program listed in a book published in 1975. Notice the structured programming use of indented code and calls to subroutines with PERFORM statements.

PROCEDURE DIVISION.
      OPEN INPUT FILE-1, FILE-2
      PERFORM READ-FILE-1-RTN.
      PERFORM READ-FILE-2-RTN.
      PERFORM MATCH-CHECK UNTIL ACCT-NO OF REC-1 = HIGH_VALUES.
      CLOSE FILE-1, FILE-2.
MATCH-CHECK.
      IF ACCT-NO OF REC-1 < ACCT-NO OF REC-2
            PERFORM READ-FILE-1-RTN
      ELSE
            IF ACCT-NO OF REC-1 > ACCT-NO OF REC-2
                  DISPLAY REC-2, 'NO MATCHING ACCT-NO'
                  PERORM READ-FILE-1-RTN
      ELSE
            PERORM READ-FILE-2-RTN UNTIL ACCT-NO OF REC-1
            NOT EQUAL TO ACCT-NO OF REC-2

When I encountered my very first structured FORTRAN program in 1975, I diligently “fixed” the program by removing all the code indentations! You see in those days, we rarely saw the entire program on a line printer listing because that took a compile of the program to produce and wasted valuable computer time, which was quite expensive back then. When I provided an estimate for a new system back then, I figured 25% for programming manpower, 25% for overhead charges from other IT groups on the project, and 50% for compiles. So instead of working with a listing of the program, we generally flipped through the card deck of the program to do debugging. Viewing indented code in a card deck can give you a real headache, so I just “fixed” the program by making sure all the code started in column 7 of the punch cards as it should!

Object-Oriented Period (1992 – Present)
During the Object-Oriented Period, programmers adopted a multicellular organization for software, in which programs consisted of many instances of objects (cells) that were surrounded by membranes studded with exposed methods (membrane receptors).

The following discussion might be a little hard to follow for readers with a biological background, but with little IT experience, so let me define a few key concepts with their biological equivalents.

Class – Think of a class as a cell type. For example, the class Customer is a class that defines the cell type of Customer and describes how to store and manipulate the data for a Customer, like firstName, lastName, address, and accountBalance. For example, a program might instantiate a Customer object called “steveJohnston”.

Object – Think of an object as a cell. A particular object will be an instance of a class. For example, the object steveJohnston might be an instance of the class Customer and will contain all the information about my particular account with a corporation. At any given time, there could be many millions of Customer objects bouncing around in the IT infrastructure of a major corporation’s website.

Instance – An instance is a particular object of a class. For example, the steveJohnston object would be a particular instance of the class Customer, just as a particular red blood cell would be a particular instance of the cell type RedBloodCell. Many times programmers will say things like “This instantiates the Customer class”, meaning it creates objects (cells) of the Customer class (cell type).

Method – Think of a method() as a biochemical pathway. It is a series of programming steps or “lines of code” that produce a macroscopic change in the state of an object (cell). The Class for each type of object defines the data for the class, like firstName, lastName, address, and accountBalance, but it also defines the methods() that operate upon these data elements. Some methods() are public, while others are private. A public method() is like a receptor on the cell membrane of an object (cell). Other objects(cells) can send a message to the public methods of an object (cell) to cause it to execute a biochemical pathway within the object (cell). For example, steveJohnston.setFirstName(“Steve”) would send a message to the steveJohnston object instance (cell) of the Customer class (cell type) to have it execute the setFirstName method() to change the firstName of the object to “Steve”. The steveJohnston.getaccountBalance() method would return my current account balance with the corporation. Objects also have many internal private methods() within that are biochemical pathways that are not exposed to the outside world. For example, the calculateAccountBalance() method could be an internal method that adds up all of my debits and credits and updates the accountBalance data element within the steveJohnston object, but this method cannot be called by other objects (cells) outside of the steveJohnston object (cell). External objects (cells) have to call the steveJohnston.getaccountBalance() in order to find out my accountBalance.

Line of Code – This is a single statement in a method() like:

discountedTotalCost = (totalHours * ratePerHour) - costOfNormalOffset;

Remember methods() are the equivalent of biochemical pathways and are composed of many lines of code, so each line of code is like a single step in a biochemical pathway. Similarly, each character in a line of code can be thought of as an atom, and each variable as an organic molecule. Each character can be in one of 256 ASCII quantum states defined by 8 quantized bits, with each bit in one of two quantum states “1” or “0”, which can also be characterized as 8 electrons in a spin-up ↑ or spin-down ↓ state:

discountedTotalCost = (totalHours * ratePerHour) - costOfNormalOffset;

C = 01000011 = ↓ ↑ ↓ ↓ ↓ ↓ ↑ ↑
H = 01001000 = ↓ ↑ ↓ ↓ ↑ ↓ ↓ ↓
N = 01001110 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↓
O = 01001111 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↑

Programmers have to assemble characters (atoms) into organic molecules (variables) to form the lines of code that define a method() (biochemical pathway). As in carbon-based biology, the slightest error in a method() can cause drastic and usually fatal consequences. Because there is nearly an infinite number of ways of writing code incorrectly and only a very few ways of writing code correctly, there is an equivalent of the second law of thermodynamics at work. This simulated second law of thermodynamics and the very nonlinear macroscopic effects that arise from small coding errors is why software architecture has converged upon Life’s Solution. With these concepts in place, we can now proceed with our comparison of the evolution of software and carbon-based life on Earth.

Object-oriented programming actually started in the 1960s with Simula, the first language to use the concept of merging data and functions into objects defined by classes, but object-oriented programming did not really catch on until nearly 30 years later:

1962 - 1965 Dahl and Nygaard develop the Simula language
1972 - Smalltalk language developed
1983 - 1985 Sroustrup develops C++
1995 - Sun announces Java at SunWorld `95

Similarly, multicellular organisms first appeared about 900 million years ago, but it took about another 400 million years, until the Cambrian, for it to catch on as well. Multicellular organisms consist of huge numbers of cells that send messages between cells (objects) by secreting organic molecules that bind to the membrane receptors on other cells and induce those cells to execute exposed methods. For example, your body consists of about 100 trillion independently acting eukaryotic cells, and not a single cell in the collection knows that the other cells even exist. In an object-oriented manner, each cell just responds to the organic molecules that bind to its membrane receptors, and in turn, sends out its own set of chemical messages that bind to the membrane receptors of other cells in your body. When you wake to the sound of breaking glass in the middle of the night, your adrenal glands secrete the hormone adrenaline (epinephrine) into your bloodstream, which binds to the getScared() receptors on many of your cells. In an act of object-oriented polymorphism, your liver cells secrete glucose into your bloodstream, and your heart cells constrict harder when their getScared() methods are called.

Figure 31 – Multicellular organisms consist of a large number of eukaryotic cells, or objects, all working together (click to enlarge)

These object-oriented languages use the concepts of encapsulation, inheritance and polymorphism which is very similar to the multicellular architecture of large organisms

Encapsulation
Objects are contiguous locations in memory that are surrounded by a virtual membrane that cannot be penetrated by other code and are similar to an individual cell in a multicellular organism. The internal contents of an object can only be changed via exposed methods (like subroutines), similar to the receptors on the cellular membranes of a multicellular organism. Each object is an instance of an object class, just as individual cells are instances of a cell type. For example, an individual red blood cell is an instance object of the red blood cell class.

Inheritance
Cells inherit methods in a hierarchy of human cell types, just as objects form a class hierarchy of inherited methods in a class library. For example, all cells have the metabolizeSugar() method, but only red blood cells have the makeHemoglobin() method. Below is a tiny portion of the 210 known cell types of the human body arranged in a class hierarchy.

Human Cell Classes
1. Epithelium
2. Connective Tissue
      A. Vascular Tissue
            a. Blood
                  - Red Blood Cells
            b. Lymph
      B. Proper Connective Tissue
3. Muscle
4. Nerve

Polymorphism
A chemical message sent from one class of cell instances can produce an abstract behavior in other cells. For example, adrenal glands can send the getScared() message to all cell instances in your body, but all of the cell instances getScared() in their own fashion. Liver cells release glucose and heart cells contract faster when their getScared() methods are called. Similarly, when you call the print() method of a report object, you get a report, and when you call the print() method of a map, you get a map.

Figure 32 – Objects are like cells in a multicellular organism that exchange messages with each other (click to enlarge)

The object-oriented revolution, enhanced by the introduction of Java in 1995, caused another mass extinction within IT as structured procedural programs began to be replaced by object-oriented C++ and Java programs, like the Cretaceous-Tertiary extinction 65 million years ago that killed off the dinosaurs, presumably caused by a massive asteroid strike upon the Earth.

Below is a code snippet from a fossil C++ program listed in a book published in 1995. Notice the object-oriented programming technique of using a class specifier to define the data and methods() of objects instantiated from the class. Notice that PurchasedPart class inherits code from the more generic Part class. In both C++ and Java, variables and methods that are declared private can only be used by a given object instance, while public methods can be called by other objects to cause an object to perform a certain function, so public methods are very similar to the functions that the cells in a multicellular organism perform when organic molecules bind to the membrane receptors of their cells. Later in this posting, we will describe in detail how multicellular organisms use this object-oriented approach to isolate functions.

class PurchasedPart : public Part
      private:
            int partNumber;
            char description[20]
      public:
            PurchasedPart(int pNum, char* desc);
            PurchasePart();
            void setPart(int pNum, char* desc);
            char* getDescription();
void main()
            PurchasedPart Nut(1, "Brass");
            Nut.setPart(1, "Copper");

Figure 33 – Cells in a growing embryo communicate with each other by sending out ligand molecules called paracrine factors that bind to membrane receptors on other cells.

Figure 34 – Calling a public method of an Object can initiate the execution of a cascade of private internal methods within the Object. Similarly, when a paracrine factor molecule plugs into a receptor on the surface of a cell, it can initiate a cascade of internal biochemical pathways. In the above figure, an Ag protein plugs into a BCR receptor and initiates a cascade of biochemical pathways or methods within a cell.

Like the geological eras, the Object-Oriented Period got a kick-start from an environmental hardware change. In the early 1990s, the Distributed Computing Revolution hit with full force, which spread computing processing over a number of servers and client PCs, rather than relying solely on mainframes to do all the processing. It began in the 1980s with the introduction of PCs into the office to do stand-alone things like word processing and spreadsheets. The PCs were also connected to mainframes as dumb terminals through emulator software as shown in Figure 26 above. In this architectural topology, the mainframes still did all the work and the PCs just displayed CICS green screens like dumb terminals. But this at least eliminated the need to have an IBM 3278 terminal and PC on a person’s desk, which would have left very little room for anything else! But this architecture wasted all the computing power of the rapidly evolving PCs, so the next step was to split the processing load between the PCs and a server. This was known as the 2-tier client/server or “thick client” architecture of Figure 35. In 2-tier client/server, the client PCs ran the software that displayed information in a GUI like Windows 3.0 and connected to a server running RDBMS (Relational Database Management System) software like Oracle or Sybase that stored the common data used by all the client PCs. This worked great so long as the number of PCs remained under about 30. We tried this at Amoco in the early 1990s, and it was like painting the Eiffel Tower. As soon as we got the 30th PC working, we had to go back and fix the first one! It was just too hard to keep the “thick client” software up and running on all those PCs with all the other software running on them that varied from machine to machine.

These problems were further complicated by the rise of computer viruses in the mid-1980s. Prior to the 2-tier client/server architecture, many office PCs were standalone machines, only connected to mainframes as dumb terminals, and thus totally isolated machines safe from computer virus infection. In the PC topology of the 1980s, computer viruses could only spread via floppy disks, which severely limited their infection rates. But once the 2-tier architecture fell into place, office PCs began to be connected together via LANs (Local Area Networks) and WANs (Wide Area Networks) to share data and other resources like printers. This provided a very friendly environment for computer viruses to quickly spread across an entire enterprise, so the other thing that office PCs began to share was computer viruses. Computer viruses are purely parasitic forms of software, which are more fully covered in postings on Self-Replicating Information and Software Symbiogenesis.

The limitations of the 2-tier architecture led to the 3-tier model in the mid to late 1990s with the advent of “middleware” as seen in Figure 35. Middleware is software that runs on servers between the RDBMS servers and the client PCs. In the 3-tier architecture, the client PCs run “thin client” software that primarily displays information via a GUI like Windows. The middleware handles all the business logic and relies on the RDBMS servers to store data.

Figure 35 – The Distributed Computing Revolution aided object-oriented architecture (click to enlarge)

In the late 1990s, the Internet exploded upon the business world and greatly enhanced the 3-tier model of Figure 35. The “thin client” running on PCs now became a web browser like Internet Explorer. Middleware containing business logic was run on Application servers that produced dynamic web pages that were dished up by Web servers like Apache. Data remained back on mainframes or RDBMS servers. Load balancers were also used to create clusters of servers that could scale load. As your processing load increased, all you had to do was buy more servers for each tier in the architecture to support the added load. This opened an ecological niche for the middleware software that ran on the Appserver tier of the architecture. At the time, people were coming up with all sorts of crazy ways to create dynamic HTML web pages on the fly. Some people were using Perl scripts, while others used C programs, but these all required a new process to be spawned each time a dynamic web page was created and that was way too much overhead. Then Java came crashing down like a 10-kilometer wide asteroid! Java, Java, Java – that’s all we heard after it hit in 1995. Java was the first object-oriented programming language to take on IT by storm. The syntax of Java was very nearly the same as C++, without all the nasty tricky things like pointers that made C++ and C so hard to deal with. C++ had evolved from C in the 1980s, and nearly all computer science majors had cut their programming teeth on C or C++ in school, so Java benefited from a large population of programmers familiar with the syntax. The end result was a mass extinction of non-Java-based software on the distributed computing platform and the rapid rise of Java-based applications like an impactor mass extinction. Even Microsoft went Object-Oriented on the Windows server platform with its .NET Framework using its Java-like C# language. Procedural, non-Object Oriented software like COBOL, sought refuge in the mainframes where it still hides today.

Figure 36 – A modern multi-tier website topology (click to enlarge)

For more about software using complex carbon-based multicellar organization see Software Embryogenesis.

SOA - Service Oriented Architecture Period (2004 – 2015)
The next advance in software architecture came as the Service Oriented Architecture (SOA) Period, which was very similar to the Cambrian Explosion. During the Cambrian Explosion, 541 million years ago, complex body plans first evolved, which allowed cells in multicellular organisms to make RMI (Remote Method Invocation) and CORBA (Common Object Request Broker Architecture) calls upon the cells in remote organs to accomplish biological purposes. In the Service Oriented Architecture Period, we used common EJB components in J2EE appservers to create services that allowed for Applications with complex body plans. The J2EE appservers performed the functions of organs like kidneys, lungs and livers. I am discounting the original appearance of CORBA in 1991 here as a failed precursor because CORBA never became ubiquitous as EJB later became. In the evolution of any form of self-replicating information, there are frequently many failed precursors leading up to a revolution in technology.

There is a growing body of evidence beginning to support the geological "Snowball Earth" hypothesis that the Earth went through a period of 100 million years of extreme climatic fluctuations just prior to the Cambrian Explosion. During this period, the Earth seesawed between being completely covered with a thick layer of ice and being a hothouse with a mean temperature of 140 0F. Snowball Earth (2003) by Gabrielle Walker is an excellent book covering the struggles of Paul Hoffman, Joe Kirschvink, and Dan Schrag to uncover the evidence for this dramatic discovery and to convince the geological community of its validity. It has been suggested that the resulting stress on the Earth's ecosystems sparked the Cambrian Explosion. As we saw above, for the great bulk of geological time, the Earth was dominated by simple single-celled organisms. The nagging question for evolutionary biology has always been why did it take several billion years for complex multicellular life to arise, and why did it arise all at once in such a brief period of geological time? As a field geologist works up from pre-Cambrian to Cambrian strata, suddenly the rocks burst forth with complex fossils where none existed before. For many, the first appearance of complex life just following the climatic upheaval of the Snowball Earth is compelling evidence that these two very unique incidents in the Earth’s history must be related.

Similarly for IT, the nagging question is why did it take until the first decade of the 21st century for the SOA Cambrian Explosion to take place when the first early precursors can be found as far back as the mid-1960s? After all, software based upon multicellular organization, also known as object-oriented software, goes all the way back to the object-oriented language Simula developed in 1965, and the ability for objects (cells) to communicate between CPUs arose with CORBA in 1991. So all the precursors were in place nearly 15 years earlier, yet software based upon a complex multicellular architecture languished until it was jarred into existence by a series of harsh environmental shocks to the IT community. It was the combination of moving off the mainframes to a distributed hardware platform, running on a large number of servers and client PCs, the shock of the Internet upon the business world and IT, and the impact of Sun’s Java programming language, that ultimately spawned the SOA (Service Oriented Architecture) Cambrian Explosion. These shocks all occurred within a few years of each other in the 1990s, and after the dust settled, IT found itself in a new world of complexity.

To see how this works, let’s examine more closely the inner workings of a J2EE Appserver. Figure 37 shows the interior of a J2EE Appserver like WebSphere. The WebSphere middleware is software that runs on a Unix server which might host 30 or more WebSphere Appserver instances and there might be many physical Unix servers running these WebSphere Appserver instances in a Cell (Tier). Figure 36 shows a Cell (Tier 2) consisting of two physical Application servers or nodes, but there could easily be 4 or 5 physical Unix servers or nodes in a WebSphere Cell. This allows WebSphere to scale, as your load increases, you just add more physical Unix servers or nodes to the Cell. So each physical Unix server in a WebSphere Cell contains a number of software Appserver instances as shown in Figure 37, and each Appserver contains a number of WebSphere Applications that do things like creating dynamic web pages for a web-based application. For example, on the far left of Figure 37, we see a client PC running a web browser like Chrome. The web browser makes HTTP requests to an HTTP webserver like Apache. If the Apache webserver can find the requested HTML page, like a login page, it returns that static HTML page to the browser for the end-user to fill in his ID and PASSWORD. The user’s ID and PASSWORD are then returned to the Apache webserver when the SUBMIT button is pressed, but now the Apache webserver must come up with an HTML page that is specific for the user’s ID and PASSWORD like a web page with the end-user’s account information. That is accomplished by having Apache forward the request to a WebSphere Application running in one of the WebSphere Appservers. The WebSphere Appserver has two software containers that perform the functions of an organ in a multicellular organism. The Web Container contains instances of servlets and JSPs (Java Server Pages). A servlet is a Java program that contains logic to control the generation of a dynamic web page. JSPs are HTML pages with tags for embedded programming logic that are compiled into servlets at execution time. The servlets in the Web Container create objects and are run in a thread pool in the Web Container, like the cells in a liver or kidney. Unlike the mainframe processing of the Unstructured Period, in which a program was loaded into memory, run, and then perished, these servlets remain in memory and are continuously reused by the thread pool to service additional requests, until no further requests arrive and the servlet is destroyed to make room for another servlet in the thread pool. The EJB Container performs a similar function by running EJBs (Enterprise Java Beans) in a thread pool. The EJBs provide business logic and connect to databases (DB) and mainframes (EIS – Enterprise Information Systems). By keeping the servlets and EJBs running continuously in memory, with permanent connections to databases and mainframes via connection pools, the overhead of loading and releasing the servlets is eliminated as well as the creation and tear-down of connections to databases and mainframes. So the Web and EJB Containers of a J2EE Appserver are very much like the cells in an organ that continuously provide services for the other cells of a multicellular organism. Look at it this way, unlike a simple single-celled organism that is born, lives, and dies, your body consists of 100 trillion cells and each day about a trillion cells die and are replaced by a trillion new cells, but through it all you keep going. A simple single-celled organism is like a batch program from the Unstructured Period, while your body runs on a SOA architecture of trillions of cells in thread and connection pools that are constantly coming and going and creating millions of objects that are created (instantiated), used, and later destroyed.

Figure 37 - A J2EE Application Server contains a WEB Container that stores pools of Servlet Objects and an EJB Container that stores pools of EJB Objects. The EJB Objects get data from relational databases (DB) and processes the data and then passes the information to Servlet Objects. The Servlet Objects generate HTML based upon the data processed by the EJB Objects and pass the HTML to HTTP webservers like Apache.

For more about complex multicellular software built on SOA architecture see Software Embryogenesis.

Design Patterns – the Phyla of IT
Another outgrowth of the object-oriented programming revolution was the adoption of design patterns by IT. Design patterns originated as an architectural concept developed by Christopher Alexander in the 1960s. In Notes on the Synthesis of Form (1964), Alexander noted that all architectural forms are really just implementations of a small set of classic design patterns that have withstood the test of time in the real world of human affairs and that have been blessed by the architectural community throughout history for both beauty and practicality. Basically, given the physical laws of the Universe and the morphology of the human body, there are really only a certain number of ways of doing things from an architectural point of view that work in practice, so by trial and error architects learned to follow a set of well established architectural patterns. In 1987, Kent Beck and Ward Cunningham began experimenting with the idea of applying the concept of design patterns to programming and presented their results at the object-oriented OOPSLA conference that year. Design patterns gained further popularity in computer science after the book Design Patterns: Elements of Reusable Object-Oriented Software was published in 1994 by Erich Gamma, Richard Helm, and Ralph Johnson. Also in 1994, the first Pattern Languages of Programming Conference was held, and in 1995 the Portland Pattern Repository was established to document design patterns for general IT usage.

However, the concept of design patterns goes back much further than this. In biology, a design pattern is called a phylum, which is a basic body plan. For example, the phylum Arthropoda consists of all body plans that use an external skeleton such as the insects and crabs, and the Echinodermata have a five-fold radial symmetry like a starfish. Similarly, the phylum Chordata consists of all body plans that have a large dorsal nerve running down a hollow backbone or spinal column. The Cambrian Explosion, 541 million years ago, brought about the first appearance of a large number of phyla or body plans on Earth. In fact, all of the 35 phyla currently found on the Earth today can trace their roots back to the Cambrian, and it even appears that some of the early Cambrian phyla have gone completely extinct, judging by some of the truly bizarre-looking fossils that have been found in the Burgess Shale of the highly experimental Cambrian period.

In IT a design pattern describes a certain design motif or way of doing things. A design pattern is a prototypical design architecture that developers can copy and adapt for their particular application to solve the general problem described by the design pattern. This is in recognition of the fact that at any given time there are only a limited number of IT problems that need to be solved at the application level, and it makes sense to apply a general design pattern rather than to reinvent the wheel each time. Developers can use a design pattern by simply adopting the common structure and organization of the design pattern for their particular application, just as living things adopt an overall body plan or phylum to solve the basic problems of existence. In addition, design patterns allow developers to communicate with each other using well-known and well-understood names for software interactions, just as biologists can communicate with each other by using the well-known taxonomic system of classification developed by Carl Linnaeus in Systema Naturae published in 1735.

A design pattern that all Internet users should be quite familiar with is the Model-View-Controller (MVC) design pattern used by most web-applications. Suppose you are placing an order with Amazon. The Model is the data that comprises your Amazon account information, such as your credit card number on file and your mailing address, together with all the items in your shopping cart. In Figure 37 above, the Model is stored on a relational database server DB, such as an Oracle server, or back on a mainframe in an EIS (Enterprise Information System) connected to a mainframe DB2 database as a series of relational database tables. The View is the series of webpages presented to your browser as .html pages that convey the Model data to you in a sensible form as you go about your purchase. These View .html pages are generated by JSPs (Java Server Pages) in the web container of the J2EE Appserver. The Controller is a servlet, a java program running in a thread pool in the web container of the J2EE Appserver, that performs the overall control of your interactions with the Amazon application as you go about placing your order. The Controller servlet calls JSPs and instantiates objects (cells) that call EJB objects (cells) in the EJB container of the J2EE Appserver that interact with the relational database tables storing your data.

During the first decade of the 21st century, the Service Oriented Architecture rapidly expanded in the IT community and began to expand beyond the traditional confines of corporate datacenters, as corporations began to make services available to business partners over the Internet. With the flexibility of Service Oriented Architecture and the Internet, we began to see the evolution of an integrated service-oriented ecology form - a web of available services like the web of life in a rain forest. Today, we call that rain forest ecology of shared software services over the Internet the Cloud Microservices Platform.

Cloud Computing and the Rise of the Cloud Computing Microservices of Today
The age of Cloud Microservices marks the latest period of software evolution. Cloud Computing allows developers to spend less time struggling with the complexities of the Distributed Computing Platform that first arose in the 1990s. Cloud Microservices allow developers to build new applications by stitching together Cloud-based Microservices running in Cloud containers in the Cloud. This seems to be the next wave of the future for IT. The use of Microservices is another emerging technology in Cloud computing that extends our experiences with SOA. SOA (Service Oriented Architecture) arrived in 2004. With SOA, people started to introduce common services in the Middleware layer of the three-tier Distributed Computing Model. SOA allowed other Middleware application components to call a set of common SOA services for data. That eliminated the need for each application to reinvent the wheel each time for many common application data needs. Cloud Microservices take this one step further. Instead of SOA services running on bare-metal Unix servers, Cloud Microservices run in Cloud Containers and each Microservice provides a very primitive function. By using a large number of Cloud Microservices running in Cloud Containers, it is now possible to quickly throw together a new application and push it into Production.

So before concluding, I would like to relay some of my experiences with the power of something like Cloud Microservices. I left Amoco in 1999 when BP bought Amoco and terminated most of Amoco's IT Department. For more on that see Hierarchiology and the Phenomenon of Self-Organizing Organizational Collapse. I then joined the IT Department of United Airlines working on the CIDB - Customer Interaction Data Base. The CIDB initially consisted of 10 C++ Tuxedo services running in a Tuxedo Domain on Unix servers. Tuxedo (Transactions Under Unix) was an early form of Middleware software developed in the 1980s to create a TPM (Transaction Processing Monitor) running under Unix that could perform the same kind of secured transaction processing that IBM's CICS (1968) provided on IBM MVS mainframes. The original 10 Tuxedo services allowed United's business applications and the www.united.com website to access the data stored on the CIDB Oracle database. We soon found that Tuxedo was very durable and robust. You could literally throw Tuxedo down the stairs without a dent! A Tuxedo Domain was very much like a Cloud Container. When you booted up a Tuxedo Domain, a number of virtual Tuxedo servers were brought up. We had each virtual Tuxedo server run just one primitive service. The Tuxedo Domain had a configuration file that allowed us to define each of the Tuxedo servers and the service that ran in it. For example, we could configure the Tuxedo Domain so that a minimum of 1 and a maximum of 10 instances of Tuxedo Server-A were brought up. So initially, only a single instance of Tuxedo Server-A would come up to receive traffic. There was a Tuxedo queue of incoming transactions that were fed to the Tuxedo Domain. If the first instance of Tuxedo Service-A was found to be busy, a second instance of Tuxedo Server-A would be automatically cranked up. The number of Tuxedo Server-A instances would then dynamically change as the Tuxedo load varied. Like most object-oriented code, the C++ code for our Tuxedo services had memory leaks, but that was not a problem for us. When one of the instances of Tuxedo Server-A ran out of memory, it would simply die and another instance of Tuxedo Service-A would be cranked up by Tuxedo. We could even change the maximum number of running Tuxedo Service-A instances on the fly without having to reboot the Tuxedo Domain.

United Airlines found the CIDB Tuxedo Domain to be so useful that we began to write large numbers of Tuxedo services. For example, we wrote many Tuxedo services that interacted with United's famous Apollo reservation system that first appeared in 1971, and also with many other United applications and databases. Soon United began to develop new applications that simply called many of our Tuxedo Microservices. We tried to keep our Tuxedo Microservices very atomic and simple. Rather than provide our client applications with an entire engine, we provided them with the parts for an engine, like engine blocks, pistons, crankshafts, water pumps, distributors, induction coils, intake manifolds, carburetors and alternators.

One day in 2002 this came in very handy. My boss called me into his office at 9:00 AM one morning and explained that United Marketing had come up with a new promotional campaign called "Fly Three - Fly Free". The "Fly Three - Fly Free" campaign worked like this. If a United customer flew three flights in one month, they would get an additional future flight for free. All the customer had to do was to register for the program on the www.united.com website. In fact, United Marketing had actually begun running ads in all of the major newspapers about the program that very day. The problem was that nobody in Marketing had told IT about the program and the www.united.com website did not have the software needed to register customers for the program. I was then sent to an emergency meeting of the Application Development team that supported the www.united.com website. According to the ads running in the newspapers, the "Fly Three - Fly Free" program was supposed to start at midnight, so we had less than 15 hours to design, develop, test and implement the necessary software for the www.united.com website! Amazingly, we were able to do this by having the www.united.com website call a number of our primitive Tuxedo Microservices that interacted with the www.united.com website and the Apollo reservation system.

The use of many primitive Microservices is also extensively used by carbon-based life on this planet. In Facilitated Variation and the Utilization of Reusable Code by Carbon-Based Life, I showcased the theory of facilitated variation by Marc W. Kirschner and John C. Gerhart. In The Plausibility of Life (2005), Marc W. Kirschner and John C. Gerhart present their theory of facilitated variation. The theory of facilitated variation maintains that, although the concepts and mechanisms of Darwin's natural selection are well understood, the mechanisms that brought forth viable biological innovations in the past are a bit wanting in classical Darwinian thought. In classical Darwinian thought, it is proposed that random genetic changes, brought on by random mutations to DNA sequences, can very infrequently cause small incremental enhancements to the survivability of the individual, and thus provide natural selection with something of value to promote in the general gene pool of a species. Again, as frequently cited, most random genetic mutations are either totally inconsequential, or totally fatal in nature, and consequently, are either totally irrelevant to the gene pool of a species or are quickly removed from the gene pool at best. The theory of facilitated variation, like classical Darwinian thought, maintains that the phenotype of an individual is key, and not so much its genotype since natural selection can only operate upon phenotypes. The theory explains that the phenotype of an individual is determined by a number of 'constrained' and 'deconstrained' elements. The constrained elements are called the "conserved core processes" of living things that essentially remain unchanged for billions of years, and which are to be found to be used by all living things to sustain the fundamental functions of carbon-based life, like the generation of proteins by processing the information that is to be found in DNA sequences, and processing it with mRNA, tRNA and ribosomes, or the metabolism of carbohydrates via the Krebs cycle. The deconstrained elements are weakly-linked regulatory processes that can change the amount, location and timing of gene expression within a body, and which, therefore, can easily control which conserved core processes are to be run by a cell and when those conserved core processes are to be run by them. The theory of facilitated variation maintains that most favorable biological innovations arise from minor mutations to the deconstrained weakly-linked regulatory processes that control the conserved core processes of life, rather than from random mutations of the genotype of an individual in general that would change the phenotype of an individual in a purely random direction. That is because the most likely change of direction for the phenotype of an individual, undergoing a random mutation to its genotype, is the death of the individual.

Marc W. Kirschner and John C. Gerhart begin by presenting the fact that simple prokaryotic bacteria, like E. coli, require a full 4,600 genes just to sustain the most rudimentary form of bacterial life, while much more complex multicellular organisms, like human beings, consisting of tens of trillions of cells differentiated into hundreds of differing cell types in the numerous complex organs of a body, require only a mere 22,500 genes to construct. The baffling question is, how is it possible to construct a human being with just under five times the number of genes as a simple single-celled E. coli bacterium? The authors contend that it is only possible for carbon-based life to do so by heavily relying upon reusable code in the genome of complex forms of carbon-based life.

Figure 38 – A simple single-celled E. coli bacterium is constructed using a full 4,600 genes.

Figure 39 – However, a human being, consisting of about 100 trillion cells that are differentiated into the hundreds of differing cell types used to form the organs of the human body, uses a mere 22,500 genes to construct a very complex body, which is just slightly under five times the number of genes used by simple E. coli bacteria to construct a single cell. How is it possible to explain this huge dynamic range of carbon-based life? Marc W. Kirschner and John C. Gerhart maintain that, like complex software, carbon-based life must heavily rely on the microservices of reusable code.

Conclusion
The rise of the complex eukaryotic cell architecture on this planet was certainly a major Filter that had to be surpassed in order for human beings to be now on the verge of creating the ASI Machines that will then go on to explore and settle the rest of our galaxy. Yes, there were many other Filters that had to be overcome. However, perhaps this Filter was the most significant because it took several billion years to overcome. The sad fact is that most planets in our galaxy simply do not have several billion years of time to squander. In Could the Galactic Scarcity of Software Simply be a Matter of Bad Luck? we covered Professor Toby Tyrrell's computer-simulated research of 100,000 Earth-like planets that suggests that our Earth may be a very rare "hole in one" planet that was able to maintain a habitable surface temperature for four billion years by sheer luck.

Figure 40 – Toby Tyrrell's computer simulation of 100,000 Earth-like planets suggests that the Earth may be a "hole in one planet" proudly sitting on a fireplace mantle.

Figure 41 – Perhaps nearly all of the potentially hospitable exoplanets that we are finding in our galaxy are not urable and cannot go the distance of staying habitable for billions of years.

My suggestion is that further study of the complex interplay between the evolution of computer hardware and computer software over the past 2.65 billion seconds might shed further light on the complex interplay between the evolution of the carbon-based hardware and software that brought forth complex carbon-based life on this planet.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston