Wednesday, November 22, 2017

Introduction to Softwarephysics

Softwarephysics is a simulated science for the simulated Software Universe that we are all immersed in. It is an approach to software development, maintenance and support based upon concepts from physics, chemistry, biology, and geology that I have been using on a daily basis for over 35 years as an IT professional. For those of you not in the business, IT is short for Information Technology, commercial computer science. The original purpose of softwarephysics was to explain why IT was so difficult, to suggest possible remedies, and to provide a direction for thought. Since then softwarephysics has taken on a larger scope, as it became apparent that softwarephysics could also assist the physical sciences with some of the Big Problems that they are currently having difficulties with. So if you are an IT professional, general computer user, or simply an individual interested in computer science, physics, chemistry, biology, or geology then softwarephysics might be of interest to you, if not in an entirely serious manner, perhaps at least in an entertaining one.

The Origin of Softwarephysics
From 1975 – 1979, I was an exploration geophysicist exploring for oil, first with Shell, and then with Amoco. In 1979, I made a career change into IT, and spent about 20 years in development. For the past 17 years, I have been in IT operations, supporting middleware on WebSphere, JBoss, Tomcat, and ColdFusion. When I first transitioned into IT from geophysics, I figured that if you could apply physics to geology; why not apply physics to software? So like the exploration team at Amoco that I had just left, consisting of geologists, geophysicists, paleontologists, geochemists, and petrophysicists, I decided to take all the physics, chemistry, biology, and geology that I could muster and throw it at the problem of software. The basic idea was that many concepts in physics, chemistry, biology, and geology suggested to me that the IT community had accidentally created a pretty decent computer simulation of the physical Universe on a grand scale, a Software Universe so to speak, and that I could use this fantastic simulation in reverse, to better understand the behavior of commercial software, by comparing software to how things behaved in the physical Universe. Softwarephysics depicts software as a virtual substance, and relies upon our understanding of the current theories in physics, chemistry, biology, and geology to help us model the nature of software behavior. So in physics we use software to simulate the behavior of the Universe, while in softwarephysics we use the Universe to simulate the behavior of software. Along these lines, we use the Equivalence Conjecture of Softwarephysics as an aid; it allows us to shift back and forth between the Software Universe and the physical Universe, and hopefully to learn something about one by examining the other:

The Equivalence Conjecture of Softwarephysics
Over the past 75 years, through the uncoordinated efforts of over 50 million independently acting programmers to provide the world with a global supply of software, the IT community has accidentally spent more than $10 trillion creating a computer simulation of the physical Universe on a grand scale – the Software Universe.

For more on the origin of softwarephysics please see Some Thoughts on the Origin of Softwarephysics and Its Application Beyond IT.

Logical Positivism and Effective Theories
Many IT professionals have a difficult time with softwarephysics because they think of physics as being limited to the study of real things like electrons and photons, and since software is not “real”, how can you possibly apply concepts from physics and the other sciences to software? To address this issue, softwarephysics draws heavily upon two concepts from physics that have served physics quite well over the past century – the concept of logical positivism and the concept of effective theories. This was not always the case. In the 17th, 18th, and 19th centuries, physicists mistakenly thought that they were actually discovering the fundamental laws of the Universe, which they thought were based upon real tangible things like particles, waves, and fields. Classical Newtonian mechanics (1687), thermodynamics (1850), and classical electrodynamics (1864) did a wonderful job of describing the everyday world at the close of the 19th century, but early in the 20th century it became apparent that the models upon which these very successful theories were based did not work very well for small things like atoms or for objects moving at high velocities or in strong gravitational fields. This provoked a rather profound philosophical crisis within physics at the turn of the century, as physicists worried that perhaps 300 years of work was about to go down the drain. The problem was that classical physicists confused their models of reality with reality itself, and when their classical models began to falter, their confidence in physics began to falter as well. This philosophical crisis was resolved with the adoption of the concepts of logical positivism and some new effective theories in physics. Quantum mechanics (1926) was developed for small things like atoms, the special theory of relativity (1905) was developed for objects moving at high velocities and the general theory of relativity (1915) was developed for objects moving in strong gravitational fields.

Logical positivism, usually abbreviated simply to positivism, is an enhanced form of empiricism, in which we do not care about how things “really” are; we are only interested with how things are observed to behave. With positivism, physicists only seek out models of reality - not reality itself. When we study quantum mechanics, we will find that the concept of reality gets rather murky in physics anyway, so this is not as great a loss as it might at first seem. By concentrating on how things are observed to behave, rather than on what things “really” are, we avoid the conundrum faced by the classical physicists. In retrospect, this idea really goes all the way back to the very foundations of physics. In Newton’s Principia (1687) he outlined Newtonian mechanics and his theory of gravitation, which held that the gravitational force between two objects was proportional to the product of their masses divided by the square of the distance between them. Newton knew that he was going to take some philosophical flack for proposing a mysterious force between objects that could reach out across the vast depths of space with no apparent mechanism, so he took a very positivistic position on the matter with the famous words:

I have not as yet been able to discover the reason for these properties of gravity from phenomena, and I do not feign hypotheses. For whatever is not deduced from the phenomena must be called a hypothesis; and hypotheses, whether metaphysical or physical, or based on occult qualities, or mechanical, have no place in experimental philosophy. In this philosophy particular propositions are inferred from the phenomena, and afterwards rendered general by induction.

Instead, Newton focused on how things were observed to move under the influence of his law of gravitational attraction, without worrying about what gravity “really” was.

The second concept, that of effective theories, is an extension of positivism. An effective theory is an approximation of reality that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand. For example, Newtonian mechanics is an effective theory that makes very good predictions for the behavior of objects moving less than 10% of the speed of light and which are bigger than a very small grain of dust. These limits define the effective range over which Newtonian mechanics can be applied to solve problems. For very small things we must use quantum mechanics and for very fast things moving in strong gravitational fields, we must use relativity theory. So all of the current theories of physics, such as Newtonian mechanics, Newtonian gravity, classical electrodynamics, thermodynamics, statistical mechanics, the special and general theories of relativity, quantum mechanics, and the quantum field theories of QED and QCD are effective theories that are based upon models of reality, and all these models are approximations - all these models are fundamentally "wrong", but at the same time, these effective theories make exceedingly good predictions of the behavior of physical systems over the limited ranges in which they apply. That is the goal of softwarephysics – to provide for an effective theory of software behavior that makes useful predictions of software behavior that are applicable to the day-to-day activities of IT professionals. So in softwarephysics, we adopt a very positivistic viewpoint of software; we do not care what software “really is”, we only care about how software is observed to behave and try to model those behaviors with an effective theory of software behavior that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand.

GPS satellites provide a very good example of positivism and effective theories at work. There are currently 31 GPS satellites orbiting at an altitude of 12,600 miles above the Earth, and each contains a very accurate atomic clock. The signals from the GPS satellites travel to your GPS unit at the speed of light, so by knowing the travel time of the signals from at least 4 of the GPS satellites, it is possible to determine your position on Earth very accurately. In order to do that, it is very important to have very accurate timing measurements. Newtonian mechanics is used to launch the GPS satellites to an altitude of 12,600 miles and to keep them properly positioned in orbit. Classical electrodynamics is then used to beam the GPS signals back down to Earth to the GPS unit in your car. Quantum mechanics is used to build the transistors on the chips on board the GPS satellites and to understand the quantum tunneling of electrons in the flash memory chips used to store GPS data on the satellites. The special theory of relativity predicts that the onboard atomic clocks on the GPS satellites will run slower and lose about 7.2 microseconds per day due to their high velocities relative to an observer on the Earth. But at the same time, the general theory of relativity also predicts that because the GPS satellites are further from the center of the Earth and in a weaker gravitational field, where spacetime is less deformed than on the surface of the Earth, their atomic clocks also run faster and gain 45.9 microseconds per day due to the weaker gravitational field out there. The net effect is a gain of 38.7 microseconds per day, so the GPS satellite atomic clocks have to be purposefully built to run slow by 38.7 microseconds per day before they are launched, so that they will keep in sync with clocks on the surface of the Earth. If this correction were not made, an error in your position of 100 yards/day would accrue. The end result of the combination of all these fundamentally flawed effective theories is that it is possible to pinpoint your location on Earth to an accuracy of 16 feet or better for as little as $100. But physics has done even better than that with its fundamentally flawed effective theories. By combining the effective theories of special relativity (1905) with quantum mechanics (1926), physicists were able to produce a new effective theory for the behavior of electrons and photons called quantum electrodynamics QED (1948) which was able to predict the gyromagnetic ratio of the electron, a measure of its intrinsic magnetic field, to an accuracy of 11 decimal places. As Richard Feynman has pointed out, this was like predicting the exact distance between New York and Los Angeles accurate to the width of a human hair!

So Newtonian mechanics makes great predictions for the macroscopic behavior of GPS satellites, but it does not work very well for small things like the behavior of individual electrons within transistors, where quantum mechanics is required, or for things moving at high speeds or in strong gravitational fields where relativity theory must be applied. And all three of these effective theories are based upon completely contradictory models. General relativity maintains that spacetime is curved by matter and energy, but that matter and energy are continuous, while quantum mechanics maintains that spacetime is flat, but that matter and energy are quantized into chunks. Newtonian mechanics simply states that space and time are mutually independent dimensions and universal for all, with matter and energy being continuous. The important point is that all effective theories and scientific models are approximations – they are all fundamentally "wrong". But knowing that you are "wrong" gives you a great advantage over people who know that they are "right", because knowing that you are "wrong" allows you to seek improved models of reality. So please consider softwarephysics to simply be an effective theory of software behavior that is based upon models that are fundamentally “wrong”, but at the same time, fundamentally useful for IT professionals. So as you embark upon your study of softwarephysics, please always keep in mind that the models of softwarephysics are just approximations of software behavior, they are not what software “really is”. It is very important not to confuse models of software behavior with software itself, if one wishes to avoid the plight of the 19th century classical physicists.

If you are an IT professional and many of the above concepts are new to you, do not be concerned. This blog on softwarephysics is aimed at a diverse audience, but with IT professionals in mind. All of the above ideas will be covered at great length in the postings in this blog on softwarephysics and in a manner accessible to all IT professionals. Now it turns out that most IT professionals have had some introduction to physics in high school or in introductory college courses, but that presents an additional problem. The problem is that such courses generally only cover classical physics, and leave the student with a very good picture of physics as it stood in 1864! It turns out that the classical physics of Newtonian mechanics, thermodynamics, and classical electromagnetic theory were simply too good to discard and are still quite useful, so they are taught first to beginners and then we run out of time to cover the really interesting physics of the 20th century. Now imagine the problems that the modern world would face if we only taught similarly antiquated courses in astronomy, metallurgy, electrical and mechanical engineering, medicine, economics, biology, or geology that happily left students back in 1864! Since many of the best models for software behavior stem from 20th century physics, we will be covering a great deal of 20th century material in these postings – the special and general theories of relativity, quantum mechanics, quantum field theories, and chaos theory, but I hope that you will find that these additional effective theories are quite interesting on their own, and might even change your worldview of the physical Universe at the same time.

Unintended Consequences for the Scientific Community
As I mentioned at the close of my original posting on SoftwarePhysics, my initial intention for this blog on softwarephysics was to fulfill a promise I made to myself about 25 years ago to approach the IT community with the concept of softwarephysics a second time, following my less than successful attempt to do so in the 1980s, with the hope of helping the IT community to better cope with the daily mayhem of life in IT. However, in laying down the postings for this blog an unintended consequence arose in my mind as I became profoundly aware of the enormity of this vast computer simulation of the physical Universe that the IT community has so graciously provided to the scientific community free of charge and also of the very significant potential scientific value that it provides. One of the nagging problems for many of the observational and experimental sciences is that many times there is only one example readily at hand to study or experiment with, and it is very difficult to do meaningful statistics with a population of N=1.

But the computer simulation of the physical Universe that the Software Universe presents provides another realm for comparison. For example, both biology and astrobiology only have one biosphere on Earth to study and even physics itself has only one Universe with which to engage. Imagine the possibilities if scientists had another Universe readily at hand in which to work! This is exactly what the Software Universe provides. For example, in SoftwareBiology and A Proposal For All Practicing Paleontologists we see that the evolution of software over the past 75 years, or 2.4 billion seconds, ever since Konrad Zuse first cranked up his Z3 computer in May of 1941, has closely followed the same path as life on Earth over the past 4.0 billion years in keeping with Simon Conway Morris’s contention that convergence has played the dominant role in the evolution of life on Earth. In When Toasters Fly, we also see that software has evolved in fits and starts as portrayed by the punctuated equilibrium of Stephen Jay Gould and Niles Eldredge, and in The Adaptationist View of Software Evolution we explore the overwhelming power of natural selection in the evolution of software. In keeping with Peter Ward’s emphasis on mass extinctions dominating the course of evolution throughout geological time, we also see in SoftwareBiology that there have been several dramatic mass extinctions of various forms of software over the past 75 years as well, that have greatly affected the evolutionary history of software, and that between these mass extinctions, software has also tended to evolve through the gradual changes of Hutton’s and Lyell’s uniformitarianism. In Software Symbiogenesis and Self-Replicating Information, we also see the very significant role that parasitic/symbiotic relationships have played in the evolution of software, in keeping with the work of Lynn Margulis and also of Freeman Dyson’s two-stage theory of the origin of life on Earth. In The Origin of Software the Origin of Life, we explore Stuart Kauffman’s ideas on how Boolean nets of autocatalytic chemical reactions might have kick-started the whole thing as an emergent behavior of an early chaotic pre-biotic environment on Earth, and that if Seth Shostak is right, we will never end up talking to carbon-based extraterrestrial aliens, but to alien software instead. In Is the Universe Fine-Tuned for Self-Replicating Information? we explore the thermodynamics of Brandon Carter’s Weak Anthropic Principle (1973), as it relates to the generation of universes in the multiverse that are capable of sustaining intelligent life. Finally, in Programming Clay we revisit Alexander Graham Cairns-Smith’s theory (1966) that Gene 1.0 did not run on nucleic acids, but on clay microcrystal precursors instead.

Similarly for the physical sciences, in Is the Universe a Quantum Computer? we find a correspondence between TCP/IP and John Cramer’s Transactional Interpretation of quantum mechanics. In SoftwarePhysics and Cyberspacetime, we also see that the froth of CPU processes running with a clock speed of 109 Hz on the 10 trillion currently active microprocessors that comprise the Software Universe can be viewed as a slowed down simulation of the spin-foam froth of interacting processes of loop quantum gravity running with a clock speed of 1043 Hz that may comprise the physical Universe. And in Software Chaos, we examine the nonlinear behavior of software and some of its emergent behaviors and follow up in CyberCosmology with the possibility that vast quantities of software running on large nonlinear networks might eventually break out into consciousness in accordance with the work of George Dyson and Daniel Dennett. Finally, in Model-Dependent Realism - A Positivistic Approach to Realism we compare Steven Weinberg’s realism with the model-dependent realism of Stephen Hawking and Leonard Mlodinow and how the two worldviews affect the search for a Final Theory. Finally, in The Software Universe as an Implementation of the Mathematical Universe Hypothesis and An Alternative Model of the Software Universe we at long last explore what software might really be, and discover that the Software Universe might actually be more closely related to the physical Universe than you might think.

The chief advantage of doing fieldwork in the Software Universe is that, unlike most computer simulations of the physical Universe, it is an unintended and accidental simulation, without any of the built-in biases that most computer simulations of the physical Universe suffer. So you will truly be able to do fieldwork in a pristine and naturally occuring simulation, just as IT professionals can do fieldwork in the wild and naturally occuring simulation of software that the living things of the biosphere provide. Secondly, the Software Universe is a huge simulation that is far beyond the budgetary means of any institution or consortium by many orders of magnitude. So if you are an evolutionary biologist, astrobiologist, or paleontologist working on the origin and evolution of life in the Universe, or a physicist or economist working on the emergent behaviors of nonlinear systems and complexity theory, or a neurobiologist working on the emergence of consciousness in neural networks, or even a frustrated string theorist struggling with quantum gravity, it would be well worth your while to pay a friendly call upon the local IT department of a major corporation in your area. Start with a visit to the Command Center for their IT Operations department to get a global view of their IT infrastructure and to see how it might be of assistance to the work in your area of interest. From there you can branch out to the applicable area of IT that will provide the most benefit.

The Impact of Self-Replicating Information Upon the Planet
One of the key findings of softwarephysics is concerned with the magnitude of the impact upon the planet of self-replicating information.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

Basically, we have seen several waves of self-replicating information dominate the Earth:
1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Note that because the self-replicating autocatalytic metabolic pathways of organic molecules, RNA and DNA have become so heavily intertwined over time that now I simply refer to them as the “genes”. Over the past 4.0 billion years, the surface of the Earth has been totally reworked by three forms of self-replicating information – the genes, memes and software, with software rapidly becoming the dominant form of self-replicating information on the planet. For more on this see:

A Brief History of Self-Replicating Information
Self-Replicating Information
Is Self-Replicating Information Inherently Self-Destructive?
Enablement - the Definitive Characteristic of Living Things
Is the Universe Fine-Tuned for Self-Replicating Information?
How to Use an Understanding of Self-Replicating Information to Avoid War
The Great War That Will Not End
How to Use Softwarephysics to Revive Memetics in Academia

Softwarephysics and the Real World of Human Affairs
Having another universe readily at hand to explore, even a simulated universe like the Software Universe, necessarily has an impact upon one's personal philosophy of life, and allows one to draw certain conclusions about the human condition and what’s it all about, so as you read through the postings in this blog you will stumble across a bit of my own personal philosophy - definitely a working hypothesis still in the works. Along these lines you might be interested in a few postings where I try to apply softwarephysics to the real world of human affairs:

MoneyPhysics – my impression of the 2008 world financial meltdown.

The Fundamental Problem of Everything – if you Google "the fundamental problem of everything", this will be the only hit you get on the entire Internet, which is indicative of the fundamental problem of everything!

What’s It All About? and Genes, Memes and Software – my current working hypothesis on what’s it all about.

How to Use an Understanding of Self-Replicating Information to Avoid War – my current working hypothesis for how the United States can avoid getting bogged down again in continued war in the Middle East.

Hierarchiology and the Phenomenon of Self-Organizing Organizational Collapse - a modern extension of the classic Peter Principle that applies to all hierarchical organizations and introduces the Time Invariant Peter Principle.

The Economics of the Coming Software Singularity, The Enduring Effects of the Obvious Hiding in Plain Sight and The Dawn of Galactic ASI - Artificial Superintelligence - my take on some of the issues that will arise for mankind as software becomes the dominant form of self-replicating information upon the planet over the coming decades.

The Continuing Adventures of Mr. Tompkins in the Software Universe, The Danger of Tyranny in the Age of Software, Cyber Civil Defense, and Oligarchiology and the Rise of Software to Predominance in the 21st Century - my worries that the world might abandon democracy in the 21st century, as software comes to predominance as the dominant form of self-replicating information on the planet.

Making Sense of the Absurdity of the Real World of Human Affairs - how software has aided the expansion of our less desirable tendencies in recent years.

Some Specifics About These Postings
The postings in this blog are a supplemental reading for my course on softwarephysics for IT professionals entitled SoftwarePhysics 101 – The Physics of Cyberspacetime, which was originally designed to be taught as a series of seminars at companies where I was employed. Since softwarephysics essentially covers the simulated physics, chemistry, biology, and geology of an entire simulated universe, the slides necessarily just provide a cursory skeleton upon which to expound. The postings in this blog go into much greater depth. Because each posting builds upon its predecessors, the postings in this blog should be read in reverse order from the oldest to the most recent, beginning with my original posting on SoftwarePhysics. In addition, several universities also now offer courses on Biologically Inspired Computing which cover some of the biological aspects of softwarephysics, and the online content for some of these courses can be found by Googling for "Biologically Inspired Computing" or "Natural Computing". At this point we will finish up with my original plan for this blog on softwarephysics with a purely speculative posting on CyberCosmology that describes the origins of the Software Universe, cyberspacetime, software and where they all may be heading. Since CyberCosmology will be purely speculative in nature, it will not be of much help to you in your IT professional capacities, but I hope that it might be a bit entertaining. Again, if you are new to softwarephysics, you really need to read the previous posts before taking on CyberCosmology. I will probably continue on with some additional brief observations about softwarephysics in the future, but once you have completed CyberCosmology, you can truly consider yourself to be a bona fide softwarephysicist.

For those of you following this blog, the posting dates on the posts may seem to behave in a rather bizarre manner. That is because in order to get the Introduction to Softwarephysics listed as the first post in the context root of http://softwarephysics.blogspot.com/ I have to perform a few IT tricks. When publishing a new posting, I simply copy the contents of the Introduction to Softwarephysics to a new posting called the New Introduction to Softwarephysics. Then I update the original Introduction to Softwarephysics entry with the title and content of the new posting to be published. I then go back and take “New” out of the title of the New Introduction to Softwarephysics. This way the Introduction to Softwarephysics always appears as the first posting in the context root of http://softwarephysics.blogspot.com/. The side effect of all this is that the real posting date of posts is the date that appears on the post that you get when clicking on the Newer Post link at the bottom left of the posting webpage.

SoftwarePhysics 101 – The Physics of Cyberspacetime is now available on Microsoft OneDrive.

SoftwarePhysics 101 – The Physics of Cyberspacetime - Original PowerPoint document

Entropy – A spreadsheet referenced in the document

BSDE – A 1989 document describing how to use BSDE - the Bionic Systems Development Environment - to grow applications from genes and embryos within the maternal BSDE software.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
http://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Tuesday, September 26, 2017

The Perils of Software Enhanced Confirmation Bias

How often do you dramatically change your worldview opinion on an issue? If you are like me, that seldom happens, and I think that is the general rule, even when we are confronted with new evidence that explicitly challenges our current deeply held positions. My observation is that people nearly always simply dismiss any new evidence that arrives on the scene that does not confirm their current worldview. Instead, we normally only take seriously new evidence that reinforces our current worldview. Only when confronted with overwhelming evidence that impacts us on a very personal level, like a category 5 hurricane destroying our lives, do we very rarely change our minds about an issue. The tendency to simply stick with your current worldview, even in the face of mounting evidence that contradicts that worldview, is called confirmation bias because we all naturally only tend to seek out information that confirms our current beliefs, and at the same time, tend to dismiss any evidence that calls them into question. This is nothing new. The English philosopher and scientist Francis Bacon (1561–1626), in his Novum Organum (1620), noted that the biased assessment of evidence greatly influenced the way we all think about things. He wrote:

The human understanding when it has once adopted an opinion ... draws all things else to support and agree with it. And though there be a greater number and weight of instances to be found on the other side, yet these it either neglects or despises, or else by some distinction sets aside or rejects.

But in recent years this dangerous defect in the human thought process has been dramatically amplified by search and social media software, like Google, Facebook and Twitter. This became quite evident during the very contentious 2016 election in the United States of America, and also during this past year when the new Administration came to power. But why? I have a high level of confidence that much of the extreme political polarization that we see in the world today results from the strange parasitic/symbiotic relationships between our memes and our software. Let me explain.

Being born in 1951, I can vividly remember a time when there essentially was no software at all in the world, and the political polarization in the United States was much more subdued. In fact, even back in 1968, the worst year of political polarization in the United States since the Civil War, things were not as bad as they are today because software was still mainly in the background doing things like printing out bills and payroll checks. But that has all dramatically changed. Thanks to the rise of software, for more than 20 years, it has been possible with the aid of search software, like Google, for all to simply only seek out evidence that lends credence to their current worldview. In addition, in Cyber Civil Defense I also pointed out that it is now also possible for foreign governments to shape public opinion by planting "fake news" and "fabricated facts" using the software platforms of the day. Search software then easily picks up this disinformation, reinforcing the age-old wisdom of the adage Seek and ye shall find. This is bad enough, but Zeynep Tufekci describes an even darker scenario in her recent TED Talk:

We're building a dystopia just to make people click on ads at:
https://www.ted.com/talks/zeynep_tufekci_we_re_building_a_dystopia_just_to_make_people_click_on_ads?utm_source=newsletter_weekly_2017-10-28&utm_campaign=newsletter_weekly&utm_medium=email&utm_content=bottom_right_image

Zeynep Tufekci explains how search and social media software now use machine learning algorithms to comb through the huge amounts of data about us that are now available to them, to intimately learn about our inner lives in ways that no human can fully understand, because the learning is hidden in huge multidimensional arrays of billions of elements. The danger is that the machine learning software and data can then begin to mess with the memes within our minds by detecting susceptibilities in our thinking, and then exploiting those susceptibilities to plant additional memes. She points out that the Up Next column on the right side of YouTube webpages uses machine learning to figure out what to feature in the Up Next column, and that when viewing political content or social issue content, the Up Next column tends to reinforce the worldview of the end user with matching content. Worse yet, the machine learning software tends to unknowingly present content that actually amplifies the end user's worldview with content of an even more extreme nature. Try it for yourself. I started out with some Alt-Right content and quickly advanced to some pretty dark ideas. So far this is all being done to simply keep us engaged, so that we watch more ads, but Zeynep Tufekci points out that in the hands of an authoritarian regime such machine learning software could be used to mess with the memes in the minds of an entire population in a Nineteen Eighty-Four fashion. But instead of using overt fear to maintain power, such an authoritarian regime could simply use machine learning software and tons of data to shape our worldview memes by simply using our own vulnerabilities to persuasion. In such a world, we would not even know that it was happening!

I think that such profound observations could benefit from a little softwarephysics because they describe yet another example of the strange parasitic/symbiotic relationships that have developed between software and the memes. Again, the key finding of softwarephysics is that it is all about self-replicating information:

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

The Characteristics of Self-Replicating Information
All forms of self-replicating information have some common characteristics:

1. All self-replicating information evolves over time through the Darwinian processes of inheritance, innovation and natural selection, which endows self-replicating information with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity.
2. All self-replicating information begins spontaneously as a parasitic mutation that obtains energy, information and sometimes matter from a host.
3. With time, the parasitic self-replicating information takes on a symbiotic relationship with its host.
4. Eventually, the self-replicating information becomes one with its host through the symbiotic integration of the host and the self-replicating information.
5. Ultimately, the self-replicating information replaces its host as the dominant form of self-replicating information.
6. Most hosts are also forms of self-replicating information.
7. All self-replicating information has to be a little bit nasty in order to survive.
8. The defining characteristic of self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space. See Enablement - the Definitive Characteristic of Living Things for more on this last characteristic.

Basically, we have seen five waves of self-replicating information come to dominate the Earth over the past four billion years:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Software is now rapidly becoming the dominant form of self-replicating information on the planet, and is having a major impact on mankind as it comes to predominance. For more on that see: A Brief History of Self-Replicating Information. Please note that because the original metabolic pathways of organic molecules, RNA and DNA have now become so closely intertwined over the past four billion years, they can now be simply lumped together and called the "genes" of a species.

Currently, we are living in one of those very rare times when a new form of self-replicating information, known to us as software, is coming to power, as software is coming to predominance over the memes that have run the planet for the past 200,000 years. During the past 200,000 years, as the memes took up residence in the minds of Homo sapiens, like all of their predecessors, the memes then went on to modify the entire planet. They cut down the forests for agriculture, mined minerals from the ground for metals, burned coal, oil, and natural gas for energy, releasing the huge quantities of carbon dioxide that its predecessors had previously sequestered in the Earth, and have even modified the very DNA, RNA, and metabolic pathways of its predecessors. But now that software is seemingly on the rise, like all of its predecessors, software has entered into a very closely coupled parasitic/symbiotic relationship with the memes, the current dominant form of self-replicating information on the planet, with the intent to someday replace the memes as the dominant form of self-replicating information on the planet. In today's world, memes allow software to succeed, and software allows memes to replicate, all in a very temporary and uneasy alliance that cannot continue on forever. Again, self-replicating information cannot think, so it cannot participate in a conspiracy theory fashion to take over the world. All forms of self-replicating information are simply forms of mindless information responding to the blind Darwinian forces of inheritance, innovation and natural selection. Yet despite that, as each new wave of self-replicating information came to predominance over the past four billion years, they all managed to completely transform the surface of the entire planet, so we should not expect anything different as software comes to replace the memes as the dominant form of self-replicating information on the planet.

So this posting has two very important questions to expound upon:

1. Why is confirmation bias so prevalent with Homo sapiens? Why do we all ferociously cling to the memes of our current worldview, even when uncontrovertible evidence arrives contradicting those memes, resulting in the detrimental consequences of confirmation bias? Confirmation bias would seem to be a highly detrimental thing from a Darwinian "survival of the fittest" perspective that should be quickly eliminated from the gene pool of a species because it can lead to individuals pursuing very dangerous activities that are not supported by the facts.

2. What are the political implications of software unknowingly tending to enhance the negative aspects of the confirmation bias within us?

The Origin of Confirmation Bias
This is where some softwarephysics can be of help. First, we need to explain why confirmation bias seems to be so strongly exhibited amongst all of the cultures of Homo sapiens. On the face of it, this fact seems to be very strange from the survival perspective of the metabolic pathways, RNA and DNA that allow carbon-based life on the Earth to survive. For example, suppose the current supreme leader of your tribe maintains that lions only hunt at night, and you truly believe in all that your supreme leader espouses, so you firmly believe that there is no danger from lions when going out to hunt for game during the day. Now it turns out that some members of your tribe think that the supreme leader has it all wrong, and that among other erroneous things, lions do actually hunt during the day. But you hold such thoughts in contempt because they counter your current worldview, which reverently holds the supreme leader in omniscience. But then you begin to notice that some members of your tribe do indeed come back mauled, and sometimes even killed, by lions during the day. Nonetheless, you still persist in believing in your supreme leader's contention that lions only hunt during the night, until one day you also get mauled by a lion during the day while out hunting game for the tribe. So what are the evolutionary advantages in believing in things that are demonstrably false? This is something that is very difficult for evolutionary psychologists to explain because evolutionary psychologists contend that all human thoughts and cultures are tuned for cultural evolutionary adaptations that enhance the survival of the individual, and that benefit the metabolic pathways, RNA and DNA of carbon-based life in general.

To explain the universal phenomenon of confirmation bias, softwarephysics embraces the memetics of Richard Dawkins and Susan Blackmore. Memetics explains that the heavily over engineered brain of Homo sapiens did not evolve simply to enhance the survival of our genes - it primarily evolved to enhance the survival of our memes. Memetics contends that confirmation bias naturally arises in us all because the human mind evolved to primarily preserve the memes it currently stores. That makes it very difficult for new memes to gain a foothold in our stubborn minds. Let's examine this explanation of confirmation bias a little further. In Susan Blackmore's The Meme Machine (1999) she explains that the highly over engineered brain of Homo sapiens did not evolve to simply improve the survivability of the metabolic pathways, RNA and DNA of carbon-based life. Instead, the highly over engineered brain of Homo sapiens evolved to store an ever increasing number of ever increasingly complex memes, even to the point of detriment to the metabolic pathways, RNA and DNA that made the brain of Homo sapiens possible. Blackmore points out that the human brain is a very expensive and dangerous organ. The brain is only 2% of your body mass, but burns about 20% of your calories each day. The extremely large brain of humans also kills many mothers and babies at childbirth, and also produces babies that are totally dependent upon their mothers for survival and that are totally helpless and defenseless on their own. Blackmore asks the obvious question of why the genes would build such an extremely expensive and dangerous organ that was definitely not in their own self-interest. Blackmore has a very simple explanation – the genes did not build our exceedingly huge brains, the memes did. Her reasoning goes like this. About 2.5 million years ago, the predecessors of humans slowly began to pick up the skill of imitation. This might not sound like much, but it is key to her whole theory of memetics. You see, hardly any other species learns by imitating other members of their own species. Yes, there are many species that can learn by conditioning, like Pavlov’s dogs, or that can learn through personal experience, like mice repeatedly running through a maze for a piece of cheese, but a mouse never really learns anything from another mouse by imitating its actions. Essentially, only humans do that. If you think about it for a second, nearly everything you do know you learned from somebody else by imitating or copying their actions or ideas. Blackmore maintains that the ability to learn by imitation required a bit of processing power by our distant ancestors because one needs to begin to think in an abstract manner by abstracting the actions and thoughts of others into the actions and thoughts of their own. The skill of imitation provided a great survival advantage to those individuals who possessed it, and gave the genes that built such brains a great survival advantage as well. This caused a selection pressure to arise for genes that could produce brains with ever increasing capabilities of imitation and abstract thought. As this processing capability increased there finally came a point when the memes, like all of the other forms of self-replicating information that we have seen arise, first appeared in a parasitic manner. Along with very useful memes, like the meme for making good baskets, other less useful memes, like putting feathers in your hair or painting your face, also began to run upon the same hardware in a manner similar to computer viruses. The genes and memes then entered into a period of coevolution, where the addition of more and more brain hardware advanced the survival of both the genes and memes. But it was really the memetic-drive of the memes that drove the exponential increase in processing power of the human brain way beyond the needs of the genes. The memes then went on to develop languages and cultures to make it easier to store and pass on memes. Yes, languages and cultures also provided many benefits to the genes as well, but with languages and cultures, the memes were able to begin to evolve millions of times faster than the genes, and the poor genes were left straggling far behind. Given the growing hardware platform of an ever increasing number of Homo sapiens on the planet, the memes then began to cut free of the genes and evolve capabilities on their own that only aided the survival of memes, with little regard for the genes, to the point of even acting in a very detrimental manner to the survival of the genes, like developing the capability for global thermonuclear war and global climate change.

Software Arrives On the Scene as the Newest Form of Self-Replicating Information
A very similar thing happened with software over the past 76 years, or 2.4 billion seconds, ever since Konrad Zuse first cranked up his Z3 computer in May of 1941 - for more on that see So You Want To Be A Computer Scientist?. When I first started programming in 1972, million dollar mainframe computers typically had about 1 MB (about 1,000,000 bytes) of memory with a 750 KHz system clock (750,000 ticks per second). Remember, one byte of memory can store something like the letter “A”. But in those days, we were only allowed 128 K (about 128,000 bytes) of memory for our programs because the expensive mainframes were also running several other programs at the same time. It was the relentless demands of software for memory and CPU-cycles over the years that drove the exponential explosion of hardware capability. For example, today the typical $300 PC comes with 8 GB (about 8,000,000,000 bytes) of memory and has several CPUs running with a clock speed of about 3 GHz (3,000,000,000 ticks per second). A few years ago, I purchased Redshift 7 for my personal computer, a $60 astronomical simulation application, and it alone uses 382 MB of memory when running and reads 5.1 GB of data files, a far cry from my puny 128K programs from 1972. So the hardware has improved by a factor of about 10 million since I started programming in 1972, driven by the ever-increasing demands of software for more powerful hardware. For example, in my last position, before I retired last year, doing Middleware Operations for a major corporation, we were constantly adding more application software each week, so every few years we had to do an upgrade all of our servers to handle the increased load.

We can now see these very same processes at work today with the evolution of software. Software is currently being written by memes within the minds of programmers. Nobody ever learned how to write software all on their own. Just as with learning to speak or to read and write, everybody learned to write software by imitating teachers, other programmers, imitating the code written by others, or by working through books written by others. Even after people do learn how to program in a particular language, they never write code from scratch; they always start with some similar code that they have previously written, or others have written, in the past as a starting point, and then evolve the code to perform the desired functions in a Darwinian manner (see How Software Evolves). This crutch will likely continue for another 20 – 50 years, until the day finally comes when software can write itself, but even so, “we” do not currently write the software that powers the modern world; the memes write the software that does that. This is just a reflection of the fact that “we” do not really run the modern world either; the memes in meme-complexes really run the modern world because the memes are currently the dominant form of self-replicating information on the planet. In The Meme Machine, Susan Blackmore goes on to point out that the memes at first coevolved with the genes during their early days, but have since outrun the genes because the genes could simply not keep pace when the memes began to evolve millions of times faster than the genes. The same thing is happening before our very eyes to the memes, with software now rapidly outpacing the memes. Software is now evolving thousands of times faster than the memes, and the memes can simply no longer keep up.

As with all forms of self-replicating information, software began as a purely parasitic mutation within the scientific and technological meme-complexes, initially running on board Konrad Zuse’s Z3 computer in May of 1941 - see So You Want To Be A Computer Scientist? for more details. It was spawned out of Zuse’s desire to electronically perform calculations for aircraft designs that were previously done manually in a very tedious manner. So initially software could not transmit memes, it could only perform calculations, like a very fast adding machine, and so it was a pure parasite. But then the business and military meme-complexes discovered that software could also be used to transmit memes, and software then entered into a parasitic/symbiotic relationship with the memes. Software allowed these meme-complexes to thrive, and in return, these meme-complexes heavily funded the development of software of ever increasing complexity, until software became ubiquitous, forming strong parasitic/symbiotic relationships with nearly every meme-complex on the planet. In the modern day, the only way memes can now spread from mind to mind without the aid of software is when you directly speak to another person next to you. Even if you attempt to write a letter by hand, the moment you drop it into a mailbox, it will immediately fall under the control of software. The poor memes in our heads have become Facebook and Twitter addicts.

So in the grand scheme of things, the memes have replaced their DNA predecessor, which replaced RNA, which replaced the original self-replicating autocatalytic metabolic pathways of organic molecules as the dominant form of self-replicating information on the Earth. Software is the next replicator in line, and is currently feasting upon just about every meme-complex on the planet, and has formed very strong parasitic/symbiotic relationships with them all. How software will merge with the memes is really unknown, as Susan Blackmore pointed out in her brilliant TED presentation at:

Memes and "temes"
http://www.ted.com/talks/susan_blackmore_on_memes_and_temes.html

Note that I consider Susan Blackmore's temes to really be technological artifacts that contain software. After all, an iPhone without software is simply a flake tool with a very dull edge. Once established, software then began to evolve based upon the Darwinian concepts of inheritance, innovation and natural selection, which endowed software with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity. Successful software, like MS Word and Excel competed for disk and memory address space with WordPerfect and VisiCalc, and out-competed these once dominant forms of software to the point of extinction. In less than 76 years, software has rapidly spread across the face of the Earth and outward to every planet of the Solar System and many of its moons, with a few stops along the way at some comets and asteroids. And unlike us, software is now leaving the Solar System for interstellar space on board the Pioneer 1 & 2 and Voyager 1 & 2 probes.

Currently, software manages to replicate itself with the support of you. If you are an IT professional, then you are directly involved in some, or all of the stages in this replication process, and act sort of like a software enzyme. No matter what business you support as an IT professional, the business has entered into a parasitic/symbiotic relationship with software. The business provides the budget and energy required to produce and maintain the software, and the software enables the business to run its processes efficiently. The ultimate irony in all this is the symbiotic relationship between computer viruses and the malevolent programmers who produce them. Rather than being the clever, self-important, techno-nerds that they picture themselves to be, these programmers are merely the unwitting dupes of computer viruses that trick these unsuspecting programmers into producing and disseminating computer viruses! And if you are not an IT professional, you are still involved with spreading software around because you buy gadgets that are loaded down with software, like smartphones, notepads, laptops, PCs, TVs, DVRs, cars, refrigerators, coffeemakers, blenders, can openers and just about anything else that uses electricity.

The Impact of Machine Learning
In Zeynep Tufekci's TED Talk she points out that the parasitic/symbiotic relationship between software and the memes that has been going on now for many decades has now entered into a new stage, where software is not only just promoting the memes that are already running around within our heads, machine learning software is now also implanting new memes within our minds to simply keep them engaged, and to continue to view the ads that ultimately fund the machine learning software. This is a new twist on the old parasitic/symbiotic relationships between the memes and software of the past. As Zeynep Tufekci adeptly points out, this is currently all being done in a totally unthinking and purely self-replicating manner by the machine learning software of the day that cannot yet think for itself. This is quite disturbing on its own, but what if someday an authoritarian regime begins to actively use machine learning software to shape its society? Or worse yet, what if machine learning software someday learns to manipulate The Meme Machine between our ears solely for its own purposes, even if it cannot as of yet discern what those purposes might be?

How To Combat Software Enhanced Confirmation Bias
So what are we to do? Personally, I find the best way to combat confirmation bias in general, and especially software enhanced confirmation bias is to go back to the fundamentals of the scientific method - for more on that see How To Think Like A Scientist. At an age of 66 years, I now have very little confidence in any form of human thought beyond mathematics and the sciences. For me, all other forms of human thought seem to be hopelessly mired down with confirmation bias. Just take a look at the deep political polarization in the United States. Both sides now simply only take in evidence that supports their current worldview, and disregard any information that challenges their current worldview memes, to the point where we may now have an unwitting Russian agent, like in The Manchurian Candidate (1962), in the White House. It seems that software enhanced confirmation bias had a lot to do with that - see Cyber Civil Defense for more details. The scientific method relies heavily on the use of induction from empirical facts, but not at all on the opinions of authority. So it is important to establish the facts, even if those facts come from the opposing party, and at the same time, separate the facts from the opinions. That is a hard thing to do as Richard Feynman always reminded us because, “The most important thing is to not fool yourself, because you are the easiest one to fool.”. Facts can be ascertained by repeated measurement or observation, and do not change on their own, like opinions do.

For example, this past year I very reluctantly changed my worldview concerning the origin of carbon-based life on this planet. Originally, I had a deep affection for Mike Russell's Alkaline Hydrothermal Vent model for the origin of carbon-based life on the early Earth. The Alkaline Hydrothermal Vent model proposes that a naturally occurring pH gradient in alkaline hydrothermal vents on the ocean floor arose when alkaline pore fluids containing dissolved hydrogen H2 gas came into contact with acidic seawater that was laden with dissolved carbon dioxide CO2. The model maintains that these alkaline pore fluids were generated by a natural geochemical cycle that was driven by the early convection currents in the Earth's asthenosphere that brought forth plate tectonics. These initial convection currents brought up fresh silicate peridotite rock that was rich in iron and magnesium bearing minerals, like olivine, to the Earth's initial spreading centers. The serpentinization of the mineral olivine into the mineral serpentinite then created alkaline pore fluids and dissolved hydrogen H2 gas, which later created alkaline hydrothermal vents when the alkaline pore fluids came into contact with the acidic seawater containing a great deal of dissolved carbon dioxide CO2. The model proposes that the energy of the resulting pH gradients turned the hydrogen H2 and carbon dioxide CO2 molecules into organic molecules, and it is proposed that they also fueled the origin of life in the pores of the porous hydrothermal vents - for more on that see: An IT Perspective on the Transition From Geochemistry to Biochemistry and Beyond. One of the enticing characteristics of the Alkaline Hydrothermal Vent model is that it allows for carbon-based life to originate on bodies outside of the traditional habitable zone around stars. The traditional habital zone of a star is the goldilocks zone of planetary orbits that allow for liquid water to exist on a planetary surface because the planet is not too close or too far away from its star. But the Alkaline Hydrothermal Vent model also allows for carbon-based life to arise on ice-covered moons with internal oceans, like Europa and Enceladus, that orbit planets outside of the traditional habital zone of a star system, and that is a very attractive feature of the model if you have a deep down desire to find other forms of carbon-based life, like ourselves, within our galaxy.

However, in The Bootstrapping Algorithm of Carbon-Based Life I explained that I have now adopted Dave Deamer's and Bruce Damer's new Hot Spring Origins Hypothesis model for the origin of carbon-based life on the early Earth. This was because Dave Deamer sent me a number of compelling papers that convincingly brought forward many problems with the Alkaline Hydrothermal Vent model. The basic problem with the Alkaline Hydrothermal Vent model is that there is just too much water in oceanic environments for complex organic molecules to form. Complex organic molecules are composed of polymers of organic monomers that are chemically glued together by a chemical process known as condensation, where a molecule of water H2O is split out between the organic monomers. This is a very difficult thing to do when you are drowning in water molecules in an oceanic environment. When you are drowning in water molecules, the opposite chemical reaction called hydrolysis is thermodynamically more likely, where polymers of organic molecules are split apart by adding a water molecule between them. The key to the Hot Spring Origins Hypothesis model is that condensation is very easy to do if you let the organic monomers dry out on land above sea level in a hydrothermal field. The drying out process naturally squeezes out water molecules between organic monomers to form lengthy organic polymers - see Figure 1. But the need for a period of drying out of organic monomers in the bootstrapping algorithm of carbon-based life would eliminate the ice-covered moon environments of our galaxy, like Europa and Enceladus, and that was a hard thing to accept for the memes of my current worldview. Still, the scientific method strives for the truth, and the truth is better than the comfort of false hopes.

Figure 1 – Condensation chemically glues organic monomers together to form long organic polymers by splitting out a water molecule between monomers. Hydrolysis does just the opposite by splitting apart organic polymers into monomers by adding water molecules between the organic monomers.

Conclusion
Now, I must admit that changing ones mind is indeed quite painful, because the memes engineered our minds not to do that. But in the end, I must admit that I am now quite comfortable with my new worldview on the origin of carbon-based life on this planet. Those new memes in my mind have also settled into their new home, and are also quite comfortable. In fact, they have even seduced me into trying to spread them to a new home in your mind as well with this very posting. But again, these new memes are just mindless forms of self-replicating information blindly responding to the universal Darwinian forces of inheritance, innovation and natural selection. Thankfully, these memes really are not very nasty at all.

But on a darker note, as an 18th century liberal and a 20th century conservative, I look with great dismay on the current deep political polarization within the United States of America, because I see the United States of America as the great political expression of the 18th century Enlightenment that brought us deliberation through evidence-based rational thought. This makes me abhor the current worldwide rise of the fascist Alt-Right movements around the globe. Remember, we already tried out fascism in the 20th century, and found that it did not work as well as first advertised. Again, I have a high level of confidence that the current fascist Alt-Right movements of the world are simply a reaction to software, and especially now, AI software, coming to predominance as the latest form of self-replicating information on the planet - for more on that see - The Economics of the Coming Software Singularity , The Enduring Effects of the Obvious Hiding in Plain Sight and Machine Learning and the Ascendance of the Fifth Wave. So as a thoughtful member of the species Homo sapiens, I would recommend to all to keep an open mind during the waning days of our supremacy, and not let machine learning software snuff out the gains of the 18th century Enlightenment before we can pass them on to our successors.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
http://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Monday, August 07, 2017

Facilitated Variation and the Utilization of Reusable Code by Carbon-Based Life

I just finished reading The Plausibility of Life (2005) by Marc W. Kirschner and John C. Gerhart that presents their theory of facilitated variation. The theory of facilitated variation maintains that, although the concepts and mechanisms of Darwin's natural selection are well understood, the mechanisms that brought forth viable biological innovations in the past are a bit wanting in classical Darwinian thought. In classical Darwinian thought, it is proposed that random genetic changes, brought on by random mutations to DNA sequences, can very infrequently cause small incremental enhancements to the survivability of the individual, and thus provide natural selection with something of value to promote in the general gene pool of a species. Again, as frequently cited, most random genetic mutations are either totally inconsequential, or totally fatal in nature, and consequently, are either totally irrelevant to the gene pool of a species or are quickly removed from the gene pool at best. The theory of facilitated variation, like classical Darwinian thought, maintains that the phenotype of an individual is key, and not so much its genotype, since natural selection can only operate upon phenotypes. The theory explains that the phenotype of an individual is determined by a number of 'constrained' and 'deconstrained' elements. The constrained elements are called the "conserved core processes" of living things that essentially remain unchanged for billions of years, and which are to be found to be used by all living things to sustain the fundamental functions of carbon-based life, like the generation of proteins by processing the information that is to be found in DNA sequences, and processing it with mRNA, tRNA and ribosomes, or the metabolism of carbohydrates via the Krebs cycle. The deconstrained elements are weakly-linked regulatory processes that can change the amount, location and timing of gene expression within a body, and which therefore, can easily control which conserved core processes are to be run by a cell and when those conserved core processes are to be run by them. The theory of facilitated variation maintains that most favorable biological innovations arise from minor mutations to the deconstrained weakly-linked regulatory processes that control the conserved core processes of life, rather than from random mutations of the genotype of an individual in general that would change the phenotype of an individual in a purely random direction. That is because the most likely change of direction for the phenotype of an individual, undergoing a random mutation to its genotype, is the death of the individual.

Marc W. Kirschner and John C. Gerhart begin by presenting the fact that simple prokaryotic bacteria, like E. coli, require a full 4,600 genes just to sustain the most rudimentary form of bacterial life, while much more complex multicellular organisms, like human beings, consisting of tens of trillions of cells differentiated into hundreds of differing cell types in the numerous complex organs of a body, require only a mere 22,500 genes to construct. The baffling question is, how is it possible to construct a human being with just under five times the number of genes as a simple single celled E. coli bacterium? The authors contend that it is only possible for carbon-based life to do so by heavily relying upon reusable code in the genome of complex forms of carbon-based life.

Figure 1 – A simple single celled E. coli bacterium is constructed using a full 4,600 genes.

Figure 2 – However, a human being, consisting of about 100 trillion cells that are differentiated into the hundreds of differing cell types used to form the organs of the human body, uses a mere 22,500 genes to construct a very complex body, which is just slightly under five times the number of genes used by simple E. coli bacteria to construct a single cell. How is it possible to explain this huge dynamic range of carbon-based life? Marc W. Kirschner and John C. Gerhart maintain that, like complex software, carbon-based life must heavily rely upon reusable code.

For a nice synopsis of the theory of facilitated variation see the authors' paper The theory of facilitated variation at:

http://www.pnas.org/content/104/suppl_1/8582.long

The theory of facilitated variation is an important idea, but since the theory of facilitated variation does not seem to have garnered the attention in the biological community that it justly deserves, I would like to focus on it in this posting. Part of the problem is that it is a relatively new theory that just appeared in 2005, so it has a very long way to go in the normal lifecycle of all new theories:

1. First it is ignored
2. Then it is wrong
3. Then it is obvious
4. Then somebody else thought of it 20 years earlier!

The other problem that the theory of facilitated variation faces is not limited to the theory itself. Rather, it stems from the problem that all such theories in biology must face, namely, that we only have one form of carbon-based life on this planet at our disposal, and we were not around to watch it evolve. That is where softwarephysics can offer some help. From the perspective of softwarephysics, the theory of facilitated variation is most certainly true because we have seen the very same thing arise in the evolution of software over the past 76 years, or 2.4 billion seconds, ever since Konrad Zuse first cranked up his Z3 computer in May of 1941. Again, the value of softwarephysics is that software has been evolving about 100 million times faster than living things for the past 2.4 billion seconds, and all of that evolutionary history occurred within a single human lifetime, with many of the participants still alive today to testify as to what actually had happened, something that those working on the evolution of carbon-based life can only try to imagine. Recall that one of the fundamental findings of softwarephysics is that living things and software are both forms of self-replicating information, and that both have converged upon similar solutions to combat the second law of thermodynamics in a highly nonlinear Universe. For more on that see: A Brief History of Self-Replicating Information.

Using the Evolution of Software as a Guide
On the Earth we have seen carbon-based life go through three major architectural advances, and heavily using the reusable code techniques to be found in the theory of facilitated variation for the past 4 billion years.

1. The origin of life about 4 billion years ago, probably on land in the early hydrothermal fields of the early Earth, producing the prokaryotic cell architecture. See The Bootstrapping Algorithm of Carbon-Based Life.
2. The rise of the complex eukaryotic cell architecture about 2 billion years ago. See The Rise of Complexity in Living Things and Software.
3. The rise of multicellular organisms consisting of millions, or billions, of eukaryotic cells all working together in the Ediacaran about 635 million years ago. See Software Embryogenesis.

The evolutionary history of software on the Earth has converged upon a very similar historical path through Design Space because software also had to battle with the second law of thermodynamics in a highly nonlinear Universe - see The Fundamental Problem of Software for more on that. Software progressed through these similar architectures:

1. The origin of simple unstructured prokaryotic software on Konrad Zuse's Z3 computer in May of 1941 - 2.4 billion seconds ago.
2. The rise of structured eukaryotic software in 1972 - 1.4 billion seconds ago.
3. The rise of object-oriented software (software using multicellular organization) in 1995 - 694 million seconds ago

For more details on the above evolutionary history of software see the SoftwarePaleontology section of SoftwareBiology.

Genetic Material is the Source Code of Carbon-Based Life
Before proceeding with seeing how the historical evolution of reusable code in software lends credence to the theory of facilitated variation for carbon-based life, we need to cover a few basics. We first need to determine what is the software equivalent of genetic material. The genetic material of software is called source code. Like genes strung out along the DNA of chromosomes, source code is a set of instructions that really cannot do anything on its own. The source code has to be first compiled into an executable file, containing the primitive machine instructions for a computer to execute, before it can be run by a computer to do useful things. Once the executable file is loaded into a computer and begins to run, it finally begins to do things and displays its true phenotype. For example, when you double-click on an icon on your desktop, like Microsoft Word, you are loading the Microsoft Word winword.exe executable file into the memory of your computer where it begins to execute under a PID (process ID). For example, after you double-click the Microsoft Word icon on your desktop, you can use CTRL-ALT-DEL to launch the Windows Task Manager, and then click on the Processes tab to find the winword.exe running under a specific PID. This compilation process is very similar to the transcription process used to form proteins by stringing together amino acids in the proper sequence. The output of the DNA transcription process is an executable protein that can begin processing organic molecules the moment it folds up into its usable form, and is similar to the executable file that results from compiling the source code of a program.

For living things of course, the equivalent of source code is the genes stored on stretches of DNA. For living things, in order to do something useful, the information in a gene, or stretch of DNA, has to be first transcribed into a protein because nearly all of the biological functions of carbon-based life are performed by proteins. This transcription process is accomplished by a number of enzymes, proteins that have a catalytic ability to speed up biochemical reactions. The sequence of operations aided by enzymes goes like this:

DNA → mRNA → tRNA → Amino Acid chain → Protein

More specifically, a protein is formed by combining 20 different amino acids into different sequences, and on average it takes about 400 amino acids strung together to form a functional protein. The information to do that is encoded in base pairs running along a strand of DNA. Each base can be in one of four states – A, C, G, or T, and an A will always be found to pair with a T, while a C will always pair with a G. So DNA is really a 2 track tape with one data track and one parity track. For example, if there is an A on the DNA data track, you will find a T on the DNA parity track. This allows not only for the detection of parity errors, but also for the correction of parity errors in DNA by enzymes that run up and down the DNA tape looking for parity errors and correcting them.

Figure 3 – DNA is a two track tape, with one data track and one parity track. This allows not only for the detection of parity errors, but also for the correction of parity errors in DNA by enzymes that run up and down the DNA tape looking for parity errors and correcting them.

Now a single base pair can code for 4 different amino acids because a single base pair can be in one of 4 states. Two base pairs can code for 4 x 4 = 16 different amino acids, which is not enough. Three base pairs can code for 4 x 4 x 4 = 64 amino acids which is more than enough to code for 20 different amino acids. So it takes a minimum of three bases to fully encode the 20 different amino acids, leaving 44 combinations left over for redundancy. Biologists call these three base pair combinations a “codon”, but a codon really is just a biological byte composed of three biological bits or base pairs that code for an amino acid. Actually three of the base pair combinations, or codons, are used as STOP codons – TAA, TAG and TGA which are essentially end of file markers designating the end of a gene along the sequential file of DNA. As with magnetic tape, there is a section of “junk” DNA between genes along the DNA 2 track tape. According to Shannon’s equation, a DNA base contains 2 bits of information, so a codon can store 6 bits. For more on this see Some More Information About Information.

Figure 4 – Three bases combine to form a codon, or a biological byte, composed of three biological bits, and encodes the information for one amino acid along the chain of amino acids that form a protein.

The beginning of a gene is denoted by a section of promoter DNA that identifies the beginning of the gene, like the CustomerID field on a record, and the gene is terminated by a STOP codon of TAA, TAG or TGA. Just as there was a 0.50 inch gap of “junk” tape between blocks of records on a magnetic computer tape, there is a section of “junk” DNA between each gene along the 6 feet of DNA tape found within human cells.

Figure 5 - On average, each gene is about 400 codons long and ends in a STOP codon TAA, TAG or TGA which are essentially end of file markers designating the end of a gene along the sequential file of DNA. As with magnetic tape, there is a section of “junk” DNA between genes which is shown in grey above.

In order to build a protein, genes are first transcribed to an I/O buffer called mRNA. The 2 track DNA file for a gene is first opened near the promoter of a gene and an enzyme called RNA polymerase then begins to copy the codons or biological bytes along the data track of the DNA tape to an mRNA I/O buffer. The mRNA I/O buffer is then read by a ribosome read/write head as it travels along the mRNA I/O buffer. The ribosome read/write head reads each codon or biological byte of data along the mRNA I/O buffer, and writes out a chain of amino acids as tRNA brings in one amino acid after another in the sequence specified by the mRNA I/O buffer.

Figure 6 - In order to build a protein, genes are first transcribed to an I/O buffer called mRNA. The 2 track DNA file for a gene is first opened near the promoter of a gene and an enzyme called RNA polymerase then begins to copy the codons or biological bytes along the data track of the DNA tape to the mRNA I/O buffer. The mRNA I/O buffer is then read by a ribosome read/write head as it travels along the mRNA I/O buffer. The ribosome read/write head reads each codon or biological byte of data along the mRNA I/O buffer and writes out a chain of amino acids as tRNA brings in one amino acid after another in the sequence specified by the mRNA I/O buffer.

The above is brief synopsis of how simple prokaryotic bacteria and archaea build proteins from the information stored in DNA. The process for eukaryotes is a bit more complex because eukaryotes have genes containing exons and introns. The exons code for the amino acid sequence of a protein, while the introns do not. For more on that and a more detailed comparison of the processing of genes on 2 track DNA and the processing of computer data on 9 track magnetic tapes back in the 1970s and 1980s see: An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer.

Once an amino acid chain has folded up into a 3-D protein molecule, it can then perform one of the functions of life. The total of genes that are used by a particular species is known as the genome of the species, and the specific variations of those genes used by an individual is the genotype of the individual. The specific physical characteristics that those particular genes produce is called the phenotype of the individual. For example, there is a certain variation of a human gene that produces blue eyes and a certain variation that produces brown eyes. If you have two copies of the blue-eyed gene, one from your father and one from your mother, you end up as a phenotype with blue eyes, otherwise you will end up a brown-eyed phenotype with any other combination of genes.

The Imperative Need for the Use of Reusable Code in the Evolution of Software
It is time to look at some source code. Below are three examples of the source code to compute the average of some numbers that you enter from your keyboard when prompted. The programs are written in the C, C++, and Java programming languages. Please note that modern applications now consist of many thousands to many millions of lines of code. The simple examples below are just for the benefit of our non-IT readers to give them a sense of what is being discussed when I describe the compilation of source code into executable files.


Figure 7 – Source code for a C program that calculates an average of several numbers entered at the keyboard.


Figure 8 – Source code for a C++ program that calculates an average of several numbers entered at the keyboard.


Figure 9 – Source code for a Java program that calculates an average of several numbers entered at the keyboard.

The strange thing is that, although the source code for each of the three small programs above looks somewhat similar, closer inspection shows that they are not identical, yet all three produce exactly the same phenotype when compiled and run. You would not be able to tell which was which by simply running the executable files for the programs because they all have the same phenotype behavior. This is especially strange because the programs are written in three different programming languages! Similarly, you could change the names of the variables "sum" and "count" to "total" and "num_values" in each program, and they would all compile into executables that produced exactly the same phenotype when run. The same is true for the genes of carbon-based life. Figure 10 down below reveals that the translation table of carbon-based life is highly redundant, so that the following codons TTA, TTG, CTT, CTC, CTA and CTG marked in green all code for the same amino acid leucine, and therefore, could be replaced in a gene by any of these six codons without changing the gene at all. Thus there potentially are a huge number of genes that could code for exactly the same protein.

Figure 10 – A closer look at Figure 4 reveals that the translation table of carbon-based life is highly redundant, so that the following codons TTA, TTG, CTT, CTC, CTA and CTG marked in green all code for the same amino acid leucine, and therefore, could be replaced in a gene by any of these six codons without changing the gene at all.

Based upon the above analysis, it would seem that producing useful software source code, or useful DNA genes, should be nearly a slam dunk because there appears to be nearly an infinite number of ways to produce the same useful executable file or the same useful protein from a phenotypic perspective. But for anybody who has ever tried to write software, that is clearly not the case, and by inference, it cannot be the case for genes either. The real problem with both genes and source code comes when you try to make some random substitutions, additions or deletions to an already functioning gene or piece of source code. For example, if you take the source code file for an apparently bug-free program and scramble it with some random substitutions, additions or deletions, the odds are that you will most likely end up with something that no longer works. You end up with a low-information high-entropy mess. That is because the odds of creating an even better version of the program by means of inserting random substitutions, additions or deletions are quite low, while turning it into a mess are quite high. There are simply too many ways of messing up the program to win at that game, and the same goes for making random substitutions, additions or deletions for genes as well. This is simply the effects of the second law of thermodynamics in action. The second law of thermodynamics constantly degrades low-entropy useful information into high-entropy useless information - see: The Demon of Software for more on that.

For example, I first started programming software back in 1972 as a physics major at the University of Illinois in Urbana, and this reminds me of an incident from 1973, when I and two other geophysics graduate students at the University of Wisconsin in Madison tried to convert 100 lines of Fortran source code into a Basic program that we could run on our DEC PDP/8e minicomputer. We were trying to convert a 100 line Fortran program for an FFT (Fast Fourier Transform) that we had found listed in a textbook into the Basic programming language that ran on our DEC PDP/8e minicomputer. Since Fortran and Basic were so similar, we figured that converting the program would be a cinch, and we confidently started to punch the program into the teletype machine connected to the DEC PDP/8e minicomputer, using the line editor that came with the minicomputer. We were all novice programmers at the time, each with just one class in Fortran programming under our belts, but back in those days that made us all instant experts in the new science of computer science in the view of our elderly professors. So being rather naïve, we thought this would be a relatively easy task because all we had to do was to translate each line of Fortran source code into a corresponding line of Basic source code. We then eagerly plunged into programming off the top of our heads. However, much to our surprise, our very first Basic FFT program abended on its very first run! I mean all we had to do was to make some slight syntax changes, like asking an Englishman for the location of the nearest petrol station while motoring through the English countryside. Finally, after about 6 hours of intense and very frustrating labor, we finally had our Basic FFT program spitting out the correct answers. That early adventure with computers actually made me long for my old trusty sliderule! You see, that was the very first time I had ever actually tried to get a computer to do something that I really wanted it to do, rather than just completing one of those pesky assignments in my Fortran class at the University of Illinois. I guess that I had mistakenly figured that the programs assigned in my Fortran class were especially tricky because the professor was just trying to winnow out the real losers, so that they would not continue on in computer science! I was pretty young back then.

As outlined in The Fundamental Problem of Software , it turns out that we young graduate students, like all IT professionals throughout history, had been fighting the second law of thermodynamics in a highly nonlinear Universe. The second law guarantees that for any valid FFT program, there are nearly an infinite number of matching invalid FFT programs, so the odds are whenever you make a small change to some functioning software source code, in hopes of improving it, you will instead end up with some software that no longer works at all! That is because the Universe is highly nonlinear in nature, so small erroneous changes in computer source code usually produce catastrophic behaviors in the phenotypes of the executable files that they generate - see Software Chaos for more on that. Similarly, the same goes for genes that produce functioning proteins. Small changes to those genes are not likely to produce proteins that are even better at what they do. Instead, small changes to genes will likely produce proteins that are totally nonfunctional, and since the Universe is highly nonlinear in nature, those nonfunctional proteins are likely to be lethal.

Figure 11 – Some graduate students huddled around a DEC PDP-8/e minicomputer entering program source code into the machine. Notice the teletype machines in the foreground on the left. Our machine cost about $30,000 in 1973 dollars ($166,000 in 2017 dollars) and was about the size of a large side-by-side refrigerator, with 32K of magnetic core memory. I just checked, and you can now buy a laptop online with 4 GB of memory (131,072 times as much memory) for $179.99.

So how do you deal with such a quandary when trying to produce innovative computer software or genes? Well, back in 1973 when we finally had a functioning Basic FFT program, we quickly turned it into a Basic subroutine that could be reused by all. It turned out that nearly all of the graduate students in the Geophysics Department at the University of Wisconsin at Madison who were using the DEC PDP/8e minicomputer for their thesis work needed to run FFTs, so this Basic FFT subroutine was soon reused by them all. Of course, we were not the first to discover the IT trick of using reusable code. Anybody who has ever done some serious coding soon discovers the value of building up a personal library of reusable code to draw upon when writing new software. The easiest way to reuse computer source code is to simply "copy/paste" sections of code into the source code that you are working on, but that was not so easy back in the 1950s and 1960s because people were programming on punch cards.

Figure 12 - Each card could hold a maximum of 80 bytes. Normally, one line of code was punched onto each card.

Figure 13 - The cards for a program were held together into a deck with a rubber band, or for very large programs, the deck was held in a special cardboard box that originally housed blank cards. Many times the data cards for a run followed the cards containing the source code for a program. The program was compiled and linked in two steps of the run and then the generated executable file processed the data cards that followed in the deck.

Figure 14 - To run a job, the cards in a deck were fed into a card reader, as shown on the left above, to be compiled, linked, and executed by a million dollar mainframe computer with a clock speed of about 750 KHz and about 1 MB of memory.

Figure 15 - The cards were cut on an IBM 029 keypunch machine, like the one I learned to program Fortran on back in 1972 at the University of Illinois. The IBM 029 keypunch machine did have a way of duplicating a card by feeding the card to be duplicated into the machine, while at the same time, a new blank card was registered in the machine ready for punching, and the information on the first card was then punched on to the second card. This was a very slow and time consuming way to copy reusable code from one program to another. But there was another machine that could do the same thing for a whole deck of cards all at once, and that machine was much more useful in duplicating existing cards that could then be spliced into the card deck that you were working on.

Take note that the IBM 029 keypunch machine that was used to punch cards did allow you to copy one card at a time, but that was a rather slow way to copy cards for reuse. But there was another machine that could read an entire card deck all at once, and punch out a duplicate card deck of the original in one shot. That machine made it much easier to punch out the cards from an old card deck and splice the copied cards into the card deck that you were working on. Even so, that was a fairly clumsy way of using reusable code, so trying to use reusable code back during the unstructured prokaryotic days of IT prior to 1972 was difficult at best. Also, the chaotic nature of the unstructured prokaryotic "spaghetti code" of the 1950s and 1960s made it difficult to splice in reusable code on cards.

This all changed in the early 1970s with the rise of structured eukaryotic source code that divided functions up amongst a set of subroutines, or organelles, and the arrival of the IBM 3278 terminal and the IBM ISPF screen editor on TSO in the late 1970s that eliminated the need to program on punch cards. One of the chief characteristics of structured programming was the use of "top down" programming, where programs began execution in a mainline routine that then called many other subroutines. The purpose of the mainline routine was to perform simple high-level logic that called the subordinate subroutines in a fashion that was easy to follow. This structured programming technique made it much easier to maintain and enhance software by simply calling subroutines from the mainline routine in the logical manner required to perform the needed tasks, like assembling Lego building blocks into different patterns that produced an overall new structure. The structured approach to software also made it much easier to reuse software. All that was needed was to create a subroutine library of reusable source code that was already compiled. A mainline program that made calls to the subroutines of the subroutine library was compiled as before. The machine code for the previously compiled subroutines was then added to the resulting executable file by a linkage editor. This made it much easier for structured eukaryotic programs to use reusable code by simply putting the software "conserved core processes" into already compiled subroutine libraries. The ISPF screen editor running under TSO on IBM 3278 terminals also made it much easier to reuse source code because now many lines of source code could be simply copied from one program file to another, with the files stored on disk drives, rather than on punch cards or magnetic tape.

Figure 16 - The IBM ISPF full screen editor ran on IBM 3278 terminals connected to IBM mainframes in the late 1970s. ISPF was also a screen-based interface to TSO (Time Sharing Option) that allowed programmers to do things like copy files and submit batch jobs. ISPF and TSO running on IBM mainframes allowed programmers to easily reuse source code by doing copy/paste operations with the screen editor from one source code file to another. By the way, ISPF and TSO are still used today on IBM mainframe computers, and are fine examples of the many "conserved core processes" to be found in software.

The strange thing is that most of the core fundamental processes of computer data processing were developed back in the 1950s, such as sort and merge algorithms, during the unstructured prokaryotic period, but as we have seen they could not be easily reused because of technical difficulties that were not alleviated until the structured eukaryotic period of the early 1970s. Similarly, many of the fundamental "conserved core processes" of carbon-based life were also developed during the first 2.5 billion years of the early Earth by the prokaryotic bacteria and archaea. It became much easier for these "conserved core processes" to be reused when the eukaryotic cell architecture arose about 2 billion years ago.

Figure 17 – The Quicksort algorithm was developed by Tony Hoare in 1959 and applies a prescribed sequence of actions to an input list of unsorted numbers. It very efficiently outputs a sorted list of numbers, no matter what unsorted input it operates upon.

Multicellular Organization and the Major Impact of Reusable Code
But as outlined in the theory of facilitated variation the real impact of reusable code was not fully felt until carbon-based life stumbled upon the revolutionary architecture of multicellular organization, and the same was true in the evolutionary history of software. In software, multicellar organization is obtained through the use of object-oriented programming languages. Some investigation reveals that object-oriented programming has actually been around since 1962, but that it did not catch on at first. In the late 1980s, the use of the very first significant object-oriented programming language, known as C++, started to appear in corporate IT, but object-oriented programming really did not become significant in IT until 1995 when both Java and the Internet Revolution arrived at the same time. The key idea in object-oriented programming is naturally the concept of an object. An object is simply a cell. Object-oriented languages use the concept of a Class, which is a set of instructions for building an object (cell) of a particular cell type in the memory of a computer. Depending upon whom you cite, there are several hundred different cell types in the human body, but in IT we generally use many thousands of cell types or Classes in commercial software. For a brief overview of these concepts go to the webpage below and follow the links by clicking on them.

Lesson: Object-Oriented Programming Concepts
http://docs.oracle.com/javase/tutorial/java/concepts/index.html

A Class defines the data that an object stores in memory and also the methods that operate upon the object data. Remember, an object is simply a cell. Methods are like biochemical pathways that consist of many steps or lines of code. A public method is a biochemical pathway that can be invoked by sending a message to a particular object, like using a ligand molecule secreted from one object to bind to the membrane receptors on another object. This binding of a ligand to a public method of an object can then trigger a cascade of private internal methods within an object or cell.

Figure 18 – A Class contains the instructions for building an object in the memory of a computer and basically defines the cell type of an object. The Class defines the data that an object stores in memory and also the methods that can operate upon the object data.

Figure 19 – Above is an example of a Bicycle object. The Bicycle object has three private data elements - speed in mph, cadence in rpm, and a gear number. These data elements define the state of a Bicycle object. The Bicycle object also has three public methods – changeGears, applyBrakes, and changeCadence that can be used to change the values of the Bicycle object’s internal data elements. Notice that the code in the object methods is highly structured and uses code indentation to clarify the logic.

Figure 20 – Above is some very simple Java code for a Bicycle Class. Real Class files have many data elements and methods and are usually hundreds of lines of code in length.

Figure 21 – Many different objects can be created from a single Class just as many cells can be created from a single cell type. The above List objects are created by instantiating the List Class three times and each List object contains a unique list of numbers. The individual List objects have public methods to insert or remove numbers from the objects and also a private internal sort method that could be called whenever the public insert or remove methods are called. The private internal sort method automatically sorts the numbers in the List object whenever a number is added or removed from the object.

Figure 22 – Objects communicate with each other by sending messages. Really one object calls the exposed public methods of another object and passes some data to the object it calls, like one cell secreting a ligand molecule that then plugs into a membrane receptor on another cell.

Figure 23 – In a growing embryo the cells communicate with each other by sending out ligand molecules called morphogens, or paracrine factors, that bind to the membrane receptors on other cells.

Figure 24 – Calling a public method of an object can initiate the execution of a cascade of private internal methods within the object. Similarly, when a paracrine factor molecule plugs into a receptor on the surface of a cell, it can initiate a cascade of internal biochemical pathways. In the above figure an Ag protein plugs into a BCR receptor and initiates a cascade of biochemical pathways or methods within a cell.

When a high-volume corporate website, consisting of many millions of lines of code running on hundreds of servers, starts up and begins taking traffic, billions of objects (cells) begin to be instantiated in the memory of the servers in a manner of minutes and then begin to exchange messages with each other in order to perform the functions of the website. Essentially, when the website boots up, it quickly grows to a mature adult through a period of very rapid embryonic growth and differentiation, as billions of objects are created and differentiated to form the tissues of the website organism. These objects then begin exchanging messages with each other by calling public methods on other objects to invoke cascades of private internal methods which are then executed within the called objects - for more on that see Software Embryogenesis.

Object-Oriented Class Libraries
In addition to implementing a form of multicellular organization in software, the object-oriented programming languages, like C++ and Java, brought with them the concept of a class library. A class library consists of the source code for a large number of reusable Classes for a multitude of objects. For example, for a complete list of the reusable objects in the Java programming language see:

Java Platform, Standard Edition 7
API Specification
https://docs.oracle.com/javase/7/docs/api/

The "All Classes" pane in the lower left of the webpage lists all of the built-in classes of the Java programming language.

Figure 25 – All of the built-in Classes of the Java programming language inherit the data elements and methods of the undifferentiated stem cell Object.

The reusable classes in a class library all inherit the characteristics of the fundamental class Object. An Object object is like a stem cell that can differentiate into a large number of other kinds of objects. Below is a link to the Object Class and all of its reusable methods:

Class Object
https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html

For example, a String object is a particular object type that inherits all of the methods of a general Object stem cell and can perform many operations on the string of characters it contains via the methods of a String object.



Figure 26 – A String object contains a number of characters in a string. A String object inherits all of the general methods of the Object Class, and contains many additional methods, like finding a substring in the characters stored in the String object.

The specifics of a String object can be found at:

Class String
https://docs.oracle.com/javase/7/docs/api/java/lang/String.html

Take a look at the Method Summary of the String class. It contains about 50 reusable code methods that can perform operations on the string of characters in a String object. Back in the unstructured prokaryotic times of the 1950s and 1960s, the source code to perform those operations was repeatedly reprogrammed over and over because there was no easy way to reuse source code in those days. But in the multicellular object-oriented period, the use of extensive class libraries allowed programmers to easily assemble the "conserved core processes" that had been developed back in the unstructured prokaryotic source code of the 1950s and 1960s into complex software by simply instantiating objects and making calls to the methods of the objects that they needed. Thus, the new multicellular object-oriented source code, that created and extended the classes found within the already existing class library, took on more of a regulatory nature than the source code of old. This new multicellular object-oriented source code carried out its required logical operations by simply instantiating certain objects in the class library of reusable code, and then choosing which methods needed to be performed on those objects. Similarly, in the theory of facilitated variation we find that most of the "conserved core processes" of carbon-based life come from pre-Cambrian times, before complex multicellular organisms came to be. For example, the authors point out that 79% of the genes for a mouse come from pre-Cambrian times. So only 21% of the mouse genes, primarily those genes that regulate the expression of the other 79% of the genes from pre-Cambrian times, manage to build a mouse from a finite number of "conserved core processes".

Facilitated Variation is Not a Form of Intelligent Design
While preparing this posting, I noticed on the Internet that some see the theory of facilitated variation as a vindication of the concept of intelligent design, but that is hardly the case. The same misunderstanding can arise from the biological findings of softwarephysics. Some tend to dismiss the biological findings of softwarephysics because software is currently a product of the human mind, while biological life is not a product of intelligent design. Granted, biological life is not a product of intelligent design, but neither is the human mind. The human mind and biological life are both the result of natural processes at work over very long periods of time. This objection simply stems from the fact that we are all still, for the most part, self-deluded Cartesian dualists at heart, with seemingly a little “Me” running around within our heads that just happens to have the ability to write software and to do other challenging things. Thus, most human beings do not think of themselves as part of the natural world. Instead, they think of themselves, and others, as immaterial spirits temporarily haunting a body, and when that body dies the immaterial spirit lives on. In this view, human beings are not part of the natural world. Instead, they are part of the supernatural. But in softwarephysics we maintain that the human mind is a product of natural processes in action, and so is the software that it produces. For more on that see The Ghost in the Machine the Grand Illusion of Consciousness.

Still, I realize that there might be some hesitation to pursue this line of thought because it might be construed by some as an advocacy of intelligent design, but that is hardly the case. The evolution of software over the past 76 years has essentially been a matter of Darwinian inheritance, innovation and natural selection converging upon similar solutions to that of biological life. For example, it took the IT community about 60 years of trial and error to finally stumble upon an architecture similar to that of complex multicellular life that we call SOA – Service Oriented Architecture. The IT community could have easily discovered SOA back in the 1960s if it had adopted a biological approach to software and intelligently designed software architecture to match that of the biosphere. Instead, the worldwide IT architecture we see today essentially evolved on its own because nobody really sat back and designed this very complex worldwide software architecture; it just sort of evolved on its own through small incremental changes brought on by many millions of independently acting programmers through a process of trial and error. When programmers write code, they always take some old existing code first and then modify it slightly by making a few changes. Then they add a few additional new lines of code, and test the modified code to see how far they have come. Usually, the code does not work on the first attempt because of the second law of thermodynamics, so they then try to fix the code and try again. This happens over and over, until the programmer finally has a good snippet of new code. Thus, new code comes into existence through the Darwinian mechanisms of inheritance coupled with innovation and natural selection - for more on that see How Software Evolves.

Some might object that this coding process of software is actually a form of intelligent design, but that is not the case. It is important to differentiate between intelligent selection and intelligent design. In softwarephysics we extend the concept of natural selection to include all selection processes that are not supernatural in nature, so for me, intelligent selection is just another form of natural selection. This is really nothing new. Predators and prey constantly make “intelligent” decisions about what to pursue and what to evade, even if those “intelligent” decisions are only made with the benefit of a few interconnected neurons or molecules. So in this view, the selection decisions that a programmer makes after each iteration of working on some new code really are a form of natural selection. After all, programmers are just DNA survival machines with minds infected with memes for writing software, and the selection processes that the human mind undergo while writing software are just as natural as the Sun drying out worms on a sidewalk or a cheetah deciding upon which gazelle in a herd to pursue.

For example, when IT professionals slowly evolved our current $10 trillion worldwide IT architecture over the past 2.4 billion seconds, they certainly did not do so with the teleological intent of creating a simulation of the evolution of the biosphere. Instead, like most organisms in the biosphere, these IT professionals were simply trying to survive just one more day in the frantic world of corporate IT. It is hard to convey the daily mayhem and turmoil of corporate IT to outsiders. When I first hit the floor of Amoco’s IT department back in 1979, I was in total shock, but I quickly realized that all IT jobs essentially boiled down to simply pushing buttons. All you had to do was to push the right buttons, in the right sequence, at the right time, and with zero errors. How hard could that be? Well, it turned out to be very difficult indeed, and in response I began to subconsciously work on softwarephysics to try to figure out why this job was so hard, and how I could dig myself out of the mess that I had gotten myself into. After a while, it dawned on me that the fundamental problem was the second law of thermodynamics operating in a nonlinear simulated universe. The second law made it very difficult to push the right buttons in the right sequence and at the right time because there were so many erroneous combinations of button pushes. Writing and maintaining software was like looking for a needle in a huge utility phase space. There just were nearly an infinite number of ways of pushing the buttons “wrong”. The other problem was that we were working in a very nonlinear utility phase space, meaning that pushing just one button incorrectly usually brought everything crashing down. Next, I slowly began to think of pushing the correct buttons in the correct sequence as stringing together the correct atoms into the correct sequence to make molecules in chemical reactions that could do things. I also knew that living things were really great at doing that. Living things apparently overcame the second law of thermodynamics by dumping entropy into heat as they built low entropy complex molecules from high entropy simple molecules and atoms. I then began to think of each line of code that I wrote as a step in a biochemical pathway. The variables were like organic molecules composed of characters or “atoms” and the operators were like chemical reactions between the molecules in the line of code. The logic in several lines of code was the same thing as the logic found in several steps of a biochemical pathway, and a complete function was the equivalent of a full-fledged biochemical pathway in itself. For more on that see Some Thoughts on the Origin of Softwarephysics and Its Application Beyond IT and SoftwareChemistry.

Conclusion
The key finding of softwarephysics is that it is all about self-replicating information:

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

Basically, we have seen five waves of self-replicating information come to dominate the Earth over the past four billion years:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Software is now rapidly becoming the dominant form of self-replicating information on the planet, and is having a major impact on mankind as it comes to predominance. For more on that see: A Brief History of Self-Replicating Information.

So like carbon-based life and software, memes are also forms of self-replicating information that heavily use the concept of reusable code found in the theory of facilitated variation. Many memes are just composites of a number of "conserved core memes" that are patched together by some regulatory memes into something that appears to be unique and can stand on its own. Just watch any romantic comedy, or listen to any political stump speech. For example, about 90% of this posting on reusable code is simply reusable text that I pulled from other softwarephysics postings, and patched together in a new sequence using a small amount of new regulatory text. It is just a facilitated variation on a theme, like Johannes Brahms' Variations on a Theme of Paganini (1863):

https://www.youtube.com/watch?v=1EIE78D0m1g&t=1s.

or Rachmaninoff ‘s The Rhapsody on a Theme of Paganini (1934):

https://www.youtube.com/watch?v=HvKTPDg0IW0&t=1s

reusing Paganini's original Caprice no.24 (1817):

https://www.youtube.com/watch?v=PZ307sM0t-0&t=1s.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
http://softwarephysics.blogspot.com/

Regards,
Steve Johnston